swh.indexer.metadata_dictionary.base module

class swh.indexer.metadata_dictionary.base.BaseMapping(log_suffix='')[source]

Bases: object

Base class for mappings to inherit from

To implement a new mapping:

  • inherit this class

  • override translate function

property name

A name of this mapping, used as an identifier in the indexer storage.

classmethod detect_metadata_files(files)[source]

Detects files potentially containing metadata


file_entries (list) – list of files


list of sha1 (possibly empty)

Return type


class swh.indexer.metadata_dictionary.base.SingleFileMapping(log_suffix='')[source]

Bases: swh.indexer.metadata_dictionary.base.BaseMapping

Base class for all mappings that use a single file as input.

property filename

The .json file to extract metadata from.

classmethod detect_metadata_files(file_entries)[source]
class swh.indexer.metadata_dictionary.base.DictMapping(log_suffix='')[source]

Bases: swh.indexer.metadata_dictionary.base.BaseMapping

Base class for mappings that take as input a file that is mostly a key-value store (eg. a shallow JSON dict).

string_fields = []

List of fields that are simple strings, and don’t need any normalization.

property mapping

A translation dict to map dict keys into a canonical name.

classmethod supported_terms()[source]
class swh.indexer.metadata_dictionary.base.JsonMapping(log_suffix='')[source]

Bases: swh.indexer.metadata_dictionary.base.DictMapping, swh.indexer.metadata_dictionary.base.SingleFileMapping

Base class for all mappings that use a JSON file as input.


Translates content by parsing content from a bytestring containing json data and translating with the appropriate mapping


raw_content (bytes) – raw content to translate


translated metadata in json-friendly form needed for the indexer

Return type