swh.indexer.metadata_dictionary.python module

class swh.indexer.metadata_dictionary.python.LinebreakPreservingEmailPolicy(**kw)[source]

Bases: email.policy.EmailPolicy

Create new Policy, possibly overriding some defaults.

See class docstring for a list of overridable attributes.

header_fetch_parse(name, value)[source]

Given the header name and the value from the model, return the value to be returned to the application program that is requesting that header. The value passed in by the email package may contain surrogateescaped binary data if the lines were parsed by a BytesParser. The returned value should not contain any surrogateescaped data.

If the value has a ‘name’ attribute, it is returned to unmodified. Otherwise the name and the value with any linesep characters removed are passed to the header_factory method, and the resulting custom header object is returned. Any surrogateescaped bytes get turned into the unicode unknown-character glyph.

class swh.indexer.metadata_dictionary.python.PythonPkginfoMapping(log_suffix='')[source]

Bases: swh.indexer.metadata_dictionary.base.DictMapping, swh.indexer.metadata_dictionary.base.SingleFileMapping

Dedicated class for Python’s PKG-INFO mapping and translation.


name = 'pkg-info'
filename = b'PKG-INFO'
mapping = {'author': 'http://schema.org/author', 'author-email': 'http://schema.org/email', 'description': 'http://schema.org/description', 'download-url': 'http://schema.org/downloadUrl', 'home-page': 'http://schema.org/url', 'keywords': 'http://schema.org/keywords', 'license': 'http://schema.org/license', 'name': 'http://schema.org/name', 'summary': 'http://schema.org/description', 'version': 'http://schema.org/version'}
string_fields = ['name', 'version', 'description', 'summary', 'author', 'author-email']

List of fields that are simple strings, and don’t need any normalization.