swh.indexer.codemeta module#

swh.indexer.codemeta.make_absolute_uri(local_name)[source]#

Parses codemeta.jsonld, and returns the @id of terms it defines.

>>> make_absolute_uri("name")
'http://schema.org/name'
>>> make_absolute_uri("downloadUrl")
'http://schema.org/downloadUrl'
>>> make_absolute_uri("referencePublication")
'https://codemeta.github.io/terms/referencePublication'
swh.indexer.codemeta.read_crosstable(fd: TextIO) Tuple[Set[str], Dict[str, Dict[str, URIRef]]][source]#

Given a file-like object to a CodeMeta crosswalk table (either the main cross-table with all columns, or an auxiliary table with just the CodeMeta column and one ecosystem-specific table); returns a list of all CodeMeta terms, and a dictionary {ecosystem: {ecosystem_term: codemeta_term}}

swh.indexer.codemeta.compact(doc, forgefed: bool)[source]#

Same as pyld.jsonld.compact, but in the context of CodeMeta.

Parameters:

forgefed – Whether to add ForgeFed and ActivityStreams as compact URIs. This is typically used for extrinsic metadata documents, which frequently use properties from these namespaces.

swh.indexer.codemeta.expand(doc)[source]#

Same as pyld.jsonld.expand, but in the context of CodeMeta.

swh.indexer.codemeta.merge_documents(documents)[source]#

Takes a list of metadata dicts, each generated from a different metadata file, and merges them.

Removes duplicates, if any.