swh.indexer.metadata_dictionary.utils module#

swh.indexer.metadata_dictionary.utils.prettyprint_graph(graph: Graph, root: URIRef)[source]#
swh.indexer.metadata_dictionary.utils.add_list(graph: Graph, subject: Node, predicate: Identifier, objects: Sequence[Node]) None[source]#

Adds triples to the graph so that they are equivalent to this JSON-LD object:

{
    "@id": subject,
    predicate: {"@list": objects}
}

This is a naive implementation of https://json-ld.org/spec/latest/json-ld-api/#list-to-rdf-conversion

swh.indexer.metadata_dictionary.utils.add_map(graph: Graph, subject: Node, predicate: Identifier, f: Callable[[Graph, TValue], Node | None], values: Iterable[TValue]) None[source]#

Helper for add_list() that takes a mapper function f.

swh.indexer.metadata_dictionary.utils.add_url_if_valid(graph: Graph, subject: Node, predicate: Identifier, url: Any) None[source]#

Adds (subject, predicate, url) to the graph if url is well-formed.

This is meant as a workaround for digitalbazaar/pyld#91 to drop URLs that are blatantly invalid early, so PyLD does not crash.

>>> from pprint import pprint
>>> graph = Graph()
>>> subject = rdflib.term.URIRef("http://example.org/test-software")
>>> predicate = rdflib.term.URIRef("http://schema.org/license")
>>> add_url_if_valid(
...     graph, subject, predicate, "https//www.apache.org/licenses/LICENSE-2.0.txt"
... )
>>> add_url_if_valid(
...     graph, subject, predicate, "http:s//www.apache.org/licenses/LICENSE-2.0.txt"
... )
>>> add_url_if_valid(
...     graph, subject, predicate, "https://www.apache.org/licenses/LICENSE-2.0.txt"
... )
>>> add_url_if_valid(
...     graph, subject, predicate, 42
... )
>>> pprint(set(graph.triples((subject, predicate, None))))
{(rdflib.term.URIRef('http://example.org/test-software'),
  rdflib.term.URIRef('http://schema.org/license'),
  rdflib.term.URIRef('https://www.apache.org/licenses/LICENSE-2.0.txt'))}