swh.graph.luigi.provenance module#
Luigi tasks to help compute the provenance of content blobs#
This module contains Luigi tasks driving the computation of a topological order, and count the number of paths to every node.
File layout#
This assumes a local compressed graph (from swh.graph.luigi.compressed_graph
)
is present, and generates/manipulates the following files:
base_dir/
<date>[_<flavor>]/
provenance/
topological_order_dfs.csv.zst
- class swh.graph.luigi.provenance.SortRevrelByDate(*args, **kwargs)[source]#
Bases:
Task
Creates a file that contains all revision/release author dates and their SWHIDs in date order from a graph export.
- local_export_path = <luigi.parameter.PathParameter object>#
- local_graph_path = <luigi.parameter.PathParameter object>#
- graph_name = <luigi.parameter.Parameter object>#
- provenance_dir = <luigi.parameter.PathParameter object>#
- class swh.graph.luigi.provenance.ListEarliestRevisions(*args, **kwargs)[source]#
Bases:
Task
Creates a file that contains all directory/content SWHIDs, along with the first revision/release author date and SWHIDs they occur in.
- local_export_path = <luigi.parameter.PathParameter object>#
- local_graph_path = <luigi.parameter.PathParameter object>#
- graph_name = <luigi.parameter.Parameter object>#
- provenance_dir = <luigi.parameter.PathParameter object>#
- property resources#
Returns the value of
self.max_ram_mb
- requires() Dict[str, Task] [source]#
Returns
LocalGraph
andSortRevrelByDate
instances.