swh.graph.luigi package#
Submodules#
- swh.graph.luigi.compressed_graph module
- Luigi tasks for compression
ObjectTypesParameterExtractNodesExtractLabelsNodeStatsEdgeStatsLabelStatsMphInitialOrderBvBvEfBfsRootsBfsPermuteAndSimplifyBfsBfsEfBfsDcfLlpPermuteLlpEfComposeOrdersTransposeTransposeEfMapsExtractPersonsPersonsStatsMphPersonsExtractFullnamesFullnamesEfNodePropertiesPthashLabelsLabelsOrderFclLabelsEdgeLabelsEdgeLabelsTransposeEdgeLabelsEfEdgeLabelsTransposeEfStatsEndToEndCheckCompressGraphCompressGraph.local_export_pathCompressGraph.local_sensitive_export_pathCompressGraph.graph_nameCompressGraph.local_graph_pathCompressGraph.local_sensitive_graph_pathCompressGraph.previous_graph_pathCompressGraph.batch_sizeCompressGraph.rust_executable_dirCompressGraph.object_typesCompressGraph.check_flavorCompressGraph.requires()CompressGraph.output()CompressGraph.run()
UploadGraphToS3DownloadGraphFromS3LocalGraph
- swh.graph.luigi.subdataset module
SelectTopGithubOriginsSubdatasetOriginsFromFileListSwhidsForSubdatasetCreateSubdatasetOnAthenaCreateSubdatasetOnAthena.local_export_pathCreateSubdatasetOnAthena.s3_parent_export_pathCreateSubdatasetOnAthena.s3_export_pathCreateSubdatasetOnAthena.s3_athena_output_locationCreateSubdatasetOnAthena.athena_db_nameCreateSubdatasetOnAthena.athena_parent_db_nameCreateSubdatasetOnAthena.object_typesCreateSubdatasetOnAthena.requires()CreateSubdatasetOnAthena.output()CreateSubdatasetOnAthena.run()
- swh.graph.luigi.utils module
Module contents#
Luigi tasks#
This package contains Luigi tasks. These come in two kinds:
in
swh.graph.luigi.compressed_graph: an alternative to the ‘swh graph compress’ CLI that can be composed with other tasks, such as swh-export’sin other submodules: tasks driving the creation of specific datasets that are generated using the compressed graph
The overall directory structure is:
base_dir/
<date>[_<flavor>]/
edges/
...
orc/
...
compressed/
graph.graph
graph.mph
...
meta/
export.json
compression.json
And optionally:
sensitive_base_dir/
<date>[_<flavor>]/
persons_sha256_to_name.csv.zst