swh.datasets.luigi.impact module#
Luigi tasks to measure institutional impact#
This module contains Luigi tasks computing the impact of an institution across all origins
- class swh.datasets.luigi.impact.ComputeRawImpact(*args, **kwargs)[source]#
Bases:
TaskCreates a file that list all origins that contains revrels from a given set of persons, as well as the number of revrels and first/latest timestamp for each origin.
- local_graph_path = <luigi.parameter.PathParameter object>#
- graph_name = <luigi.parameter.Parameter object>#
- persons_path = <luigi.parameter.PathParameter object>#
- raw_impact_path = <luigi.parameter.PathParameter object>#
- output_emails = <luigi.parameter.BoolParameter object>#
- include_ranges = <luigi.parameter.Parameter object>#
- exclude_ranges = <luigi.parameter.Parameter object>#
- requires() Dict[str, Task][source]#
Returns an instance of
swh.graph.luigi.compressed_graph.LocalGraphandswh.graph.libs.luigi.topology.ComputeGenerations.
- class swh.datasets.luigi.impact.ComputeIndexedImpact(*args, **kwargs)[source]#
Bases:
TaskRemoves forks from
ComputeRawImpact’s output, unless they contain more revrels (or older/newer ones) than the upstream origin.- indexer_storage_url = <luigi.parameter.Parameter object>#
- swh_scheduler_url = <luigi.parameter.Parameter object>#
- FORK_FILTERS = ['all', 'none', 'without-upstream-contribution', 'with-original-content']#
- local_graph_path = <luigi.parameter.PathParameter object>#
- graph_name = <luigi.parameter.Parameter object>#
- persons_path = <luigi.parameter.PathParameter object>#
- raw_impact_path = <luigi.parameter.PathParameter object>#
- output_emails = <luigi.parameter.BoolParameter object>#
- indexed_impact_path = <luigi.parameter.PathParameter object>#
- fork_filter = <luigi.parameter.ChoiceParameter object>#
- requires() Dict[str, Task][source]#
Returns an instance of
swh.graph.luigi.compressed_graph.LocalGraphandswh.graph.libs.luigi.topology.ComputeGenerations.