swh.dataset.test.test_journal_processor module#

swh.dataset.test.test_journal_processor.journal_client_config(kafka_server: str, kafka_prefix: str, kafka_consumer_group: str)[source]#
swh.dataset.test.test_journal_processor.journal_writer(kafka_server: str, kafka_prefix: str)[source]#
swh.dataset.test.test_journal_processor.disable_gc(f)[source]#

Decorator for test functions; prevents segfaults in confluent-kafka. See confluentinc/confluent-kafka-python#1761

class swh.dataset.test.test_journal_processor.ListExporter(objects: ListProxy, *args, **kwargs)[source]#

Bases: Exporter

process_object(object_type: ModelObjectType, obj: Dict[str, Any]) None[source]#

Process a SWH object to export.

Override this with your custom exporter.

swh.dataset.test.test_journal_processor.assert_exported_objects(exported_objects: Sequence[Tuple[str, Dict]], expected_objects: Sequence[BaseModel]) None[source]#
swh.dataset.test.test_journal_processor.test_parallel_journal_processor(journal_client_config, journal_writer, tmp_path) None[source]#
swh.dataset.test.test_journal_processor.test_parallel_journal_processor_origin(journal_client_config, journal_writer, tmp_path) None[source]#
swh.dataset.test.test_journal_processor.test_parallel_journal_processor_origin_visit_status(journal_client_config, journal_writer, tmp_path) None[source]#
swh.dataset.test.test_journal_processor.test_parallel_journal_processor_offsets(journal_client_config, journal_writer, tmp_path) None[source]#

Checks the exporter stops at the offsets computed at the beginning of the export

swh.dataset.test.test_journal_processor.test_parallel_journal_processor_masked(journal_client_config, journal_writer, tmp_path) None[source]#
swh.dataset.test.test_journal_processor.test_parallel_journal_processor_masked_origin(journal_client_config, journal_writer, tmp_path) None[source]#
swh.dataset.test.test_journal_processor.test_parallel_journal_processor_masked_origin_visit_statuses(journal_client_config, journal_writer, tmp_path) None[source]#