swh.dataset.exporter module#
- class swh.dataset.exporter.Exporter(config: Dict[str, Any], export_path, *args: Any, **kwargs: Any)[source]#
Bases:
object
Base class for all the exporters.
Each export can have multiple exporters, so we can read the journal a single time, then export the objects we read in different formats without having to re-read them every time.
Override this class with the behavior for an export in a specific export format. You have to overwrite process_object() to make it write to the appropriate export files.
You can also put setup and teardown logic in __enter__ and __exit__, and it will be called automatically.