- class swh.dataset.exporter.Exporter(config: Dict[str, Any], export_path, *args: Any, **kwargs: Any)#
Base class for all the exporters.
Each export can have multiple exporters, so we can read the journal a single time, then export the objects we read in different formats without having to re-read them every time.
Override this class with the behavior for an export in a specific export format. You have to overwrite process_object() to make it write to the appropriate export files.
You can also put setup and teardown logic in __enter__ and __exit__, and it will be called automatically.
- process_object(object_type: str, obj: Dict[str, Any]) None #
Process a SWH object to export.
Override this with your custom exporter.
- class swh.dataset.exporter.ExporterDispatch(config: Dict[str, Any], export_path, *args: Any, **kwargs: Any)#
Like Exporter, but dispatches each object type to a different function (e.g you can override process_origin(self, object) to process origins.)