swh.storage.backfill module¶
Storage backfiller.
The backfiller goal is to produce back part or all of the objects from a storage to the journal topics
Current implementation consists in the JournalBackfiller class.
It simply reads the objects from the storage and sends every object identifier back to the journal.
-
swh.storage.backfill.
directory_converter
(db: swh.core.db.BaseDb, directory_d: Dict[str, Any]) → swh.model.model.Directory[source]¶ Convert directory from the flat representation to swh model compatible objects.
-
swh.storage.backfill.
raw_extrinsic_metadata_converter
(db: swh.core.db.BaseDb, metadata: Dict[str, Any]) → swh.model.model.RawExtrinsicMetadata[source]¶ Convert revision from the flat representation to swh model compatible objects.
-
swh.storage.backfill.
revision_converter
(db: swh.core.db.BaseDb, revision_d: Dict[str, Any]) → swh.model.model.Revision[source]¶ Convert revision from the flat representation to swh model compatible objects.
-
swh.storage.backfill.
release_converter
(db: swh.core.db.BaseDb, release_d: Dict[str, Any]) → swh.model.model.Release[source]¶ Convert release from the flat representation to swh model compatible objects.
-
swh.storage.backfill.
snapshot_converter
(db: swh.core.db.BaseDb, snapshot_d: Dict[str, Any]) → swh.model.model.Snapshot[source]¶ Convert snapshot from the flat representation to swh model compatible objects.
-
swh.storage.backfill.
object_to_offset
(object_id, numbits)[source]¶ - Compute the index of the range containing object id, when dividing
space into 2^numbits.
- Parameters
object_id (str) – The hex representation of object_id
numbits (int) – Number of bits in which we divide input space
- Returns
The index of the range containing object id
-
swh.storage.backfill.
byte_ranges
(numbits, start_object=None, end_object=None)[source]¶ - Generate start/end pairs of bytes spanning numbits bits and
constrained by optional start_object and end_object.
- Parameters
numbits (int) – Number of bits in which we divide input space
start_object (str) – Hex object id contained in the first range returned
end_object (str) – Hex object id contained in the last range returned
- Yields
2^numbits pairs of bytes
-
swh.storage.backfill.
fetch
(db, obj_type, start, end)[source]¶ Fetch all obj_type’s identifiers from db.
This opens one connection, stream objects and when done, close the connection.
- Parameters
db (BaseDb) – Db connection object
obj_type (str) – Object type
start (Union[bytes|Tuple]) – Range start identifier
end (Union[bytes|Tuple]) – Range end identifier
- Raises
ValueError if obj_type is not supported –
- Yields
Objects in the given range
-
class
swh.storage.backfill.
JournalBackfiller
(config=None)[source]¶ Bases:
object
Class in charge of reading the storage’s objects and sends those back to the journal’s topics.
This is designed to be run periodically.