swh.storage.proxies.buffer module#

swh.storage.proxies.buffer.estimate_revision_size(revision: Revision) int[source]#

Estimate the size of a revision, by summing the size of variable length fields

swh.storage.proxies.buffer.estimate_release_size(release: Release) int[source]#

Estimate the size of a release, by summing the size of variable length fields

class swh.storage.proxies.buffer.BufferingProxyStorage(storage: Mapping, min_batch_size: Mapping = {})[source]#

Bases: object

content_add(contents: Sequence[Content]) Dict[str, int][source]#

Push contents to write to the storage in the buffer.

Following policies apply:

  • if the buffer’s threshold is hit, flush content to the storage.

  • otherwise, if the total size of buffered contents’s threshold is hit, flush content to the storage.

skipped_content_add(contents: Sequence[SkippedContent]) Dict[str, int][source]#
directory_add(directories: Sequence[Directory]) Dict[str, int][source]#
revision_add(revisions: Sequence[Revision]) Dict[str, int][source]#
release_add(releases: Sequence[Release]) Dict[str, int][source]#
object_add(objects: Sequence[BaseModel], *, object_type: Literal['raw_extrinsic_metadata', 'content', 'skipped_content', 'directory', 'revision', 'release', 'snapshot', 'extid'], keys: Iterable[str]) Dict[str, int][source]#

Push objects to write to the storage in the buffer. Flushes the buffer to the storage if the threshold is hit.

flush(object_types: Sequence[Literal['raw_extrinsic_metadata', 'content', 'skipped_content', 'directory', 'revision', 'release', 'snapshot', 'extid']] = ('raw_extrinsic_metadata', 'content', 'skipped_content', 'directory', 'revision', 'release', 'snapshot', 'extid')) Dict[str, int][source]#
clear_buffers(object_types: Sequence[Literal['raw_extrinsic_metadata', 'content', 'skipped_content', 'directory', 'revision', 'release', 'snapshot', 'extid']] = ('raw_extrinsic_metadata', 'content', 'skipped_content', 'directory', 'revision', 'release', 'snapshot', 'extid')) None[source]#

Clear objects from current buffer.

Warning

data that has not been flushed to storage will be lost when this method is called. This should only be called when flush fails and you want to continue your processing.