swh.objstorage.backends.generator module

class swh.objstorage.backends.generator.Randomizer[source]

Bases: object

read(size)[source]
swh.objstorage.backends.generator.gen_sizes()[source]

generates numbers according to the rought distribution of file size in the SWH archive

swh.objstorage.backends.generator.gen_random_content(total=None, filesize=None)[source]

generates random (file) content which sizes roughly follows the SWH archive file size distribution (by default).

Parameters
  • total (int) – the total number of objects to generate. Infinite if unset.

  • filesize (int) – generate objects with fixed size instead of random ones.

class swh.objstorage.backends.generator.RandomGeneratorObjStorage(filesize=None, total=None, **kwargs)[source]

Bases: swh.objstorage.objstorage.ObjStorage

A stupid read-only storage that generates blobs for testing purpose.

property content_generator
check_config(*, check_write)[source]

Check whether the object storage is properly configured.

Parameters
  • check_write (bool) – if True, check if writes to the object storage

  • succeed. (can) –

Returns

True if the configuration check worked, an exception if it didn’t.

get(obj_id, *args, **kwargs)[source]

Retrieve the content of a given object.

Parameters

obj_id (bytes) – object id.

Returns

the content of the requested object as bytes.

Raises

ObjNotFoundError – if the requested object is missing.

add(content, obj_id=None, check_presence=True, *args, **kwargs)[source]

Add a new object to the object storage.

Parameters
  • content (bytes) – object’s raw content to add in storage.

  • obj_id (bytes) – checksum of [bytes] using [ID_HASH_ALGO] algorithm. When given, obj_id will be trusted to match the bytes. If missing, obj_id will be computed on the fly.

  • check_presence (bool) – indicate if the presence of the content should be verified before adding the file.

Returns

the id (bytes) of the object into the storage.

check(obj_id, *args, **kwargs)[source]

Perform an integrity check for a given object.

Verify that the file object is in place and that the content matches the object id.

Parameters

obj_id (bytes) – object identifier.

Raises
delete(obj_id, *args, **kwargs)[source]

Delete an object.

Parameters

obj_id (bytes) – object identifier.

Raises

ObjNotFoundError – if the requested object is missing.

get_stream(obj_id, chunk_size=2097152)[source]

Retrieve the content of a given object as a chunked iterator.

Parameters

obj_id (bytes) – object id.

Returns

the content of the requested object as bytes.

Raises

ObjNotFoundError – if the requested object is missing.

list_content(last_obj_id=None, limit=10000)[source]

Generates known object ids.

Parameters
  • last_obj_id (bytes) – object id from which to iterate from (excluded).

  • limit (int) – max number of object ids to generate.

Generates:

obj_id (bytes): object ids.