swh.objstorage.multiplexer.multiplexer_objstorage module¶
-
class
swh.objstorage.multiplexer.multiplexer_objstorage.
ObjStorageThread
(storage)[source]¶ Bases:
threading.Thread
-
run
()[source]¶ Method representing the thread’s activity.
You may override this method in a subclass. The standard run() method invokes the callable object passed to the object’s constructor as the target argument, if any, with sequential and keyword arguments taken from the args and kwargs arguments, respectively.
-
queue_command
(command, *args, mailbox=None, **kwargs)[source]¶ Enqueue a new command to be processed by the thread.
- Parameters
command (str) – one of the method names for the underlying storage.
mailbox (queue.Queue) – explicit mailbox if the calling thread wants to override it.
args – arguments for the command.
kwargs – arguments for the command.
- Returns: queue.Queue
The mailbox you can read the response from
-
queue_result
(mailbox, result_type, result)[source]¶ Enqueue a new result in the mailbox
This also provides a reference to the storage, which can be useful when an exceptional condition arises.
- Parameters
mailbox (queue.Queue) – the mailbox to which we need to enqueue the result
result_type (str) – one of ‘result’, ‘exception’
result – the result to pass back to the calling thread
-
static
get_result_from_mailbox
(mailbox, *args, **kwargs)[source]¶ Unpack the result from the mailbox.
- Parameters
mailbox (queue.Queue) – A mailbox to unpack a result from
args – arguments to
mailbox.get()
kwargs – arguments to
mailbox.get()
- Returns
the next result unpacked from the queue
- Raises
either the exception we got back from the underlying storage, –
:raises or
queue.Empty
ifmailbox.get()
raises that.:
-
-
class
swh.objstorage.multiplexer.multiplexer_objstorage.
MultiplexerObjStorage
(storages, **kwargs)[source]¶ Bases:
swh.objstorage.objstorage.ObjStorage
Implementation of ObjStorage that distributes between multiple storages.
The multiplexer object storage allows an input to be demultiplexed among multiple storages that will or will not accept it by themselves (see .filter package).
As the ids can be different, no pre-computed ids should be submitted. Also, there are no guarantees that the returned ids can be used directly into the storages that the multiplexer manage.
Use case examples follow.
Example 1:
storage_v1 = filter.read_only(PathSlicingObjStorage('/dir1', '0:2/2:4/4:6')) storage_v2 = PathSlicingObjStorage('/dir2', '0:1/0:5') storage = MultiplexerObjStorage([storage_v1, storage_v2])
When using ‘storage’, all the new contents will only be added to the v2 storage, while it will be retrievable from both.
Example 2:
storage_v1 = filter.id_regex( PathSlicingObjStorage('/dir1', '0:2/2:4/4:6'), r'[^012].*' ) storage_v2 = filter.if_regex( PathSlicingObjStorage('/dir2', '0:1/0:5'), r'[012]/*' ) storage = MultiplexerObjStorage([storage_v1, storage_v2])
When using this storage, the contents with a sha1 starting with 0, 1 or 2 will be redirected (read AND write) to the storage_v2, while the others will be redirected to the storage_v1. If a content starting with 0, 1 or 2 is present in the storage_v1, it would be ignored anyway.
-
check_config
(*, check_write)[source]¶ Check whether the object storage is properly configured.
- Parameters
check_write (bool) – if True, check if writes to the object storage
succeed. (can) –
- Returns
True if the configuration check worked, an exception if it didn’t.
-
add
(content, obj_id=None, check_presence=True)[source]¶ Add a new object to the object storage.
If the adding step works in all the storages that accept this content, this is a success. Otherwise, the full adding step is an error even if it succeed in some of the storages.
- Parameters
content – content of the object to be added to the storage.
obj_id – checksum of [bytes] using [ID_HASH_ALGO] algorithm. When given, obj_id will be trusted to match the bytes. If missing, obj_id will be computed on the fly.
check_presence – indicate if the presence of the content should be verified before adding the file.
- Returns
an id of the object into the storage. As the write-storages are always readable as well, any id will be valid to retrieve a content.
-
restore
(content, obj_id=None)[source]¶ Restore a content that have been corrupted.
This function is identical to add but does not check if the object id is already in the file system. The default implementation provided by the current class is suitable for most cases.
- Parameters
content (bytes) – object’s raw content to add in storage
obj_id (bytes) – checksum of bytes as computed by ID_HASH_ALGO. When given, obj_id will be trusted to match bytes. If missing, obj_id will be computed on the fly.
-
get
(obj_id)[source]¶ Retrieve the content of a given object.
- Parameters
obj_id (bytes) – object id.
- Returns
the content of the requested object as bytes.
- Raises
ObjNotFoundError – if the requested object is missing.
-
check
(obj_id)[source]¶ Perform an integrity check for a given object.
Verify that the file object is in place and that the content matches the object id.
- Parameters
obj_id (bytes) – object identifier.
- Raises
ObjNotFoundError – if the requested object is missing.
Error – if the request object is corrupted.
-
delete
(obj_id)[source]¶ Delete an object.
- Parameters
obj_id (bytes) – object identifier.
- Raises
ObjNotFoundError – if the requested object is missing.
-
get_random
(batch_size)[source]¶ Get random ids of existing contents.
This method is used in order to get random ids to perform content integrity verifications on random contents.
- Parameters
batch_size (int) – Number of ids that will be given
- Yields
An iterable of ids (bytes) of contents that are in the current object storage.
-