swh.objstorage.api.client module#
- class swh.objstorage.api.client.RemoteObjStorage(url: str, timeout: None | Tuple[float, float] | List[float] | float = None, chunk_size: int = 4096, max_retries: int = 3, pool_connections: int = 20, pool_maxsize: int = 100, adapter_kwargs: Dict[str, Any] | None = None, api_exception: Type[Exception] | None = None, reraise_exceptions: List[Type[Exception]] | None = None, enable_requests_retry: bool | None = None, **kwargs)[source]#
Bases:
RPCClient
Proxy to a remote object storage.
This class allows to connect to an object storage server via http protocol.
- url#
The url of the server to connect. Must end with a ‘/’
- Type:
string
- session#
The session to send requests.
- api_exception#
alias of
ObjStorageAPIError
- reraise_exceptions: List[Type[Exception]] = [<class 'swh.objstorage.exc.ObjNotFoundError'>, <class 'swh.objstorage.exc.Error'>, <class 'swh.objstorage.exc.ObjCorruptedError'>, <class 'swh.objstorage.exc.NoBackendsLeftError'>, <class 'PermissionError'>]#
On server errors, if any of the exception classes in this list has the same name as the error name, then the exception will be instantiated and raised instead of a generic RemoteException.
- backend_class#
alias of
ObjStorageInterface
- list_content(last_obj_id: bytes | CompositeObjId | None = None, limit: int | None = 10000) Iterator[CompositeObjId] [source]#
- add(content: bytes, obj_id: bytes | CompositeObjId, check_presence: bool = True) None #
Add a new object to the object storage.
- Parameters:
content – object’s raw content to add in storage.
obj_id – either dict of checksums, or single checksum of [bytes] using [ID_HASH_ALGO] algorithm. It is trusted to match the bytes.
check_presence (bool) – indicate if the presence of the content should be verified before adding the file.
- Returns:
the id (bytes) of the object into the storage.
- add_batch(contents: Mapping[bytes, bytes] | Iterable[Tuple[bytes | CompositeObjId, bytes]], check_presence: bool = True) Dict #
Add a batch of new objects to the object storage.
- Parameters:
contents – either mapping from [ID_HASH_ALGO] checksums to object contents, or list of pairs of dict hashes and object contents
- Returns:
the summary of objects added to the storage (count of object, count of bytes object)
- check(obj_id: bytes | CompositeObjId) None #
Perform an integrity check for a given object.
Verify that the file object is in place and that the content matches the object id.
- Parameters:
obj_id – object identifier.
- Raises:
ObjNotFoundError – if the requested object is missing.
ObjCorruptedError – if the requested object is corrupted.
- check_config(*, check_write)#
Check whether the object storage is properly configured.
- Parameters:
check_write (bool) – if True, check if writes to the object storage
succeed. (can)
- Returns:
True if the configuration check worked, False if ‘check_write’ is True and the object storage is actually read only, and an exception if the check failed.
- delete(obj_id: bytes | CompositeObjId)#
Delete an object.
- Parameters:
obj_id – object identifier.
- Raises:
ObjNotFoundError – if the requested object is missing.
- download_url(obj_id: bytes | CompositeObjId, content_disposition: str | None = None, expiry: timedelta | None = None) str | None #
Get a direct download link for the object if the obstorage backend supports such feature.
Some objstorage backends, typically cloud based ones like azure or s3, can provide a direct download link for a stored object.
- Parameters:
obj_id – object identifier
content_disposition – set Content-Disposition header for the generated URL response if the objstorage backend supports it
expiry – the duration after which the URL expires if the objstorage backend supports it, if not provided the URL expires 24 hours after its creation
- Returns:
- Direct download URL for the object or
None
if the objstorage backend does not support such feature.
- Direct download URL for the object or
- get(obj_id: bytes | CompositeObjId) bytes #
Retrieve the content of a given object.
- Parameters:
obj_id – object id.
- Returns:
the content of the requested object as bytes.
- Raises:
ObjNotFoundError – if the requested object is missing.
- get_batch(obj_ids: Iterable[bytes | CompositeObjId]) Iterator[bytes | None] #
Retrieve objects’ raw content in bulk from storage.
Note: This function does have a default implementation in ObjStorage that is suitable for most cases.
For object storages that needs to do the minimal number of requests possible (ex: remote object storages), that method can be overridden to perform a more efficient operation.
- Parameters:
obj_ids – list of object ids.
- Returns:
list of resulting contents, or None if the content could not be retrieved. Do not raise any exception as a fail for one content will not cancel the whole request.