swh.objstorage.backends.winery.sharedbase module#

class swh.objstorage.backends.winery.sharedbase.ShardState(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: Enum

Description of the lifecycle of Winery shards

STANDBY = 'standby'#

The write shard is idle but ready to receive new objects as soon as it is locked.

WRITING = 'writing'#

The write shard is currently locked by a WineryWriter and receiving writes.

FULL = 'full'#

The write shard has reached the size threshold and will not be written to anymore, it is ready to be packed.

PACKING = 'packing'#

The write shard is being packed into its read-only version.

PACKED = 'packed'#

The read-only shard has been finalized, the write shard is pending cleanup as soon as all hosts have acknowledged the read-only shard.

CLEANING = 'cleaning'#

The write shard has been locked for cleanup.

READONLY = 'readonly'#

Only the read-only shard remains.

property locked#

The state corresponds to a locked shard

property image_available#

In this state, the read-only shard is available

property readonly#

In this state, the write shard is unavailable

class swh.objstorage.backends.winery.sharedbase.SignatureState(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: Enum

INFLIGHT = 'inflight'#
PRESENT = 'present'#
DELETED = 'deleted'#
class swh.objstorage.backends.winery.sharedbase.TemporaryShardLocker(base: SharedBase, current_state: ShardState, new_state: ShardState, min_mapped_hosts: int = 0)[source]#

Bases: object

Opportunistically lock a shard, and provide a context manager to unlock the shard if an operation fails.

Use this through the SharedBase.maybe_lock_one_shard() method.

class swh.objstorage.backends.winery.sharedbase.SharedBase(base_dsn: str, application_name: str | None = None, **kwargs)[source]#

Bases: Database

The main database for a Winery instance.

This handles access to the following tables:

  • shards is the list of shards and their associated ShardState.

  • signature2shard is the mapping between object ids and the shard that contains the associated object.

This class is also used to lock a shard for exclusive use (by moving it to a locked state, and setting a locker id).

current_version: int = 2#
property locked_shard: str#

The name of the shard that is currently locked for writing by this SharedBase.

property locked_shard_id: int#

The numeric ID of the shard that is currently locked for writing by this :py:class`SharedBase`.

set_locked_shard() None[source]#

Lock a shard in ShardState.STANDBY for writing, creating a new write shard (and the associated table) if none is currently available.

lock_one_shard(current_state: ShardState, new_state: ShardState, min_mapped_hosts: int = 0) Tuple[str, int] | None[source]#

Lock one shard in current_state, putting it into new_state. Only lock a shard if it has more than min_mapped_hosts hosts that have registered as having mapped the shard.

maybe_lock_one_shard(current_state: ShardState, new_state: ShardState, min_mapped_hosts: int = 0) TemporaryShardLocker[source]#

Opportunistically lock a shard, and, if a shard was locked, provide a context manager to rollback the locking on failure.

Example:

locked = base.maybe_lock_one_shard(
    current_state=ShardState.FULL,
    new_state=ShardState.PACKING,
)
if not locked:
    wait_a_minute()
    return

with locked:
    do_something_with_locked_shard(locked.name)

If do_something_with_locked_shard fails, the shard will be moved back to the current_state on exit.

set_shard_state(new_state: ShardState, set_locker: bool = False, check_locker: bool = False, name: str | None = None, db: Connection | None = None)[source]#

Set the state of a given shard (or of the shard that is currently locked).

Parameters:
  • new_state – the new ShardState for the shard.

  • set_locker – whether the shard should be marked as locked by the current SharedBase.

  • check_locker – whether state change should only be accepted if the shard is currently locked by us.

  • name – the name of the shard to change the state of (default to the currently locked shard).

  • db – pass an existing psycopg connection to run this in an existing transaction.

create_shard(new_state: ShardState) Tuple[str, int][source]#

Create a new write shard (locked by the current SharedBase), with a generated name.

Parameters:

new_state – the ShardState for the new shard.

Returns:

the name and numeric id of the newly created shard.

Raises:

RuntimeError – if the shard creation failed (for instance if a shard with an identical name was created concurrently).

shard_packing_starts(name: str)[source]#

Record the named shard as being packed now.

shard_packing_ends(name: str)[source]#

Record the completion of packing shard name.

get_shard_info(id: int) Tuple[str, ShardState] | None[source]#

Get the name and ShardState of the shard with the given id.

Returns:

None if the shard with the given id doesn’t exist.

get_shard_state(name: str) ShardState | None[source]#

Get the ShardState of the named shard.

Returns:

None if the shard with the given name doesn’t exist.

list_shards() Iterator[Tuple[str, ShardState]][source]#

List all known shards and their current ShardState.

count_objects(name: str | None = None) int | None[source]#

Count the known objects in a shard.

Parameters:

name – the name of the shard in which objects should be counted (defaults to the currently locked shard)

Returns:

None if no shard exists with the given name.

Raises:

ValueError – if no shard has been specified and no shard is currently locked.

record_shard_mapped(host: str, name: str) Set[str][source]#

Record that the name``d shard has been mapped on the given ``host.

This is used in the distributed winery mode to acknowledge shards that have been seen by hosts, before the write shard is removed for cleanup.

contains(obj_id: bytes) int | None[source]#

Return the id of the shard which contains obj_id, or :py:const`None` if the object is not known (or deleted).

get(obj_id) Tuple[str, ShardState] | None[source]#

Return the name and ShardState of the shard containing obj_id, or None if the object is not known (or deleted).

record_new_obj_id(db: Connection, obj_id: bytes) int | None[source]#

Try to record obj_id as present in the currently locked shard.

Parameters:
  • db – a psycopg database with an open transaction

  • obj_id – the id of the object being added

Returns:

The numeric id of the shard in which the object is recorded as present (which can differ from the currently locked shard, if the object was added in another concurrent transaction).

list_signatures(after_id: bytes | None = None, limit: int | None = None) Iterator[bytes][source]#

List limit known object ids after after_id.

delete(obj_id: bytes)[source]#

Mark obj_id for deletion.

deleted_objects() Iterator[Tuple[bytes, str, ShardState]][source]#

List all objects marked for deletion, with the name and state of the shard in which the object is stored.

Returns:

an iterator over object_id, shard name, ShardState tuples

clean_deleted_object(obj_id) None[source]#

Remove the reference to the deleted object obj_id.