swh.storage.postgresql.storage module#
- swh.storage.postgresql.storage.EMPTY_SNAPSHOT_ID = b'\x1a\x88\x93\xe6\xa8oDN\x8b\xe8\xe7\xbd\xa6\xcb4\xfb\x175\xa0\x0e'#
Identifier for the empty snapshot
- swh.storage.postgresql.storage.VALIDATION_EXCEPTIONS = (<class 'KeyError'>, <class 'TypeError'>, <class 'ValueError'>, <class 'psycopg2.errors.CheckViolation'>, <class 'psycopg2.IntegrityError'>, <class 'psycopg2.errors.InvalidTextRepresentation'>, <class 'psycopg2.errors.NotNullViolation'>, <class 'psycopg2.errors.NumericValueOutOfRange'>, <class 'psycopg2.errors.UndefinedFunction'>, <class 'psycopg2.errors.ProgramLimitExceeded'>)#
Exceptions raised by postgresql when validation of the arguments failed.
- swh.storage.postgresql.storage.convert_validation_exceptions()[source]#
Catches postgresql errors related to invalid arguments, and re-raises a StorageArgumentException.
- class swh.storage.postgresql.storage.Storage(db, objstorage=None, min_pool_conns=1, max_pool_conns=10, journal_writer=None, query_options=None)[source]#
Bases:
object
SWH storage datastore proxy, encompassing DB and object storage
Instantiate a storage instance backed by a PostgreSQL database and an objstorage.
When
db
is passed as a connection string, then this module automatically manages a connection pool betweenmin_pool_conns
andmax_pool_conns
. Whendb
is an explicit psycopg2 connection, thenmin_pool_conns
andmax_pool_conns
are ignored and the connection is used directly.- Parameters:
db – either a libpq connection string, or a psycopg2 connection
objstorage – configuration for the backend
ObjStorage
; if unset, use a NoopObjStoragemin_pool_conns – min number of connections in the psycopg2 pool
max_pool_conns – max number of connections in the psycopg2 pool
journal_writer – configuration for the
JournalWriter
query_options –
configuration for the sql connections; keys of the dict are the method names decorated with
db_transaction()
ordb_transaction_generator()
(eg.content_find()
), and values are dicts (config_name, config_value) used to configure the sql connection for the method_name. For example, using:{"content_get": {"statement_timeout": 5000}}
will override the default statement timeout for the
content_get()
endpoint from 500ms to 5000ms.See
swh.core.db.common
for more details.
- content_get_partition(partition_id: int, nb_partitions: int, page_token: Optional[str] = None, limit: int = 1000) PagedResult[Content, str] [source]#
- skipped_content_find(content: HashDict) List[SkippedContent] [source]#
- directory_entry_get_by_path(directory: bytes, paths: List[bytes]) Optional[Dict[str, Any]] [source]#
- directory_get_entries(directory_id: bytes, page_token: Optional[bytes] = None, limit: int = 1000) Optional[PagedResult[DirectoryEntry, str]] [source]#
- directory_get_id_partition(partition_id: int, nb_partitions: int, page_token: Optional[str] = None, limit: int = 1000) PagedResult[bytes, str] [source]#
- revision_get_partition(partition_id: int, nb_partitions: int, page_token: Optional[str] = None, limit: int = 1000) PagedResult[Revision, str] [source]#
- revision_get(revision_ids: List[bytes], ignore_displayname: bool = False) List[Optional[Revision]] [source]#
- revision_log(revisions: List[bytes], ignore_displayname: bool = False, limit: Optional[int] = None) Iterable[Optional[Dict[str, Any]]] [source]#
- revision_shortlog(revisions: List[bytes], limit: Optional[int] = None) Iterable[Optional[Tuple[bytes, Tuple[bytes, ...]]]] [source]#
- extid_get_from_extid(id_type: str, ids: List[bytes], version: Optional[int] = None) List[ExtID] [source]#
- extid_get_from_target(target_type: ObjectType, ids: List[bytes], extid_type: Optional[str] = None, extid_version: Optional[int] = None) List[ExtID] [source]#
- release_get(releases: List[bytes], ignore_displayname: bool = False) List[Optional[Release]] [source]#
- release_get_partition(partition_id: int, nb_partitions: int, page_token: Optional[str] = None, limit: int = 1000) PagedResult[Release, str] [source]#
- snapshot_get_id_partition(partition_id: int, nb_partitions: int, page_token: Optional[str] = None, limit: int = 1000) PagedResult[bytes, str] [source]#
- snapshot_count_branches(snapshot_id: bytes, branch_name_exclude_prefix: Optional[bytes] = None) Optional[Dict[Optional[str], int]] [source]#
- snapshot_get_branches(snapshot_id: bytes, branches_from: bytes = b'', branches_count: int = 1000, target_types: Optional[List[str]] = None, branch_name_include_substring: Optional[bytes] = None, branch_name_exclude_prefix: Optional[bytes] = None) Optional[PartialBranches] [source]#
- snapshot_branch_get_by_name(snapshot_id: bytes, branch_name: bytes, follow_alias_chain: bool = True, max_alias_chain_length: int = 100) Optional[SnapshotBranchByNameResponse] [source]#
- origin_visit_add(visits: List[OriginVisit]) Iterable[OriginVisit] [source]#
- origin_visit_status_get_latest(origin_url: str, visit: int, allowed_statuses: Optional[List[str]] = None, require_snapshot: bool = False) Optional[OriginVisitStatus] [source]#
- origin_visit_get(origin: str, page_token: Optional[str] = None, order: ListOrder = ListOrder.ASC, limit: int = 10) PagedResult[OriginVisit, str] [source]#
- origin_visit_get_with_statuses(origin: str, allowed_statuses: Optional[List[str]] = None, require_snapshot: bool = False, page_token: Optional[str] = None, order: ListOrder = ListOrder.ASC, limit: int = 10) PagedResult[OriginVisitWithStatuses, str] [source]#
- origin_visit_get_latest(origin: str, type: Optional[str] = None, allowed_statuses: Optional[List[str]] = None, require_snapshot: bool = False) Optional[OriginVisit] [source]#
- origin_visit_status_get(origin: str, visit: int, page_token: Optional[str] = None, order: ListOrder = ListOrder.ASC, limit: int = 10) PagedResult[OriginVisitStatus, str] [source]#
- origin_visit_status_get_random(type: str) Optional[OriginVisitStatus] [source]#
- origin_search(url_pattern: str, page_token: Optional[str] = None, limit: int = 50, regexp: bool = False, with_visit: bool = False, visit_types: Optional[List[str]] = None) PagedResult[Origin, str] [source]#
- raw_extrinsic_metadata_get(target: ExtendedSWHID, authority: MetadataAuthority, after: Optional[datetime] = None, page_token: Optional[bytes] = None, limit: int = 1000) PagedResult[RawExtrinsicMetadata, str] [source]#
- raw_extrinsic_metadata_get_authorities(target: ExtendedSWHID) List[MetadataAuthority] [source]#
- metadata_authority_get(type: MetadataAuthorityType, url: str) Optional[MetadataAuthority] [source]#
- object_find_recent_references(target_swhid: ExtendedSWHID, limit: int) List[ExtendedSWHID] [source]#