swh.model.swhids module

Classes to represent SWH persistend IDentifiers.

CoreSWHID represents a SWHID with no qualifier, and QualifiedSWHID represents a SWHID that may have qualifiers. ExtendedSWHID extends the definition of SWHID to other object types, and is used internally in Software Heritage; it does not support qualifiers.

class swh.model.swhids.ObjectType(value)[source]

Bases: enum.Enum

Possible object types of a QualifiedSWHID or CoreSWHID.

The values of each variant is what is used in the SWHID’s string representation.

SNAPSHOT = 'snp'
REVISION = 'rev'
RELEASE = 'rel'
DIRECTORY = 'dir'
CONTENT = 'cnt'
class swh.model.swhids.ExtendedObjectType(value)[source]

Bases: enum.Enum

Possible object types of an ExtendedSWHID.

The variants are a superset of ObjectType’s

SNAPSHOT = 'snp'
REVISION = 'rev'
RELEASE = 'rel'
DIRECTORY = 'dir'
CONTENT = 'cnt'
ORIGIN = 'ori'
RAW_EXTRINSIC_METADATA = 'emd'
class swh.model.swhids.CoreSWHID(*, namespace: str = 'swh', scheme_version: int = 1, object_id: bytes, object_type)[source]

Bases: swh.model.swhids._BaseSWHID[swh.model.swhids.ObjectType]

Dataclass holding the relevant info associated to a SoftWare Heritage persistent IDentifier (SWHID).

Unlike QualifiedSWHID, it is restricted to core SWHIDs, ie. SWHIDs with no qualifiers.

Raises

swh.model.exceptions.ValidationError – In case of invalid object type or id

To get the raw SWHID string from an instance of this class, use the str() function:

>>> swhid = CoreSWHID(
...     object_type=ObjectType.CONTENT,
...     object_id=bytes.fromhex('8ff44f081d43176474b267de5451f2c2e88089d0'),
... )
>>> str(swhid)
'swh:1:cnt:8ff44f081d43176474b267de5451f2c2e88089d0'

And vice-versa with CoreSWHID.from_string():

>>> swhid == CoreSWHID.from_string(
...     "swh:1:cnt:8ff44f081d43176474b267de5451f2c2e88089d0"
... )
True

Method generated by attrs for class CoreSWHID.

object_type: swh.model.swhids._TObjectType

the type of object the identifier points to

to_extended() swh.model.swhids.ExtendedSWHID[source]

Converts this CoreSWHID into an ExtendedSWHID.

As ExtendedSWHID is a superset of CoreSWHID, this is lossless.

class swh.model.swhids.QualifiedSWHID(*, namespace: str = 'swh', scheme_version: int = 1, object_id: bytes, object_type, origin: Optional[str] = None, visit: Optional[Union[str, swh.model.swhids.CoreSWHID]] = None, anchor: Optional[Union[str, swh.model.swhids.CoreSWHID]] = None, path: Optional[Union[str, bytes]] = None, lines: Optional[Union[str, Tuple[int, Optional[int]]]] = None)[source]

Bases: swh.model.swhids._BaseSWHID[swh.model.swhids.ObjectType]

Dataclass holding the relevant info associated to a SoftWare Heritage persistent IDentifier (SWHID)

Raises

swh.model.exceptions.ValidationError – In case of invalid object type or id

To get the raw SWHID string from an instance of this class, use the str() function:

>>> swhid = QualifiedSWHID(
...     object_type=ObjectType.CONTENT,
...     object_id=bytes.fromhex('8ff44f081d43176474b267de5451f2c2e88089d0'),
...     lines=(5, 10),
... )
>>> str(swhid)
'swh:1:cnt:8ff44f081d43176474b267de5451f2c2e88089d0;lines=5-10'

And vice-versa with QualifiedSWHID.from_string():

>>> swhid == QualifiedSWHID.from_string(
...     "swh:1:cnt:8ff44f081d43176474b267de5451f2c2e88089d0;lines=5-10"
... )
True

Method generated by attrs for class QualifiedSWHID.

object_type: swh.model.swhids._TObjectType

the type of object the identifier points to

origin

the software origin where an object has been found or observed in the wild, as an URI

visit

the core identifier of a snapshot corresponding to a specific visit of a repository containing the designated object

anchor

a designated node in the Merkle DAG relative to which a path to the object is specified, as the core identifier of a directory, a revision, a release, or a snapshot

path

the absolute file path, from the root directory associated to the anchor node, to the object; when the anchor denotes a directory or a revision, and almost always when it’s a release, the root directory is uniquely determined; when the anchor denotes a snapshot, the root directory is the one pointed to by HEAD (possibly indirectly), and undefined if such a reference is missing

lines

line number(s) of interest, usually within a content object

Type

lines

check_visit(attribute, value)[source]
check_anchor(attribute, value)[source]
qualifiers() Dict[str, str][source]
classmethod from_string(s: str) swh.model.swhids.QualifiedSWHID[source]
class swh.model.swhids.ExtendedSWHID(*, namespace: str = 'swh', scheme_version: int = 1, object_id: bytes, object_type)[source]

Bases: swh.model.swhids._BaseSWHID[swh.model.swhids.ExtendedObjectType]

Dataclass holding the relevant info associated to a SoftWare Heritage persistent IDentifier (SWHID).

It extends CoreSWHID, by allowing non-standard object types; and should only be used internally to Software Heritage.

Raises

swh.model.exceptions.ValidationError – In case of invalid object type or id

To get the raw SWHID string from an instance of this class, use the str() function:

>>> swhid = ExtendedSWHID(
...     object_type=ExtendedObjectType.CONTENT,
...     object_id=bytes.fromhex('8ff44f081d43176474b267de5451f2c2e88089d0'),
... )
>>> str(swhid)
'swh:1:cnt:8ff44f081d43176474b267de5451f2c2e88089d0'

And vice-versa with CoreSWHID.from_string():

>>> swhid == ExtendedSWHID.from_string(
...     "swh:1:cnt:8ff44f081d43176474b267de5451f2c2e88089d0"
... )
True

Method generated by attrs for class ExtendedSWHID.

object_type: swh.model.swhids._TObjectType

the type of object the identifier points to