swh.model.model module¶
-
exception
swh.model.model.
MissingData
[source]¶ Bases:
Exception
Raised by Content.with_data when it has no way of fetching the data (but not when fetching the data fails).
-
swh.model.model.
KeyType
¶ The type returned by BaseModel.unique_key().
alias of Union[Dict[str, str], Dict[str, bytes], bytes]
-
swh.model.model.
freeze_optional_dict
(d: Union[None, Dict[KT, VT], swh.model.collections.ImmutableDict[KT, VT]]) → Optional[swh.model.collections.ImmutableDict[KT, VT]][source]¶
-
class
swh.model.model.
BaseModel
[source]¶ Bases:
object
Base class for SWH model classes.
Provides serialization/deserialization to/from Python dictionaries, that are suitable for JSON/msgpack-like formats.
-
to_dict
()[source]¶ Wrapper of attr.asdict that can be overridden by subclasses that have special handling of some of the fields.
-
classmethod
from_dict
(d)[source]¶ Takes a dictionary representing a tree of SWH objects, and recursively builds the corresponding objects.
-
-
class
swh.model.model.
HashableObject
[source]¶ Bases:
object
Mixin to automatically compute object identifier hash when the associated model is instantiated.
-
class
swh.model.model.
Person
(fullname: bytes, name: Optional[bytes], email: Optional[bytes])[source]¶ Bases:
swh.model.model.BaseModel
Represents the author/committer of a revision or release.
-
object_type
: typing_extensions.Final = 'person'¶
-
fullname
¶
-
name
¶
-
email
¶
-
classmethod
from_fullname
(fullname: bytes)[source]¶ Returns a Person object, by guessing the name and email from the fullname, in the name <email> format.
The fullname is left unchanged.
-
anonymize
() → swh.model.model.Person[source]¶ Returns an anonymized version of the Person object.
Anonymization is simply a Person which fullname is the hashed, with unset name or email.
-
-
class
swh.model.model.
Timestamp
(seconds: int, microseconds: int)[source]¶ Bases:
swh.model.model.BaseModel
Represents a naive timestamp from a VCS.
-
object_type
: typing_extensions.Final = 'timestamp'¶
-
seconds
¶
-
microseconds
¶
-
-
class
swh.model.model.
TimestampWithTimezone
(timestamp: swh.model.model.Timestamp, offset: int, negative_utc: bool)[source]¶ Bases:
swh.model.model.BaseModel
Represents a TZ-aware timestamp from a VCS.
-
object_type
: typing_extensions.Final = 'timestamp_with_timezone'¶
-
timestamp
¶
-
offset
¶
-
negative_utc
¶
-
check_offset
(attribute, value)[source]¶ Checks the offset is a 16-bits signed integer (in theory, it should always be between -14 and +14 hours).
-
-
class
swh.model.model.
Origin
(url: str)[source]¶ Bases:
swh.model.model.BaseModel
Represents a software source: a VCS and an URL.
-
object_type
: typing_extensions.Final = 'origin'¶
-
url
¶
-
unique_key
() → Union[Dict[str, str], Dict[str, bytes], bytes][source]¶ Returns a unique key for this object, that can be used for deduplication.
-
swhid
() → swh.model.identifiers.ExtendedSWHID[source]¶ Returns a SWHID representing this origin.
-
-
class
swh.model.model.
OriginVisit
(origin: str, date: datetime.datetime, type: str, visit: Optional[int] = None)[source]¶ Bases:
swh.model.model.BaseModel
Represents an origin visit with a given type at a given point in time, by a SWH loader.
-
object_type
: typing_extensions.Final = 'origin_visit'¶
-
origin
¶
-
date
¶
-
type
¶ Should not be set before calling ‘origin_visit_add()’.
-
visit
¶
-
-
class
swh.model.model.
OriginVisitStatus
(origin: str, visit: int, date: datetime.datetime, status: str, snapshot: Optional[bytes], type: Optional[str] = None, metadata=None)[source]¶ Bases:
swh.model.model.BaseModel
Represents a visit update of an origin at a given point in time.
-
object_type
: typing_extensions.Final = 'origin_visit_status'¶
-
origin
¶
-
visit
¶
-
date
¶
-
status
¶
-
snapshot
¶
-
type
¶
-
metadata
¶
-
-
class
swh.model.model.
TargetType
(value)[source]¶ Bases:
enum.Enum
The type of content pointed to by a snapshot branch. Usually a revision or an alias.
-
CONTENT
= 'content'¶
-
DIRECTORY
= 'directory'¶
-
REVISION
= 'revision'¶
-
RELEASE
= 'release'¶
-
SNAPSHOT
= 'snapshot'¶
-
ALIAS
= 'alias'¶
-
-
class
swh.model.model.
ObjectType
(value)[source]¶ Bases:
enum.Enum
The type of content pointed to by a release. Usually a revision
-
CONTENT
= 'content'¶
-
DIRECTORY
= 'directory'¶
-
REVISION
= 'revision'¶
-
RELEASE
= 'release'¶
-
SNAPSHOT
= 'snapshot'¶
-
-
class
swh.model.model.
SnapshotBranch
(target: bytes, target_type: swh.model.model.TargetType)[source]¶ Bases:
swh.model.model.BaseModel
Represents one of the branches of a snapshot.
-
object_type
: typing_extensions.Final = 'snapshot_branch'¶
-
target
¶
-
target_type
¶
-
-
class
swh.model.model.
Snapshot
(branches, id: bytes = b'')[source]¶ Bases:
swh.model.model.HashableObject
,swh.model.model.BaseModel
Represents the full state of an origin at a given point in time.
-
object_type
: typing_extensions.Final = 'snapshot'¶
-
branches
¶
-
id
¶
-
compute_hash
() → bytes[source]¶ Derived model classes must implement this to compute the object hash.
This method is called by the object initialization if the id attribute is set to an empty value.
-
classmethod
from_dict
(d)[source]¶ Takes a dictionary representing a tree of SWH objects, and recursively builds the corresponding objects.
-
swhid
() → swh.model.identifiers.CoreSWHID[source]¶ Returns a SWHID representing this object.
-
-
class
swh.model.model.
Release
(name: bytes, message: Optional[bytes], target: Optional[bytes], target_type: swh.model.model.ObjectType, synthetic: bool, author: Optional[swh.model.model.Person] = None, date: Optional[swh.model.model.TimestampWithTimezone] = None, metadata=None, id: bytes = b'')[source]¶ Bases:
swh.model.model.HashableObject
,swh.model.model.BaseModel
-
object_type
: typing_extensions.Final = 'release'¶
-
name
¶
-
message
¶
-
target
¶
-
target_type
¶
-
synthetic
¶
-
date
¶
-
metadata
¶
-
id
¶
-
compute_hash
() → bytes[source]¶ Derived model classes must implement this to compute the object hash.
This method is called by the object initialization if the id attribute is set to an empty value.
If the author is None, checks the date is None too.
-
to_dict
()[source]¶ Wrapper of attr.asdict that can be overridden by subclasses that have special handling of some of the fields.
-
classmethod
from_dict
(d)[source]¶ Takes a dictionary representing a tree of SWH objects, and recursively builds the corresponding objects.
-
swhid
() → swh.model.identifiers.CoreSWHID[source]¶ Returns a SWHID representing this object.
-
anonymize
() → swh.model.model.Release[source]¶ Returns an anonymized version of the Release object.
Anonymization consists in replacing the author with an anonymized Person object.
-
-
class
swh.model.model.
RevisionType
(value)[source]¶ Bases:
enum.Enum
An enumeration.
-
GIT
= 'git'¶
-
TAR
= 'tar'¶
-
DSC
= 'dsc'¶
-
SUBVERSION
= 'svn'¶
-
MERCURIAL
= 'hg'¶
-
-
class
swh.model.model.
Revision
(message: Optional[bytes], author: swh.model.model.Person, committer: swh.model.model.Person, date: Optional[swh.model.model.TimestampWithTimezone], committer_date: Optional[swh.model.model.TimestampWithTimezone], type: swh.model.model.RevisionType, directory: bytes, synthetic: bool, metadata=None, parents: Tuple[bytes, …] = (), id: bytes = b'', extra_headers=())[source]¶ Bases:
swh.model.model.HashableObject
,swh.model.model.BaseModel
-
object_type
: typing_extensions.Final = 'revision'¶
-
message
¶
-
committer
¶
-
date
¶
-
committer_date
¶
-
type
¶
-
directory
¶
-
synthetic
¶
-
metadata
¶
-
parents
¶
-
id
¶
-
extra_headers
¶
-
compute_hash
() → bytes[source]¶ Derived model classes must implement this to compute the object hash.
This method is called by the object initialization if the id attribute is set to an empty value.
-
classmethod
from_dict
(d)[source]¶ Takes a dictionary representing a tree of SWH objects, and recursively builds the corresponding objects.
-
swhid
() → swh.model.identifiers.CoreSWHID[source]¶ Returns a SWHID representing this object.
-
anonymize
() → swh.model.model.Revision[source]¶ Returns an anonymized version of the Revision object.
Anonymization consists in replacing the author and committer with an anonymized Person object.
-
-
class
swh.model.model.
DirectoryEntry
(name: bytes, type: str, target: bytes, perms: int)[source]¶ Bases:
swh.model.model.BaseModel
-
object_type
: typing_extensions.Final = 'directory_entry'¶
-
name
¶
-
type
¶
-
target
¶
-
perms
¶ Usually one of the values of swh.model.from_disk.DentryPerms.
-
-
class
swh.model.model.
Directory
(entries: Tuple[swh.model.model.DirectoryEntry, …], id: bytes = b'')[source]¶ Bases:
swh.model.model.HashableObject
,swh.model.model.BaseModel
-
object_type
: typing_extensions.Final = 'directory'¶
-
entries
¶
-
id
¶
-
compute_hash
() → bytes[source]¶ Derived model classes must implement this to compute the object hash.
This method is called by the object initialization if the id attribute is set to an empty value.
-
classmethod
from_dict
(d)[source]¶ Takes a dictionary representing a tree of SWH objects, and recursively builds the corresponding objects.
-
swhid
() → swh.model.identifiers.CoreSWHID[source]¶ Returns a SWHID representing this object.
-
-
class
swh.model.model.
BaseContent
(status: str)[source]¶ Bases:
swh.model.model.BaseModel
-
status
¶
-
-
class
swh.model.model.
Content
(sha1: bytes, sha1_git: bytes, sha256: bytes, blake2s256: bytes, length: int, status: str = 'visible', data: Optional[bytes] = None, ctime: Optional[datetime.datetime] = None)[source]¶ Bases:
swh.model.model.BaseContent
-
object_type
: typing_extensions.Final = 'content'¶
-
sha1
¶
-
sha1_git
¶
-
sha256
¶
-
blake2s256
¶
-
length
¶
-
status
¶
-
data
¶
-
ctime
¶
-
to_dict
()[source]¶ Wrapper of attr.asdict that can be overridden by subclasses that have special handling of some of the fields.
-
classmethod
from_data
(data, status='visible', ctime=None) → swh.model.model.Content[source]¶ Generate a Content from a given data byte string.
This populates the Content with the hashes and length for the data passed as argument, as well as the data itself.
-
classmethod
from_dict
(d)[source]¶ Takes a dictionary representing a tree of SWH objects, and recursively builds the corresponding objects.
-
with_data
() → swh.model.model.Content[source]¶ Loads the data attribute; meaning that it is guaranteed not to be None after this call.
This call is almost a no-op, but subclasses may overload this method to lazy-load data (eg. from disk or objstorage).
-
unique_key
() → Union[Dict[str, str], Dict[str, bytes], bytes][source]¶ Returns a unique key for this object, that can be used for deduplication.
-
swhid
() → swh.model.identifiers.CoreSWHID[source]¶ Returns a SWHID representing this object.
-
-
class
swh.model.model.
SkippedContent
(sha1: Optional[bytes], sha1_git: Optional[bytes], sha256: Optional[bytes], blake2s256: Optional[bytes], length: Optional[int], status: str, reason: Optional[str] = None, origin: Optional[str] = None, ctime: Optional[datetime.datetime] = None)[source]¶ Bases:
swh.model.model.BaseContent
-
object_type
: typing_extensions.Final = 'skipped_content'¶
-
sha1
¶
-
sha1_git
¶
-
sha256
¶
-
blake2s256
¶
-
length
¶
-
status
¶
-
reason
¶
-
origin
¶
-
ctime
¶
-
to_dict
()[source]¶ Wrapper of attr.asdict that can be overridden by subclasses that have special handling of some of the fields.
-
classmethod
from_data
(data: bytes, reason: str, ctime: Optional[datetime.datetime] = None) → swh.model.model.SkippedContent[source]¶ Generate a SkippedContent from a given data byte string.
This populates the SkippedContent with the hashes and length for the data passed as argument.
You can use attr.evolve on such a generated content to nullify some of its attributes, e.g. for tests.
-
-
class
swh.model.model.
MetadataAuthorityType
(value)[source]¶ Bases:
enum.Enum
An enumeration.
-
DEPOSIT_CLIENT
= 'deposit_client'¶
-
FORGE
= 'forge'¶
-
REGISTRY
= 'registry'¶
-
-
class
swh.model.model.
MetadataAuthority
(type: swh.model.model.MetadataAuthorityType, url: str, metadata=None)[source]¶ Bases:
swh.model.model.BaseModel
Represents an entity that provides metadata about an origin or software artifact.
-
object_type
: typing_extensions.Final = 'metadata_authority'¶
-
type
¶
-
url
¶
-
metadata
¶
-
to_dict
()[source]¶ Wrapper of attr.asdict that can be overridden by subclasses that have special handling of some of the fields.
-
-
class
swh.model.model.
MetadataFetcher
(name: str, version: str, metadata=None)[source]¶ Bases:
swh.model.model.BaseModel
Represents a software component used to fetch metadata from a metadata authority, and ingest them into the Software Heritage archive.
-
object_type
: typing_extensions.Final = 'metadata_fetcher'¶
-
name
¶
-
version
¶
-
metadata
¶
-
-
class
swh.model.model.
RawExtrinsicMetadata
(target: swh.model.identifiers.ExtendedSWHID, discovery_date: datetime.datetime, authority: swh.model.model.MetadataAuthority, fetcher: swh.model.model.MetadataFetcher, format: str, metadata: bytes, origin: Optional[str] = None, visit: Optional[int] = None, snapshot: Optional[swh.model.identifiers.CoreSWHID] = None, release: Optional[swh.model.identifiers.CoreSWHID] = None, revision: Optional[swh.model.identifiers.CoreSWHID] = None, path: Optional[bytes] = None, directory: Optional[swh.model.identifiers.CoreSWHID] = None, id: bytes = b'')[source]¶ Bases:
swh.model.model.HashableObject
,swh.model.model.BaseModel
-
object_type
: typing_extensions.Final = 'raw_extrinsic_metadata'¶
-
target
¶
-
discovery_date
¶
-
fetcher
¶
-
format
¶
-
metadata
¶
-
origin
¶
-
visit
¶
-
snapshot
¶
-
release
¶
-
revision
¶
-
path
¶
-
directory
¶
-
id
¶
-
compute_hash
() → bytes[source]¶ Derived model classes must implement this to compute the object hash.
This method is called by the object initialization if the id attribute is set to an empty value.
-