swh.fuse.fs.artifact module#

class swh.fuse.fs.artifact.Content(name: str, mode: int, depth: int, fuse: Fuse, swhid: CoreSWHID, prefetch: Any = None)[source]#

Bases: FuseFileEntry

Software Heritage content artifact.

Content leaves (AKA blobs) are represented on disks as regular files, containing the corresponding bytes, as archived.

Note that permissions are associated to blobs only in the context of directories. Hence, when accessing blobs from the top-level archive/ directory, the permissions of the archive/SWHID file will be arbitrary and not meaningful (e.g., 0x644).

swhid: CoreSWHID#
prefetch: Any = None#

optional prefetched metadata used to set entry attributes

async get_content() bytes[source]#

Return the content of a file entry

async size() int[source]#

Return the size (in bytes) of an entry

class swh.fuse.fs.artifact.Directory(name: str, mode: int, depth: int, fuse: Fuse, swhid: CoreSWHID)[source]#

Bases: FuseDirEntry

Software Heritage directory artifact.

Directory nodes are represented as directories on the file-system, containing one entry for each entry of the archived directory. Entry names and other metadata, including permissions, will correspond to the archived entry metadata.

Note that the FUSE mount is read-only, no matter what the permissions say. So it is possible that, in the context of a directory, a file is presented as writable, whereas actually writing to it will fail with EPERM.

swhid: CoreSWHID#
async compute_entries() AsyncIterator[FuseEntry][source]#

Return the child entries of a directory entry

class swh.fuse.fs.artifact.Revision(name: str, mode: int, depth: int, fuse: Fuse, swhid: CoreSWHID)[source]#

Bases: FuseDirEntry

Software Heritage revision artifact.

Revision (AKA commit) nodes are represented on the file-system as directories with the following entries:

  • root: source tree at the time of the commit, as a symlink pointing into archive/, to a SWHID of type dir

  • parents/ (note the plural): a virtual directory containing entries named 1, 2, 3, etc., one for each parent commit. Each of these entry is a symlink pointing into archive/, to the SWHID file for the given parent commit

  • parent (note the singular): present if and only if the current commit has at least one parent commit (which is the most common case). When present it is a symlink pointing into parents/1/

  • history: a virtual directory listing all its revision ancestors, sorted in reverse topological order. The history can be listed through by-date/, by-hash/ or by-page/ with each its own sharding policy.

  • meta.json: metadata for the current node, as a symlink pointing to the relevant archive/<SWHID>.json file

swhid: CoreSWHID#
async compute_entries() AsyncIterator[FuseEntry][source]#

Return the child entries of a directory entry

class swh.fuse.fs.artifact.RevisionParents(name: str, mode: int, depth: int, fuse: Fuse, parents: List[CoreSWHID])[source]#

Bases: FuseDirEntry

Revision virtual parents/ directory

parents: List[CoreSWHID]#
async compute_entries() AsyncIterator[FuseEntry][source]#

Return the child entries of a directory entry

class swh.fuse.fs.artifact.RevisionHistory(name: str, mode: int, depth: int, fuse: Fuse, swhid: CoreSWHID)[source]#

Bases: FuseDirEntry

Revision virtual history/ directory

swhid: CoreSWHID#
async prefill_by_date_cache(by_date_dir: FuseDirEntry) None[source]#
async compute_entries() AsyncIterator[FuseEntry][source]#

Return the child entries of a directory entry

class swh.fuse.fs.artifact.RevisionHistoryShardByDate(name: str, mode: int, depth: int, fuse: Fuse, history_swhid: CoreSWHID, prefix: str = '', is_status_done: bool = False)[source]#

Bases: FuseDirEntry

Revision virtual history/by-date sharded directory

history_swhid: CoreSWHID#
prefix: str = ''#
is_status_done: bool = False#
DATE_FMT = '{year:04d}/{month:02d}/{day:02d}/'#
ENTRIES_REGEXP: Pattern | None = re.compile('^([0-9]{2,4})|(swh:1:(cnt|dir|rel|rev|snp):[0-9a-f]{40})$')#
class StatusFile(depth: int, fuse: Fuse, history_swhid: CoreSWHID)[source]#

Bases: FuseFileEntry

Temporary file used to indicate loading progress in by-date/

name: str = '.status'#

entry filename

mode: int = 33060#

entry permission mode

history_swhid: CoreSWHID#
async get_content() bytes[source]#

Return the content of a file entry

async compute_entries() AsyncIterator[FuseEntry][source]#

Return the child entries of a directory entry

class swh.fuse.fs.artifact.RevisionHistoryShardByHash(name: str, mode: int, depth: int, fuse: Fuse, history_swhid: CoreSWHID, prefix: str = '')[source]#

Bases: FuseDirEntry

Revision virtual history/by-hash sharded directory

history_swhid: CoreSWHID#
prefix: str = ''#
SHARDING_LENGTH = 2#
ENTRIES_REGEXP: Pattern | None = re.compile('^([a-f0-9]+)|(swh:1:(cnt|dir|rel|rev|snp):[0-9a-f]{40})$')#
async compute_entries() AsyncIterator[FuseEntry][source]#

Return the child entries of a directory entry

class swh.fuse.fs.artifact.RevisionHistoryShardByPage(name: str, mode: int, depth: int, fuse: Fuse, history_swhid: CoreSWHID, prefix: int | None = None)[source]#

Bases: FuseDirEntry

Revision virtual history/by-page sharded directory

history_swhid: CoreSWHID#
prefix: int | None = None#
PAGE_SIZE = 10000#
PAGE_FMT = '{page_number:03d}'#
ENTRIES_REGEXP: Pattern | None = re.compile('^([0-9]+)|(swh:1:(cnt|dir|rel|rev|snp):[0-9a-f]{40})$')#
async compute_entries() AsyncIterator[FuseEntry][source]#

Return the child entries of a directory entry

class swh.fuse.fs.artifact.Release(name: str, mode: int, depth: int, fuse: Fuse, swhid: CoreSWHID)[source]#

Bases: FuseDirEntry

Software Heritage release artifact.

Release nodes are represented on the file-system as directories with the following entries:

  • target: target node, as a symlink to archive/<SWHID>

  • target_type: regular file containing the type of the target SWHID

  • root: present if and only if the release points to something that (transitively) resolves to a directory. When present it is a symlink pointing into archive/ to the SWHID of the given directory

  • meta.json: metadata for the current node, as a symlink pointing to the relevant archive/<SWHID>.json file

swhid: CoreSWHID#
async find_root_directory(swhid: CoreSWHID) CoreSWHID | None[source]#
async compute_entries() AsyncIterator[FuseEntry][source]#

Return the child entries of a directory entry

class swh.fuse.fs.artifact.ReleaseType(name: str, mode: int, depth: int, fuse: Fuse, target_type: ObjectType)[source]#

Bases: FuseFileEntry

Release type virtual file

target_type: ObjectType#
async get_content() bytes[source]#

Return the content of a file entry

class swh.fuse.fs.artifact.Snapshot(name: str, mode: int, depth: int, fuse: Fuse, swhid: CoreSWHID, prefix: str = '')[source]#

Bases: FuseDirEntry

Software Heritage snapshot artifact.

Snapshot nodes are represented on the file-system as recursive directories following the branch names structure. For example, a branch named refs/tags/v1.0 will be represented as a refs directory containing a tags directory containing a v1.0 symlink pointing to the branch target SWHID.

swhid: CoreSWHID#
prefix: str = ''#
async compute_entries() AsyncIterator[FuseEntry][source]#

Return the child entries of a directory entry

class swh.fuse.fs.artifact.Origin(name: str, mode: int, depth: int, fuse: Fuse)[source]#

Bases: FuseDirEntry

Software Heritage origin artifact.

Origin nodes are represented on the file-system as directories with one entry for each origin visit.

The visits directories are named after the visit date (YYYY-MM-DD, if multiple visits occur the same day only the first one is kept). Each visit directory contains a meta.json with associated metadata for the origin node, and potentially a snapshot symlink pointing to the visit’s snapshot node.

DATE_FMT = '{year:04d}-{month:02d}-{day:02d}'#
ENTRIES_REGEXP: Pattern | None = re.compile('^[0-9]{4}-[0-9]{2}-[0-9]{2}$')#
async compute_entries() AsyncIterator[FuseEntry][source]#

Return the child entries of a directory entry

class swh.fuse.fs.artifact.OriginVisit(name: str, mode: int, depth: int, fuse: Fuse, meta: Dict[str, Any])[source]#

Bases: FuseDirEntry

Origin visit virtual directory

meta: Dict[str, Any]#
class MetaFile(name: 'str', mode: 'int', depth: 'int', fuse: 'Fuse', content: str)[source]#

Bases: FuseFileEntry

content: str#
async get_content() bytes[source]#

Return the content of a file entry

async compute_entries() AsyncIterator[FuseEntry][source]#

Return the child entries of a directory entry