swh.web.api.views package

Submodules

swh.web.api.views.content module

swh.web.api.views.content.api_content_filetype(request, q)[source]
GET /api/1/content/[(hash_type):](hash)/filetype/

Get information about the detected MIME type of a content object.

Parameters:
  • hash_type (string) – optional parameter specifying which hashing algorithm has been used to compute the content checksum. It can be either sha1, sha1_git, sha256 or blake2s256. If that parameter is not provided, it is assumed that the hashing algorithm used is sha1.
  • hash (string) – hexadecimal representation of the checksum value computed with the specified hashing algorithm.
Response JSON Object:
 
  • content_url (object) – link to GET /api/1/content/[(hash_type):](hash)/ for getting information about the content
  • encoding (string) – the detected content encoding
  • id (string) – the sha1 identifier of the content
  • mimetype (string) – the detected MIME type of the content
  • tool (object) – information about the tool used to detect the content filetype
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/content/sha1:dc2830a9e72f23c1dfebef4413003221baa5fb62/filetype/
swh.web.api.views.content.api_content_language(request, q)[source]
GET /api/1/content/[(hash_type):](hash)/language/

Get information about the programming language used in a content object.

Note: this endpoint currently returns no data.

Parameters:
  • hash_type (string) – optional parameter specifying which hashing algorithm has been used to compute the content checksum. It can be either sha1, sha1_git, sha256 or blake2s256. If that parameter is not provided, it is assumed that the hashing algorithm used is sha1.
  • hash (string) – hexadecimal representation of the checksum value computed with the specified hashing algorithm.
Response JSON Object:
 
  • content_url (object) – link to GET /api/1/content/[(hash_type):](hash)/ for getting information about the content
  • id (string) – the sha1 identifier of the content
  • lang (string) – the detected programming language if any
  • tool (object) – information about the tool used to detect the programming language
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/content/sha1:dc2830a9e72f23c1dfebef4413003221baa5fb62/language/
swh.web.api.views.content.api_content_license(request, q)[source]
GET /api/1/content/[(hash_type):](hash)/license/

Get information about the license of a content object.

Parameters:
  • hash_type (string) – optional parameter specifying which hashing algorithm has been used to compute the content checksum. It can be either sha1, sha1_git, sha256 or blake2s256. If that parameter is not provided, it is assumed that the hashing algorithm used is sha1.
  • hash (string) – hexadecimal representation of the checksum value computed with the specified hashing algorithm.
Response JSON Object:
 
  • content_url (object) – link to GET /api/1/content/[(hash_type):](hash)/ for getting information about the content
  • id (string) – the sha1 identifier of the content
  • licenses (array) – array of strings containing the detected license names if any
  • tool (object) – information about the tool used to detect the license
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/content/sha1:dc2830a9e72f23c1dfebef4413003221baa5fb62/license/
swh.web.api.views.content.api_content_ctags(request, q)[source]

Get information about all Ctags-style symbols defined in a content object.

swh.web.api.views.content.api_content_raw(request, q)[source]
GET /api/1/content/[(hash_type):](hash)/raw/

Get the raw content of a content object (aka a “blob”), as a byte sequence.

Parameters:
  • hash_type (string) – optional parameter specifying which hashing algorithm has been used to compute the content checksum. It can be either sha1, sha1_git, sha256 or blake2s256. If that parameter is not provided, it is assumed that the hashing algorithm used is sha1.
  • hash (string) – hexadecimal representation of the checksum value computed with the specified hashing algorithm.
Query Parameters:
 
  • filename (string) – if provided, the downloaded content will get that filename
Response Headers:
 

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/content/sha1:dc2830a9e72f23c1dfebef4413003221baa5fb62/raw/
swh.web.api.views.content.api_content_symbol(request, q=None)[source]

Search content objects by Ctags-style symbol (e.g., function name, data type, method, …).

swh.web.api.views.content.api_check_content_known(request, q=None)[source]
GET /api/1/content/known/(sha1)[,(sha1), ...,(sha1)]/

Check whether some content(s) (aka “blob(s)”) is present in the archive based on its sha1 checksum.

Parameters:
  • sha1 (string) – hexadecimal representation of the sha1 checksum value for the content to check existence. Multiple values can be provided separated by ‘,’.
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 
Response JSON Object:
 
  • search_res (array) – array holding the search result for each provided sha1
  • search_stats (object) – some statistics regarding the number of sha1 provided and the percentage of those found in the archive

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/content/known/dc2830a9e72f23c1dfebef4413003221baa5fb62,0c3f19cb47ebfbe643fb19fa94c874d18fa62d12/
swh.web.api.views.content.api_content_metadata(request, q)[source]
GET /api/1/content/[(hash_type):](hash)/

Get information about a content (aka a “blob”) object. In the archive, a content object is identified based on checksum values computed using various hashing algorithms.

Parameters:
  • hash_type (string) – optional parameter specifying which hashing algorithm has been used to compute the content checksum. It can be either sha1, sha1_git, sha256 or blake2s256. If that parameter is not provided, it is assumed that the hashing algorithm used is sha1.
  • hash (string) – hexadecimal representation of the checksum value computed with the specified hashing algorithm.
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 
Response JSON Object:
 

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:

Example:

curl -i https://archive.softwareheritage.org/api/1/content/sha1_git:fe95a46679d128ff167b7c55df5d02356c5a1ae1/

swh.web.api.views.directory module

swh.web.api.views.directory.api_directory(request, sha1_git, path=None)[source]
GET /api/1/directory/(sha1_git)/[(path)/]

Get information about directory objects. Directories are identified by sha1 checksums, compatible with Git directory identifiers. See swh.model.identifiers.directory_identifier() in our data model module for details about how they are computed.

When given only a directory identifier, this endpoint returns information about the directory itself, returning its content (usually a list of directory entries). When given a directory identifier and a path, this endpoint returns information about the directory entry pointed by the relative path, starting path resolution from the given directory.

Parameters:
  • sha1_git (string) – hexadecimal representation of the directory sha1_git identifier
  • path (string) – optional parameter to get information about the directory entry pointed by that relative path
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 
Response JSON Array of Objects:
 
  • checksums (object) – object holding the computed checksum values for a directory entry (only for file entries)
  • dir_id (string) – sha1_git identifier of the requested directory
  • length (number) – length of a directory entry in bytes (only for file entries) for getting information about the content MIME type
  • name (string) – the directory entry name
  • perms (number) – permissions for the directory entry
  • target (string) – sha1_git identifier of the directory entry
  • target_url (string) – link to GET /api/1/content/[(hash_type):](hash)/ or GET /api/1/directory/(sha1_git)/[(path)/] depending on the directory entry type
  • type (string) – the type of the directory entry, can be either dir, file or rev

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/directory/977fc4b98c0e85816348cebd3b12026407c368b6/

swh.web.api.views.identifiers module

swh.web.api.views.identifiers.api_resolve_swh_pid(request, swh_id)[source]
GET /api/1/resolve/(swh_id)/

Resolve a Software Heritage persistent identifier.

Try to resolve a provided persistent identifier into an url for browsing the pointed archive object. If the provided identifier is valid, the existence of the object in the archive will also be checked.

Parameters:
  • swh_id (string) – a Software Heritage presistent identifier
Response JSON Object:
 
  • browse_url (string) – the url for browsing the pointed object
  • metadata (object) – object holding optional parts of the persistent identifier
  • namespace (string) – the persistent identifier namespace
  • object_id (string) – the hash identifier of the pointed object
  • object_type (string) – the type of the pointed object
  • scheme_version (number) – the scheme version of the persistent identifier
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/resolve/swh:1:rev:96db9023b881d7cd9f379b0c154650d6c108e9a3;origin=https://github.com/openssl/openssl/

swh.web.api.views.origin module

swh.web.api.views.origin._enrich_origin(origin)[source]
swh.web.api.views.origin._enrich_origin_visit(origin_visit, *, with_origin_url, with_origin_visit_url)[source]
swh.web.api.views.origin.api_origins(request)[source]
GET /api/1/origins/

Get list of archived software origins.

Origins are sorted by ids before returning them.

Query Parameters:
 
  • origin_from (int) – The first origin id that will be included in returned results (default to 1)
  • origin_count (int) – The maximum number of origins to return (default to 100, can not exceed 10000)
Response JSON Array of Objects:
 
  • id (number) – the origin unique identifier
  • origin_visits_url (string) – link to in order to get information about the visits for that origin
  • type (string) – the type of software origin (possible values are git, svn, hg, deb, pypi, ftp or deposit)
  • url (string) – the origin canonical url
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 
  • Content-Type – this depends on Accept header of request
  • Link – indicates that a subsequent or previous result page are available and contains the urls pointing to them

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/origins?origin_from=50000&origin_count=500
swh.web.api.views.origin.api_origin(request, origin_id=None, origin_type=None, origin_url=None)[source]
GET /api/1/origin/(origin_id)/

Get information about a software origin.

Parameters:
  • origin_id (int) – a software origin identifier
Response JSON Object:
 
  • id (number) – the origin unique identifier
  • origin_visits_url (string) – link to in order to get information about the visits for that origin
  • type (string) – the type of software origin (possible values are git, svn, hg, deb, pypi, ftp or deposit)
  • url (string) – the origin canonical url
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/origin/1/
GET /api/1/origin/(origin_type)/url/(origin_url)/

Get information about a software origin.

Parameters:
  • origin_type (string) – the origin type (possible values are git, svn, hg, deb, pypi, ftp or deposit)
  • origin_url (string) – the origin url
Response JSON Object:
 
  • id (number) – the origin unique identifier
  • origin_visits_url (string) – link to in order to get information about the visits for that origin
  • type (string) – the type of software origin
  • url (string) – the origin canonical url
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/origin/git/url/https://github.com/python/cpython/
GET /api/1/origin/search/(url_pattern)/

Search for software origins whose urls contain a provided string pattern or match a provided regular expression. The search is performed in a case insensitive way.

Parameters:
  • url_pattern (string) – a string pattern or a regular expression
Query Parameters:
 
  • offset (int) – the number of found origins to skip before returning results
  • limit (int) – the maximum number of found origins to return
  • regexp (boolean) – if true, consider provided pattern as a regular expression and search origins whose urls match it
  • with_visit (boolean) – if true, only return origins with at least one visit by Software heritage
Response JSON Array of Objects:
 
  • id (number) – the origin unique identifier
  • origin_visits_url (string) – link to in order to get information about the visits for that origin
  • type (string) – the type of software origin
  • url (string) – the origin canonical url
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/origin/search/python/?limit=2
GET /api/1/origin/metadata-search/

Search for software origins whose metadata (expressed as a JSON-LD/CodeMeta dictionary) match the provided criteria. For now, only full-text search on this dictionary is supported.

Query Parameters:
 
  • fulltext (str) – a string that will be matched against origin metadata; results are ranked and ordered starting with the best ones.
  • limit (int) – the maximum number of found origins to return (bounded to 100)
Response JSON Array of Objects:
 
  • origin_id (number) – the origin unique identifier
  • metadata (dict) – metadata of the origin (as a JSON-LD/CodeMeta dictionary)
  • from_revision (string) – the revision used to extract these metadata (the current HEAD or one of the former HEADs)
  • tool (dict) – the tool used to extract these metadata
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/origin/metadata-search/?limit=2&fulltext=Jane%20Doe
swh.web.api.views.origin.api_origin_visits(request, origin_id)[source]
GET /api/1/origin/(origin_id)/visits/

Get information about all visits of a software origin. Visits are returned sorted in descending order according to their date.

Parameters:
  • origin_id (int) – a software origin identifier
Query Parameters:
 
  • per_page (int) – specify the number of visits to list, for pagination purposes
  • last_visit (int) – visit to start listing from, for pagination purposes
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 
  • Content-Type – this depends on Accept header of request
  • Link – indicates that a subsequent result page is available and contains the url pointing to it
Response JSON Array of Objects:
 
  • date (string) – ISO representation of the visit date (in UTC)
  • id (number) – the unique identifier of the origin
  • origin_visit_url (string) – link to GET /api/1/origin/(origin_id)/visit/(visit_id)/ in order to get information about the visit
  • snapshot (string) – the snapshot identifier of the visit
  • snapshot_url (string) – link to GET /api/1/snapshot/(snapshot_id)/ in order to get information about the snapshot of the visit
  • status (string) – status of the visit (either full, partial or ongoing)
  • visit (number) – the unique identifier of the visit

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/origin/1/visits/
swh.web.api.views.origin.api_origin_visit(request, origin_id, visit_id)[source]
GET /api/1/origin/(origin_id)/visit/(visit_id)/

Get information about a specific visit of a software origin.

Parameters:
  • origin_id (int) – a software origin identifier
  • visit_id (int) – a visit identifier
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 
Response JSON Object:
 
  • date (string) – ISO representation of the visit date (in UTC)
  • origin (number) – the origin unique identifier
  • origin_url (string) – link to get information about the origin
  • status (string) – status of the visit (either full, partial or ongoing)
  • visit (number) – the unique identifier of the visit
Response JSON Array of Objects:
 
  • snapshot (string) – the snapshot identifier of the visit
  • snapshot_url (string) – link to GET /api/1/snapshot/(snapshot_id)/ in order to get information about the snapshot of the visit

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:
  • 200 OK – no error
  • 404 Not Found – requested origin or visit can not be found in the archive

Example:

https://archive.softwareheritage.org/api/1/origin/1500/visit/1/

swh.web.api.views.origin_save module

swh.web.api.views.origin_save.api_save_origin(request, origin_type, origin_url)[source]
GET /api/1/origin/save/(origin_type)/url/(origin_url)/
POST /api/1/origin/save/(origin_type)/url/(origin_url)/

Request the saving of a software origin into the archive or check the status of previously created save requests.

That endpoint enables to create a saving task for a software origin through a POST request.

Depending of the provided origin url, the save request can either be:

  • immediately accepted, for well known code hosting providers like for instance GitHub or GitLab
  • rejected, in case the url is blacklisted by Software Heritage
  • put in pending state until a manual check is done in order to determine if it can be loaded or not

Once a saving request has been accepted, its associated saving task status can then be checked through a GET request on the same url. Returned status can either be:

  • not created: no saving task has been created
  • not yet scheduled: saving task has been created but its execution has not yet been scheduled
  • scheduled: the task execution has been scheduled
  • succeed: the saving task has been successfully executed
  • failed: the saving task has been executed but it failed

When issuing a POST request an object will be returned while a GET request will return an array of objects (as multiple save requests might have been submitted for the same origin).

Parameters:
  • origin_type (string) – the type of origin to save (currently the supported types are git, hg and svn)
  • origin_url (string) – the url of the origin to save
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 
Response JSON Object:
 
  • origin_url (string) – the url of the origin to save
  • origin_type (string) – the type of the origin to save
  • save_request_date (string) – the date (in iso format) the save request was issued
  • save_request_status (string) – the status of the save request, either accepted, rejected or pending
  • save_task_status (string) – the status of the origin saving task, either not created, not yet scheduled, scheduled, succeed or failed

Allowed HTTP Methods: GET, POST, HEAD, OPTIONS

Status Codes:

swh.web.api.views.person module

swh.web.api.views.person.api_person(request, person_id)[source]
GET /api/1/person/(person_id)/

Get information about a person in the archive.

Parameters:
  • person_id (int) – a person identifier
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 
Response JSON Object:
 
  • email (string) – the email of the person
  • fullname (string) – the full name of the person: combination of its name and email
  • id (number) – the unique identifier of the person
  • name (string) – the name of the person

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/person/8275/

swh.web.api.views.release module

swh.web.api.views.release.api_release(request, sha1_git)[source]
GET /api/1/release/(sha1_git)/

Get information about a release in the archive. Releases are identified by sha1 checksums, compatible with Git tag identifiers. See swh.model.identifiers.release_identifier() in our data model module for details about how they are computed.

Parameters:
  • sha1_git (string) – hexadecimal representation of the release sha1_git identifier
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 
Response JSON Object:
 
  • author (object) – information about the author of the release
  • author_url (string) – link to GET /api/1/person/(person_id)/ to get information about the author of the release
  • date (string) – ISO representation of the release date (in UTC)
  • id (string) – the release unique identifier
  • message (string) – the message associated to the release
  • name (string) – the name of the release
  • target (string) – the target identifier of the release
  • target_type (string) – the type of the target, can be either release, revision, content, directory
  • target_url (string) – a link to the adequate api url based on the target type

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/release/208f61cc7a5dbc9879ae6e5c2f95891e270f09ef/

swh.web.api.views.revision module

swh.web.api.views.revision._revision_directory_by(revision, path, request_path, limit=100, with_data=False)[source]

Compute the revision matching criterion’s directory or content data.

Parameters:
  • revision – dictionary of criterions representing a revision to lookup
  • path – directory’s path to lookup
  • request_path – request path which holds the original context to
  • limit – optional query parameter to limit the revisions log
  • to 100) For now, note that this limit could impede the ((default) –
  • conclusion about sha1_git not being an ancestor of (transitivity) –
  • with_data – indicate to retrieve the content’s raw data if path resolves
  • a content. (to) –
swh.web.api.views.revision.api_revision_log_by(request, origin_id, branch_name='HEAD', ts=None)[source]
GET /api/1/revision/origin/(origin_id)[/branch/(branch_name)][/ts/(timestamp)]/log

Show the commit log for a revision, searching for it based on software origin, branch name, and/or visit timestamp.

This endpoint behaves like GET /api/1/revision/(sha1_git)[/prev/(prev_sha1s)]/log/, but operates on the revision that has been found at a given software origin, close to a given point in time, pointed by a given branch.

Parameters:
  • origin_id (int) – a software origin identifier
  • branch_name (string) – optional parameter specifying a fully-qualified branch name associated to the software origin, e.g., “refs/heads/master”. Defaults to the HEAD branch.
  • timestamp (string) – optional parameter specifying a timestamp close to which the revision pointed by the given branch should be looked up. The timestamp can be expressed either as an ISO date or as a Unix one (in UTC). Defaults to now.
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 
Response JSON Array of Objects:
 
  • author (object) – information about the author of the revision
  • author_url (string) – link to GET /api/1/person/(person_id)/ to get information about the author of the revision
  • committer (object) – information about the committer of the revision
  • committer_url (string) – link to GET /api/1/person/(person_id)/ to get information about the committer of the revision
  • committer_date (string) – ISO representation of the commit date (in UTC)
  • date (string) – ISO representation of the revision date (in UTC)
  • directory (string) – the unique identifier that revision points to
  • directory_url (string) – link to GET /api/1/directory/(sha1_git)/[(path)/] to get information about the directory associated to the revision
  • id (string) – the revision unique identifier
  • merge (boolean) – whether or not the revision corresponds to a merge commit
  • message (string) – the message associated to the revision
  • parents (array) – the parents of the revision, i.e. the previous revisions that head directly to it, each entry of that array contains an unique parent revision identifier but also a link to GET /api/1/revision/(sha1_git)/ to get more information about it
  • type (string) – the type of the revision

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:
  • 200 OK – no error
  • 404 Not Found – no revision matching the given criteria could be found in the archive

Example:

https://archive.softwareheritage.org/api/1/revision/origin/723566/ts/2016-01-17T00:00:00+00:00/log/
swh.web.api.views.revision.api_directory_through_revision_origin(request, origin_id, branch_name='HEAD', ts=None, path=None, with_data=False)[source]

Display directory or content information through a revision identified by origin/branch/timestamp.

swh.web.api.views.revision.api_revision_with_origin(request, origin_id, branch_name='HEAD', ts=None)[source]
GET /api/1/revision/origin/(origin_id)/[branch/(branch_name)/][ts/(timestamp)/]

Get information about a revision, searching for it based on software origin, branch name, and/or visit timestamp.

This endpoint behaves like GET /api/1/revision/(sha1_git)/, but operates on the revision that has been found at a given software origin, close to a given point in time, pointed by a given branch.

Parameters:
  • origin_id (int) – a software origin identifier
  • branch_name (string) – optional parameter specifying a fully-qualified branch name associated to the software origin, e.g., “refs/heads/master”. Defaults to the HEAD branch.
  • timestamp (string) – optional parameter specifying a timestamp close to which the revision pointed by the given branch should be looked up. The timestamp can be expressed either as an ISO date or as a Unix one (in UTC). Defaults to now.
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 
Response JSON Object:
 
  • author (object) – information about the author of the revision
  • author_url (string) – link to GET /api/1/person/(person_id)/ to get information about the author of the revision
  • committer (object) – information about the committer of the revision
  • committer_url (string) – link to GET /api/1/person/(person_id)/ to get information about the committer of the revision
  • committer_date (string) – ISO representation of the commit date (in UTC)
  • date (string) – ISO representation of the revision date (in UTC)
  • directory (string) – the unique identifier that revision points to
  • directory_url (string) – link to GET /api/1/directory/(sha1_git)/[(path)/] to get information about the directory associated to the revision
  • id (string) – the revision unique identifier
  • merge (boolean) – whether or not the revision corresponds to a merge commit
  • message (string) – the message associated to the revision
  • parents (array) – the parents of the revision, i.e. the previous revisions that head directly to it, each entry of that array contains an unique parent revision identifier but also a link to GET /api/1/revision/(sha1_git)/ to get more information about it
  • type (string) – the type of the revision

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:
  • 200 OK – no error
  • 404 Not Found – no revision matching the given criteria could be found in the archive

Example:

https://archive.softwareheritage.org/api/1/revision/origin/13706355/branch/refs/heads/2.7/
swh.web.api.views.revision.api_revision(request, sha1_git)[source]
GET /api/1/revision/(sha1_git)/

Get information about a revision in the archive. Revisions are identified by sha1 checksums, compatible with Git commit identifiers. See swh.model.identifiers.revision_identifier() in our data model module for details about how they are computed.

Parameters:
  • sha1_git (string) – hexadecimal representation of the revision sha1_git identifier
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 
Response JSON Object:
 
  • author (object) – information about the author of the revision
  • author_url (string) – link to GET /api/1/person/(person_id)/ to get information about the author of the revision
  • committer (object) – information about the committer of the revision
  • committer_url (string) – link to GET /api/1/person/(person_id)/ to get information about the committer of the revision
  • committer_date (string) – ISO representation of the commit date (in UTC)
  • date (string) – ISO representation of the revision date (in UTC)
  • directory (string) – the unique identifier that revision points to
  • directory_url (string) – link to GET /api/1/directory/(sha1_git)/[(path)/] to get information about the directory associated to the revision
  • id (string) – the revision unique identifier
  • merge (boolean) – whether or not the revision corresponds to a merge commit
  • message (string) – the message associated to the revision
  • parents (array) – the parents of the revision, i.e. the previous revisions that head directly to it, each entry of that array contains an unique parent revision identifier but also a link to GET /api/1/revision/(sha1_git)/ to get more information about it
  • type (string) – the type of the revision

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/revision/aafb16d69fd30ff58afdd69036a26047f3aebdc6/
swh.web.api.views.revision.api_revision_raw_message(request, sha1_git)[source]

Return the raw data of the message of revision identified by sha1_git

swh.web.api.views.revision.api_revision_directory(request, sha1_git, dir_path=None, with_data=False)[source]
GET /api/1/revision/(sha1_git)/directory/[(path)/]

Get information about directory (entry) objects associated to revisions. Each revision is associated to a single “root” directory. This endpoint behaves like GET /api/1/directory/(sha1_git)/[(path)/], but operates on the root directory associated to a given revision.

Parameters:
  • sha1_git (string) – hexadecimal representation of the revision sha1_git identifier
  • path (string) – optional parameter to get information about the directory entry pointed by that relative path
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 
Response JSON Object:
 
  • content (array) – directory entries as returned by GET /api/1/directory/(sha1_git)/[(path)/]
  • path (string) – path of directory from the revision root one
  • revision (string) – the unique revision identifier
  • type (string) – the type of the directory

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/revision/f1b94134a4b879bc55c3dacdb496690c8ebdc03f/directory/
swh.web.api.views.revision.api_revision_log(request, sha1_git, prev_sha1s=None)[source]
GET /api/1/revision/(sha1_git)[/prev/(prev_sha1s)]/log/

Get a list of all revisions heading to a given one, in other words show the commit log.

Parameters:
  • sha1_git (string) – hexadecimal representation of the revision sha1_git identifier
  • prev_sha1s (string) – optional parameter representing the navigation breadcrumbs (descendant revisions previously visited). If multiple values, use / as delimiter. If provided, revisions information will be added at the beginning of the returned list.
Query Parameters:
 
  • per_page (int) – number of elements in the returned list, for pagination purpose
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 
  • Content-Type – this depends on Accept header of request
  • Link – indicates that a subsequent result page is available and contains the url pointing to it
Response JSON Array of Objects:
 
  • author (object) – information about the author of the revision
  • author_url (string) – link to GET /api/1/person/(person_id)/ to get information about the author of the revision
  • committer (object) – information about the committer of the revision
  • committer_url (string) – link to GET /api/1/person/(person_id)/ to get information about the committer of the revision
  • committer_date (string) – ISO representation of the commit date (in UTC)
  • date (string) – ISO representation of the revision date (in UTC)
  • directory (string) – the unique identifier that revision points to
  • directory_url (string) – link to GET /api/1/directory/(sha1_git)/[(path)/] to get information about the directory associated to the revision
  • id (string) – the revision unique identifier
  • merge (boolean) – whether or not the revision corresponds to a merge commit
  • message (string) – the message associated to the revision
  • parents (array) – the parents of the revision, i.e. the previous revisions that head directly to it, each entry of that array contains an unique parent revision identifier but also a link to GET /api/1/revision/(sha1_git)/ to get more information about it
  • type (string) – the type of the revision

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/revision/e1a315fa3fa734e2a6154ed7b5b9ae0eb8987aad/log/

swh.web.api.views.snapshot module

swh.web.api.views.snapshot.api_snapshot(request, snapshot_id)[source]
GET /api/1/snapshot/(snapshot_id)/

Get information about a snapshot in the archive.

A snapshot is a set of named branches, which are pointers to objects at any level of the Software Heritage DAG. It represents a full picture of an origin at a given time.

As well as pointing to other objects in the Software Heritage DAG, branches can also be aliases, in which case their target is the name of another branch in the same snapshot, or dangling, in which case the target is unknown.

A snapshot identifier is a salted sha1. See swh.model.identifiers.snapshot_identifier() in our data model module for details about how they are computed.

Parameters:
  • snapshot_id (sha1) – a snapshot identifier
Query Parameters:
 
  • branches_from (str) – optional parameter used to skip branches whose name is lesser than it before returning them
  • branches_count (int) – optional parameter used to restrain the amount of returned branches (default to 1000)
  • target_types (str) – optional comma separated list parameter used to filter the target types of branch to return (possible values that can be contained in that list are content, directory, revision, release, snapshot or alias)
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 
  • Content-Type – this depends on Accept header of request
  • Link – indicates that a subsequent result page is available and contains the url pointing to it
Response JSON Object:
 
  • branches (object) – object containing all branches associated to the snapshot, for each of them the associated target type and id are given but also a link to get information about that target
  • id (string) – the unique identifier of the snapshot

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/snapshot/6a3a2cf0b2b90ce7ae1cf0a221ed68035b686f5a/

swh.web.api.views.stat module

swh.web.api.views.stat.api_stats(request)[source]
GET /api/1/stat/counters/

Get statistics about the content of the archive.

Response JSON Object:
 
  • content (number) – current number of content objects (aka files) in the archive
  • directory (number) – current number of directory objects in the archive
  • origin (number) – current number of software origins (an origin is a “place” where code source can be found, e.g. a git repository, a tarball, …) in the archive
  • origin_visit (number) – current number of visits on software origins to fill the archive
  • person (number) – current number of persons (code source authors or committers) in the archive
  • release (number) – current number of releases objects in the archive
  • revision (number) – current number of revision objects (aka commits) in the archive
  • skipped_content (number) – current number of content objects (aka files) which where not inserted in the archive
  • snapshot (number) – current number of snapshot objects (aka set of named branches) in the archive
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/stat/counters/

swh.web.api.views.utils module

swh.web.api.views.utils.api_lookup(lookup_fn, *args, notfound_msg='Object not found', enrich_fn=None)[source]
Capture a redundant behavior of:
  • looking up the backend with a criteria (be it an identifier or checksum) passed to the function lookup_fn
  • if nothing is found, raise an NotFoundExc exception with error message notfound_msg.
  • Otherwise if something is returned:
    • either as list, map or generator, map the enrich_fn function to it and return the resulting data structure as list.
    • either as dict and pass to enrich_fn and return the dict enriched.
Parameters:
  • lookup_fn (-) – function expects one criteria and optional supplementary *args.
  • notfound_msg (-) – if nothing matching the criteria is found, raise NotFoundExc with this error message.
  • enrich_fn (-) – Function to use to enrich the result returned by lookup_fn. Default to the identity function if not provided.
  • *args (-) – supplementary arguments to pass to lookup_fn.
Raises:

NotFoundExp or whatever lookup_fn raises.

swh.web.api.views.utils.api_home(self, request, *args, **kwargs)[source]
swh.web.api.views.utils.api_endpoints(request)[source]

Display the list of opened api endpoints.

swh.web.api.views.vault module

swh.web.api.views.vault._dispatch_cook_progress(request, obj_type, obj_id)[source]
swh.web.api.views.vault.api_vault_cook_directory(request, dir_id)[source]
GET /api/1/vault/directory/(dir_id)/
POST /api/1/vault/directory/(dir_id)/

Request the cooking of an archive for a directory or check its cooking status.

That endpoint enables to create a vault cooking task for a directory through a POST request or check the status of a previously created one through a GET request.

Once the cooking task has been executed, the resulting archive can be downloaded using the dedicated endpoint GET /api/1/vault/directory/(dir_id)/raw/.

Then to extract the cooked directory in the current one, use:

$ tar xvf path/to/directory.tar.gz
Parameters:
  • dir_id (string) – the directory’s sha1 identifier
Query Parameters:
 
  • email (string) – e-mail to notify when the archive is ready
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 
Response JSON Object:
 
  • fetch_url (string) – the url from which to download the archive once it has been cooked (see GET /api/1/vault/directory/(dir_id)/raw/)
  • obj_type (string) – the type of object to cook (directory or revision)
  • progress_message (string) – message describing the cooking task progress
  • id (number) – the cooking task id
  • status (string) – the cooking task status (either new, pending, done or failed)
  • obj_id (string) – the identifier of the object to cook

Allowed HTTP Methods: GET, POST, HEAD, OPTIONS

Status Codes:
swh.web.api.views.vault.api_vault_fetch_directory(request, dir_id)[source]
GET /api/1/vault/directory/(dir_id)/raw/

Fetch the cooked archive for a directory.

See GET /api/1/vault/directory/(dir_id)/ to get more details on directory cooking.

Parameters:
  • dir_id (string) – the directory’s sha1 identifier
Response Headers:
 

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:
swh.web.api.views.vault.api_vault_cook_revision_gitfast(request, rev_id)[source]
GET /api/1/vault/revision/(rev_id)/gitfast/
POST /api/1/vault/revision/(rev_id)/gitfast/

Request the cooking of a gitfast archive for a revision or check its cooking status.

That endpoint enables to create a vault cooking task for a revision through a POST request or check the status of a previously created one through a GET request.

Once the cooking task has been executed, the resulting gitfast archive can be downloaded using the dedicated endpoint GET /api/1/vault/revision/(rev_id)/gitfast/raw/.

Then to import the revision in the current directory, use:

$ git init
$ zcat path/to/revision.gitfast.gz | git fast-import
$ git checkout HEAD
Parameters:
  • rev_id (string) – the revision’s sha1 identifier
Query Parameters:
 
  • email (string) – e-mail to notify when the gitfast archive is ready
Request Headers:
 
  • Accept – the requested response content type, either application/json (default) or application/yaml
Response Headers:
 
Response JSON Object:
 
  • fetch_url (string) – the url from which to download the archive once it has been cooked (see GET /api/1/vault/revision/(rev_id)/gitfast/raw/)
  • obj_type (string) – the type of object to cook (directory or revision)
  • progress_message (string) – message describing the cooking task progress
  • id (number) – the cooking task id
  • status (string) – the cooking task status (new/pending/done/failed)
  • obj_id (string) – the identifier of the object to cook

Allowed HTTP Methods: GET, POST, HEAD, OPTIONS

Status Codes:
swh.web.api.views.vault.api_vault_fetch_revision_gitfast(request, rev_id)[source]
GET /api/1/vault/revision/(rev_id)/gitfast/raw/

Fetch the cooked gitfast archive for a revision.

See GET /api/1/vault/revision/(rev_id)/gitfast/ to get more details on directory cooking.

Parameters:
  • rev_id (string) – the revision’s sha1 identifier
Response Headers:
 

Allowed HTTP Methods: GET, HEAD, OPTIONS

Status Codes:

Module contents