URI scheme for swh-web Browse application

This web application aims to provide HTML views to easily navigate in the archive, thus it needs to be reached from a web browser. If you intend to query the archive programmatically through any HTTP client, please refer to the swh-web API URLs section instead.

Context-independent browsing

Context-independent URLs provide information about objects (e.g., revisions, directories, contents, person, …), independently of the contexts where they have been found (e.g., specific repositories, branches, commits, …).

The following endpoints are the same of the API case (see below), and just render the corresponding information for user consumption. Where hyperlinks are created, they always point to other context-independent user URLs:

Context-dependent browsing

Context-dependent URLs provide information about objects, limited to specific contexts where the objects have been found.

For instance, instead of having to specify a (root) revision by sha1_git, users might want to specify a place and a time. In Software Heritage a “place” is an origin, with an optional branch name; a “time” is a timestamp at which some place has been observed by Software Heritage crawlers.

Wherever a revision context is expected in a path (i.e., a /browse/revision/(sha1_git)/ path fragment) we can put in its stead a path fragment of the form /origin/(origin_type)/url/(origin_url)/[/visit/(timestamp)/][?branch=(branch)]. Such a fragment is resolved, internally by the archive, to a revision sha1_git as follows:

  • if timestamp is absent: look for the most recent crawl of origin identified by origin_type and origin_url
  • if timestamp is given: look for the closest crawl of origin identified by origin_type and origin_url from timestamp timestamp
  • if branch is given as a query parameter: look for the branch branch
  • if branch is absent: look for branch “HEAD” or “master”
  • return the revision sha1_git pointed by the chosen branch

The already mentioned URLs for revision contexts can therefore be alternatively specified by users as:

Typing:

  • origin_type corresponds to the type of the archived origin: git, svn,``hg``, deb, pypi, ftp or deposit
  • origin_url corresponds to the URL the origin was crawled from, for instance https://github.com/(user)/(repo)/
  • branch name is given as per the corresponding VCS (e.g., Git) as a query parameter to the requested URL.
  • timestamp is given in a format as liberal as possible, to uphold the principle of least surprise. At the very minimum it is possible to enter timestamps as:
    • Unix epoch timestamp (see for instance the output of date +%s)
    • ISO 8601 timestamps (see for instance the output of date -I, date -Is)
    • YYYY[MM[DD[HH[MM[SS]]]]] ad-hoc format
    • YYYY[-MM[-DD[ HH:[MM:[SS:]]]]] ad-hoc format

swh-web Browse Urls

Content

GET /browse/content/[(algo_hash):](hash)/

HTML view that displays a content identified by its hash value.

If the content to display is textual, it will be highlighted client-side if possible using highlightjs. In order for that operation to be performed, a programming language must first be associated to the content. The following procedure is used in order to find the language:

  1. First try to find a language from the content filename (provided as query parameter when navigating from a directory view).
  2. If no language has been found from the filename, try to find one from the content mime type. The mime type is retrieved from the content metadata stored in the archive or is computed server-side using Python magic module.

It is also possible to highlight specific lines of a textual content (not in terms of syntax highlighting but to emphasize some relevant content part) by either:

  • clicking on line numbers (holding shift to highlight a lines range)
  • using an url fragment in the form ‘#Ln’ or ‘#Lm-Ln’

When that view is called in the context of a navigation coming from a directory view, a breadcrumb will be displayed on top of the rendered content in order to easily navigate up to the associated root directory. In that case, the path query parameter will be used and filled with the path of the file relative to the root directory.

Parameters:
  • algo_hash (string) – optional parameter to indicate the algorithm used to compute the content checksum (can be either sha1, sha1_git, sha256 or blake2s256, default to sha1)
  • hash (string) – hexadecimal representation for the checksum from which to retrieve the associated content in the archive
Query Parameters:
 
  • path (string) – describe the path of the content relative to a root directory (used to add context aware navigation links when navigating from a directory view)
Status Codes:

Examples:

https://archive.softwareheritage.org/browse/content/sha1_git:f5d0b39a0cdddb91a31a537052b7d8d31a4aa79f/
https://archive.softwareheritage.org/browse/content/sha1_git:f5d0b39a0cdddb91a31a537052b7d8d31a4aa79f/#L23-L41
https://archive.softwareheritage.org/browse/content/blake2s256:1cc1e3124957c9be8a454c58e92eb925cf4aa9823984bd01451c5b7e0fee99d1/
https://archive.softwareheritage.org/browse/content/sha1:1cb1447c1c7ddc1b03eac88398e40bd914d46b62/
https://archive.softwareheritage.org/browse/content/sha256:8ceb4b9ee5adedde47b31e975c1d90c73ad27b6b165a1dcd80c7c545eb65b903/
GET /browse/content/[(algo_hash):](hash)/raw/

HTML view that produces a raw display of a content identified by its hash value.

The behaviour of that view depends on the mime type of the requested content. If the mime type is from the text family, the view will return a response whose content type is ‘text/plain’ that will be rendered by the browser. Otherwise, the view will return a response whose content type is ‘application/octet-stream’ and the browser will then offer to download the file.

In the context of a navigation coming from a directory view, the filename query parameter will be used in order to provide the real name of the file when one wants to save it locally.

Parameters:
  • algo_hash (string) – optional parameter to indicate the algorithm used to compute the content checksum (can be either sha1, sha1_git, sha256 or blake2s256, default to sha1)
  • hash (string) – hexadecimal representation for the checksum from which to retrieve the associated content in the archive
Query Parameters:
 
  • filename (string) – indicate the name of the file holding the requested content (used when one wants to save the content to a local file)
Status Codes:

Examples:

https://archive.softwareheritage.org/browse/content/sha1_git:f5d0b39a0cdddb91a31a537052b7d8d31a4aa79f/raw/?filename=LICENSE
https://archive.softwareheritage.org/browse/content/blake2s256:1cc1e3124957c9be8a454c58e92eb925cf4aa9823984bd01451c5b7e0fee99d1/raw/?filename=MAINTAINERS
https://archive.softwareheritage.org/browse/content/sha1:1cb1447c1c7ddc1b03eac88398e40bd914d46b62/raw/
https://archive.softwareheritage.org/browse/content/sha256:8ceb4b9ee5adedde47b31e975c1d90c73ad27b6b165a1dcd80c7c545eb65b903/raw/?filename=COPYING

Directory

GET /browse/directory/(sha1_git)/[(path)/]

HTML view for browsing the content of a directory reachable from the provided root one (including itself) identified by its sha1_git value.

The content of the directory is first sorted in lexicographical order and the sub-directories are displayed before the regular files.

The view enables to navigate from the requested directory to directories reachable from it in a recursive way but also up to the root directory. A breadcrumb located in the top part of the view allows to keep track of the paths navigated so far.

Parameters:
  • sha1_git (string) – hexadecimal representation for the sha1_git identifier of the directory to browse
  • path (string) – optional parameter used to specify the path of a directory reachable from the provided root one
Status Codes:

Examples:

https://archive.softwareheritage.org/browse/directory/977fc4b98c0e85816348cebd3b12026407c368b6/
https://archive.softwareheritage.org/browse/directory/9650ed370c0330d2cd2b6fd1e9febf649ffe538d/kernel/sched/

Origin

This describes the URI scheme when one wants to browse the Software Heritage archive in the context of an origin (for instance, a repository crawled from GitHub or a Debian source package). All the views pointed by that scheme offer quick links to browse objects as found during the associated crawls performed by Software Heritage:

  • the root directory of the origin
  • the list of branches of the origin
  • the list of releases of the origin

Origin visits

GET /browse/origin/[(origin_type)/url/](origin_url)/visits/

HTML view that displays a visits reporting for a software origin identified by its type and url.

Parameters:
  • origin_type (string) – the type of software origin (possible values are git, svn, hg, deb, pypi, ftp or deposit)
  • origin_url (string) – the url of the origin (e.g. https://github.com/(user)/(repo)/)
Status Codes:

Examples:

https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/torvalds/linux/visits/
https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/python/cpython/visits/
https://archive.softwareheritage.org/browse/origin/deb://Debian-Security/packages/mediawiki/visits/
https://archive.softwareheritage.org/browse/origin/https://gitorious.org/qt/qtbase.git/visits/

Origin directory

GET /browse/origin/[(origin_type)/url/](origin_url)/directory/[(path)/]

HTML view for browsing the content of a directory reachable from the root directory (including itself) associated to the latest full visit of a software origin.

The content of the directory is first sorted in lexicographical order and the sub-directories are displayed before the regular files.

The view enables to navigate from the requested directory to directories reachable from it in a recursive way but also up to the origin root directory. A breadcrumb located in the top part of the view allows to keep track of the paths navigated so far.

The view also enables to easily switch between the origin branches and releases through a dropdown menu.

The origin branch (default to master) from which to retrieve the directory content can also be specified by using the branch query parameter.

Parameters:
  • origin_type (string) – the type of software origin (possible values are git, svn, hg, deb, pypi, ftp or deposit)
  • origin_url (string) – the url of the origin (e.g. https://github.com/(user)/(repo)/)
  • path (string) – optional parameter used to specify the path of a directory reachable from the origin root one
Query Parameters:
 
  • branch (string) – specify the origin branch name from which to retrieve the root directory
  • release (string) – specify the origin release name from which to retrieve the root directory
  • revision (string) – specify the origin revision, identified by the hexadecimal representation of its sha1_git value, from which to retrieve the root directory
  • visit_id (int) – specify a visit id to retrieve the directory from instead of using the latest full visit by default
Status Codes:
  • 200 OK – no error
  • 404 Not Found – requested origin can not be found in the archive or the provided path does not exist from the origin root directory

Examples:

https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/torvalds/linux/directory/
https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/torvalds/linux/directory/net/ethernet/
https://archive.softwareheritage.org/browse/origin/https://github.com/python/cpython/directory/
https://archive.softwareheritage.org/browse/origin/https://github.com/python/cpython/directory/Python/
https://archive.softwareheritage.org/browse/origin/https://github.com/python/cpython/directory/?branch=refs/heads/2.7
GET /browse/origin/[(origin_type)/url/](origin_url)/visit/(timestamp)/directory/[(path)/]

HTML view for browsing the content of a directory reachable from the root directory (including itself) associated to a visit of a software origin closest to a provided timestamp.

The content of the directory is first sorted in lexicographical order and the sub-directories are displayed before the regular files.

The view enables to navigate from the requested directory to directories reachable from it in a recursive way but also up to the origin root directory. A breadcrumb located in the top part of the view allows to keep track of the paths navigated so far.

The view also enables to easily switch between the origin branches and releases through a dropdown menu.

The origin branch (default to master) from which to retrieve the directory content can also be specified by using the branch query parameter.

Parameters:
  • origin_type (string) – the type of software origin (possible values are git, svn, hg, deb, pypi, ftp or deposit)
  • origin_url (string) – the url of the origin (e.g. https://github.com/(user)/(repo)/)
  • timestamp (string) – a date string (any format parsable by dateutil.parser.parse) or Unix timestamp to parse in order to find the closest visit.
  • path (string) – optional parameter used to specify the path of a directory reachable from the origin root one
Query Parameters:
 
  • branch (string) – specify the origin branch name from which to retrieve the root directory
  • release (string) – specify the origin release name from which to retrieve the root directory
  • revision (string) – specify the origin revision, identified by the hexadecimal representation of its sha1_git value, from which to retrieve the directory
  • visit_id (int) – specify a visit id to retrieve the directory from instead of using the provided timestamp
Status Codes:
  • 200 OK – no error
  • 404 Not Found – requested origin can not be found in the archive, requested visit timestamp does not exist or the provided path does not exist from the origin root directory

Examples:

https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/torvalds/linux/visit/1493926809/directory/
https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/torvalds/linux/visit/2016-09-14T10:36:21/directory/net/ethernet/
https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/python/cpython/visit/1474620651/directory/
https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/python/cpython/visit/2017-05-05/directory/Python/
https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/python/cpython/visit/2015-08/directory/?branch=refs/heads/2.7

Origin content

GET /browse/origin/[(origin_type)/url/](origin_url)/content/(path)/

HTML view that produces a display of a content associated to the latest full visit of a software origin.

If the content to display is textual, it will be highlighted client-side if possible using highlightjs. The procedure to perform that task is described in GET /browse/content/[(algo_hash):](hash)/.

It is also possible to highlight specific lines of a textual content (not in terms of syntax highlighting but to emphasize some relevant content part) by either:

  • clicking on line numbers (holding shift to highlight a lines range)
  • using an url fragment in the form ‘#Ln’ or ‘#Lm-Ln’

The view displays a breadcrumb on top of the rendered content in order to easily navigate up to the origin root directory.

The view also enables to easily switch between the origin branches and releases through a dropdown menu.

The origin branch (default to master) from which to retrieve the content can also be specified by using the branch query parameter.

Parameters:
  • origin_type (string) – the type of software origin (possible values are git, svn, hg, deb, pypi, ftp or deposit)
  • origin_url (string) – the url of the origin (e.g. https://github.com/(user)/(repo)/)
  • path (string) – path of a content reachable from the origin root directory
Query Parameters:
 
  • branch (string) – specify the origin branch name from which to retrieve the content
  • release (string) – specify the origin release name from which to retrieve the content
  • revision (string) – specify the origin revision, identified by the hexadecimal representation of its sha1_git value, from which to retrieve the content
  • visit_id (int) – specify a visit id to retrieve the content from instead of using the latest full visit by default
Status Codes:
  • 200 OK – no error
  • 404 Not Found – requested origin can not be found in the archive, or the provided content path does not exist from the origin root directory

Examples:

https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/git/git/content/git.c/
https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/git/git/content/git.c/
https://archive.softwareheritage.org/browse/origin/https://github.com/mozilla/gecko-dev/content/js/src/json.cpp/
https://archive.softwareheritage.org/browse/origin/https://github.com/git/git/content/git.c/?branch=refs/heads/next
GET /browse/origin/[(origin_type)/url/](origin_url)/visit/(timestamp)/content/(path)/

HTML view that produces a display of a content associated to a visit of a software origin closest to a provided timestamp.

If the content to display is textual, it will be highlighted client-side if possible using highlightjs. The procedure to perform that task is described in GET /browse/content/[(algo_hash):](hash)/.

It is also possible to highlight specific lines of a textual content (not in terms of syntax highlighting but to emphasize some relevant content part) by either:

  • clicking on line numbers (holding shift to highlight a lines range)
  • using an url fragment in the form ‘#Ln’ or ‘#Lm-Ln’

The view displays a breadcrumb on top of the rendered content in order to easily navigate up to the origin root directory.

The view also enables to easily switch between the origin branches and releases through a dropdown menu.

The origin branch (default to master) from which to retrieve the content can also be specified by using the branch query parameter.

Parameters:
  • origin_type (string) – the type of software origin (possible values are git, svn, hg, deb, pypi, ftp or deposit)
  • origin_url (string) – the url of the origin (e.g. https://github.com/(user)/(repo)/)
  • timestamp (string) – a date string (any format parsable by dateutil.parser.parse) or Unix timestamp to parse in order to find the closest visit.
  • path (string) – path of a content reachable from the origin root directory
Query Parameters:
 
  • branch (string) – specify the origin branch name from which to retrieve the content
  • release (string) – specify the origin release name from which to retrieve the content
  • revision (string) – specify the origin revision, identified by the hexadecimal representation of its sha1_git value, from which to retrieve the content
  • visit_id (int) – specify a visit id to retrieve the content from instead of using the provided timestamp
Status Codes:
  • 200 OK – no error
  • 404 Not Found – requested origin can not be found in the archive, requested visit timestamp does not exist or the provided content path does not exist from the origin root directory

Examples:

https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/git/git/visit/1473933564/content/git.c/
https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/git/git/visit/2016-05-05T00:0:00+00:00/content/git.c/
https://archive.softwareheritage.org/browse/origin/https://github.com/mozilla/gecko-dev/visit/1490126182/content/js/src/json.cpp/
https://archive.softwareheritage.org/browse/origin/https://github.com/mozilla/gecko-dev/visit/2017-03-21/content/js/src/json.cpp/#L904-L931
https://archive.softwareheritage.org/browse/origin/https://github.com/git/git/visit/2017-09-15/content/git.c/?branch=refs/heads/next

Origin history

GET /browse/origin/[(origin_type)/url/](origin_url)/log/

HTML view that produces a display of revisions history heading to the last revision found during the latest visit of a software origin. In other words, it shows the commit log associated to the latest full visit of a software origin.

The following data are displayed for each log entry:

  • link to browse the associated revision in the origin context
  • author of the revision
  • date of the revision
  • message associated the revision
  • commit date of the revision

By default, the revisions are ordered in reverse chronological order of their commit date.

N log entries are displayed per page (default is 100). In order to navigate in a large history, two buttons are present at the bottom of the view:

  • Newer: fetch and display if available the N more recent log entries than the ones currently displayed
  • Older: fetch and display if available the N older log entries than the ones currently displayed

The view also enables to easily switch between the origin branches and releases through a dropdown menu.

The origin branch (default to master) from which to retrieve the content can also be specified by using the branch query parameter.

Parameters:
  • origin_type (string) – the type of software origin (possible values are git, svn, hg, deb, pypi, ftp or deposit)
  • origin_url (string) – the url of the origin (e.g. https://github.com/(user)/(repo)/)
Query Parameters:
 
  • per_page (int) – the number of log entries to display per page
  • offset (int) – the number of revisions to skip before returning those to display
  • revs_ordering (str) – specify the revisions ordering, possible values are committer_date, dfs, dfs_post and bfs
  • branch (string) – specify the origin branch name from which to retrieve the commit log
  • release (string) – specify the origin release name from which to retrieve the commit log
  • revision (string) – specify the origin revision, identified by the hexadecimal representation of its sha1_git value, from which to retrieve the commit log
  • visit_id (int) – specify a visit id to retrieve the history log from instead of using the latest visit by default
Status Codes:

Examples:

https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/videolan/vlc/log/
https://archive.softwareheritage.org/browse/origin/https://github.com/Kitware/CMake/log/
https://archive.softwareheritage.org/browse/origin/https://github.com/Kitware/CMake/log/?branch=refs/heads/release
GET /browse/origin/[(origin_type)/url/](origin_url)/visit/(timestamp)/log/

HTML view that produces a display of revisions history heading to the last revision found during a visit of a software origin closest to the provided timestamp. In other words, it shows the commit log associated to a visit of a software origin closest to a provided timestamp.

The following data are displayed for each log entry:

  • author of the revision
  • link to the revision metadata
  • message associated the revision
  • date of the revision
  • link to browse the associated source tree in the origin context

N log entries are displayed per page (default is 20). In order to navigate in a large history, two buttons are present at the bottom of the view:

  • Newer: fetch and display if available the N more recent log entries than the ones currently displayed
  • Older: fetch and display if available the N older log entries than the ones currently displayed

The view also enables to easily switch between the origin branches and releases through a dropdown menu.

The origin branch (default to master) from which to retrieve the content can also be specified by using the branch query parameter.

Parameters:
  • origin_type (string) – the type of software origin (possible values are git, svn, hg, deb, pypi, ftp or deposit)
  • origin_url (string) – the url of the origin (e.g. https://github.com/(user)/(repo)/)
  • timestamp (string) – a date string (any format parsable by dateutil.parser.parse) or Unix timestamp to parse in order to find the closest visit.
Query Parameters:
 
  • revs_breadcrumb (string) – used internally to store the navigation breadcrumbs (i.e. the list of descendant revisions visited so far). It must be a string in the form “(rev_1)[/(rev_2)/…/(rev_n)]” where rev_i corresponds to a revision sha1_git.
  • per_page (int) – the number of log entries to display per page (default is 20, max is 50)
  • branch (string) – specify the origin branch name from which to retrieve the commit log
  • release (string) – specify the origin release name from which to retrieve the commit log
  • revision (string) – specify the origin revision, identified by the hexadecimal representation of its sha1_git value, from which to retrieve the commit log
  • visit_id (int) – specify a visit id to retrieve the history log from instead of using the provided timestamp
Status Codes:

Examples:

https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/videolan/vlc/visit/1459651262/log/
https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/Kitware/CMake/visit/2016-04-01/log/
https://archive.softwareheritage.org/browse/origin/https://github.com/Kitware/CMake/visit/1438116814/log/?branch=refs/heads/release
https://archive.softwareheritage.org/browse/origin/https://github.com/Kitware/CMake/visit/2017-05-05T03:14:23/log/?branch=refs/heads/release

Origin branches

GET /browse/origin/[(origin_type)/url/](origin_url)/branches/

HTML view that produces a display of the list of branches found during the latest full visit of a software origin.

The following data are displayed for each branch:

  • its name
  • a link to browse the associated directory
  • a link to browse the associated revision
  • last commit message
  • last commit date

That list of branches is paginated, each page displaying a maximum of 100 branches.

Parameters:
  • origin_type (string) – the type of software origin (possible values are git, svn, hg, deb, pypi, ftp or deposit)
  • origin_url (string) – the url of the origin (e.g. https://github.com/(user)/(repo)/)
Status Codes:

Examples:

https://archive.softwareheritage.org/browse/origin/deb/url/deb://Debian/packages/linux/branches/
https://archive.softwareheritage.org/browse/origin/https://github.com/webpack/webpack/branches/
GET /browse/origin/[(origin_type)/url/](origin_url)/visit/(timestamp)/branches/

HTML view that produces a display of the list of branches found during a visit of a software origin closest to the provided timestamp.

The following data are displayed for each branch:

  • its name
  • a link to browse the associated directory
  • a link to browse the associated revision
  • last commit message
  • last commit date

That list of branches is paginated, each page displaying a maximum of 100 branches.

Parameters:
  • origin_type (string) – the type of software origin (possible values are git, svn, hg, deb, pypi, ftp or deposit)
  • origin_url (string) – the url of the origin (e.g. https://github.com/(user)/(repo)/)
  • timestamp (string) – a date string (any format parsable by dateutil.parser.parse) or Unix timestamp to parse in order to find the closest visit.
Status Codes:

Examples:

https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/kripken/emscripten/visit/2017-05-05T12:02:03/branches/
https://archive.softwareheritage.org/browse/origin/deb://Debian/packages/apache2-mod-xforward/visit/2017-11-15T05:15:09/branches/

Origin releases

GET /browse/origin/[(origin_type)/url/](origin_url)/releases/

HTML view that produces a display of the list of releases found during the latest full visit of a software origin.

The following data are displayed for each release:

  • its name
  • a link to browse the release details
  • its target type (revision, directory, content or release)
  • its associated message
  • its date

That list of releases is paginated, each page displaying a maximum of 100 releases.

Parameters:
  • origin_type (string) – the type of software origin (possible values are git, svn, hg, deb, pypi, ftp or deposit)
  • origin_url (string) – the url of the origin (e.g. https://github.com/(user)/(repo)/)
Status Codes:

Examples:

https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/git/git/releases/
https://archive.softwareheritage.org/browse/origin/https://github.com/webpack/webpack/releases/
GET /browse/origin/[(origin_type)/url/](origin_url)/visit/(timestamp)/releases/

HTML view that produces a display of the list of releases found during a visit of a software origin closest to the provided timestamp.

The following data are displayed for each release:

  • its name
  • a link to browse the release details
  • its target type (revision, directory, content or release)
  • its associated message
  • its date

That list of releases is paginated, each page displaying a maximum of 100 releases.

Parameters:
  • origin_type (string) – the type of software origin (possible values are git, svn, hg, deb, pypi, ftp or deposit)
  • origin_url (string) – the url of the origin (e.g. https://github.com/(user)/(repo)/)
  • timestamp (string) – a date string (any format parsable by dateutil.parser.parse) or Unix timestamp to parse in order to find the closest visit.
Status Codes:

Examples:

https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/torvalds/linux/visit/2017-11-21T19:37:42/releases/
https://archive.softwareheritage.org/browse/origin/https://github.com/Kitware/CMake/visit/2016-09-23T14:06:35/releases/

Person

GET /browse/person/(person_id)/

HTML view that displays information regarding a person.

Parameters:
  • person_id (int) – the id of a person
Status Codes:

Release

GET /browse/release/(sha1_git)/

HTML view that displays metadata associated to a release:

  • the author
  • the release date
  • the release name
  • the associated message
  • the type of target the release points to (revision, directory, content or release)
  • the link to browse the release target
Parameters:
  • sha1_git (string) – hexadecimal representation for the sha1_git identifier of a release
Status Codes:

Examples:

https://archive.softwareheritage.org/browse/release/208f61cc7a5dbc9879ae6e5c2f95891e270f09ef/
https://archive.softwareheritage.org/browse/release/f883596e997fe5bcbc5e89bee01b869721326109/

Revision

GET /browse/revision/(sha1_git)/

HTML view to browse a revision. It notably shows the revision date and message but also offers links to get more details on:

  • its author
  • its parent revisions
  • the history log reachable from it

The view also enables to navigate in the source tree associated to the revision and browse its content.

Last but not least, the view displays the list of file changes introduced in the revision but also the diffs of each changed files.

Parameters:
  • sha1_git (string) – hexadecimal representation for the sha1_git identifier of a revision
Query Parameters:
 
  • origin_type (string) – used internally to associate a software origin type (possible values are git, svn, hg, deb, pypi, ftp or deposit) to the revision
  • origin_url (string) – used internally to associate an origin url (e.g. https://github.com/user/repo) to the revision
  • timestamp (string) – used internally to associate an origin visit to the revision, must be a date string (any format parsable by dateutil.parser.parse) or Unix timestamp to parse in order to find the closest visit.
  • visit_id (int) – used internally to specify a visit id instead of using the provided timestamp
  • path (string) – used internally when navigating in the source tree associated to the revision
Status Codes:

Examples:

https://archive.softwareheritage.org/browse/revision/f1b94134a4b879bc55c3dacdb496690c8ebdc03f/
https://archive.softwareheritage.org/browse/revision/d1aa2b3f607b35dc5dbf613b2334b6d243ec2bda/
GET /browse/revision/(sha1_git)/log/

HTML view that displays the list of revisions heading to a given one. In other words, it shows a commit log. The following data are displayed for each log entry:

  • link to browse the revision
  • author of the revision
  • date of the revision
  • message associated to the revision
  • commit date of the revision

By default, the revisions are ordered in reverse chronological order of their commit date.

N log entries are displayed per page (default is 100). In order to navigate in a large history, two buttons are present at the bottom of the view:

  • Newer: fetch and display if available the N more recent log entries than the ones currently displayed
  • Older: fetch and display if available the N older log entries than the ones currently displayed
Parameters:
  • sha1_git (string) – hexadecimal representation for the sha1_git identifier of a revision
Query Parameters:
 
  • per_page (int) – the number of log entries to display per page
  • offset (int) – the number of revisions to skip before returning those to display
  • revs_ordering (str) – specify the revisions ordering, possible values are committer_date, dfs, dfs_post and bfs
Status Codes:

Examples:

https://archive.softwareheritage.org/browse/revision/f1b94134a4b879bc55c3dacdb496690c8ebdc03f/log/
https://archive.softwareheritage.org/browse/revision/d1aa2b3f607b35dc5dbf613b2334b6d243ec2bda/log/

Snapshot

GET /browse/snapshot/(snapshot_id)/

HTML view that displays the content of a snapshot from its identifier (see swh.model.identifiers.snapshot_identifier() in our data model module for details about how they are computed).

A snapshot is a set of named branches, which are pointers to objects at any level of the Software Heritage DAG. It represents a full picture of an origin at a given time. Thus, multiple visits of different origins can point to the same snapshot (for instance, when several projects are forks of a common one).

Currently, that endpoint simply performs a redirection to GET /browse/snapshot/(snapshot_id)/directory/[(path)/] in order to display the root directory associated to the default snapshot branch (usually master).

Parameters:
  • snapshot_id (string) – hexadecimal representation of the snapshot sha1 identifier
Status Codes:

Examples:

https://archive.softwareheritage.org/browse/snapshot/baebc2109e4a2ec22a1129a3859647e191d04df4/
https://archive.softwareheritage.org/browse/snapshot/673156c31a876c5b99b2fe3e89615529de9a3c44/

Snapshot directory

GET /browse/snapshot/(snapshot_id)/directory/[(path)/]

HTML view that displays the content of a directory reachable from a snapshot.

The features offered by the view are similar to the one for browsing a directory in an origin context (see GET /browse/origin/[(origin_type)/url/](origin_url)/directory/[(path)/]).

Parameters:
  • snapshot_id (string) – hexadecimal representation of the snapshot sha1 identifier
  • path (string) – optional parameter used to specify the path of a directory reachable from the snapshot root one
Query Parameters:
 
  • branch (string) – specify the snapshot branch name from which to retrieve the root directory
  • release (string) – specify the snapshot release name from which to retrieve the root directory
  • revision (string) – specify the snapshot revision, identified by the hexadecimal representation of its sha1_git value, from which to retrieve the root directory
Status Codes:

Examples:

https://archive.softwareheritage.org/browse/snapshot/baebc2109e4a2ec22a1129a3859647e191d04df4/directory/drivers/gpu/
https://archive.softwareheritage.org/browse/snapshot/673156c31a876c5b99b2fe3e89615529de9a3c44/directory/src/opengl/
https://archive.softwareheritage.org/browse/snapshot/673156c31a876c5b99b2fe3e89615529de9a3c44/log/?release=v5.7.0

Snapshot content

GET /browse/snapshot/(snapshot_id)/content/(path)/

HTML view that produces a display of a content reachable from a snapshot.

The features offered by the view are similar to the one for browsing a content in an origin context (see GET /browse/origin/[(origin_type)/url/](origin_url)/content/(path)/).

Parameters:
  • snapshot_id (string) – hexadecimal representation of the snapshot sha1 identifier
  • path (string) – path of a content reachable from the snapshot root directory
Query Parameters:
 
  • branch (string) – specify the snapshot branch name from which to retrieve the content
  • release (string) – specify the snapshot release name from which to retrieve the content
  • revision (string) – specify the snapshot revision, identified by the hexadecimal representation of its sha1_git value, from which to retrieve the content
Status Codes:
  • 200 OK – no error
  • 400 Bad Request – an invalid snapshot identifier has been provided
  • 404 Not Found – requested snapshot can not be found in the archive, or the provided content path does not exist from the origin root directory

Examples:

https://archive.softwareheritage.org/browse/snapshot/baebc2109e4a2ec22a1129a3859647e191d04df4/content/init/initramfs.c
https://archive.softwareheritage.org/browse/snapshot/673156c31a876c5b99b2fe3e89615529de9a3c44/content/src/opengl/qglbuffer.h/
https://archive.softwareheritage.org/browse/snapshot/673156c31a876c5b99b2fe3e89615529de9a3c44/content/src/opengl/qglbuffer.h/?release=v5.0.0

Snapshot history

GET /browse/snapshot/(snapshot_id)/log/

HTML view that produces a display of revisions history (aka the commit log) heading to the last revision collected in a snapshot.

The features offered by the view are similar to the one for browsing the history in an origin context (see GET /browse/origin/[(origin_type)/url/](origin_url)/log/).

Parameters:
  • snapshot_id (string) – hexadecimal representation of the snapshot sha1 identifier
Query Parameters:
 
  • revs_breadcrumb (string) – used internally to store the navigation breadcrumbs (i.e. the list of descendant revisions visited so far). It must be a string in the form “(rev_1)[/(rev_2)/…/(rev_n)]” where rev_i corresponds to a revision sha1_git.
  • per_page (int) – the number of log entries to display per page (default is 20, max is 50)
  • branch (string) – specify the snapshot branch name from which to retrieve the commit log
  • release (string) – specify the snapshot release name from which to retrieve the commit log
  • revision (string) – specify the snapshot revision, identified by the hexadecimal representation of its sha1_git value, from which to retrieve the commit log
Status Codes:

Examples:

https://archive.softwareheritage.org/browse/snapshot/a274b44111f777209556e94920b7e71cf5c305cd/log/
https://archive.softwareheritage.org/browse/snapshot/9ca9e75279df5f4e3fee19bf5190ed672dcdfb33/log/?branch=refs/heads/emacs-unicode

Snapshot branches

GET /browse/snapshot/(snapshot_id)/branches/

HTML view that produces a display of the list of branches collected in a snapshot.

The features offered by the view are similar to the one for browsing the list of branches in an origin context (see GET /browse/origin/[(origin_type)/url/](origin_url)/branches/).

Parameters:
  • snapshot_id (string) – hexadecimal representation of the snapshot sha1 identifier
Status Codes:

Examples:

https://archive.softwareheritage.org/browse/snapshot/03d7897352541e78ee7b13a580dc836778e8126a/branches/
https://archive.softwareheritage.org/browse/snapshot/f37563b953327f8fd83e39af6ebb929ef85103d5/branches/

Snapshot releases

GET /browse/snapshot/(snapshot_id)/releases/

HTML view that produces a display of the list of releases collected in a snapshot.

The features offered by the view are similar to the one for browsing the list of releases in an origin context (see GET /browse/origin/[(origin_type)/url/](origin_url)/releases/).

Parameters:
  • snapshot_id (string) – hexadecimal representation of the snapshot sha1 identifier
Status Codes:

Examples:

https://archive.softwareheritage.org/browse/snapshot/673156c31a876c5b99b2fe3e89615529de9a3c44/releases/
https://archive.softwareheritage.org/browse/snapshot/23e6fb084a60cc909b9e222d80d89fdb98756dee/releases/