Origin#

GET /api/1/origins/#

Get list of archived software origins.

Warning

This endpoint used to provide an origin_from query parameter, and guarantee an order on results. This is no longer true, and only the Link header should be used for paginating through results.

Query Parameters:
  • origin_count (int) – The maximum number of origins to return (default to 100, cannot exceed 10000)

Response JSON Array of Objects:
  • origin_visits_url (string) – link to in order to get information about the visits for that origin

  • url (string) – the origin canonical url

  • metadata_authorities_url (string) – link to GET /api/1/raw-extrinsic-metadata/swhid/(target)/authorities/ to get the list of metadata authorities providing extrinsic metadata on this origin (and, indirectly, to the origin’s extrinsic metadata itself)

  • has_visits (boolean) – indicates if Software Heritage made at least one full visit of the origin

Request Headers:
  • Accept – the requested response content type, either application/json (default) or application/yaml

Response Headers:
  • Content-Type – this depends on Accept header of request

  • Link – indicates that a subsequent result page is available and contains the url pointing to it

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/origins?origin_count=500
GET /api/1/origin/(origin_url)/get/#

Get information about a software origin.

Parameters:
  • origin_url (string) – the origin url

Response JSON Object:
  • origin_visits_url (string) – link to in order to get information about the visits for that origin

  • url (string) – the origin canonical url

  • metadata_authorities_url (string) – link to GET /api/1/raw-extrinsic-metadata/swhid/(target)/authorities/ to get the list of metadata authorities providing extrinsic metadata on this origin (and, indirectly, to the origin’s extrinsic metadata itself)

  • has_visits (boolean) – indicates if Software Heritage made at least one full visit of the origin

  • visit_types (array) – set of visit types for that origin

Request Headers:
  • Accept – the requested response content type, either application/json (default) or application/yaml

Response Headers:
Status Codes:

Example:

https://archive.softwareheritage.org/api/1/origin/https://github.com/python/cpython/get/
GET /api/1/origin/search/(url_pattern)/#

Search for software origins whose urls contain a provided string pattern or match a provided regular expression. The search is performed in a case insensitive way.

Warning

This endpoint used to provide an offset query parameter, and guarantee an order on results. This is no longer true, and only the Link header should be used for paginating through results.

Parameters:
  • url_pattern (string) – a string pattern

Query Parameters:
  • use_ql (boolean) – whether to use swh search query language or not

  • limit (int) – the maximum number of found origins to return (bounded to 1000)

  • with_visit (boolean) – if true, only return origins with at least one visit by Software heritage

  • visit_type (string) – if provided, only return origins with that specific visit type (currently the supported types are ???)

Response JSON Array of Objects:
  • origin_visits_url (string) – link to in order to get information about the visits for that origin

  • url (string) – the origin canonical url

  • metadata_authorities_url (string) – link to GET /api/1/raw-extrinsic-metadata/swhid/(target)/authorities/ to get the list of metadata authorities providing extrinsic metadata on this origin (and, indirectly, to the origin’s extrinsic metadata itself)

  • has_visits (boolean) – indicates if Software Heritage made at least one full visit of the origin

Request Headers:
  • Accept – the requested response content type, either application/json (default) or application/yaml

Response Headers:
  • Content-Type – this depends on Accept header of request

  • Link – indicates that a subsequent result page is available and contains the url pointing to it

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/origin/search/python/?limit=2
GET /api/1/origin/(origin_url)/visits/#

Get information about all visits of a software origin. Visits are returned sorted in descending order according to their date.

Parameters:
  • origin_url (str) – a software origin URL

Query Parameters:
  • per_page (int) – specify the number of visits to list, for pagination purposes

  • last_visit (int) – visit to start listing from, for pagination purposes

Request Headers:
  • Accept – the requested response content type, either application/json (default) or application/yaml

Response Headers:
  • Content-Type – this depends on Accept header of request

  • Link – indicates that a subsequent result page is available and contains the url pointing to it

Response JSON Array of Objects:
  • date (string) – ISO8601/RFC3339 representation of the visit date (in UTC)

  • origin (str) – the origin canonical url

  • origin_url (string) – link to get information about the origin

  • snapshot (string) – the snapshot identifier of the visit (may be null if status is not full).

  • snapshot_url (string) – link to GET /api/1/snapshot/(snapshot_id)/ in order to get information about the snapshot of the visit (may be null if status is not full).

  • status (string) – status of the visit (either full, partial or ongoing)

  • type (string) – visit type for the origin

  • visit (number) – the unique identifier of the visit

  • id (number) – the unique identifier of the origin

  • origin_visit_url (string) – link to GET /api/1/origin/(origin_url)/visit/(visit_id)/ in order to get information about the visit

Status Codes:

Example:

https://archive.softwareheritage.org/api/1/origin/https://github.com/hylang/hy/visits/
GET /api/1/origin/(origin_url)/visit/(visit_id)/#

Get information about a specific visit of a software origin.

Parameters:
  • origin_url (str) – a software origin URL

  • visit_id (int) – a visit identifier

Request Headers:
  • Accept – the requested response content type, either application/json (default) or application/yaml

Response Headers:
Response JSON Object:
  • date (string) – ISO8601/RFC3339 representation of the visit date (in UTC)

  • origin (str) – the origin canonical url

  • origin_url (string) – link to get information about the origin

  • snapshot (string) – the snapshot identifier of the visit (may be null if status is not full).

  • snapshot_url (string) – link to GET /api/1/snapshot/(snapshot_id)/ in order to get information about the snapshot of the visit (may be null if status is not full).

  • status (string) – status of the visit (either full, partial or ongoing)

  • type (string) – visit type for the origin

  • visit (number) – the unique identifier of the visit

Status Codes:
  • 200 OK – no error

  • 404 Not Found – requested origin or visit cannot be found in the archive

Example:

https://archive.softwareheritage.org/api/1/origin/https://github.com/hylang/hy/visit/1/
GET /api/1/origin/(origin_url)/visit/latest/#

Get information about the latest visit of a software origin.

Parameters:
  • origin_url (str) – a software origin URL

Query Parameters:
  • require_snapshot (boolean) – if true, only return a visit with a snapshot

  • visit_type (str) – if provided, filter visits by type

Request Headers:
  • Accept – the requested response content type, either application/json (default) or application/yaml

Response Headers:
Response JSON Object:
  • date (string) – ISO8601/RFC3339 representation of the visit date (in UTC)

  • origin (str) – the origin canonical url

  • origin_url (string) – link to get information about the origin

  • snapshot (string) – the snapshot identifier of the visit (may be null if status is not full).

  • snapshot_url (string) – link to GET /api/1/snapshot/(snapshot_id)/ in order to get information about the snapshot of the visit (may be null if status is not full).

  • status (string) – status of the visit (either full, partial or ongoing)

  • type (string) – visit type for the origin

  • visit (number) – the unique identifier of the visit

Status Codes:
  • 200 OK – no error

  • 404 Not Found – requested origin or visit cannot be found in the archive

Example:

https://archive.softwareheritage.org/api/1/origin/https://github.com/hylang/hy/visit/latest/
GET /api/1/origin/metadata-search/#

Search for software origins whose metadata (expressed as a JSON-LD/CodeMeta dictionary) match the provided criteria. For now, only full-text search on this dictionary is supported.

Query Parameters:
  • fulltext (str) – a string that will be matched against origin metadata; results are ranked and ordered starting with the best ones.

  • limit (int) – the maximum number of found origins to return (bounded to 100)

Response JSON Array of Objects:
  • origin_visits_url (string) – link to in order to get information about the visits for that origin

  • url (string) – the origin canonical url

  • metadata_authorities_url (string) – link to GET /api/1/raw-extrinsic-metadata/swhid/(target)/authorities/ to get the list of metadata authorities providing extrinsic metadata on this origin (and, indirectly, to the origin’s extrinsic metadata itself)

  • has_visits (boolean) – indicates if Software Heritage made at least one full visit of the origin

Request Headers:
  • Accept – the requested response content type, either application/json (default) or application/yaml

Response Headers:
Status Codes:

Example:

https://archive.softwareheritage.org/api/1/origin/metadata-search/?limit=2&fulltext=node-red-nodegen
GET /api/1/intrinsic-metadata/origin/#

Get intrinsic metadata of a software origin (as a JSON-LD/CodeMeta dictionary).

Query Parameters:
  • origin_url (string) – the URL of the origin

Response JSON Array of Objects:
  • ??? (???) – intrinsic metadata field of the origin

Request Headers:
  • Accept – the requested response content type, either application/json (default) or application/yaml

Response Headers:
Status Codes:

Example:

https://archive.softwareheritage.org/api/1/intrinsic-metadata/origin/?origin_url=https://github.com/node-red/node-red-nodegen
GET /api/1/extrinsic-metadata/origin/#

Get extrinsic metadata of a software origin (as a JSON-LD/CodeMeta dictionary).

Query Parameters:
  • origin_url (str) – parameter for origin url

Response JSON Array of Objects:
  • ??? (???) – extrinsic metadata field of the origin

Request Headers:
  • Accept – the requested response content type, either application/json (default) or application/yaml

Response Headers:
Status Codes:

Example:

https://archive.softwareheritage.org/api/1/extrinsic-metadata/origin/?origin_url=https://github.com/node-red/node-red-nodegen