Origin#
- GET /api/1/origins/#
Get list of archived software origins.
Warning
This endpoint used to provide an
origin_from
query parameter, and guarantee an order on results. This is no longer true, and only the Link header should be used for paginating through results.- Query Parameters:
origin_count (int) – The maximum number of origins to return (default to 100, cannot exceed 10000)
- Response JSON Array of Objects:
origin_visits_url (string) – link to in order to get information about the visits for that origin
url (string) – the origin canonical url
metadata_authorities_url (string) – link to
GET /api/1/raw-extrinsic-metadata/swhid/(target)/authorities/
to get the list of metadata authorities providing extrinsic metadata on this origin (and, indirectly, to the origin’s extrinsic metadata itself)has_visits (boolean) – indicates if Software Heritage made at least one full visit of the origin
- Request Headers:
Accept – the requested response content type, either
application/json
(default) orapplication/yaml
- Response Headers:
Content-Type – this depends on Accept header of request
Link – indicates that a subsequent result page is available and contains the url pointing to it
- Status Codes:
200 OK – no error
Example:
https://archive.softwareheritage.org/api/1/origins?origin_count=500
- GET /api/1/origin/(origin_url)/get/#
Get information about a software origin.
- Parameters:
origin_url (string) – the origin url
- Response JSON Object:
origin_visits_url (string) – link to in order to get information about the visits for that origin
url (string) – the origin canonical url
metadata_authorities_url (string) – link to
GET /api/1/raw-extrinsic-metadata/swhid/(target)/authorities/
to get the list of metadata authorities providing extrinsic metadata on this origin (and, indirectly, to the origin’s extrinsic metadata itself)has_visits (boolean) – indicates if Software Heritage made at least one full visit of the origin
visit_types (array) – set of visit types for that origin
- Request Headers:
Accept – the requested response content type, either
application/json
(default) orapplication/yaml
- Response Headers:
Content-Type – this depends on Accept header of request
- Status Codes:
200 OK – no error
404 Not Found – requested origin cannot be found in the archive
Example:
https://archive.softwareheritage.org/api/1/origin/https://github.com/python/cpython/get/
- GET /api/1/origin/search/(url_pattern)/#
Search for software origins whose urls contain a provided string pattern or match a provided regular expression. The search is performed in a case insensitive way.
Warning
This endpoint used to provide an
offset
query parameter, and guarantee an order on results. This is no longer true, and only the Link header should be used for paginating through results.- Parameters:
url_pattern (string) – a string pattern
- Query Parameters:
use_ql (boolean) – whether to use swh search query language or not
limit (int) – the maximum number of found origins to return (bounded to 1000)
with_visit (boolean) – if true, only return origins with at least one visit by Software heritage
visit_type (string) – if provided, only return origins with that specific visit type (currently the supported types are ???)
- Response JSON Array of Objects:
origin_visits_url (string) – link to in order to get information about the visits for that origin
url (string) – the origin canonical url
metadata_authorities_url (string) – link to
GET /api/1/raw-extrinsic-metadata/swhid/(target)/authorities/
to get the list of metadata authorities providing extrinsic metadata on this origin (and, indirectly, to the origin’s extrinsic metadata itself)has_visits (boolean) – indicates if Software Heritage made at least one full visit of the origin
- Request Headers:
Accept – the requested response content type, either
application/json
(default) orapplication/yaml
- Response Headers:
Content-Type – this depends on Accept header of request
Link – indicates that a subsequent result page is available and contains the url pointing to it
- Status Codes:
200 OK – no error
Example:
https://archive.softwareheritage.org/api/1/origin/search/python/?limit=2
- GET /api/1/origin/(origin_url)/visits/#
Get information about all visits of a software origin. Visits are returned sorted in descending order according to their date.
- Parameters:
origin_url (str) – a software origin URL
- Query Parameters:
per_page (int) – specify the number of visits to list, for pagination purposes
last_visit (int) – visit to start listing from, for pagination purposes
- Request Headers:
Accept – the requested response content type, either
application/json
(default) orapplication/yaml
- Response Headers:
Content-Type – this depends on Accept header of request
Link – indicates that a subsequent result page is available and contains the url pointing to it
- Response JSON Array of Objects:
date (string) – ISO8601/RFC3339 representation of the visit date (in UTC)
origin (str) – the origin canonical url
origin_url (string) – link to get information about the origin
snapshot (string) – the snapshot identifier of the visit (may be null if status is not full).
snapshot_url (string) – link to
GET /api/1/snapshot/(snapshot_id)/
in order to get information about the snapshot of the visit (may be null if status is not full).status (string) – status of the visit (either full, partial or ongoing)
type (string) – visit type for the origin
visit (number) – the unique identifier of the visit
id (number) – the unique identifier of the origin
origin_visit_url (string) – link to
GET /api/1/origin/(origin_url)/visit/(visit_id)/
in order to get information about the visit
- Status Codes:
200 OK – no error
404 Not Found – requested origin cannot be found in the archive
Example:
https://archive.softwareheritage.org/api/1/origin/https://github.com/hylang/hy/visits/
- GET /api/1/origin/(origin_url)/visit/(visit_id)/#
Get information about a specific visit of a software origin.
- Parameters:
origin_url (str) – a software origin URL
visit_id (int) – a visit identifier
- Request Headers:
Accept – the requested response content type, either
application/json
(default) orapplication/yaml
- Response Headers:
Content-Type – this depends on Accept header of request
- Response JSON Object:
date (string) – ISO8601/RFC3339 representation of the visit date (in UTC)
origin (str) – the origin canonical url
origin_url (string) – link to get information about the origin
snapshot (string) – the snapshot identifier of the visit (may be null if status is not full).
snapshot_url (string) – link to
GET /api/1/snapshot/(snapshot_id)/
in order to get information about the snapshot of the visit (may be null if status is not full).status (string) – status of the visit (either full, partial or ongoing)
type (string) – visit type for the origin
visit (number) – the unique identifier of the visit
- Status Codes:
200 OK – no error
404 Not Found – requested origin or visit cannot be found in the archive
Example:
https://archive.softwareheritage.org/api/1/origin/https://github.com/hylang/hy/visit/1/
- GET /api/1/origin/(origin_url)/visit/latest/#
Get information about the latest visit of a software origin.
- Parameters:
origin_url (str) – a software origin URL
- Query Parameters:
require_snapshot (boolean) – if true, only return a visit with a snapshot
visit_type (str) – if provided, filter visits by type
- Request Headers:
Accept – the requested response content type, either
application/json
(default) orapplication/yaml
- Response Headers:
Content-Type – this depends on Accept header of request
- Response JSON Object:
date (string) – ISO8601/RFC3339 representation of the visit date (in UTC)
origin (str) – the origin canonical url
origin_url (string) – link to get information about the origin
snapshot (string) – the snapshot identifier of the visit (may be null if status is not full).
snapshot_url (string) – link to
GET /api/1/snapshot/(snapshot_id)/
in order to get information about the snapshot of the visit (may be null if status is not full).status (string) – status of the visit (either full, partial or ongoing)
type (string) – visit type for the origin
visit (number) – the unique identifier of the visit
- Status Codes:
200 OK – no error
404 Not Found – requested origin or visit cannot be found in the archive
Example:
https://archive.softwareheritage.org/api/1/origin/https://github.com/hylang/hy/visit/latest/
- GET /api/1/origin/metadata-search/#
Search for software origins whose metadata (expressed as a JSON-LD/CodeMeta dictionary) match the provided criteria. For now, only full-text search on this dictionary is supported.
- Query Parameters:
fulltext (str) – a string that will be matched against origin metadata; results are ranked and ordered starting with the best ones.
limit (int) – the maximum number of found origins to return (bounded to 100)
- Response JSON Array of Objects:
origin_visits_url (string) – link to in order to get information about the visits for that origin
url (string) – the origin canonical url
metadata_authorities_url (string) – link to
GET /api/1/raw-extrinsic-metadata/swhid/(target)/authorities/
to get the list of metadata authorities providing extrinsic metadata on this origin (and, indirectly, to the origin’s extrinsic metadata itself)has_visits (boolean) – indicates if Software Heritage made at least one full visit of the origin
- Request Headers:
Accept – the requested response content type, either
application/json
(default) orapplication/yaml
- Response Headers:
Content-Type – this depends on Accept header of request
- Status Codes:
200 OK – no error
Example:
https://archive.softwareheritage.org/api/1/origin/metadata-search/?limit=2&fulltext=node-red-nodegen
- GET /api/1/intrinsic-metadata/origin/#
Get intrinsic metadata of a software origin (as a JSON-LD/CodeMeta dictionary).
- Query Parameters:
origin_url (string) – the URL of the origin
- Response JSON Array of Objects:
??? (???) – intrinsic metadata field of the origin
- Request Headers:
Accept – the requested response content type, either
application/json
(default) orapplication/yaml
- Response Headers:
Content-Type – this depends on Accept header of request
- Status Codes:
200 OK – no error
404 Not Found – requested origin cannot be found in the archive
Example:
https://archive.softwareheritage.org/api/1/intrinsic-metadata/origin/?origin_url=https://github.com/node-red/node-red-nodegen
- GET /api/1/extrinsic-metadata/origin/#
Get extrinsic metadata of a software origin (as a JSON-LD/CodeMeta dictionary).
- Query Parameters:
origin_url (str) – parameter for origin url
- Response JSON Array of Objects:
??? (???) – extrinsic metadata field of the origin
- Request Headers:
Accept – the requested response content type, either
application/json
(default) orapplication/yaml
- Response Headers:
Content-Type – this depends on Accept header of request
- Status Codes:
200 OK – no error
404 Not Found – requested origin cannot be found in the archive
Example:
https://archive.softwareheritage.org/api/1/extrinsic-metadata/origin/?origin_url=https://github.com/node-red/node-red-nodegen