Graph traversal API

Terminology

This API uses the following notions:

  • Node: a node in the Software Heritage graph, represented by a persistent identifier (abbreviated as SWH PID, or simply PID).
  • Node type: the 3-letter specifier from the node PID (cnt, dir, rel, rev, snp), or * for all node types.
  • Edge type: a comma-separated list of src:dst strings where src and dst are node types, or * for all edge types.

Examples

  • swh:1:cnt:94a9ed024d3859793618152ea559a168bbcbb5e2 the PID of a node of type content containing the full text of the GPL3 license.
  • swh:1:rev:f39d7d78b70e0f39facb1e4fab77ad3df5c52a35 the PID of a node of type revision corresponding to the commit in Linux that merged the ‘x86/urgent’ branch on 31 December 2017.
  • "dir:dir,dir:cnt" node types allowing edges from directories to directories nodes, or directories to contents nodes.
  • "rev:rev,dir:*" node types allowing edges from revisions to revisions nodes, or from directories nodes.
  • "*:rel" node types allowing all edges to releases.

Leaves

GET /graph/leaves/:src

Performs a graph traversal and returns the leaves of the subgraph rooted at the specified source node.

Parameters:
  • src (string) – source node specified as a SWH PID
Query Parameters:
 
  • edges (string) – edges types the traversal can follow; default to

"*" :query string direction: direction in which graph edges will be followed; can be either forward or backward, default to forward

Status Codes:

HTTP/1.1 200 OK Content-Type: application/json

[
“swh:1:cnt:669ac7c32292798644b21dbb5a0dc657125f444d”, “swh:1:cnt:da4cb28febe66172a9fdf1a235525ae6c00cde1d”, …

]

Neighbors

GET /graph/neighbors/:src

Returns node direct neighbors (linked with exactly one edge) in the graph.

Parameters:
  • src (string) – source node specified as a SWH PID
Query Parameters:
 
  • edges (string) – edges types allowed to be listed as neighbors; default

to "*" :query string direction: direction in which graph edges will be followed; can be either forward or backward, default to forward

Status Codes:

HTTP/1.1 200 OK Content-Type: application/json

[
“swh:1:cnt:94a9ed024d3859793618152ea559a168bbcbb5e2”, “swh:1:dir:d198bc9d7a6bcf6db04f476d29314f157507d505”, …

]

Walk

GET /graph/walk/:src/:dst

Performs a graph traversal and returns the first found path from source to destination (final destination node included).

Parameters:
  • src (string) – starting node specified as a SWH PID
  • dst (string) – destination node, either as a node PID or a node type.

The traversal will stop at the first node encountered matching the desired destination.

Query Parameters:
 
  • edges (string) – edges types the traversal can follow; default to

"*" :query string traversal: traversal algorithm; can be either dfs or bfs, default to dfs :query string direction: direction in which graph edges will be followed; can be either forward or backward, default to forward

Status Codes:

HTTP/1.1 200 OK Content-Type: application/json

[
“swh:1:rev:f39d7d78b70e0f39facb1e4fab77ad3df5c52a35”, “swh:1:rev:52c90f2d32bfa7d6eccd66a56c44ace1f78fbadd”, “swh:1:rev:cea92e843e40452c08ba313abc39f59efbb4c29c”, “swh:1:rev:8d517bdfb57154b8a11d7f1682ecc0f79abf8e02”, …

]

Visit

GET /graph/visit/:src
GET /graph/visit/nodes/:src
GET /graph/visit/paths/:src

Performs a graph traversal and returns explored nodes and/or paths (in the order of the traversal).

Parameters:
  • src (string) – starting node specified as a SWH PID
Query Parameters:
 
  • edges (string) – edges types the traversal can follow; default to

"*" :query string direction: direction in which graph edges will be followed; can be either forward or backward, default to forward

Status Codes:

GET /graph/visit/ HTTP/1.1 200 OK Content-Type: application/json

{
“paths”: [
[
“swh:1:rev:f39d7d78b70e0f39facb1e4fab77ad3df5c52a35”, “swh:1:rev:52c90f2d32bfa7d6eccd66a56c44ace1f78fbadd”, …

], [

“swh:1:rev:f39d7d78b70e0f39facb1e4fab77ad3df5c52a35”, “swh:1:rev:a31e58e129f73ab5b04016330b13ed51fde7a961”, …

], “nodes”: [

“swh:1:rev:f39d7d78b70e0f39facb1e4fab77ad3df5c52a35”, “swh:1:rev:52c90f2d32bfa7d6eccd66a56c44ace1f78fbadd”, … “swh:1:rev:a31e58e129f73ab5b04016330b13ed51fde7a961”, …

]

}


GET /graph/visit/nodes/ HTTP/1.1 200 OK Content-Type: application/json

[
“swh:1:rev:f39d7d78b70e0f39facb1e4fab77ad3df5c52a35”, “swh:1:rev:52c90f2d32bfa7d6eccd66a56c44ace1f78fbadd”, … “swh:1:rev:a31e58e129f73ab5b04016330b13ed51fde7a961”, …

]


GET /graph/visit/paths/ HTTP/1.1 200 OK Content-Type: application/json

[
[
“swh:1:rev:f39d7d78b70e0f39facb1e4fab77ad3df5c52a35”, “swh:1:rev:52c90f2d32bfa7d6eccd66a56c44ace1f78fbadd”, …

], [

“swh:1:rev:f39d7d78b70e0f39facb1e4fab77ad3df5c52a35”, “swh:1:rev:a31e58e129f73ab5b04016330b13ed51fde7a961”, …

]

Stats

GET /graph/stats

Returns statistics on the compressed graph.

Status Codes:

HTTP/1.1 200 OK Content-Type: application/json

{
“counts”: {
“nodes”: 16222788, “edges”: 9907464

}, “ratios”: {

“compression”: 0.367, “bits_per_node”: 5.846, “bits_per_edge”: 9.573, “avg_locality”: 270.369

}, “indegree”: {

“min”: 0, “max”: 12382, “avg”: 0.6107127825377487

}, “outdegree”: {

“min”: 0, “max”: 1, “avg”: 0.6107127825377487

}

}