Traversal REST API

Terminology

This API uses the following notions:

  • Node: a node in the Software Heritage graph, represented by a persistent identifier (abbreviated as SWH PID, or simply PID).
  • Node type: the 3-letter specifier from the node PID (cnt, dir, rel, rev, snp), or * for all node types.
  • Edge type: a comma-separated list of src:dst strings where src and dst are node types, or * for all edge types.

Examples

  • swh:1:cnt:94a9ed024d3859793618152ea559a168bbcbb5e2 the PID of a node of type content containing the full text of the GPL3 license.
  • swh:1:rev:f39d7d78b70e0f39facb1e4fab77ad3df5c52a35 the PID of a node of type revision corresponding to the commit in Linux that merged the ‘x86/urgent’ branch on 31 December 2017.
  • "dir:dir,dir:cnt" node types allowing edges from directories to directories nodes, or directories to contents nodes.
  • "rev:rev,dir:*" node types allowing edges from revisions to revisions nodes, or from directories nodes.
  • "*:rel" node types allowing all edges to releases.

Metadata

Extra metadata are given in addition to the result:

  • timings: only when configured to do so (see the server’s README):

    • traversal: time in seconds to do the actual graph traversal.
    • pid2node: time in seconds to convert input PID to node id.
    • node2pid: time in seconds to convert output node ids to PIDs.
  • nb_edges_accessed: number of edges accessed during the traversal operation.

Leaves

GET /graph/leaves/:src

Performs a graph traversal and returns the leaves of the subgraph rooted at the specified source node.

Parameters:
  • src (string) – source node specified as a SWH PID
Query Parameters:
 
  • edges (string) – edges types the traversal can follow; default to "*"
  • direction (string) – direction in which graph edges will be followed; can be either forward or backward, default to forward
Status Codes:
HTTP/1.1 200 OK
Content-Type: application/json

{
    "result": [
        "swh:1:cnt:669ac7c32292798644b21dbb5a0dc657125f444d",
        "swh:1:cnt:da4cb28febe66172a9fdf1a235525ae6c00cde1d",
        ...
    ],
    "meta": {
        "timings": {
            "traversal": 0.002942681,
            "pid2node": 0.000178051,
            "node2pid": 0.000956569
        },
        "nb_edges_accessed": 12
    }
}

Neighbors

GET /graph/neighbors/:src

Returns node direct neighbors (linked with exactly one edge) in the graph.

Parameters:
  • src (string) – source node specified as a SWH PID
Query Parameters:
 
  • edges (string) – edges types allowed to be listed as neighbors; default to "*"
  • direction (string) – direction in which graph edges will be followed; can be either forward or backward, default to forward
Status Codes:
HTTP/1.1 200 OK
Content-Type: application/json

{
    "result": [
        "swh:1:cnt:94a9ed024d3859793618152ea559a168bbcbb5e2",
        "swh:1:dir:d198bc9d7a6bcf6db04f476d29314f157507d505",
        ...
    ],
    "meta": {
        "timings": {
            "traversal": 0.002942681,
            "pid2node": 0.000178051,
            "node2pid": 0.000956569
        },
        "nb_edges_accessed": 12
    }
}

Walk

GET /graph/walk/:src/:dst

Performs a graph traversal and returns the first found path from source to destination (final destination node included).

Parameters:
  • src (string) – starting node specified as a SWH PID
  • dst (string) – destination node, either as a node PID or a node type. The traversal will stop at the first node encountered matching the desired destination.
Query Parameters:
 
  • edges (string) – edges types the traversal can follow; default to "*"
  • traversal (string) – traversal algorithm; can be either dfs or bfs, default to dfs
  • direction (string) – direction in which graph edges will be followed; can be either forward or backward, default to forward
Status Codes:
HTTP/1.1 200 OK
Content-Type: application/json

{
    "result": [
        "swh:1:rev:f39d7d78b70e0f39facb1e4fab77ad3df5c52a35",
        "swh:1:rev:52c90f2d32bfa7d6eccd66a56c44ace1f78fbadd",
        "swh:1:rev:cea92e843e40452c08ba313abc39f59efbb4c29c",
        "swh:1:rev:8d517bdfb57154b8a11d7f1682ecc0f79abf8e02",
        ...
    ],
    "meta": {
        "timings": {
            "traversal": 0.002942681,
            "pid2node": 0.000178051,
            "node2pid": 0.000956569
        },
        "nb_edges_accessed": 12
    }
}

Visit

GET /graph/visit/nodes/:src
GET /graph/visit/paths/:src

Performs a graph traversal and returns explored nodes or paths (in the order of the traversal).

Parameters:
  • src (string) – starting node specified as a SWH PID
Query Parameters:
 
  • edges (string) – edges types the traversal can follow; default to "*"
  • direction (string) – direction in which graph edges will be followed; can be either forward or backward, default to forward
Status Codes:
GET /graph/visit/nodes/
HTTP/1.1 200 OK
Content-Type: application/json

{
    "result": [
        "swh:1:rev:f39d7d78b70e0f39facb1e4fab77ad3df5c52a35",
        "swh:1:rev:52c90f2d32bfa7d6eccd66a56c44ace1f78fbadd",
        ...
        "swh:1:rev:a31e58e129f73ab5b04016330b13ed51fde7a961",
        ...
    ],
    "meta": {
        "timings": {
            "traversal": 0.002942681,
            "pid2node": 0.000178051,
            "node2pid": 0.000956569
        },
        "nb_edges_accessed": 12
    }
}
GET /graph/visit/paths/
HTTP/1.1 200 OK
Content-Type: application/json

{
    "result": [
        [
            "swh:1:rev:f39d7d78b70e0f39facb1e4fab77ad3df5c52a35",
            "swh:1:rev:52c90f2d32bfa7d6eccd66a56c44ace1f78fbadd",
            ...
        ],
        [
            "swh:1:rev:f39d7d78b70e0f39facb1e4fab77ad3df5c52a35",
            "swh:1:rev:a31e58e129f73ab5b04016330b13ed51fde7a961",
            ...
        ],
        ...
    ],
    "meta": {
        "timings" : {
            "traversal": 0.002942681,
            "pid2node": 0.000178051,
            "node2pid": 0.000956569
        },
        "nb_edges_accessed": 12
    }
}

Stats

GET /graph/stats

Returns statistics on the compressed graph.

Status Codes:
HTTP/1.1 200 OK
Content-Type: application/json

{
    "counts": {
        "nodes": 16222788,
        "edges": 9907464
    },
    "ratios": {
        "compression": 0.367,
        "bits_per_node": 5.846,
        "bits_per_edge": 9.573,
        "avg_locality": 270.369
    },
    "indegree": {
        "min": 0,
        "max": 12382,
        "avg": 0.6107127825377487
    },
    "outdegree": {
        "min": 0,
        "max": 1,
        "avg": 0.6107127825377487
    }
}