Search Query Language#

Every query is composed of filters separated by and or or. These filters have 3 components in the order : Name Operator Value

Some of the examples are :
  • origin : plasma and language in [python] and visits >= 5

  • last_visit > 2021-01-01 or last_visit < 2020-01-01

  • visited = false and metadata : "kubernetes" or origin : "minikube"

  • keyword in ["orchestration", "kubectl"] and license in ["GPLv3+", "GPLv3"]

  • (origin : debian or visit_type = ["deb"]) and license in ["GPL-3"]

  • Whitespaces are optional between the three components of a filter.

  • The conjunction operators have left precedence. Therefore foo and bar and baz means (foo and bar) and baz

  • and has higher precedence than or. Therefore foo or bar and baz means foo or (bar and baz)

  • Precedence can be overridden using parentheses: ( and ). For example, you can override the default precedence in the previous query as: (foo or bar) and baz

  • To actually search for and or or as strings, just put them within quotes. Example : metadata : "vcs history and metadata", or even just metadata : "and" to search for the string and in the metadata

The filters have been classified based on the type of value that they expect.

Pattern filters#

Returns origins having the given keywords in their url or intrinsic metadata

  • Name:
    • origin: Keywords from the origin url

    • metadata: Keywords from all the intrinsic metadata fields

  • Operator: :

  • Value: String wrapped in quotation marks(" or ')

Note: If a string has no whitespace then the quotation marks become optional.


  • origin :

  • origin : kubernetes

  • origin : "github python"

  • metadata : orchestration

  • metadata : "javascript language"

Boolean filters#

Returns origins having their boolean type values equal to given values

  • Name: visited : Whether the origin has been visited

  • Operator: =

  • Value: true or false


  • visited = true

  • visited = false

Numeric filters#

Returns origins having their numeric type values in the given range

  • Name: visits : Number of visits of an origin

  • Operator: < <= = != > >=

  • Value: Positive integer


  • visits > 2

  • visits = 5

  • visits <= 10

Un-bounded List filters#

Returns origins that satisfy the criteria based on a given list

  • Name:
    • language : Programming languages used

    • license : License used

    • keyword : keywords (often same as tags) or description (includes README) from the metadata

  • Operator: in not in

  • Value: Array of strings

  • If a string has no whitespace then the quotation marks become optional.

  • The keyword filter gives more priority to the keywords field of intrinsic metadata than the description field. So origins having the queried term in their intrinsic metadata keyword will appear first.


  • language in [python, js]

  • license in ["GPL 3.0 or later", MIT]

  • keyword in ["Software Heritage", swh]

Bounded List filters#

Returns origins that satisfy the criteria based on a list of fixed options


  • Name: visit_type : Returns only origins with at least one of the specified visit types

  • Operator: =

  • Value: Array of the following values

    any cran deb deposit ftp hg git nixguix npm pypi svn tar


  • Name: sort_by : Sorts origins based on the given list of origin attributes

  • Operator: =

  • Value: Array of the following values

    visits last_visit last_eventful_visit last_revision last_release created modified published


  • visit_type = [svn, npm]

  • visit_type = [nixguix, "ftp"]

  • sort_by = ["last_visit", created]

  • sort_by = [visits, modified]

Date filters#

Returns origins having their date type values in the given range

  • Name:

    • last_visit : Latest visit date

    • last_eventful_visit : Latest visit date where a new snapshot was detected

    • last_revision : Latest commit date

    • last_release : Latest release date

    • created Creation date

    • modified Modification date

    • published Published date

  • Operator: < <= = != > >=

  • Value: Date in Standard ISO format

Note: The last three date filters are based on metadata that has to be manually entered by the repository authors. So they might not be correct or up-to-date.


  • last_visit > 2001-01-01 and last_visit < 2101-01-01

  • last_revision = "2000-01-01 18:35Z"

  • last_release != "2021-07-17T18:35:00Z"

  • created <= "2021-07-17 18:35"

Limit filter#

Limits the number of results to at most N

  • Name: limit

  • Operator: =

  • Value: Positive Integer

Note: The default value of the limit is 50


  • limit = 1

  • limit = 15