swh.search.api.client module

class swh.search.api.client.RemoteSearch(url, api_exception=None, timeout=None, chunk_size=4096, reraise_exceptions=None, **kwargs)[source]

Bases: swh.core.api.RPCClient

Proxy to a remote search API


alias of swh.search.elasticsearch.ElasticSearch

flush() → None

Blocks until all previous calls to _update() are completely applied.

Searches for origins matching the url_pattern.

  • url_pattern (str) – Part of thr URL to search for

  • with_visit (bool) – Whether origins with no visit are to be filtered out

  • page_token (str) – Opaque value used for pagination.

  • count (int) – number of results to return.


  • next_page_token: opaque value used for fetching more results. None if there are no more result.

  • results: list of dictionaries with key: * url: URL of a matching origin

Return type

a dictionary with keys

origin_update(documents: Iterable[dict]) → None