swh.lister.npm.lister module

class swh.lister.npm.lister.NpmListerBase(url='https://replicate.npmjs.com', per_page=1000, override_config=None)[source]

Bases: swh.lister.core.indexing_lister.IndexingHttpLister

List packages available in the npm registry in a paginated way

MODEL

alias of swh.lister.npm.models.NpmModel

LISTER_NAME = 'npm'
instance = 'npm'
property ADDITIONAL_CONFIG

(Override) Add extra configuration

get_model_from_repo(repo_name: str) → Dict[str, str][source]

(Override) Transform from npm package name to model

task_dict(origin_type: str, origin_url: str, **kwargs)[source]

(Override) Return task dict for loading a npm package into the archive.

This is overridden from the lister_base as more information is needed for the ingestion task creation.

request_headers() → Dict[str, Any][source]

(Override) Set requests headers to send when querying the npm registry.

string_pattern_check(inner: int, lower: int, upper: int = None)[source]

(Override) Inhibit the effect of that method as packages indices correspond to package names and thus do not respect any kind of fixed length string pattern

class swh.lister.npm.lister.NpmLister(url='https://replicate.npmjs.com', per_page=1000, override_config=None)[source]

Bases: swh.lister.npm.lister.NpmListerBase

List all packages available in the npm registry in a paginated way

PATH_TEMPLATE = '/_all_docs?startkey="%s"'
get_next_target_from_response(response: requests.models.Response) → Optional[str][source]

(Override) Get next npm package name to continue the listing

transport_response_simplified(response: requests.models.Response) → List[Dict[str, str]][source]

(Override) Transform npm registry response to list for model manipulation

class swh.lister.npm.lister.NpmIncrementalLister(url='https://replicate.npmjs.com', per_page=1000, override_config=None)[source]

Bases: swh.lister.npm.lister.NpmListerBase

List packages in the npm registry, updated since a specific update_seq value of the underlying CouchDB database, in a paginated way.

PATH_TEMPLATE = '/_changes?since=%s'
property CONFIG_BASE_FILENAME

str(object=’’) -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.__str__() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to ‘strict’.

get_next_target_from_response(response: requests.models.Response) → Optional[str][source]

(Override) Get next npm package name to continue the listing.

transport_response_simplified(response: requests.models.Response) → List[Dict[str, str]][source]

(Override) Transform npm registry response to list for model manipulation.

filter_before_inject(models_list: List[Dict[str, Any]])[source]

(Override) Filter out documents in the CouchDB database not related to a npm package.

disable_deleted_repo_tasks(start, end, keep_these)[source]

(Override) Disable the processing performed by that method as it is not relevant in this incremental lister context. It also raises an exception due to a different index type (int instead of str).