swh.web.save_code_now.origin_save module#
- swh.web.save_code_now.origin_save.get_origin_save_authorized_urls() List[str] [source]#
Get the list of origin url prefixes authorized to be immediately loaded into the archive (whitelist).
- Returns:
The list of authorized origin url prefix
- Return type:
- swh.web.save_code_now.origin_save.get_origin_save_unauthorized_urls() List[str] [source]#
Get the list of origin url prefixes forbidden to be loaded into the archive (blacklist).
- Returns:
the list of unauthorized origin url prefix
- Return type:
- swh.web.save_code_now.origin_save.can_save_origin(origin_url: str, bypass_pending_review: bool = False) str [source]#
Check if a software origin can be saved into the archive.
Based on the origin url, the save request will be either:
immediately accepted if the url is whitelisted
rejected if the url is blacklisted
put in pending state for manual review otherwise
- swh.web.save_code_now.origin_save.get_savable_visit_types_dict(privileged_user: bool = False) Dict [source]#
Returned the supported task types the user has access to.
- Parameters:
privileged_user – Flag to determine if all visit types should be returned or not. Default to False to only list unprivileged visit types.
- Returns:
the dict of supported visit types for the user
- swh.web.save_code_now.origin_save.get_savable_visit_types(privileged_user: bool = False) List[str] [source]#
Return the list of visit types the user can perform save requests on.
- Parameters:
privileged_user – Flag to determine if all visit types should be returned or not. Default to False to only list unprivileged visit types.
- Returns:
the list of saveable visit types
- swh.web.save_code_now.origin_save.validate_origin_url(origin_url: str) None [source]#
Check an origin URL is well formed and does not contain password.
- Parameters:
origin_url – The URL to check
- Raises:
BadInputExc – if one of the checks failed
- swh.web.save_code_now.origin_save.origin_exists(origin_url: str) OriginExistenceCheckInfo [source]#
Check the origin url for existence. If it exists, extract some more useful information on the origin.
- swh.web.save_code_now.origin_save.create_save_origin_request(visit_type: str, origin_url: str, privileged_user: bool = False, user_id: int | None = None, from_webhook: bool = False, webhook_origin: str | None = None, **kwargs) SaveOriginRequestInfo [source]#
Create a loading task to save a software origin into the archive.
This function aims to create a software origin loading task through the use of the swh-scheduler component.
First, some checks are performed to see if the visit type and origin url are valid but also if the the save request can be accepted. For the ‘archives’ visit type, this also ensures the artifacts actually exists. If those checks passed, the loading task is then created. Otherwise, the save request is put in pending or rejected state.
All the submitted save requests are logged into the swh-web database to keep track of them.
- Parameters:
visit_type – the type of visit to perform (e.g. git, hg, svn, archives, …)
origin_url – the url of the origin to save
privileged – Whether the user has some more privilege than other (bypass review, access to privileged other visit types)
user_id – User identifier (provided when authenticated)
from_webhook – Indicates if the save request is created from a webhook receiver
webhook_origin – Indicates which forge type sent the webhook
kwargs – Optional parameters (e.g. artifact_url, artifact_filename, artifact_version)
- Raises:
BadInputExc – the visit type or origin url is invalid or inexistent
ForbiddenExc – the provided origin url is blacklisted
- Returns:
A dict describing the save request with the following keys:
visit_type: the type of visit to perform
origin_url: the url of the origin
save_request_date: the date the request was submitted
save_request_status: the request status, either accepted, rejected or pending
save_task_status: the origin loading task status, either not created, pending, scheduled, running, succeeded or failed
- Return type:
- swh.web.save_code_now.origin_save.update_save_origin_requests_from_queryset(requests_queryset: QuerySet) List[SaveOriginRequestInfo] [source]#
Update all save requests from a SaveOriginRequest queryset, update their status in db and return the list of impacted save_requests.
- Parameters:
requests_queryset – input SaveOriginRequest queryset
- Returns:
A list of save origin request info dicts as described in
swh.web.save_code_now.origin_save.create_save_origin_request()
- Return type:
- swh.web.save_code_now.origin_save.get_save_origin_requests_to_update(origin_url: str | None = None) QuerySet [source]#
Get the set of recent save origin requests that have non terminal statuses and require update.
Non-terminal requests are those whose status is accepted and their task status are either created, pending, scheduled or running.
- Parameters:
origin_url – If provided, only return requests to update for the given origin URL
- Returns:
Django queryset of requests to update
- swh.web.save_code_now.origin_save.refresh_save_origin_request_statuses() List[SaveOriginRequestInfo] [source]#
Refresh non-terminal save origin requests (SOR) in the backend.
Non-terminal SOR are requests whose status is accepted and their task status are either created, pending, scheduled or running.
This shall compute this list of save requests, checks their status in the scheduler, then update those in database.
Finally, this returns the refreshed information on those save requests.
- swh.web.save_code_now.origin_save.get_save_origin_requests(visit_type: str, origin_url: str) List[SaveOriginRequestInfo] [source]#
Get all save requests for a given software origin.
- Parameters:
visit_type – the type of visit
origin_url – the url of the origin
- Raises:
BadInputExc – the visit type or origin url is invalid
swh.web.utils.exc.NotFoundExc – no save requests can be found for the given origin
- Returns:
A list of save origin requests dict as described in
swh.web.save_code_now.origin_save.create_save_origin_request()
- Return type:
- swh.web.save_code_now.origin_save.get_save_origin_request(request_id: int) SaveOriginRequestInfo [source]#
Get save request with given identifier.
- Parameters:
request_id – the save request identifier
- Raises:
swh.web.utils.exc.NotFoundExc – no save request can be found for the given identifier
- Returns:
A save origin request dict as described in
swh.web.save_code_now.origin_save.create_save_origin_request()
- swh.web.save_code_now.origin_save.get_save_origin_task_info(save_request_id: int) Dict[str, Any] [source]#
Get detailed information about an accepted save origin request and its associated loading task.
If the associated loading task info is archived and removed from the scheduler database, returns an empty dictionary.
- Parameters:
save_request_id – identifier of a save origin request
- Returns:
type: loading task type
arguments: loading task arguments
id: loading task database identifier
backend_id: loading task celery identifier
scheduled: loading task scheduling date
ended: loading task termination date
status: loading task execution status
visit_status: Actual visit status
- metadata: any other metadata related to the loading task;
typically comes with the error for a failed task
- Return type:
A dictionary with the following keys
- swh.web.save_code_now.origin_save.schedule_origins_recurrent_visits(save_requests: List[SaveOriginRequestInfo]) int [source]#
Schedule recurrent visits of origin URLs submitted to Save Code Now.
- Parameters:
save_requests – List of save origin requests from which to schedule recurrent visits
- Returns:
The number of origins that were scheduled for recurrent visits