swh.scheduler.celery_backend.utils module#

swh.scheduler.celery_backend.utils.get_loader_task_type(scheduler: SchedulerInterface, visit_type: str) TaskType | None[source]#

Given a visit type, return its associated task type.

swh.scheduler.celery_backend.utils.send_to_celery(scheduler: SchedulerInterface, visit_type_to_queue: Dict[str, str], enabled: bool = True, lister_name: str | None = None, lister_instance_name: str | None = None, policy: str = 'oldest_scheduled_first', tablesample: float | None = None, absolute_cooldown: timedelta | None = None, scheduled_cooldown: timedelta | None = None, failed_cooldown: timedelta | None = None, not_found_cooldown: timedelta | None = None)[source]#

Utility function to read tasks from the scheduler and send those directly to celery.

Parameters:
  • visit_type_to_queue – Optional mapping of visit/loader type (e.g git, svn, …) to queue to send task to.

  • enabled – Determine whether we want to list enabled or disabled origins. As default, we want reasonably enabled origins. For some edge case, we might want the others.

  • lister_name – Determine the list of origins listed from the lister with name

  • lister_instance_name – Determine the list of origins listed from the lister with instance name

  • policy – the scheduling policy used to select which visits to schedule

  • tablesample – the percentage of the table on which we run the query (None: no sampling)

  • absolute_cooldown – the minimal interval between two visits of the same origin

  • scheduled_cooldown – the minimal interval before which we can schedule the same origin again if it’s not been visited

  • failed_cooldown – the minimal interval before which we can reschedule a failed origin

  • not_found_cooldown – the minimal interval before which we can reschedule a not_found origin

Returns:

The number of tasks sent to celery