Command-line interface#
Scheduler task utilities#
swh scheduler task#
Manipulate tasks.
Expected configuration:
swh scheduler task [OPTIONS] COMMAND [ARGS]...
add#
Schedule one task from arguments.
The first argument is the name of the task type. Flag options (policy, priority) are task configuration. Further options are positional and keyword argument(s) of the task, in YAML format. Keyword args are of the form key=value.
Usage sample:
swh-scheduler --database 'service=swh-scheduler' \
task add list-pypi
swh-scheduler --database 'service=swh-scheduler' \
task add list-debian-distribution --policy=oneshot distribution=stretch
Note: if the priority is not given, the task won’t have the priority set, which is considered as the lowest priority level.
swh scheduler task add [OPTIONS] TASK_TYPE_NAME [OPTIONS]...
Options
- -p, --policy <policy>#
- Options:
recurring | oneshot
- -P, --priority <priority>#
- Options:
low | normal | high
- -n, --next-run <next_run>#
Arguments
- TASK_TYPE_NAME#
Required argument
- OPTIONS#
Optional argument(s)
list#
List tasks.
swh scheduler task list [OPTIONS]
Options
- -i, --task-id <ID>#
List only tasks whose id is ID.
- -t, --task-type <TYPE>#
List only tasks of type TYPE
- -l, --limit <limit>#
The maximum number of tasks to fetch.
- -s, --status <STATUS>#
List tasks whose status is STATUS.
- Options:
next_run_not_scheduled | next_run_scheduled | completed | disabled
- -p, --policy <policy>#
List tasks whose policy is POLICY.
- Options:
recurring | oneshot
- -P, --priority <priority>#
List tasks whose priority is PRIORITY.
- Options:
all | low | normal | high
- -b, --before <DATETIME>#
Limit to tasks supposed to run before the given date.
- -a, --after <DATETIME>#
Limit to tasks supposed to run after the given date.
- -r, --list-runs#
Also list past executions of each task.
list-pending#
List tasks with no priority that are going to be run.
You can override the number of tasks to fetch with the –limit flag.
swh scheduler task list-pending [OPTIONS] TASK_TYPES...
Options
- -l, --limit <num_tasks>#
The maximum number of tasks to fetch
- -b, --before <before>#
List all jobs supposed to run before the given date
Arguments
- TASK_TYPES#
Required argument(s)
respawn#
Respawn tasks.
Respawn tasks given by their ids (see the ‘task list’ command to find task ids) at the given date (immediately by default).
For example:
swh-scheduler task respawn 1 3 12
swh scheduler task respawn [OPTIONS] TASK_IDS...
Options
- -n, --next-run <DATETIME>#
Re spawn the selected tasks at this date
Arguments
- TASK_IDS#
Required argument(s)
schedule#
Schedule tasks from a CSV input file.
The following columns are expected, and can be set through the -c option:
The CSV can be read either from a named file, or from stdin (use - as filename).
Use sample:
cat scheduling-task.txt | \
python3 -m swh.scheduler.cli \
--database 'service=swh-scheduler-dev' \
task schedule \
--columns type --columns kwargs --columns policy \
--delimiter ';' -
swh scheduler task schedule [OPTIONS] FILE
Options
- -c, --columns <columns>#
columns present in the CSV file
- Options:
type | args | kwargs | policy | next_run
- -d, --delimiter <delimiter>#
Arguments
- FILE#
Required argument
schedule_origins#
Schedules tasks for origins that are already known.
The first argument is the name of the task type, further ones are keyword argument(s) of the task in the form key=value, where value is in YAML format.
Usage sample:
swh-scheduler --database 'service=swh-scheduler' \
task schedule_origins index-origin-metadata
swh scheduler task schedule_origins [OPTIONS] TYPE [OPTIONS]...
Options
- -b, --batch-size <origin_batch_size>#
Number of origins per task
- Default:
10
- --page-token <page_token>#
Only schedule tasks for origins whose ID is greater
- Default:
0
- --limit <limit>#
Limit the tasks scheduling up to this number of tasks
- -g, --storage-url <storage_url>#
URL of the (graph) storage API
- --dry-run, --no-dry-run#
List only what would be scheduled.
Arguments
- TYPE#
Required argument
- OPTIONS#
Optional argument(s)
swh scheduler task_type#
Manipulate task types.
Expected configuration:
swh scheduler task_type [OPTIONS] COMMAND [ARGS]...
add#
Create a new task type
swh scheduler task_type add [OPTIONS] TYPE TASK_NAME DESCRIPTION
Options
- -i, --default-interval <default_interval>#
Default interval (“90 days” by default)
- --min-interval <min_interval>#
Minimum interval (default interval if not set)
- -i, --max-interval <max_interval>#
Maximal interval (default interval if not set)
- -f, --backoff-factor <backoff_factor>#
Backoff factor
Arguments
- TYPE#
Required argument
- TASK_NAME#
Required argument
- DESCRIPTION#
Required argument
list#
swh scheduler task_type list [OPTIONS]
Options
- -v, --verbose#
Verbose mode
- -t, --task_type <task_type>#
List task types of given type
- -n, --task_name <task_name>#
List task types of given backend task name
register#
Register missing task-type entries in the scheduler.
According to declared tasks in each loaded worker (e.g. lister, loader, …) plugins.
swh scheduler task_type register [OPTIONS]
Options
- -p, --plugins <plugins>#
Registers task-types for provided plugins. Defaults to all
Scheduler server utilities#
swh scheduler runner#
Starts a swh-scheduler runner service.
This process is responsible for checking for ready-to-run tasks and schedule them.
Expected configuration:
swh scheduler runner [OPTIONS]
Options
- -p, --period <period>#
Period (in s) at witch pending tasks are checked and executed. Set to 0 (default) for a one shot.
- --task-type <task_type_names>#
Task types to schedule. If not provided, this iterates over every task types referenced in the scheduler backend.
- --with-priority, --without-priority#
Determine if those tasks should be the ones with priority or not.By default, this deals with tasks without any priority.
swh scheduler listener#
Starts a swh-scheduler listener service.
This service is responsible for listening at task lifecycle events and handle their workflow status in the database.
Expected configuration:
swh scheduler listener [OPTIONS]
swh scheduler rpc-serve#
Starts a swh-scheduler API HTTP server.
Expected configuration:
swh scheduler rpc-serve [OPTIONS]
Options
- --host <host>#
Host to run the scheduler server api
- --port <port>#
Binding port of the server
- --debug, --nodebug#
Indicates if the server should run in debug mode. Defaults to True if log-level is DEBUG, False otherwise.
swh scheduler celery-monitor#
Monitoring of Celery
swh scheduler celery-monitor [OPTIONS] COMMAND [ARGS]...
Options
- --timeout <timeout>#
Timeout for celery remote control
- --pattern <pattern>#
Celery destination pattern
list-running#
List running tasks on the lister workers
swh scheduler celery-monitor list-running [OPTIONS]
Options
- --format <format>#
Output format
- Options:
pretty | csv
ping-workers#
Check which workers respond to the celery remote control
swh scheduler celery-monitor ping-workers [OPTIONS]