Page MenuHomeSoftware Heritage

Restart scheduling regularly origins with relevant scheduling policies
Closed, MigratedEdits Locked

Description

A tmux session was running on saatchi to:

  • schedule regular origin listing for at least git and npm [1]
  • peridiocally update scheduler metrics (this one needs to be installed as a cron)

We need to restart those so we visit origins according to dedicated policy.

As far as I am able to tell, the policies are, per visit type:

  • npm: already_visited_order_by_lag (to avoid visiting those too often)
  • git: never_visited_oldest_update_first (to subside the github lab)

[1] It's done that way as a temporary workaround as we need an extra tool to orchestrate
the scheduling according to a feedback loop based on the metrics (work has not begun on
that part yet)

Event Timeline

ardumont renamed this task from Restart scheduling regularly origins to Restart scheduling regularly origins with relevant scheduling policies.Jul 28 2021, 11:56 AM
ardumont triaged this task as High priority.
ardumont created this task.

tmux session started as root user (so it's shareable).
Command are executed under a virtualenv with the swhscheduler user.

Metric update command to execute regularly (i'll cron it but i executed a bit to ensure it's ok, it is):

(ve) swhscheduler@saatchi:~$ SWH_CONFIG_FILENAME=/etc/softwareheritage/scheduler/backend.yml swh scheduler --config-file /etc/softwareheritage/scheduler/backend.yml origin update-metrics

git:

(ve) swhscheduler@saatchi:~$ while true; do
>   SWH_CONFIG_FILENAME=/etc/softwareheritage/scheduler/listener-runner.yml \
>     swh scheduler -C /etc/softwareheritage/scheduler/listener-runner.yml \
>       origin send-to-celery -p never_visited_oldest_update_first git
>     sleep 600;
> done
2 slots available in celery queue
...

npm:

(ve) swhscheduler@saatchi:~$ while true; do
>   SWH_CONFIG_FILENAME=/etc/softwareheritage/scheduler/listener-runner.yml \
>     swh scheduler -C /etc/softwareheritage/scheduler/listener-runner.yml \
>       origin send-to-celery \
>       -p already_visited_order_by_lag npm;
>     sleep 1200;
> done
7500 slots available in celery queue
112 visits to send to celery
...
ardumont changed the task status from Open to Work in Progress.Jul 28 2021, 12:05 PM
ardumont moved this task from Backlog to in-progress on the System administration board.

Heads up, this is running slightly different now.

For visit_type in (git, 900s), (pypi, 1800s), (npm, 3600s), 3 different processes are
running in 3 different shells (still within the same tmux session on saatchi).

This seems to be enough to regularly schedule more interesting origins than the old
scheduler runner [1].

visit_type=git; sleep=900; while true; do
  for policy in never_visited_oldest_update_first already_visited_order_by_lag; do
    echo "$(date) scheduling $visit_type origins with policy ${policy}"
    SWH_CONFIG_FILENAME=/etc/softwareheritage/scheduler/listener-runner.yml \
      swh scheduler -C /etc/softwareheritage/scheduler/listener-runner.yml \
        origin send-to-celery --policy $policy $visit_type;
    echo "$(date) sleep $sleep" ;
    sleep $sleep;
  done
done

[1] Somehow it started to schedule back git origins as well, which seemed to be mostly
noop events origins (resulting on a high load overall on the infra). I did not find back
the sql query to deactivate those. So in the mean time, i have patched
swh-scheduler-runner service on saatchi so it skips that "git origins without priority"
scheduling (tags on grafana to mention this).