Page MenuHomeSoftware Heritage

swh-docker-dev: `load-tar` tasks are never executed
Closed, MigratedEdits Locked

Description

With the swh-docker-dev stack, the load-tar tasks generated by the GNU listers are never executed. Note tasks generated by the gitlab lister don't have this issue.

It's quite easy to reproduce, with a fresh swh-docker-dev environment, run the GNU lister from the lister container:

docker-compose exec swh-lister python3 -c 'import logging; from swh.lister.gnu.tasks import gnu_lister; logging.basicConfig(level=logging.DEBUG); gnu_lister()'

You can see more than 300 tasks are created. Here is a fragment:

docker-compose exec  swh-scheduler-api  swh scheduler task list 
...
Task 384
  Next run: 7 minutes ago (2019-09-04 17:35:39+00:00)
  Interval: 64 days, 0:00:00
  Type: load-tar
  Policy: recurring
  Status: next_run_scheduled
  Priority: 
  Args:
    'glibc'
    'https://ftp.gnu.org/gnu/glibc/'
  Keyword args:
    tarballs: [{'date': '859017600', 'archive': 'https://ftp.gnu.org/old-gnu/glibc/glibc-2.0.2.tar.gz'}, {'date': '862470000', 'archive': 'https://ftp.gnu.org/old-gnu/glibc/glibc-2.0.3-m68k-linux.bin.tar.gz'}, {'date': '861692400', 'archive': 'https://ftp.gnu.org/old-gnu/glibc/glibc-2.0.3.tar.gz'}, {'date': '865321200', 'archive': 'https://ftp.gnu.org/old-gnu/glibc/glibc-2.0.4-m68k-linux.bin.tar.gz'}, {'date': '865148400', 'archive': 'https://ftp.gnu.org/old-gnu/glibc/glibc-2.0.4.bin.i386.tar.gz'}, {'date': '864716400', 'archive': 'https://ftp.gnu.org/old-gnu/glibc/glibc-2.0.4.tar.gz'}, {'date': '872492400', 'archive': 'https://ftp.gnu.org/old-gnu/glibc/glibc-2.0.5.tar.gz'}, {'date': '854352000', 'archive': 'https://ftp.gnu.org/old-gnu/glibc/glibc-2.0.bin.i386.tar.gz'}, {'date': '854352000', 'archive': 'https://ftp.gnu.org/old-gnu/glibc/glibc-2.0.tar.gz'}, {'date': '859017600', 'archive': 'https://ftp.gnu.org/old-gnu/glibc/glibc-linuxthreads-2.0.2.tar.gz'}, {'date': '861951600', 'archive': 'https://ftp.gnu.org/old-gnu/glibc/glibc-linuxthreads-2.0.3a.tar.gz'}, {'date': '864716400', 'archive': 'https://ftp.gnu.org/old-gnu/glibc/glibc-linuxthreads-2.0.4.tar.gz'}, {'date': '872492400', 'archive': 'https://ftp.gnu.org/old-gnu/glibc/glibc-linuxthreads-2.0.5.tar.gz'}, {'date': '854352000', 'archive': 'https://ftp.gnu.org/old-gnu/glibc/glibc-linuxthreads-2.0.tar.gz'}, {'date': '859017600', 'archive': 'https://ftp.gnu.org/old-gnu/glibc/glibc-localedata-2.0.2.tar.gz'}, {'date': '861692400', 'archive': 'https://ftp.gnu.org/old-gnu/glibc/glibc-localedata-2.0.3.tar.gz'}, {'date': '864716400', 'archive': 'https://ftp.gnu.org/old-gnu/glibc/glibc-localedata-2.0.4.tar.gz'}, {'date': '872492400', 'archive': 'https://ftp.gnu.org/old-gnu/glibc/glibc-localedata-2.0.5.tar.gz'}, {'date': '854352000', 'archive': 'https://ftp.gnu.org/old-gnu/glibc/glibc-localedata-2.0.tar.gz'}]

As you can see, the status of the task is next_run_scheduled, but it is never executed.

Event Timeline

lewo created this object in space S1 Public.

In the current state, i can assume that:

  • the lister db is properly setup since the lister ran (we see new scheduling entries)
  • the scheduler db as well
  • the scheduler runner ran (it reads scheduling entries to push to the rabbitmq queues)

Just nothing seems to be subscribed for the loader-tar.
As mentioned in irc, check conf/loader.yml, i don't see the task for the load-tar there.

Note:
some other tool can help by the way (you may know this already).

  • docker-compose logs --follow <docker-container-name-as-reported-in-docker-compose.yml>
  • docker-compose ps <- to see what's running
ardumont renamed this task from `load-tar` tasks are never executed to swh-docker-dev: `load-tar` tasks are never executed.Sep 5 2019, 9:56 AM
ardumont changed the task status from Open to Work in Progress.
ardumont triaged this task as Normal priority.
ardumont added a project: Docker environment.

Thx to this change (provided by @olasd) in the swh-docker-dev repository, tasks are exectued:

diff --git a/conf/loader.yml b/conf/loader.yml
index 4a4fb54..0cc07e6 100644
--- a/conf/loader.yml
+++ b/conf/loader.yml
@@ -5,6 +5,7 @@ storage:
 celery:
   task_broker: amqp://guest:guest@amqp//
   task_modules:
+    - swh.loader.package.tasks
     - swh.loader.debian.tasks
     - swh.loader.dir.tasks
     - swh.loader.git.tasks
@@ -16,6 +17,7 @@ celery:
     - swh.deposit.loader.tasks
 
   task_queues:
+    - swh.loader.package.tasks.LoadGNU
     - swh.loader.debian.tasks.LoadDebianPackage
     - swh.loader.dir.tasks.LoadDirRepository
     - swh.loader.git.tasks.LoadDiskGitRepository

Hm, i don't know how to close this issue, but it should be closed now.
Thx.

ardumont claimed this task.