Page MenuHomeSoftware Heritage

tutorial: How to run a new lister (within docker-dev)

Authored by nahimilega on May 3 2019, 10:23 AM.



Added a note informing about adding the name of new lister in the main in order to make it functional

As there is no documentation of the procedure to run lister with docker
we can extend the approach of this diff to -

  • Include how to mount your local development modifications (docker-compose.override.yml).
  • Add a section titled "How to run a *new* lister"

Diff Detail

rDLS Listers
Automatic diff as part of commit; lint not applicable.
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Build has FAILED

You need to rebase your change on master

I have one diff (gnu lister) already rebased on the master, how can I have both the diffs rebased on master?

Use git branches (rebased on latest master) and arc diff from those.

Added an important note in lister tutorial

This revision is now accepted and ready to land.Jun 6 2019, 5:06 PM
This revision now requires review to proceed.Jun 12 2019, 1:56 PM
nahimilega edited the summary of this revision. (Show Details)

Add testing and how to run a new lister in docker section in tutorial, also fixed few grammatical errors

ardumont added inline comments.

I'd keep the original which i found clearer.


Please keep backward here, it's correct.




I'm wondering if we want to enter into so much details.

@douardda What do you think? Isn't docker-compose up enough?


Truncate the output.
Also, keep it formatted as the real cli do, it's nicer to read with indentation.


is creating new loading task not yet registered, you need to register that task type as well.


Well, yes and no.
We will try to avoid using python top-level, let's use the scheduler instead.

You need to schedule a task, for example with the gnu lister:

swh scheduler --url http://localhost:5008/ task add list-gnu-full --policy oneshot

It's missing a new chapter that explicits we are changing subject here.

Ok, i think i see now.
Keep the original chapter as it is already.

Then add a new one to explicit how to run (and test) lister within the docker-dev environment.
After this one...

Let's discuss this in irc...

This revision now requires changes to proceed.Jun 13 2019, 2:02 PM
nahimilega added inline comments.

Here I intentionally did this because docker-compose up will start all the containers with could be harsh on the pc. Running all the docker containers while working eats up all my RAM

nahimilega marked 6 inline comments as done.

Made suggested changes

ardumont retitled this revision from Added an important note in lister tutorial. to tutorial: How to run a new lister (within docker-dev).Jun 13 2019, 7:34 PM
  • Add new page "run_a new_lister"
ardumont added inline comments.
4 ↗(On Diff #5251)

rename this to .. _run-lister-tutorial

You should then be able to link it

9 ↗(On Diff #5251)

docker-dev. It provides an almost production-like environment.

11 ↗(On Diff #5251)

when running in production.

I think we can drop Hence ... deployment.

16 ↗(On Diff #5251)
  1. Edit a docker-compose.override.yml, following the sample provided:
29 ↗(On Diff #5251)

Highlight Preparation steps and Configuration

63 ↗(On Diff #5251)

. You (missing a space in between)

64 ↗(On Diff #5251)



You say it already in the page, just propose it:

After tests, it's suggested to run your new lister in docker-dev: `How to run a lister <Link_to_the_page>`.

Yes, but that's a detail of your machine (as annoying as it is).
There should be a global note (which i think already exists) in the main part of the docker-dev (or getting-started) which explains the caveats.

Also, by exposing the service names here, that pose a potential problem if we ever change it (docker-compose won't though).

This revision now requires changes to proceed.Jun 14 2019, 6:43 PM
67–68 ↗(On Diff #5251)

I tested this method of running lister, but when I used it, it only created a task of "list-gnu-full", nothing else. No other code was executed, neither more task of "load-gnu" created.
Whereas when I used in python code which was mentioned in README, it ran pretty nice and created loading tasks.

Can you please tell me what could be the reason for this behaviour?

nahimilega marked 8 inline comments as done.
  • Fixed the docunment according to suggestion
ardumont added inline comments.
9 ↗(On Diff #5257)
... docker-dev. This provides ...

Prefer short sentence when possible.

12 ↗(On Diff #5257)
To run a lister within your local environment:
27 ↗(On Diff #5257)

Please explain that this use docker with mount points on your local environment (your modified code) and not the pypi one.
That allows to execute the development version.

39 ↗(On Diff #5257)

You are missing the backend name (fqdn of the lister task needed for the code to run) as 2nd parameter of the cli, please fix this to:

swh scheduler task-type add list-gnu-full "swh.lister.gnu.tasks.GNUListerTask" "Full GNU lister" \
  --default-interval '1 day' \
  --backoff-factor 1

Note: that must be the reason of your error below.

When developing a new lister, it's important to test. 
For this, add the tests (check `swh/lister/*/tests/`) and
register the celery tasks in the main (`swh/lister....`).

Another important step is to actually run it within the 
docker-dev (:ref:`run-lister-tutorial`).
This revision now requires changes to proceed.Jun 17 2019, 2:39 PM
nahimilega marked 4 inline comments as done.
  • Made recommended changes
39 ↗(On Diff #5257)

I added the task type as you recommended.

(swh) archit@work-pc:~/swh-environment/swh-lister$ swh scheduler task-type list -v
list-gnu-full: swh.lister.gnu.tasks.GNUListerTask
  Full GNU lister
  interval: 1 day, 0:00:00 [None, None]
  backoff_factor: 1.0
  max_queue_length: None
  num_retries: None
  retry_delay: None

Now I created the task as

swh scheduler --url http://localhost:5008/ task add \
      list-gnu-full --policy oneshot

But when I checked for tasks created by the lister, there were none

(swh) archit@work-pc:~/swh-environment/swh-lister$ swh scheduler task list
Found 3 tasks

Task 16809
  Next run: 12 minutes ago (2019-06-17 19:02:11+00:00)
  Interval: 1 day, 0:00:00
  Type: list-gnu-full
  Policy: oneshot
  Status: disabled
  Keyword args:

There is something still something wrong. The lister is not creating the loading task.
Can you please help me find where the steps went wrong.

39 ↗(On Diff #5257)

Have you added the associated load-gnu task (as correctly mentioned in the doc ;)?

I did not see it earlier, in your command sample, please insert a space between $ and the start of the command.
I think it's way clearer to read.
Also it eases copy/paste from a user perspective (one cannot always see clearly the limit between the prompt and the start of the cli she wants to execute).

Almost there (again).
But this time, i'm right (hopefully ;)

73 ↗(On Diff #5292)

You can also add a note to check that the lister's cache (table in db) has been indeed filled.


That paragraph is redundant with the previous one which is much better (which also has sphinx ready link).
Please remove it.

This revision now requires changes to proceed.Jun 19 2019, 8:52 AM
nahimilega marked an inline comment as done.
  • Add point 3 in 'run a new lister' and made recommended changes
This revision is now accepted and ready to land.Jun 20 2019, 10:00 AM

I am not able to push this

(swh) archit@work-pc:~/swh-environment/swh-lister$ git push
Counting objects: 9, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (9/9), done.
Writing objects: 100% (9/9), 3.58 KiB | 1.19 MiB/s, done.
Total 9 (delta 6), reused 0 (delta 0)
remote: This push was rejected by Herald push rule H16.
remote:     Change: commit/
remote:       Rule: Non-developers require code review for pushing content
remote:     Reason: needs review
remote: Transcript:
 ! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to ''

I am not able to push this

I think because I have rebased it on latest master after merging D1612. This could be the reason

This revision was landed with ongoing or failed builds.Jun 20 2019, 10:46 AM
This revision was automatically updated to reflect the committed changes.