Page MenuHomeSoftware Heritage

models: Keep scheduler task ids reference on deposit model
ClosedPublic

Authored by ardumont on May 7 2019, 1:53 PM.

Details

Summary

First step to actually ease the rescheduling of new deposits, keeping the
scheduler task identifiers reference on the deposit side.

Related T1703

Test Plan

tox

Then using docker-dev:

doco up -d
  • Trigger a deposit
swh deposit upload --url http://localhost:5006/1 --username test \
                   --password test --collection test \
                   --archive ../swh-docker-dev.tgz  --author mg \
                   --name 'swh-docker-dev'
  • check in db the check_task_id and load_task_id are referenced within the deposit record
$ doco exec swh-deposit bash -c 'psql swh-deposit -c "select id, status, swh_id, check_task_id, load_task_id from deposit"'
 id | status |                       swh_id                       | check_task_id | load_task_id
----+--------+----------------------------------------------------+---------------+--------------
  1 | done   | swh:1:dir:3b0919ddd42be1ba0405d33f383b6e0ee8dedcba | 1             | 2
  (1 row)
  • check those corresponds to the scheduling task:
$ swh scheduler task list
Found 2 tasks

Task 1
  Next run: 19 minutes ago (2019-05-07 11:25:32+00:00)
  Interval: 1 day, 0:00:00
    Type: swh-deposit-archive-checks
    Policy: oneshot
    Status: completed
    Priority:
    Args:
    Keyword args:
      deposit_check_url: '/1/private/test/1/check/'

Task 2
  Next run: 3 minutes ago (2019-05-07 11:41:27+00:00)
    Interval: 1 day, 0:00:00
    Type: swh-deposit-archive-loading
    Policy: oneshot
    Status: next_run_scheduled  # <- strange as the scheduling took place, issue unrelated to the deposit's code though
    Priority:
    Args:
    Keyword args:
      archive_url: '/1/private/test/1/raw/'
      deposit_meta_url: '/1/private/test/1/meta/'
      deposit_update_url: '/1/private/test/1/update/'
  • Empty the record and change status to 'verified'
swh-deposit=# update deposit
swh-deposit-# set status='verified', swh_id=null, swh_anchor_id=null, swh_id_context=null, swh_anchor_id_context=null
swh-deposit-# where id=1;
UPDATE 1
swh-deposit=# select id, status, swh_id, check_task_id, load_task_id from deposit;
 id |  status  | swh_id | check_task_id | load_task_id
----+----------+--------+---------------+--------------
  1 | verified |        | 1             | 2
  (1 row)
  • Respawn manually the loading task using the associated task id
$ swh scheduler task respawn 2
  • Wait for the loading to keep up
  • Check the deposit's status is 'done' again with the right ids (same as initial)
swh-deposit=# select id, status, swh_id, check_task_id, load_task_id from deposit;
 id | status |                       swh_id                       | check_task_id | load_task_id
----+--------+----------------------------------------------------+---------------+--------------
  1 | done   | swh:1:dir:3b0919ddd42be1ba0405d33f383b6e0ee8dedcba | 1             | 2
  (1 row)

Diff Detail

Repository
rDDEP Push deposit
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 5622
Build 7666: tox-on-jenkinsJenkins
Build 7665: arc lint + arc unit

Event Timeline

ardumont edited the test plan for this revision. (Show Details)
ardumont added a project: SWORD deposit.
ardumont edited the test plan for this revision. (Show Details)
swh/deposit/signals.py
76

This is because this function call will also be triggered by the instance.save() below...

swh/deposit/signals.py
76

So put this in a comment in the code instead of here (IMHO)

swh/deposit/signals.py
76

Right, did not really know where to put it ;)

swh/deposit/models.py
125–126

I'm gonna make my annoying nitpicker, but there should be something in there (comment or description string) giving a clue on what these fields are for.

douardda requested changes to this revision.May 7 2019, 3:55 PM
This revision now requires changes to proceed.May 7 2019, 3:55 PM
swh/deposit/models.py
125–126

That ain't nitpicking ;)

swh/deposit/models.py
125–126

TIL: https://docs.djangoproject.com/en/2.2/topics/db/models/#verbose-field-names

There is a verbose_name parameter for that ;)

Add verbose_name to new fields

This revision is now accepted and ready to land.May 7 2019, 5:35 PM
This revision was automatically updated to reflect the committed changes.