Page MenuHomeSoftware Heritage

visit_stats: Update references when per event date is more recent
AbandonedPublicDraft

Authored by ardumont on Jan 13 2021, 6:44 PM.

Details

Summary

Prior to this commit, the upsert did not take into account the dates and systematically
overwrote previous "most recent" inputs.

This now tries to update only when the provided date (per event) are most
recent (events: eventful, uneventful, failed).

The last_snapshot entry is then only updated when the last_eventful date is most
recent then the one currently stored.

Related to T2967
Depends on D4853

Test Plan

tox
(failing until swh.model > 0.9 [1] is released)

[1] D4848

Diff Detail

Repository
rDSCH Scheduling utilities
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 18340
Build 28332: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 28331: arc lint + arc unit

Event Timeline

Build has FAILED

Patch application report for D4859 (id=17194)

Could not rebase; Attempt merge onto a62003397d...

Updating a620033..382f40d
Fast-forward
 sql/updates/20.sql                         |  4 ++
 swh/scheduler/backend.py                   | 39 ++++++++----
 swh/scheduler/journal_client.py            | 47 +++++++++++++++
 swh/scheduler/model.py                     |  3 +
 swh/scheduler/sql/10-superuser-init.sql    |  1 +
 swh/scheduler/sql/30-schema.sql            |  4 +-
 swh/scheduler/tests/test_journal_client.py | 97 ++++++++++++++++++++++++++++++
 swh/scheduler/tests/test_scheduler.py      | 76 +++++++++++++++++++++++
 8 files changed, 258 insertions(+), 13 deletions(-)
 create mode 100644 sql/updates/20.sql
 create mode 100644 swh/scheduler/journal_client.py
 create mode 100644 swh/scheduler/tests/test_journal_client.py
Changes applied before test
commit 382f40d4b11876d2f1823fd8efa2d450f00c5697
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jan 13 18:36:49 2021 +0100

    visit_stats: Update references when per event date is more recent
    
    Prior to this commit, the upsert did not take into account the dates and systematically
    overwrote previous inputs.
    
    This now tries to update only when the provided date (per event) are most
    recent (events: eventful, uneventful, failed).
    
    The last_snapshot entry is then only updated when the `last_eventful` date is most
    recent then the one currently stored.
    
    Related to T2967

commit 1e05e7154156a481f86d08c41e6986b6a3b5ef99
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jan 13 15:32:36 2021 +0100

    visit_stats: Keep a reference to the last snapshot
    
    Related to T2967

commit ab9871d3f14ccc4510ef0912af45716259630878
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jan 13 13:03:28 2021 +0100

    Populate origin_visit_stats table out of the origin_visit_status topic
    
    Related to T2967

Link to build: https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/111/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/111/console

Harbormaster returned this revision to the author for changes because remote builds failed.Jan 13 2021, 6:46 PM
Harbormaster failed remote builds in B18338: Diff 17194!

Build has FAILED

Patch application report for D4859 (id=17196)

Could not rebase; Attempt merge onto a62003397d...

Updating a620033..f4e593d
Fast-forward
 sql/updates/21.sql                         |  4 ++
 swh/scheduler/backend.py                   | 39 ++++++++----
 swh/scheduler/journal_client.py            | 47 +++++++++++++++
 swh/scheduler/model.py                     |  3 +
 swh/scheduler/sql/30-schema.sql            |  4 +-
 swh/scheduler/tests/test_journal_client.py | 97 ++++++++++++++++++++++++++++++
 swh/scheduler/tests/test_scheduler.py      | 76 +++++++++++++++++++++++
 7 files changed, 257 insertions(+), 13 deletions(-)
 create mode 100644 sql/updates/21.sql
 create mode 100644 swh/scheduler/journal_client.py
 create mode 100644 swh/scheduler/tests/test_journal_client.py
Changes applied before test
commit f4e593dd5fa2aa95c7c7efb2c733607404db3ce9
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jan 13 18:36:49 2021 +0100

    visit_stats: Update references when per event date is more recent
    
    Prior to this commit, the upsert did not take into account the dates and systematically
    overwrote previous inputs.
    
    This now tries to update only when the provided date (per event) are most
    recent (events: eventful, uneventful, failed).
    
    The last_snapshot entry is then only updated when the `last_eventful` date is most
    recent then the one currently stored.
    
    Related to T2967

commit be20e09b877837302e1aeade259be29a93988798
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jan 13 15:32:36 2021 +0100

    visit_stats: Keep a reference to the last snapshot
    
    Related to T2967

commit ab9871d3f14ccc4510ef0912af45716259630878
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jan 13 13:03:28 2021 +0100

    Populate origin_visit_stats table out of the origin_visit_status topic
    
    Related to T2967

Link to build: https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/113/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/113/console

Harbormaster returned this revision to the author for changes because remote builds failed.Jan 13 2021, 6:50 PM
Harbormaster failed remote builds in B18340: Diff 17196!
swh/scheduler/backend.py
799

i explicitely kept the when cases for now but that can be shortened.