Page MenuHomeSoftware Heritage

package: Mark visit status as failed when relevant
ClosedPublic

Authored by ardumont on Feb 5 2021, 12:26 PM.

Details

Summary

When:

  • failure to communicate internally with the storage
  • absolutely no revision got loaded during a visit

Note:
Coverage got filled in where status changes happened.

Related to T3030

Diff Detail

Repository
rDLDBASE Generic VCS/Package Loader
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D5024 (id=17913)

Could not rebase; Attempt merge onto b332f4acd1...

Updating b332f4a..c6a4f5f
Fast-forward
 swh/loader/core/loader.py                  |  2 ++
 swh/loader/package/loader.py               |  9 +++++----
 swh/loader/package/npm/tests/test_npm.py   |  2 +-
 swh/loader/package/pypi/tests/test_pypi.py | 32 +++++++++++++++++++++++++++++-
 4 files changed, 39 insertions(+), 6 deletions(-)
Changes applied before test
commit c6a4f5f6c6fff953902fa097c75b57ebd8aac397
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Feb 5 12:25:12 2021 +0100

    package: Make visit status 'failed' when relevant
    
    When:
    - failure to communicate internally with the storage
    - absolutely no revision got loaded during a visit
    
    Related to T3030

commit 265d3b2b2ac6047cee40e26b145be500a9749a41
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Feb 5 11:41:41 2021 +0100

    loader: Make loader write the origin_visit_status' type
    
    Related to T3030

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/343/ for more details.

vlorentz requested changes to this revision.Feb 5 2021, 2:10 PM
vlorentz added a subscriber: vlorentz.
vlorentz added inline comments.
swh/loader/package/loader.py
343

alternatively

486
497
This revision now requires changes to proceed.Feb 5 2021, 2:10 PM
swh/loader/package/loader.py
486

we'll potentially end up with visits in failed status with a snapshot, is that something we want?

We settled on no hence why that stayed that way, same for your 2nd suggestion.

swh/loader/package/loader.py
486

which should be fine if we update the current swh.scheduler.journal_client as well to deal with the failed status (it does it partially currently).

swh/loader/package/loader.py
486

see scheduler journal client adaptation in D5028

Adapt according to review

This needs D5028 adaptations in the scheduler with the current diff adaptation now.

Build is green

Patch application report for D5024 (id=17934)

Rebasing onto 65a513578d...

Current branch diff-target is up to date.
Changes applied before test
commit 838ef55bf932e1c545cdc53fcf5a00cf4885c16f
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Feb 5 12:25:12 2021 +0100

    package: Make visit status 'failed' when relevant
    
    When:
    - failure to communicate internally with the storage
    - absolutely no revision got loaded during a visit
    
    Related to T3030

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/346/ for more details.

ardumont marked 2 inline comments as done.

Drop unneeded changes

Build is green

Patch application report for D5024 (id=17936)

Rebasing onto 65a513578d...

Current branch diff-target is up to date.
Changes applied before test
commit a77e748c184349c3ab3f4d674a6edb3f9f187e63
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Feb 5 12:25:12 2021 +0100

    package: Make visit status 'failed' when relevant
    
    When:
    - failure to communicate internally with the storage
    - absolutely no revision got loaded during a visit
    
    Related to T3030

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/347/ for more details.

swh/loader/package/loader.py
486

oh, that's the difference between failed an partial? "partial" is fine then

swh/loader/package/loader.py
486

oh, that's the difference between failed an partial? "partial" is fine then

Well, now i'm mostly expecting to have failed tasks without snapshot.
Whereas with partial, that can happen.

I'm fine with settling on failing the task here because something went wrong.
That's less question to ask oneself when coding [1]

"What's the status here? Well, it failed so failed!"

[1] As long as we merge D5028 along, that is.

ardumont retitled this revision from package: Make visit status 'failed' when relevant to package: Mark visit status as failed when relevant.Feb 5 2021, 5:44 PM

Build is green

Patch application report for D5024 (id=17944)

Rebasing onto 65a513578d...

Current branch diff-target is up to date.
Changes applied before test
commit 3b4a023468714a23f42eba4ef4e848ca209cc6b0
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Feb 5 12:25:12 2021 +0100

    package: Mark visit status as failed when relevant
    
    When:
    - failure to communicate internally with the storage
    - absolutely no revision got loaded during a visit
    
    Related to T3030

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/349/ for more details.

swh/loader/package/loader.py
486

then what is the meaning of "partial"?

swh/loader/package/loader.py
486

partially ingested stuff (so partially failing to ingest those)...

Here it'd make sense to keep partial as most if not all got ingested (thus the initial diff state).

Adapt according to my understanding on the review: Keep status partial when visit only
failed during the load extrinsic metadata step.

@vlorentz you good ^ ?

Build is green

Patch application report for D5024 (id=17947)

Rebasing onto 65a513578d...

Current branch diff-target is up to date.
Changes applied before test
commit 4500063a2fc5467fdd3ee2c0da25fa70a1fd919a
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Feb 5 12:25:12 2021 +0100

    package: Mark visit status as failed when relevant
    
    When:
    - failure to communicate internally with the storage
    - absolutely no revision got loaded during a visit
    
    Related to T3030

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/352/ for more details.

This revision is now accepted and ready to land.Feb 5 2021, 11:21 PM