Page MenuHomeSoftware Heritage

nixguix: Deal with mistyped origins
ClosedPublic

Authored by ardumont on Oct 4 2022, 1:02 PM.

Details

Summary

Some origins are listed as urls while they are not. They are possibly vcs. So this
commit tries to detect and and deal with those if possible. If not possible, they are
skipped.

Related to T3781
Related to P1470
Depends on D8605

Diff Detail

Repository
rDLS Listers
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

swh/lister/nixguix/tests/test_lister.py
163

can be dropped now.

Build is green

Patch application report for D8606 (id=31088)

Could not rebase; Attempt merge onto fa1205c4df...

Updating fa1205c..706abfa
Fast-forward
 requirements-swh.txt                               |   2 +-
 setup.py                                           |   1 +
 swh/lister/__init__.py                             |  22 ++
 swh/lister/gnu/tree.py                             |  21 +-
 swh/lister/nixguix/__init__.py                     |  38 ++
 swh/lister/nixguix/lister.py                       | 410 +++++++++++++++++++++
 swh/lister/nixguix/tasks.py                        |  14 +
 swh/lister/nixguix/tests/__init__.py               |   0
 .../nixguix/tests/data/guix-swh_sources.json       |  24 ++
 .../nixguix/tests/data/nixpkgs-swh_sources.json    |  57 +++
 swh/lister/nixguix/tests/test_lister.py            | 265 +++++++++++++
 swh/lister/nixguix/tests/test_tasks.py             |  27 ++
 swh/lister/tests/test_cli.py                       |   4 +
 13 files changed, 867 insertions(+), 18 deletions(-)
 create mode 100644 swh/lister/nixguix/__init__.py
 create mode 100644 swh/lister/nixguix/lister.py
 create mode 100644 swh/lister/nixguix/tasks.py
 create mode 100644 swh/lister/nixguix/tests/__init__.py
 create mode 100644 swh/lister/nixguix/tests/data/guix-swh_sources.json
 create mode 100644 swh/lister/nixguix/tests/data/nixpkgs-swh_sources.json
 create mode 100644 swh/lister/nixguix/tests/test_lister.py
 create mode 100644 swh/lister/nixguix/tests/test_tasks.py
Changes applied before test
commit 706abfa9cefc688c9e2c3fb065fd6cba9566d325
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Oct 4 13:00:18 2022 +0200

    nixguix: Deal with mistyped origins
    
    Some origins are listed as urls whereas they are possibly vcs. So detect and try to deal
    with those. If eventually, they are not recognized, they are skipped.
    
    Related to T3781
    Related to P1470

commit 1b4fe51f62c706a9ef77b8eea74e111bb8be3542
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Oct 4 10:57:32 2022 +0200

    nixguix: Randomize order of listed origins
    
    The end goal is to ingest sparsely the origins, that would avoid hitting the various
    servers around the same time for colocated origins in the upstream manifest (especially
    file or tarball).
    
    Related to T3781

commit 94b6dbea0a7f602be0711a3bb1f9bb9e16fc48ce
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Sat Oct 1 16:41:48 2022 +0200

    nixguix: Document lister
    
    Related to T3781

commit 6d2e7aa17808e39ba9f493b65d662d0ddef5796c
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Sat Oct 1 16:12:46 2022 +0200

    nixguix: Register task
    
    Related to T3781

commit fbfdf88ea4fe79c4846ecd48f2a1322f5d3995fc
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Aug 30 11:17:33 2022 +0200

    nixguix: Add lister
    
    Related to T3781

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/737/ for more details.

Should we try asking Nixpkgs devs to fix their manifest instead?

Should we try asking Nixpkgs devs to fix their manifest instead?

yes, that too. In the mean time, let's make it work.

I've got another fix to do about expired certs...

anlambert added inline comments.
swh/lister/nixguix/lister.py
135

There is also tarball URLs with ftp scheme, for instance ftp://ftp.ourproject.org/pub/ytalk/ytalk-3.3.0.tar.gz with the nixguix listing.

swh/lister/nixguix/lister.py
136

More standard

326–328

ditto

ardumont marked 3 inline comments as done.

Adapt according to review:

  • let ftp urls through
  • format log or exception message with <%s>

Build is green

Patch application report for D8606 (id=31089)

Could not rebase; Attempt merge onto fa1205c4df...

Updating fa1205c..acfac04
Fast-forward
 requirements-swh.txt                               |   2 +-
 setup.py                                           |   1 +
 swh/lister/__init__.py                             |  22 ++
 swh/lister/gnu/tree.py                             |  21 +-
 swh/lister/nixguix/__init__.py                     |  38 ++
 swh/lister/nixguix/lister.py                       | 410 +++++++++++++++++++++
 swh/lister/nixguix/tasks.py                        |  14 +
 swh/lister/nixguix/tests/__init__.py               |   0
 .../nixguix/tests/data/guix-swh_sources.json       |  24 ++
 .../nixguix/tests/data/nixpkgs-swh_sources.json    |  64 ++++
 swh/lister/nixguix/tests/test_lister.py            | 259 +++++++++++++
 swh/lister/nixguix/tests/test_tasks.py             |  27 ++
 swh/lister/tests/test_cli.py                       |   4 +
 13 files changed, 868 insertions(+), 18 deletions(-)
 create mode 100644 swh/lister/nixguix/__init__.py
 create mode 100644 swh/lister/nixguix/lister.py
 create mode 100644 swh/lister/nixguix/tasks.py
 create mode 100644 swh/lister/nixguix/tests/__init__.py
 create mode 100644 swh/lister/nixguix/tests/data/guix-swh_sources.json
 create mode 100644 swh/lister/nixguix/tests/data/nixpkgs-swh_sources.json
 create mode 100644 swh/lister/nixguix/tests/test_lister.py
 create mode 100644 swh/lister/nixguix/tests/test_tasks.py
Changes applied before test
commit acfac0462a4b0919891158d25053a358ad0894f8
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Oct 4 13:00:18 2022 +0200

    nixguix: Deal with mistyped origins
    
    Some origins are listed as urls whereas they are possibly vcs. So detect this case to
    try and deal with those if possible. If not possible, they are skipped.
    
    Related to T3781
    Related to P1470

commit 1b4fe51f62c706a9ef77b8eea74e111bb8be3542
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Oct 4 10:57:32 2022 +0200

    nixguix: Randomize order of listed origins
    
    The end goal is to ingest sparsely the origins, that would avoid hitting the various
    servers around the same time for colocated origins in the upstream manifest (especially
    file or tarball).
    
    Related to T3781

commit 94b6dbea0a7f602be0711a3bb1f9bb9e16fc48ce
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Sat Oct 1 16:41:48 2022 +0200

    nixguix: Document lister
    
    Related to T3781

commit 6d2e7aa17808e39ba9f493b65d662d0ddef5796c
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Sat Oct 1 16:12:46 2022 +0200

    nixguix: Register task
    
    Related to T3781

commit fbfdf88ea4fe79c4846ecd48f2a1322f5d3995fc
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Aug 30 11:17:33 2022 +0200

    nixguix: Add lister
    
    Related to T3781

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/738/ for more details.

Deal with ftp targetting a file without extension which will trigger a head request and
fail. This will skip it as nothing simple can be done to detect the nature of the file
then.

what's been said before (with the right local commit inside this time...)

Build was aborted

Patch application report for D8606 (id=31090)

Could not rebase; Attempt merge onto fa1205c4df...

Updating fa1205c..a4de171
Fast-forward
 requirements-swh.txt                               |   2 +-
 setup.py                                           |   1 +
 swh/lister/__init__.py                             |  22 ++
 swh/lister/gnu/tree.py                             |  21 +-
 swh/lister/nixguix/__init__.py                     |  38 ++
 swh/lister/nixguix/lister.py                       | 417 +++++++++++++++++++++
 swh/lister/nixguix/tasks.py                        |  14 +
 swh/lister/nixguix/tests/__init__.py               |   0
 .../nixguix/tests/data/guix-swh_sources.json       |  31 ++
 .../nixguix/tests/data/nixpkgs-swh_sources.json    |  64 ++++
 swh/lister/nixguix/tests/test_lister.py            | 265 +++++++++++++
 swh/lister/nixguix/tests/test_tasks.py             |  27 ++
 swh/lister/tests/test_cli.py                       |   4 +
 13 files changed, 888 insertions(+), 18 deletions(-)
 create mode 100644 swh/lister/nixguix/__init__.py
 create mode 100644 swh/lister/nixguix/lister.py
 create mode 100644 swh/lister/nixguix/tasks.py
 create mode 100644 swh/lister/nixguix/tests/__init__.py
 create mode 100644 swh/lister/nixguix/tests/data/guix-swh_sources.json
 create mode 100644 swh/lister/nixguix/tests/data/nixpkgs-swh_sources.json
 create mode 100644 swh/lister/nixguix/tests/test_lister.py
 create mode 100644 swh/lister/nixguix/tests/test_tasks.py
Changes applied before test
commit a4de1715eaaff83d1a7d1a6a764467b4c936a947
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Oct 4 13:00:18 2022 +0200

    nixguix: Deal with mistyped origins
    
    Some origins are listed as urls whereas they are possibly vcs. So detect this case to
    try and deal with those if possible. If not possible, they are skipped.
    
    Related to T3781
    Related to P1470

commit 1b4fe51f62c706a9ef77b8eea74e111bb8be3542
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Oct 4 10:57:32 2022 +0200

    nixguix: Randomize order of listed origins
    
    The end goal is to ingest sparsely the origins, that would avoid hitting the various
    servers around the same time for colocated origins in the upstream manifest (especially
    file or tarball).
    
    Related to T3781

commit 94b6dbea0a7f602be0711a3bb1f9bb9e16fc48ce
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Sat Oct 1 16:41:48 2022 +0200

    nixguix: Document lister
    
    Related to T3781

commit 6d2e7aa17808e39ba9f493b65d662d0ddef5796c
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Sat Oct 1 16:12:46 2022 +0200

    nixguix: Register task
    
    Related to T3781

commit fbfdf88ea4fe79c4846ecd48f2a1322f5d3995fc
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Aug 30 11:17:33 2022 +0200

    nixguix: Add lister
    
    Related to T3781

Link to build: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/739/
See console output for more information: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/739/console

Build is green

Patch application report for D8606 (id=31091)

Could not rebase; Attempt merge onto fa1205c4df...

Updating fa1205c..a94b75f
Fast-forward
 requirements-swh.txt                               |   2 +-
 setup.py                                           |   1 +
 swh/lister/__init__.py                             |  22 ++
 swh/lister/gnu/tree.py                             |  21 +-
 swh/lister/nixguix/__init__.py                     |  38 ++
 swh/lister/nixguix/lister.py                       | 417 +++++++++++++++++++++
 swh/lister/nixguix/tasks.py                        |  14 +
 swh/lister/nixguix/tests/__init__.py               |   0
 .../nixguix/tests/data/guix-swh_sources.json       |  31 ++
 .../nixguix/tests/data/nixpkgs-swh_sources.json    |  64 ++++
 swh/lister/nixguix/tests/test_lister.py            | 265 +++++++++++++
 swh/lister/nixguix/tests/test_tasks.py             |  27 ++
 swh/lister/tests/test_cli.py                       |   4 +
 13 files changed, 888 insertions(+), 18 deletions(-)
 create mode 100644 swh/lister/nixguix/__init__.py
 create mode 100644 swh/lister/nixguix/lister.py
 create mode 100644 swh/lister/nixguix/tasks.py
 create mode 100644 swh/lister/nixguix/tests/__init__.py
 create mode 100644 swh/lister/nixguix/tests/data/guix-swh_sources.json
 create mode 100644 swh/lister/nixguix/tests/data/nixpkgs-swh_sources.json
 create mode 100644 swh/lister/nixguix/tests/test_lister.py
 create mode 100644 swh/lister/nixguix/tests/test_tasks.py
Changes applied before test
commit a94b75f366be5722c47022ce7afb55384bc8fbb6
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Oct 4 13:00:18 2022 +0200

    nixguix: Deal with mistyped origins
    
    Some origins are listed as urls while they are not. They are possibly vcs. So this
    commit tries to detect and and deal with those if possible. If not possible, they are
    skipped.
    
    Related to T3781
    Related to P1470

commit 1b4fe51f62c706a9ef77b8eea74e111bb8be3542
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Oct 4 10:57:32 2022 +0200

    nixguix: Randomize order of listed origins
    
    The end goal is to ingest sparsely the origins, that would avoid hitting the various
    servers around the same time for colocated origins in the upstream manifest (especially
    file or tarball).
    
    Related to T3781

commit 94b6dbea0a7f602be0711a3bb1f9bb9e16fc48ce
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Sat Oct 1 16:41:48 2022 +0200

    nixguix: Document lister
    
    Related to T3781

commit 6d2e7aa17808e39ba9f493b65d662d0ddef5796c
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Sat Oct 1 16:12:46 2022 +0200

    nixguix: Register task
    
    Related to T3781

commit fbfdf88ea4fe79c4846ecd48f2a1322f5d3995fc
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Aug 30 11:17:33 2022 +0200

    nixguix: Add lister
    
    Related to T3781

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/740/ for more details.

This revision is now accepted and ready to land.Oct 4 2022, 2:32 PM
This revision was automatically updated to reflect the committed changes.