Page MenuHomeSoftware Heritage

Add incremental function to Golang Lister
ClosedPublic

Authored by Alphare on Aug 23 2022, 5:11 PM.

Diff Detail

Repository
rDLS Listers
Branch
golang-D8298
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 31047
Build 48573: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 48572: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D8298 (id=29962)

Could not rebase; Attempt merge onto 4b511b4181...

Auto-merging setup.py
Auto-merging README.md
Merge made by the 'recursive' strategy.
 README.md                               |   3 +-
 setup.py                                |   1 +
 swh/lister/golang/__init__.py           |  12 +++
 swh/lister/golang/lister.py             | 152 +++++++++++++++++++++++++++
 swh/lister/golang/tasks.py              |  18 ++++
 swh/lister/golang/tests/__init__.py     |   0
 swh/lister/golang/tests/data/page-1.txt |   5 +
 swh/lister/golang/tests/data/page-2.txt |   4 +
 swh/lister/golang/tests/data/page-3.txt |  10 ++
 swh/lister/golang/tests/test_lister.py  | 177 ++++++++++++++++++++++++++++++++
 swh/lister/golang/tests/test_tasks.py   |  32 ++++++
 11 files changed, 413 insertions(+), 1 deletion(-)
 create mode 100644 swh/lister/golang/__init__.py
 create mode 100644 swh/lister/golang/lister.py
 create mode 100644 swh/lister/golang/tasks.py
 create mode 100644 swh/lister/golang/tests/__init__.py
 create mode 100644 swh/lister/golang/tests/data/page-1.txt
 create mode 100644 swh/lister/golang/tests/data/page-2.txt
 create mode 100644 swh/lister/golang/tests/data/page-3.txt
 create mode 100644 swh/lister/golang/tests/test_lister.py
 create mode 100644 swh/lister/golang/tests/test_tasks.py
Changes applied before test
commit f611c4f9b4780f23606c7e6894f782cc8353952c
Merge: 4b511b4 ea62f31
Author: Jenkins user <jenkins@localhost>
Date:   Tue Aug 23 15:11:14 2022 +0000

    Merge branch 'diff-target' into HEAD

commit ea62f31444b187191550f68bb122bd7b7fba38f6
Author: Raphaël Gomès <rgomes@octobus.net>
Date:   Tue Aug 23 17:05:30 2022 +0200

    Add incremental function to Golang Lister

commit 785924535018cda297252f18ffa4cfa6d667c921
Author: Raphaël Gomès <rgomes@octobus.net>
Date:   Wed Mar 9 22:35:40 2022 +0100

    Add non-incremental Golang modules lister
    
    This uses https://index.golang.org and lists origins to be loaded using
    the Golang Module Proxy Protocol. An associated loader will be sent in
    the near future, as well as an incremental version of this lister.
    
    [1] https://go.dev/ref/mod#goproxy-protocol

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/607/ for more details.

vlorentz added a subscriber: vlorentz.
vlorentz added inline comments.
swh/lister/golang/tests/test_lister.py
113

;)

(ditto above)

This revision is now accepted and ready to land.Aug 23 2022, 5:29 PM
swh/lister/golang/tests/test_lister.py
113

I had already tried that but datadir is a str and not a Path, maybe that's something that should be changed?

Build is green

Patch application report for D8298 (id=29970)

Could not rebase; Attempt merge onto 4b511b4181...

Auto-merging setup.py
Auto-merging README.md
Merge made by the 'recursive' strategy.
 README.md                               |   3 +-
 setup.py                                |   1 +
 swh/lister/golang/__init__.py           |  12 +++
 swh/lister/golang/lister.py             | 155 ++++++++++++++++++++++++++++
 swh/lister/golang/tasks.py              |  18 ++++
 swh/lister/golang/tests/__init__.py     |   0
 swh/lister/golang/tests/data/page-1.txt |   5 +
 swh/lister/golang/tests/data/page-2.txt |   4 +
 swh/lister/golang/tests/data/page-3.txt |  10 ++
 swh/lister/golang/tests/test_lister.py  | 177 ++++++++++++++++++++++++++++++++
 swh/lister/golang/tests/test_tasks.py   |  32 ++++++
 11 files changed, 416 insertions(+), 1 deletion(-)
 create mode 100644 swh/lister/golang/__init__.py
 create mode 100644 swh/lister/golang/lister.py
 create mode 100644 swh/lister/golang/tasks.py
 create mode 100644 swh/lister/golang/tests/__init__.py
 create mode 100644 swh/lister/golang/tests/data/page-1.txt
 create mode 100644 swh/lister/golang/tests/data/page-2.txt
 create mode 100644 swh/lister/golang/tests/data/page-3.txt
 create mode 100644 swh/lister/golang/tests/test_lister.py
 create mode 100644 swh/lister/golang/tests/test_tasks.py
Changes applied before test
commit b9e32698babc336e82ac3e9e67c1045258e4e25d
Merge: 4b511b4 8f42183
Author: Jenkins user <jenkins@localhost>
Date:   Wed Aug 24 09:13:41 2022 +0000

    Merge branch 'diff-target' into HEAD

commit 8f4218362b645b17771b409ca81738559cec0631
Author: Raphaël Gomès <rgomes@octobus.net>
Date:   Tue Aug 23 17:05:30 2022 +0200

    Add incremental function to Golang Lister

commit 3f50c01e63feffafad1c92e09a4829ad847f8e89
Author: Raphaël Gomès <rgomes@octobus.net>
Date:   Wed Mar 9 22:35:40 2022 +0100

    Add non-incremental Golang modules lister
    
    This uses https://index.golang.org and lists origins to be loaded using
    the Golang Module Proxy Protocol. An associated loader will be sent in
    the near future, as well as an incremental version of this lister.
    
    [1] https://go.dev/ref/mod#goproxy-protocol

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/609/ for more details.

List per package, not per version, prefix with proxy URL

This does not group the ListedOrigin per package yet. I still feel like this
is not necessary since they are upserted and batched, so the performance
penalty should be negligible. Either way I don't have time to write that today,
might as well post a correct implementation in the mean time.

Build is green

Patch application report for D8298 (id=29998)

Could not rebase; Attempt merge onto 4b511b4181...

Auto-merging setup.py
Auto-merging README.md
Merge made by the 'recursive' strategy.
 README.md                               |   3 +-
 setup.py                                |   1 +
 swh/lister/golang/__init__.py           |  12 +++
 swh/lister/golang/lister.py             | 154 ++++++++++++++++++++++++++++++++
 swh/lister/golang/tasks.py              |  18 ++++
 swh/lister/golang/tests/__init__.py     |   0
 swh/lister/golang/tests/data/page-1.txt |   5 ++
 swh/lister/golang/tests/data/page-2.txt |   4 +
 swh/lister/golang/tests/data/page-3.txt |  10 +++
 swh/lister/golang/tests/test_lister.py  | 146 ++++++++++++++++++++++++++++++
 swh/lister/golang/tests/test_tasks.py   |  32 +++++++
 11 files changed, 384 insertions(+), 1 deletion(-)
 create mode 100644 swh/lister/golang/__init__.py
 create mode 100644 swh/lister/golang/lister.py
 create mode 100644 swh/lister/golang/tasks.py
 create mode 100644 swh/lister/golang/tests/__init__.py
 create mode 100644 swh/lister/golang/tests/data/page-1.txt
 create mode 100644 swh/lister/golang/tests/data/page-2.txt
 create mode 100644 swh/lister/golang/tests/data/page-3.txt
 create mode 100644 swh/lister/golang/tests/test_lister.py
 create mode 100644 swh/lister/golang/tests/test_tasks.py
Changes applied before test
commit 05b1e8adf6c810efd520cea484950514d3c88e26
Merge: 4b511b4 364fc90
Author: Jenkins user <jenkins@localhost>
Date:   Wed Aug 24 16:35:32 2022 +0000

    Merge branch 'diff-target' into HEAD

commit 364fc907c75b611b8fc2ee5a4225101f53a18283
Author: Raphaël Gomès <rgomes@octobus.net>
Date:   Tue Aug 23 17:05:30 2022 +0200

    Add incremental function to Golang Lister

commit c9815b48b14efc2a3c81cb4f5d5827ba2d2885c1
Author: Raphaël Gomès <rgomes@octobus.net>
Date:   Wed Mar 9 22:35:40 2022 +0100

    Add non-incremental Golang modules lister
    
    This uses https://index.golang.org and lists origins to be loaded using
    the Golang Module Proxy Protocol. An associated loader will be sent in
    the near future, as well as an incremental version of this lister.
    
    [1] https://go.dev/ref/mod#goproxy-protocol

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/614/ for more details.

anlambert added inline comments.
swh/lister/golang/lister.py
148

Could you use https://pkg.go.dev/{path} instead ?

Current origin URL yields a 404 when trying to browse it, which is not great.

For instance for the golang.org/x/text package:

swh/lister/golang/lister.py
148

Sure, then the loader will need to expect those URLs and substitute them with the proxy prefix. If that's fine with you, I can do that change

swh/lister/golang/lister.py
148

Sounds good to me, thanks!

Build is green

Patch application report for D8298 (id=30020)

Could not rebase; Attempt merge onto ce72969de5...

Auto-merging setup.py
Auto-merging README.md
Merge made by the 'recursive' strategy.
 README.md                               |   3 +-
 setup.py                                |   1 +
 swh/lister/golang/__init__.py           |  12 +++
 swh/lister/golang/lister.py             | 161 ++++++++++++++++++++++++++++++++
 swh/lister/golang/tasks.py              |  18 ++++
 swh/lister/golang/tests/__init__.py     |   0
 swh/lister/golang/tests/data/page-1.txt |   5 +
 swh/lister/golang/tests/data/page-2.txt |   4 +
 swh/lister/golang/tests/data/page-3.txt |  10 ++
 swh/lister/golang/tests/test_lister.py  | 146 +++++++++++++++++++++++++++++
 swh/lister/golang/tests/test_tasks.py   |  32 +++++++
 11 files changed, 391 insertions(+), 1 deletion(-)
 create mode 100644 swh/lister/golang/__init__.py
 create mode 100644 swh/lister/golang/lister.py
 create mode 100644 swh/lister/golang/tasks.py
 create mode 100644 swh/lister/golang/tests/__init__.py
 create mode 100644 swh/lister/golang/tests/data/page-1.txt
 create mode 100644 swh/lister/golang/tests/data/page-2.txt
 create mode 100644 swh/lister/golang/tests/data/page-3.txt
 create mode 100644 swh/lister/golang/tests/test_lister.py
 create mode 100644 swh/lister/golang/tests/test_tasks.py
Changes applied before test
commit 8fef46cc2e8d3fd18c7bab86e1dfb0bd4f0c8ae6
Merge: ce72969 bce3c52
Author: Jenkins user <jenkins@localhost>
Date:   Thu Aug 25 10:00:51 2022 +0000

    Merge branch 'diff-target' into HEAD

commit bce3c520c668a2e0a2b1420514ac4c5228cc07c5
Author: Raphaël Gomès <rgomes@octobus.net>
Date:   Tue Aug 23 17:05:30 2022 +0200

    Add incremental function to Golang Lister

commit 1cc2c6c404ca60cf1b6b77bd1778e34edc5a029b
Author: Raphaël Gomès <rgomes@octobus.net>
Date:   Wed Mar 9 22:35:40 2022 +0100

    Add non-incremental Golang modules lister
    
    This uses https://index.golang.org. An associated loader will be sent in
    the near future, as well as an incremental version of this lister.
    
    [1] https://go.dev/ref/mod#goproxy-protocol

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/619/ for more details.

Please add an incremental parameter to the lister in order to be able to perform a full relisting of go packages if it is set to False
(other listers with incremental feature proceed like this).

You also need to add the celery task for the incremental lister in tasks.py:

@shared_task(name=__name__ + ".IncrementalGolangLister")
def list_golang_incremental(**lister_args):
    """Incremental update of Golang packages"""
    lister = GolangLister.from_configfile(incremental=True, **lister_args)
    return lister.run().dict()
swh/lister/golang/lister.py
55

You should add an incremental parameter to allow full relisting if set to False.

132–133

You should ignore the state here if the lister is not in incremental mode.

This revision now requires changes to proceed.Aug 25 2022, 3:58 PM

Incremental mode, better testing

Build is green

Patch application report for D8298 (id=30084)

Could not rebase; Attempt merge onto 5410b6e3f3...

Auto-merging setup.py
Auto-merging README.md
Merge made by the 'recursive' strategy.
 README.md                               |   3 +-
 setup.py                                |   1 +
 swh/lister/golang/__init__.py           |  12 ++
 swh/lister/golang/lister.py             | 188 +++++++++++++++++++++++++++++++
 swh/lister/golang/tasks.py              |  18 +++
 swh/lister/golang/tests/__init__.py     |   0
 swh/lister/golang/tests/data/page-1.txt |   5 +
 swh/lister/golang/tests/data/page-2.txt |   4 +
 swh/lister/golang/tests/data/page-3.txt |  10 ++
 swh/lister/golang/tests/test_lister.py  | 193 ++++++++++++++++++++++++++++++++
 swh/lister/golang/tests/test_tasks.py   |  32 ++++++
 11 files changed, 465 insertions(+), 1 deletion(-)
 create mode 100644 swh/lister/golang/__init__.py
 create mode 100644 swh/lister/golang/lister.py
 create mode 100644 swh/lister/golang/tasks.py
 create mode 100644 swh/lister/golang/tests/__init__.py
 create mode 100644 swh/lister/golang/tests/data/page-1.txt
 create mode 100644 swh/lister/golang/tests/data/page-2.txt
 create mode 100644 swh/lister/golang/tests/data/page-3.txt
 create mode 100644 swh/lister/golang/tests/test_lister.py
 create mode 100644 swh/lister/golang/tests/test_tasks.py
Changes applied before test
commit e348307dc2b58bab1e721f7d923540f135cec781
Merge: 5410b6e c22857e
Author: Jenkins user <jenkins@localhost>
Date:   Mon Aug 29 10:21:13 2022 +0000

    Merge branch 'diff-target' into HEAD

commit c22857e342894cf4f9fb05ac16dc6614ca1ab049
Author: Raphaël Gomès <rgomes@octobus.net>
Date:   Tue Aug 23 17:05:30 2022 +0200

    Add incremental function to Golang Lister

commit efea636215507073f845c7ab02109cda98c863bb
Author: Raphaël Gomès <rgomes@octobus.net>
Date:   Wed Mar 9 22:35:40 2022 +0100

    Add non-incremental Golang modules lister
    
    This uses https://index.golang.org. An associated loader will be sent in
    the near future, as well as an incremental version of this lister.
    
    [1] https://go.dev/ref/mod#goproxy-protocol

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/623/ for more details.

You also need to add the celery task for the incremental lister in tasks.py

Sorry, I actually didn't see the comments on this diff before updating (though I did most of it), I'll add the celery task.

swh/lister/golang/lister.py
132–133

I should ignore it for listing, but not for updating the state, correct? That way a full run will still save the last timestamp for the next incremental run.

Add incremental celery task and its test

Build is green

Patch application report for D8298 (id=30085)

Could not rebase; Attempt merge onto 5410b6e3f3...

Auto-merging setup.py
Auto-merging README.md
Merge made by the 'recursive' strategy.
 README.md                               |   3 +-
 setup.py                                |   1 +
 swh/lister/golang/__init__.py           |  12 ++
 swh/lister/golang/lister.py             | 188 +++++++++++++++++++++++++++++++
 swh/lister/golang/tasks.py              |  25 +++++
 swh/lister/golang/tests/__init__.py     |   0
 swh/lister/golang/tests/data/page-1.txt |   5 +
 swh/lister/golang/tests/data/page-2.txt |   4 +
 swh/lister/golang/tests/data/page-3.txt |  10 ++
 swh/lister/golang/tests/test_lister.py  | 193 ++++++++++++++++++++++++++++++++
 swh/lister/golang/tests/test_tasks.py   |  52 +++++++++
 11 files changed, 492 insertions(+), 1 deletion(-)
 create mode 100644 swh/lister/golang/__init__.py
 create mode 100644 swh/lister/golang/lister.py
 create mode 100644 swh/lister/golang/tasks.py
 create mode 100644 swh/lister/golang/tests/__init__.py
 create mode 100644 swh/lister/golang/tests/data/page-1.txt
 create mode 100644 swh/lister/golang/tests/data/page-2.txt
 create mode 100644 swh/lister/golang/tests/data/page-3.txt
 create mode 100644 swh/lister/golang/tests/test_lister.py
 create mode 100644 swh/lister/golang/tests/test_tasks.py
Changes applied before test
commit 090d9adf467fffa56fe58f41d5c5ecea5530c1da
Merge: 5410b6e ef30be1
Author: Jenkins user <jenkins@localhost>
Date:   Mon Aug 29 10:36:27 2022 +0000

    Merge branch 'diff-target' into HEAD

commit ef30be144170bc37a8b6addb73467e7a2a06c31f
Author: Raphaël Gomès <rgomes@octobus.net>
Date:   Tue Aug 23 17:05:30 2022 +0200

    Add incremental function to Golang Lister

commit efea636215507073f845c7ab02109cda98c863bb
Author: Raphaël Gomès <rgomes@octobus.net>
Date:   Wed Mar 9 22:35:40 2022 +0100

    Add non-incremental Golang modules lister
    
    This uses https://index.golang.org. An associated loader will be sent in
    the near future, as well as an incremental version of this lister.
    
    [1] https://go.dev/ref/mod#goproxy-protocol

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/624/ for more details.

swh/lister/golang/lister.py
132–133

Other listers do not save any state when they are executed in non incremental mode so you should do the same imho.

Remove state save when full listing

Build is green

Patch application report for D8298 (id=30095)

Could not rebase; Attempt merge onto ceae8c42b5...

Auto-merging setup.py
Auto-merging README.md
Merge made by the 'recursive' strategy.
 README.md                               |   3 +-
 setup.py                                |   1 +
 swh/lister/golang/__init__.py           |  12 ++
 swh/lister/golang/lister.py             | 188 +++++++++++++++++++++++++++++++
 swh/lister/golang/tasks.py              |  25 +++++
 swh/lister/golang/tests/__init__.py     |   0
 swh/lister/golang/tests/data/page-1.txt |   5 +
 swh/lister/golang/tests/data/page-2.txt |   4 +
 swh/lister/golang/tests/data/page-3.txt |  10 ++
 swh/lister/golang/tests/test_lister.py  | 193 ++++++++++++++++++++++++++++++++
 swh/lister/golang/tests/test_tasks.py   |  52 +++++++++
 11 files changed, 492 insertions(+), 1 deletion(-)
 create mode 100644 swh/lister/golang/__init__.py
 create mode 100644 swh/lister/golang/lister.py
 create mode 100644 swh/lister/golang/tasks.py
 create mode 100644 swh/lister/golang/tests/__init__.py
 create mode 100644 swh/lister/golang/tests/data/page-1.txt
 create mode 100644 swh/lister/golang/tests/data/page-2.txt
 create mode 100644 swh/lister/golang/tests/data/page-3.txt
 create mode 100644 swh/lister/golang/tests/test_lister.py
 create mode 100644 swh/lister/golang/tests/test_tasks.py
Changes applied before test
commit df56e62754222a9332890fbe57649c4460ffb0b9
Merge: ceae8c4 f1dd211
Author: Jenkins user <jenkins@localhost>
Date:   Mon Aug 29 14:52:57 2022 +0000

    Merge branch 'diff-target' into HEAD

commit f1dd211ed03a5265cfe31959b7d30ba5bcd59b18
Author: Raphaël Gomès <rgomes@octobus.net>
Date:   Tue Aug 23 17:05:30 2022 +0200

    Add incremental function to Golang Lister

commit efea636215507073f845c7ab02109cda98c863bb
Author: Raphaël Gomès <rgomes@octobus.net>
Date:   Wed Mar 9 22:35:40 2022 +0100

    Add non-incremental Golang modules lister
    
    This uses https://index.golang.org. An associated loader will be sent in
    the near future, as well as an incremental version of this lister.
    
    [1] https://go.dev/ref/mod#goproxy-protocol

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/626/ for more details.

This revision is now accepted and ready to land.Aug 30 2022, 12:30 PM
This revision was landed with ongoing or failed builds.Aug 30 2022, 2:35 PM
This revision was automatically updated to reflect the committed changes.

Build is green

Patch application report for D8298 (id=30134)

Could not rebase; Attempt merge onto 0acf5b0f4f...

Updating 0acf5b0..c6ce862
Fast-forward
 README.md                               |   3 +-
 setup.py                                |   1 +
 swh/lister/golang/__init__.py           |  12 ++
 swh/lister/golang/lister.py             | 188 +++++++++++++++++++++++++++++++
 swh/lister/golang/tasks.py              |  25 +++++
 swh/lister/golang/tests/__init__.py     |   0
 swh/lister/golang/tests/data/page-1.txt |   5 +
 swh/lister/golang/tests/data/page-2.txt |   4 +
 swh/lister/golang/tests/data/page-3.txt |  10 ++
 swh/lister/golang/tests/test_lister.py  | 193 ++++++++++++++++++++++++++++++++
 swh/lister/golang/tests/test_tasks.py   |  52 +++++++++
 11 files changed, 492 insertions(+), 1 deletion(-)
 create mode 100644 swh/lister/golang/__init__.py
 create mode 100644 swh/lister/golang/lister.py
 create mode 100644 swh/lister/golang/tasks.py
 create mode 100644 swh/lister/golang/tests/__init__.py
 create mode 100644 swh/lister/golang/tests/data/page-1.txt
 create mode 100644 swh/lister/golang/tests/data/page-2.txt
 create mode 100644 swh/lister/golang/tests/data/page-3.txt
 create mode 100644 swh/lister/golang/tests/test_lister.py
 create mode 100644 swh/lister/golang/tests/test_tasks.py
Changes applied before test
commit c6ce862d3250b910b1a5c123d343aa6d2539892f
Author: Raphaël Gomès <rgomes@octobus.net>
Date:   Tue Aug 23 17:05:30 2022 +0200

    Add incremental function to Golang Lister

commit 60405e78aefde8566ead1c9d2901ab64b689129d
Author: Raphaël Gomès <rgomes@octobus.net>
Date:   Wed Mar 9 22:35:40 2022 +0100

    Add non-incremental Golang modules lister
    
    This uses https://index.golang.org. An associated loader will be sent in
    the near future, as well as an incremental version of this lister.
    
    [1] https://go.dev/ref/mod#goproxy-protocol

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/632/ for more details.