Will give some insights into what is taking the most time
Details
- Reviewers
anlambert olasd - Group Reviewers
Reviewers - Maniphest Tasks
- T4219: Investigate why GitHub fork detection did not bring a speed-up
- Commits
- rDLDBASE6ca6d5cf9cef: loader.core: Add statsd timing metrics
Not tested
Diff Detail
- Repository
- rDLDBASE Generic VCS/Package Loader
- Branch
- metrics
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 29007 Build 45346: Phabricator diff pipeline on jenkins Jenkins console · Jenkins Build 45345: arc lint + arc unit
Event Timeline
Build has FAILED
Patch application report for D7726 (id=27936)
Rebasing onto 82e1bfad5c...
Current branch diff-target is up to date.
Changes applied before test
commit 619f8aef0fe93725a254b8978220b328b559a3a3 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Mon May 2 15:20:52 2022 +0200 loader.core: Add statsd timing metrics
Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/764/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/764/console
Build is green
Patch application report for D7726 (id=27936)
Rebasing onto 82e1bfad5c...
Current branch diff-target is up to date.
Changes applied before test
commit 619f8aef0fe93725a254b8978220b328b559a3a3 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Mon May 2 15:20:52 2022 +0200 loader.core: Add statsd timing metrics
See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/766/ for more details.
I think you should add some tests checking the expected metrics are sent to statsd by mocking the statsd_timed method.
This will prevent possible regressions on that feature if the code of BaseLoader gets modified.
swh/loader/core/loader.py | ||
---|---|---|
426 | success my be True here if we did not enter the except block. | |
429 | same here |
Build is green
Patch application report for D7726 (id=27949)
Rebasing onto c06305e78a...
First, rewinding head to replay your work on top of it... Applying: loader.core: Add statsd timing metrics
Changes applied before test
commit d16c32648030ea5488d1a0d89651fe2d3a62524e Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Mon May 2 15:20:52 2022 +0200 loader.core: Add statsd timing metrics
See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/769/ for more details.
Looks good to me.
swh/loader/core/tests/test_loader.py | ||
---|---|---|
340 ↗ | (On Diff #27949) | You can directly use mocker.call to gain an import statement. |
swh/loader/core/loader.py | ||
---|---|---|
41 | Could probably just be swh_loader as it's fully generic. | |
474–488 |
|
swh/loader/core/loader.py | ||
---|---|---|
377–378 | Are you sure the * 1000 is needed? I'm pretty sure our statsd client does this conversion already and expects seconds as arguments. |
Build is green
Patch application report for D7726 (id=28000)
Rebasing onto a097a946c2...
First, rewinding head to replay your work on top of it... Applying: loader.core: Add statsd timing metrics
Changes applied before test
commit 34985af30d02575639937d89a444dc0fd91bd0d1 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Mon May 2 15:20:52 2022 +0200 loader.core: Add statsd timing metrics
See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/774/ for more details.
Build is green
Patch application report for D7726 (id=28003)
Rebasing onto a097a946c2...
First, rewinding head to replay your work on top of it... Applying: loader.core: Add statsd timing metrics
Changes applied before test
commit 9682fb6fceac4a7e18aa219e0659900f238ee23b Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Mon May 2 15:20:52 2022 +0200 loader.core: Add statsd timing metrics
See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/776/ for more details.
Very nice, thanks!
After reviewing D7732, I'm getting the feeling that it may be worth adding a swh.core.statsd.Statsd instance as attribute of the loader instance, with a set namespace and a set of constant tags that we would not need to repeat for each statsd call (for instance, the visit_type).
This would also allow the loader to collect new constant tags over its lifetime. In this specific case, you could add the success/status tags as soon as they are known. And the git loader could have the "incremental mode" information introduced by D7732 automatically applied to further calls to statsd within the lifetime of the loader, allowing us to get distinct histograms for every use case.
What do you think?
swh/loader/core/loader.py | ||
---|---|---|
474–488 | The prometheus statsd exporter converts it to sensible units, so _seconds is correct |
Build is green
Patch application report for D7726 (id=28037)
Rebasing onto a097a946c2...
Current branch diff-target is up to date.
Changes applied before test
commit 6ca6d5cf9cefb595ee8f1ad3fcf5decf4008aafb Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Mon May 2 15:20:52 2022 +0200 loader.core: Add statsd timing metrics
See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/782/ for more details.