Page MenuHomeSoftware Heritage

loader.core: Add statsd timing metrics
ClosedPublic

Authored by vlorentz on May 2 2022, 3:21 PM.

Details

Summary

Will give some insights into what is taking the most time

Test Plan

Not tested

Event Timeline

Build has FAILED

Patch application report for D7726 (id=27936)

Rebasing onto 82e1bfad5c...

Current branch diff-target is up to date.
Changes applied before test
commit 619f8aef0fe93725a254b8978220b328b559a3a3
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon May 2 15:20:52 2022 +0200

    loader.core: Add statsd timing metrics

Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/764/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/764/console

Harbormaster returned this revision to the author for changes because remote builds failed.May 2 2022, 3:22 PM
Harbormaster failed remote builds in B29007: Diff 27936!

Build is green

Patch application report for D7726 (id=27936)

Rebasing onto 82e1bfad5c...

Current branch diff-target is up to date.
Changes applied before test
commit 619f8aef0fe93725a254b8978220b328b559a3a3
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon May 2 15:20:52 2022 +0200

    loader.core: Add statsd timing metrics

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/766/ for more details.

I think you should add some tests checking the expected metrics are sent to statsd by mocking the statsd_timed method.
This will prevent possible regressions on that feature if the code of BaseLoader gets modified.

swh/loader/core/loader.py
433–440

success my be True here if we did not enter the except block.

435

same here

add test and fix lots of bugs

Build is green

Patch application report for D7726 (id=27949)

Rebasing onto c06305e78a...

First, rewinding head to replay your work on top of it...
Applying: loader.core: Add statsd timing metrics
Changes applied before test
commit d16c32648030ea5488d1a0d89651fe2d3a62524e
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon May 2 15:20:52 2022 +0200

    loader.core: Add statsd timing metrics

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/769/ for more details.

Looks good to me.

swh/loader/core/tests/test_loader.py
340

You can directly use mocker.call to gain an import statement.

This revision is now accepted and ready to land.May 3 2022, 1:57 PM
olasd added inline comments.
swh/loader/core/loader.py
42

Could probably just be swh_loader as it's fully generic.

484–498
  • We want the metric name to finish with the unit to match the prometheus naming conventions (our statsd exporter configuration only translates dots to underscores)
  • I would suggest making the operation name a tag, as it'll be easier to make generic dashboards that way, even when we add new operations.
swh/loader/core/loader.py
378–379

Are you sure the * 1000 is needed? I'm pretty sure our statsd client does this conversion already and expects seconds as arguments.

swh/loader/core/loader.py
378–379

It's done in TimedContextManagerDecorator, which is used by statsd_timed (a decoractor and context manager), not by statsd_timing (a one-off function).

You can see in the test that in the end, everything is multiplied by 1000.

484–498

good point, will do.

rename metrics as suggested

Build is green

Patch application report for D7726 (id=28000)

Rebasing onto a097a946c2...

First, rewinding head to replay your work on top of it...
Applying: loader.core: Add statsd timing metrics
Changes applied before test
commit 34985af30d02575639937d89a444dc0fd91bd0d1
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon May 2 15:20:52 2022 +0200

    loader.core: Add statsd timing metrics

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/774/ for more details.

olasd added inline comments.
swh/loader/core/loader.py
378–379

Barf. Okay.

swh/loader/core/tests/test_loader.py
340

Hmpf. We should probably use either dots or underscores consistently (not that it really matters much, it's all getting underscored in the prometheus exporter).

consistently use underscores

Build is green

Patch application report for D7726 (id=28003)

Rebasing onto a097a946c2...

First, rewinding head to replay your work on top of it...
Applying: loader.core: Add statsd timing metrics
Changes applied before test
commit 9682fb6fceac4a7e18aa219e0659900f238ee23b
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon May 2 15:20:52 2022 +0200

    loader.core: Add statsd timing metrics

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/776/ for more details.

Very nice, thanks!

After reviewing D7732, I'm getting the feeling that it may be worth adding a swh.core.statsd.Statsd instance as attribute of the loader instance, with a set namespace and a set of constant tags that we would not need to repeat for each statsd call (for instance, the visit_type).

This would also allow the loader to collect new constant tags over its lifetime. In this specific case, you could add the success/status tags as soon as they are known. And the git loader could have the "incremental mode" information introduced by D7732 automatically applied to further calls to statsd within the lifetime of the loader, allowing us to get distinct histograms for every use case.

What do you think?

Good idea, I'll send another diff (based on D7727) to do this

swh/loader/core/loader.py
484–498

Is it fine to call it _seconds even though it's in millisecond; or should I change the suffix to _ms?

swh/loader/core/loader.py
484–498

The prometheus statsd exporter converts it to sensible units, so _seconds is correct

Good idea, I'll send another diff (based on D7727) to do this

Cool, thanks

Build is green

Patch application report for D7726 (id=28037)

Rebasing onto a097a946c2...

Current branch diff-target is up to date.
Changes applied before test
commit 6ca6d5cf9cefb595ee8f1ad3fcf5decf4008aafb
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon May 2 15:20:52 2022 +0200

    loader.core: Add statsd timing metrics

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/782/ for more details.