Page MenuHomeSoftware Heritage

First iteration of prometheus export of the e2e metrics
ClosedPublic

Authored by vsellier on Jan 12 2022, 3:07 PM.

Details

Summary

TODO:

  • generate the metrics for the vault
  • test the content of the exported file
  • add an info field containing the result of the test to help the diagnosis in the monitoring

Related to T3129

Test Plan

Example of generated metrics:

  • scn:
# HELP swh_e2e_duration_seconds 
# TYPE swh_e2e_duration_seconds gauge
swh_e2e_duration_seconds{application="scn",status="succeeded"} 30.0
# HELP swh_e2e_status 
# TYPE swh_e2e_status gauge
swh_e2e_status{application="scn"} 0.0
  • deposit:
# HELP swh_e2e_duration_seconds 
# TYPE swh_e2e_duration_seconds gauge
swh_e2e_duration_seconds{application="deposit",stage="validation",status="ok"} 10.0
swh_e2e_duration_seconds{application="deposit",stage="loading",status="failed"} 20.0
swh_e2e_duration_seconds{application="deposit",stage="total",status="failed"} 30.0
# HELP swh_e2e_status 
# TYPE swh_e2e_status gauge
swh_e2e_status{application="deposit"} 2.0
  • vault
# HELP swh_e2e_status 
# TYPE swh_e2e_status gauge
swh_e2e_status{application="vault"} 2.0
# HELP swh_e2e_duration_seconds 
# TYPE swh_e2e_duration_seconds gauge
swh_e2e_duration_seconds{application="vault",status="failed",step="end"} 10.0

Diff Detail

Repository
rDICP Icinga plugins
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build has FAILED

Patch application report for D6926 (id=25102)

Rebasing onto c4f025e849...

Current branch diff-target is up to date.
Changes applied before test
commit d4d2c0449bc279618cab1f5198f8e81f52002b33
Author: Vincent SELLIER <vincent.sellier@softwareheritage.org>
Date:   Wed Jan 12 15:03:06 2022 +0100

    WIP - First iteration of prometheus export of the e2e metrics
    
    TODO:
    - generate the metrics for the vault
    - test the content of the exported file
    - add an info field containing the result of the test to help
      the diagnosis in the monitoring
    
    Related to T3129

Link to build: https://jenkins.softwareheritage.org/job/DICP/job/tests-on-diff/38/
See console output for more information: https://jenkins.softwareheritage.org/job/DICP/job/tests-on-diff/38/console

Harbormaster returned this revision to the author for changes because remote builds failed.Jan 12 2022, 3:09 PM
Harbormaster failed remote builds in B25976: Diff 25102!

Build has FAILED

Patch application report for D6926 (id=25104)

Rebasing onto c4f025e849...

Current branch diff-target is up to date.
Changes applied before test
commit 54f100898bd6a925a02a1733d4aae187bf44d738
Author: Vincent SELLIER <vincent.sellier@softwareheritage.org>
Date:   Wed Jan 12 15:03:06 2022 +0100

    WIP - First iteration of prometheus export of the e2e metrics
    
    TODO:
    - generate the metrics for the vault
    - test the content of the exported file
    - add an info field containing the result of the test to help
      the diagnosis in the monitoring
    
    Related to T3129

Link to build: https://jenkins.softwareheritage.org/job/DICP/job/tests-on-diff/39/
See console output for more information: https://jenkins.softwareheritage.org/job/DICP/job/tests-on-diff/39/console

Harbormaster returned this revision to the author for changes because remote builds failed.Jan 12 2022, 3:17 PM
Harbormaster failed remote builds in B25978: Diff 25104!
  • fix tests
  • add metrics on the vault

Build has FAILED

Patch application report for D6926 (id=25555)

Rebasing onto c4f025e849...

Current branch diff-target is up to date.
Changes applied before test
commit e4bb1261ec2c4fbfd7adf9ccf63c5004e35ae321
Author: Vincent SELLIER <vincent.sellier@softwareheritage.org>
Date:   Wed Jan 12 15:03:06 2022 +0100

    WIP - First iteration of prometheus export of the e2e metrics
    
    TODO:
    - generate the metrics for the vault
    - test the content of the exported file
    - add an info field containing the result of the test to help
      the diagnosis in the monitoring
    
    Related to T3129

Link to build: https://jenkins.softwareheritage.org/job/DICP/job/tests-on-diff/40/
See console output for more information: https://jenkins.softwareheritage.org/job/DICP/job/tests-on-diff/40/console

Harbormaster returned this revision to the author for changes because remote builds failed.Jan 28 2022, 4:27 PM
Harbormaster failed remote builds in B26386: Diff 25555!

minor changes to make mypy happy

Build has FAILED

Patch application report for D6926 (id=25556)

Rebasing onto c4f025e849...

Current branch diff-target is up to date.
Changes applied before test
commit 24f1afe34b0be999c26eadbb2359460d4ce5cffb
Author: Vincent SELLIER <vincent.sellier@softwareheritage.org>
Date:   Wed Jan 12 15:03:06 2022 +0100

    WIP - First iteration of prometheus export of the e2e metrics
    
    TODO:
    - generate the metrics for the vault
    - test the content of the exported file
    - add an info field containing the result of the test to help
      the diagnosis in the monitoring
    
    Related to T3129

Link to build: https://jenkins.softwareheritage.org/job/DICP/job/tests-on-diff/41/
See console output for more information: https://jenkins.softwareheritage.org/job/DICP/job/tests-on-diff/41/console

Harbormaster returned this revision to the author for changes because remote builds failed.Jan 28 2022, 4:40 PM
Harbormaster failed remote builds in B26387: Diff 25556!

Build has FAILED

Patch application report for D6926 (id=25559)

Rebasing onto 742053536f...

Current branch diff-target is up to date.
Changes applied before test
commit 7c6d921180322ccef54ddec337366808c7f5e61f
Author: Vincent SELLIER <vincent.sellier@softwareheritage.org>
Date:   Wed Jan 12 15:03:06 2022 +0100

    WIP - First iteration of prometheus export of the e2e metrics
    
    TODO:
    - generate the metrics for the vault
    - test the content of the exported file
    - add an info field containing the result of the test to help
      the diagnosis in the monitoring
    
    Related to T3129

Link to build: https://jenkins.softwareheritage.org/job/DICP/job/tests-on-diff/43/
See console output for more information: https://jenkins.softwareheritage.org/job/DICP/job/tests-on-diff/43/console

Harbormaster returned this revision to the author for changes because remote builds failed.Jan 28 2022, 4:53 PM
Harbormaster failed remote builds in B26390: Diff 25559!
vsellier edited the test plan for this revision. (Show Details)

Build is green

Patch application report for D6926 (id=25560)

Rebasing onto 742053536f...

Current branch diff-target is up to date.
Changes applied before test
commit 04ba6906d071a045c4f2a10edb7956283f112f45
Author: Vincent SELLIER <vincent.sellier@softwareheritage.org>
Date:   Wed Jan 12 15:03:06 2022 +0100

    WIP - First iteration of prometheus export of the e2e metrics
    
    TODO:
    - generate the metrics for the vault
    - test the content of the exported file
    - add an info field containing the result of the test to help
      the diagnosis in the monitoring
    
    Related to T3129

See https://jenkins.softwareheritage.org/job/DICP/job/tests-on-diff/44/ for more details.

Build is green

Patch application report for D6926 (id=27348)

Rebasing onto 95df818bb8...

First, rewinding head to replay your work on top of it...
Applying: WIP - First iteration of prometheus export of the e2e metrics
Using index info to reconstruct a base tree...
M	swh/icinga_plugins/cli.py
M	swh/icinga_plugins/deposit.py
M	swh/icinga_plugins/tests/test_deposit.py
M	swh/icinga_plugins/tests/test_save_code_now.py
M	swh/icinga_plugins/tests/test_vault.py
Falling back to patching base and 3-way merge...
Auto-merging swh/icinga_plugins/tests/test_vault.py
Auto-merging swh/icinga_plugins/tests/test_save_code_now.py
Auto-merging swh/icinga_plugins/tests/test_deposit.py
Auto-merging swh/icinga_plugins/deposit.py
Auto-merging swh/icinga_plugins/cli.py
CONFLICT (content): Merge conflict in swh/icinga_plugins/cli.py
Patch failed at 0001 WIP - First iteration of prometheus export of the e2e metrics

Resolve all conflicts manually, mark them as resolved with
"git add/rm <conflicted_files>", then run "git rebase --continue".
You can instead skip this commit: run "git rebase --skip".
To abort and get back to the state before "git rebase", run "git rebase --abort".

Rebase failed (ret=1)!

Could not rebase; Attempt merge onto 95df818bb8...

Already up to date.
Changes applied before test
commit 9f80eac01aeeeabb4924455d8006927c58a7f471
Author: Vincent SELLIER <vincent.sellier@softwareheritage.org>
Date:   Fri Apr 8 16:17:13 2022 +0200

    WIP - First iteration of prometheus export of the e2e metrics
    
    Summary:
    TODO:
    - ~~generate the metrics for the vault~~
    - test the content of the exported file
    - add an info field containing the result of the test to help
      the diagnosis in the monitoring
    
    Related to T3129
    
    Test Plan:
    Example of generated metrics:
    - scn:
swh_e2e_duration_seconds{application="scn",status="succeeded"} 30.0
swh_e2e_status{application="scn"} 0.0
```

- deposit:
```
swh_e2e_duration_seconds{application="deposit",stage="validation",status="ok"} 10.0
swh_e2e_duration_seconds{application="deposit",stage="loading",status="failed"} 20.0
swh_e2e_duration_seconds{application="deposit",stage="total",status="failed"} 30.0
swh_e2e_status{application="deposit"} 2.0
```

- vault
```
swh_e2e_status{application="vault"} 2.0
swh_e2e_duration_seconds{application="vault",status="failed",step="end"} 10.0
```

Reviewers: #system_administrators

Maniphest Tasks: T3129

Differential Revision: https://forge.softwareheritage.org/D6926
See https://jenkins.softwareheritage.org/job/DICP/job/tests-on-diff/68/ for more details.

Build is green

Patch application report for D6926 (id=27559)

Rebasing onto 95df818bb8...

Current branch diff-target is up to date.
Changes applied before test
commit d7330f61231c9005b740133ec859a894445e039a
Author: Vincent SELLIER <vincent.sellier@softwareheritage.org>
Date:   Fri Apr 8 16:17:13 2022 +0200

    WIP - First iteration of prometheus export of the e2e metrics
    
    Summary:
    TODO:
    - ~~generate the metrics for the vault~~
    - test the content of the exported file
    - add an info field containing the result of the test to help
      the diagnosis in the monitoring
    
    Related to T3129
    
    Test Plan:
    Example of generated metrics:
    - scn:
swh_e2e_duration_seconds{application="scn",status="succeeded"} 30.0
swh_e2e_status{application="scn"} 0.0
```

- deposit:
```
swh_e2e_duration_seconds{application="deposit",stage="validation",status="ok"} 10.0
swh_e2e_duration_seconds{application="deposit",stage="loading",status="failed"} 20.0
swh_e2e_duration_seconds{application="deposit",stage="total",status="failed"} 30.0
swh_e2e_status{application="deposit"} 2.0
```

- vault
```
swh_e2e_status{application="vault"} 2.0
swh_e2e_duration_seconds{application="vault",status="failed",step="end"} 10.0
```

Reviewers: #system_administrators

Maniphest Tasks: T3129

Differential Revision: https://forge.softwareheritage.org/D6926
See https://jenkins.softwareheritage.org/job/DICP/job/tests-on-diff/69/ for more details.
  • rebase
  • test the prometheus exporter file creation

Build is green

Patch application report for D6926 (id=27945)

Rebasing onto f1804f1cab...

Current branch diff-target is up to date.
Changes applied before test
commit e3be943d0c98beffb5f2a1fb3f68f5fcc909f94b
Author: Vincent SELLIER <vincent.sellier@softwareheritage.org>
Date:   Fri Apr 8 16:17:13 2022 +0200

    WIP - First iteration of prometheus export of the e2e metrics
    
    Summary:
    TODO:
    - ~~generate the metrics for the vault~~
    - test the content of the exported file
    - add an info field containing the result of the test to help
      the diagnosis in the monitoring
    
    Related to T3129
    
    Test Plan:
    Example of generated metrics:
    - scn:
swh_e2e_duration_seconds{application="scn",status="succeeded"} 30.0
swh_e2e_status{application="scn"} 0.0
```

- deposit:
```
swh_e2e_duration_seconds{application="deposit",stage="validation",status="ok"} 10.0
swh_e2e_duration_seconds{application="deposit",stage="loading",status="failed"} 20.0
swh_e2e_duration_seconds{application="deposit",stage="total",status="failed"} 30.0
swh_e2e_status{application="deposit"} 2.0
```

- vault
```
swh_e2e_status{application="vault"} 2.0
swh_e2e_duration_seconds{application="vault",status="failed",step="end"} 10.0
```

Reviewers: #system_administrators

Maniphest Tasks: T3129

Differential Revision: https://forge.softwareheritage.org/D6926
See https://jenkins.softwareheritage.org/job/DICP/job/tests-on-diff/71/ for more details.
vsellier retitled this revision from WIP - First iteration of prometheus export of the e2e metrics to First iteration of prometheus export of the e2e metrics.May 6 2022, 5:30 PM
ardumont added inline comments.
swh/icinga_plugins/base_check.py
60

?

(no idea whether duplicates is possible and if that's an issue or not but better ask)

swh/icinga_plugins/base_check.py
106

Please add a docstring mentioning what it does.
Afaict, it's writing the metrics in a prometheus file.
Also mention it's a callback method triggered when the icinga check is done.

swh/icinga_plugins/deposit.py
266

?

lgtm

A couple of suggestions inline.
And maybe even a typo in the status code to fix.

This revision is now accepted and ready to land.May 9 2022, 5:58 PM
  • rebase
  • update according the review feedbacks
vsellier marked an inline comment as not done.May 10 2022, 8:54 AM
vsellier added inline comments.
swh/icinga_plugins/base_check.py
60

it should not be a problem. The list must match the names as in created in _get_labels_names so as the logic is the same it's ok.

106

done

swh/icinga_plugins/deposit.py
266

true, good catch

Build is green

Patch application report for D6926 (id=28160)

Rebasing onto 3e33ab9f97...

Current branch diff-target is up to date.
Changes applied before test
commit 9812ac8f7b1d82003f95b90640bf1c5a26a944e6
Author: Vincent SELLIER <vincent.sellier@softwareheritage.org>
Date:   Fri Apr 8 16:17:13 2022 +0200

    First iteration of prometheus export of the e2e metrics
    
    Related to T3129

See https://jenkins.softwareheritage.org/job/DICP/job/tests-on-diff/73/ for more details.