Page MenuHomeSoftware Heritage

misc/coverage: Revamp and improve archive coverage widget
ClosedPublic

Authored by anlambert on Jul 16 2021, 6:21 PM.

Details

Summary

In order to implement T3127, revamp the archive coverage widget by:

  • adding new origins types that we now archived
  • splitting origins in multiple categories: listed, legacy, deposited and miscellaneous
  • adding global origins counts per origin type
  • adding detailed origins counts for each listed forges
  • removing dead code

The default view presents a high level overview of the archived origins.

In order to get more details about listed forges, collapsible tables have been added to the widget.
By clicking on any total origin counts, all those tables will be expanded to display detailed listers
metrics and links to search associated origins.

Below is how it looks so far:



Diff Detail

Repository
rDWAPPS Web applications
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Thanks for this, and for the screenshots, they look gorgeous!

As a minor suggestion I propose the following heading changes:

  • listed origins -> regular crawling
  • legacy origins -> discontinued hosting
  • deposited origins -> on demand archival

This would reduce a bit the repetition between title headings and explanations just below them (the explanations themselves looks great on the other hand).

As another minor comment, the vertical space took by the counters looks significant overall, as it adds up throughout the page, maybe we can reduce the font size a bit? (YMMV !)

As a minor suggestion I propose the following heading changes:

 listed origins -> regular crawling
 legacy origins -> discontinued hosting
deposited origins -> on demand archival

Better naming indeed, will update code and screenshots once it is done.

As another minor comment, the vertical space took by the counters looks significant overall, as it adds up throughout the page, maybe we can reduce the font size a bit? (YMMV !)

Font sizes are fine imho but I think we can gain some vertical spaces by tweaking some paddings and margins.

Update:

  • Rename section titles according to @zack suggestions
  • Remove vertical padding around counters to gain vertical space
  • Add fallback when scheduler metrics or deposit lists are not available, widget with logos will stil be displayed but without counters info

Screenshots in task description have been updated.

anlambert added inline comments.
swh/web/misc/coverage.py
205–206

I ensured that widget will still be displayed when metrics are not available, only counters info will be missing in that case.

Build is green

Patch application report for D6004 (id=22130)

Rebasing onto 87cc9e042d...

Current branch diff-target is up to date.
Changes applied before test
commit 45725285f1ec269a24ec34414ffde989702d6ed6
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Fri Jul 16 18:09:30 2021 +0200

    misc/coverage: Revamp archive coverage widget (WIP)

See https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/1013/ for more details.

anlambert edited the summary of this revision. (Show Details)
anlambert marked an inline comment as done.

Next step: write tests for that updated view.

Rebase and add test for coverage view.

Build has FAILED

Patch application report for D6004 (id=22172)

Rebasing onto 8acd726369...

Current branch diff-target is up to date.
Changes applied before test
commit 5fbc8666c341421a76ac3c04ca414c44c1beea87
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Fri Jul 16 18:09:30 2021 +0200

    misc/coverage: Revamp and improve archive coverage widget

Link to build: https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/1022/
See console output for more information: https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/1022/console

Build is green

Patch application report for D6004 (id=22173)

Rebasing onto 8acd726369...

Current branch diff-target is up to date.
Changes applied before test
commit f50c2eb3b4580080f9ff6fbc6d5e84534d846383
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Fri Jul 16 18:09:30 2021 +0200

    misc/coverage: Revamp and improve archive coverage widget

See https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/1023/ for more details.

Ensure lru caches are cleared before each test.

Build is green

Patch application report for D6004 (id=22175)

Rebasing onto 8acd726369...

Current branch diff-target is up to date.
Changes applied before test
commit 9bd604ff2f640e327bf1e59e491eaae955e7262a
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Fri Jul 16 18:09:30 2021 +0200

    misc/coverage: Revamp and improve archive coverage widget

See https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/1024/ for more details.

Build is green

Patch application report for D6004 (id=22176)

Rebasing onto 8acd726369...

Current branch diff-target is up to date.
Changes applied before test
commit 25fec609363f88542761992d159650535b2e9040
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Fri Jul 16 18:09:30 2021 +0200

    misc/coverage: Revamp and improve archive coverage widget

See https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/1025/ for more details.

Update:

  • rebase
  • improve test to check origins logos and search links are present in the rendered HTML page
  • improve commit message

It is time to remove the WIP state for that diff now.

anlambert retitled this revision from misc/coverage: Revamp archive coverage widget (WIP) to misc/coverage: Revamp and improve archive coverage widget.Aug 25 2021, 12:05 PM

Build is green

Patch application report for D6004 (id=22191)

Rebasing onto 773d6c2e06...

Current branch diff-target is up to date.
Changes applied before test
commit b2bdcf5041940dc187fda7c59f477a4ee878cb7e
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Fri Jul 16 18:09:30 2021 +0200

    misc/coverage: Revamp and improve archive coverage widget in homepage
    
    Add new origins types that we now archived.
    
    Update source code provider logos to png only in order to ease the
    integration of future ones
    
    Split origins in multiple categories: listed, legacy, deposited and
    miscellaneous.
    
    Add global origins counts per origin type.
    
    Add detailed origins counts for each listed forge, those details are
    hidden by default and can be displayed on demand through collapsible
    elements.
    
    Remove dead code.
    
    Related to T3127

See https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/1028/ for more details.

Update:

  • set 3 origins types per row instead of 4 to improve readability
  • hide phabricator origins for the moment as most of them have not been loaded into the archive

Build is green

Patch application report for D6004 (id=22202)

Rebasing onto 773d6c2e06...

Current branch diff-target is up to date.
Changes applied before test
commit 27168249259f23d2d9dd3906b9236fb6c9aeac9d
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Fri Jul 16 18:09:30 2021 +0200

    misc/coverage: Revamp and improve archive coverage widget in homepage
    
    Add new origins types that we now archived.
    
    Update source code provider logos to png only in order to ease the
    integration of future ones
    
    Split origins in multiple categories: listed, legacy, deposited and
    miscellaneous.
    
    Add global origins counts per origin type.
    
    Add detailed origins counts for each listed forge, those details are
    hidden by default and can be displayed on demand through collapsible
    elements.
    
    Remove dead code.
    
    Related to T3127

See https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/1030/ for more details.

Rebase, improve tuple iteration code and add some docstrings.

Build is green

Patch application report for D6004 (id=22307)

Rebasing onto f15e17c406...

Current branch diff-target is up to date.
Changes applied before test
commit a0950d85888c1724de9c5234fe50b3a44428e6b3
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Fri Jul 16 18:09:30 2021 +0200

    misc/coverage: Revamp and improve archive coverage widget in homepage
    
    Add new origins types that we now archived.
    
    Update source code provider logos to png only in order to ease the
    integration of future ones
    
    Split origins in multiple categories: listed, legacy, deposited and
    miscellaneous.
    
    Add global origins counts per origin type.
    
    Add detailed origins counts for each listed forge, those details are
    hidden by default and can be displayed on demand through collapsible
    elements.
    
    Remove dead code.
    
    Related to T3127

See https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/1043/ for more details.

Update:

  • Open search links in new browser tab
  • Put coverage widget CSS rules in a dedicated file

Screenshots in diff description have also been updated.

quick comment on the "Miscellaneous" category:

  • it's not a great name, and it really feels they are "less important" than the others even if we say explicitly they aren't (or maybe because we say so :-))
  • and shouldn't the two items in there (nix, guix) go under "regular crawling" anyway? (that would trivially solve the previous point)

the rest LGTM

In D6004#159479, @zack wrote:

quick comment on the "Miscellaneous" category:

  • it's not a great name, and it really feels they are "less important" than the others even if we say explicitly they aren't (or maybe because we say so :-))
  • and shouldn't the two items in there (nix, guix) go under "regular crawling" anyway? (that would trivially solve the previous point)

the rest LGTM

Ack, I understand the concerns. I will put the nix/guix origins into the regular crawling section then.

Build has FAILED

Patch application report for D6004 (id=22335)

Rebasing onto f15e17c406...

Current branch diff-target is up to date.
Changes applied before test
commit 34a719d152fff31f4fccc90ab192ea8dcd3e4d4b
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Fri Jul 16 18:09:30 2021 +0200

    misc/coverage: Revamp and improve archive coverage widget in homepage
    
    Add new origins types that we now archived.
    
    Update source code provider logos to png only in order to ease the
    integration of future ones
    
    Split origins in multiple categories: listed, legacy, deposited and
    miscellaneous.
    
    Add global origins counts per origin type.
    
    Add detailed origins counts for each listed forge, those details are
    hidden by default and can be displayed on demand through collapsible
    elements.
    
    Remove dead code.
    
    Related to T3127

Link to build: https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/1044/
See console output for more information: https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/1044/console

Update:

  • fix test
  • remove miscellaneous section and move guix/nixos origins into the regular crawling one
  • implement functions to the get the guix/nixos origins count as there is no scheduler metrics for those

Screenshots in task description have also been updated.

This should be good to land and deploy now.

Build is green

Patch application report for D6004 (id=22338)

Rebasing onto f15e17c406...

Current branch diff-target is up to date.
Changes applied before test
commit 82ed1cc5f7f2536ecc7bbd93df89dd63bec0acba
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Fri Jul 16 18:09:30 2021 +0200

    misc/coverage: Revamp and improve archive coverage widget in homepage
    
    Add new origins types that we now archived.
    
    Update source code provider logos to png only in order to ease the
    integration of future ones
    
    Split origins in multiple categories: listed, legacy and deposited.
    
    Add global origins counts per origin type.
    
    Add detailed origins counts for each listed forge, those details are
    hidden by default and can be displayed on demand through collapsible
    elements.
    
    Remove dead code.
    
    Related to T3127

See https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/1045/ for more details.

swh/web/misc/coverage.py
31

or swh-web's?

86

should be like the other nixguix loader visit type is nixguix [1]
But those are complicated, they are dealing with other types like 'tar' for example.

That probably explains why we got so few in your screenshot (26) is too few to my taste ;)

[1] https://forge.softwareheritage.org/source/swh-loader-core/browse/master/swh/loader/package/nixguix/loader.py$68

103

i'd say nixpkgs here.

swh/web/misc/coverage.py
31

Those info could be needed in other swh components so I would not put that in swh-web db.

86

I missed the nixguix loader configuartion for guix, correct origin counts is the number of branches in that origin, will adapt to retrieve it.

103

I use the type value to get associated png logo in static assets so I will keep nixos value here.

swh/web/misc/coverage.py
103

NixOS is the linux distribution (that uses nix and nix's dsl for everything).
Nixpkgs is the set of packages proposed by the nix package manager. That's what we are ingesting with the nixguix loader implementation (well a derivative of it through the json file just below).

I understood nixpkgs as the superset of all packages (including the one we can install for nixos).
Hence why i suggest we use nixpkgs here ;)

255
swh/web/misc/coverage.py
31

sounds fair.

86

right!

103

I use the type value to get associated png logo in static assets so I will keep nixos value here.

ack

swh/web/misc/coverage.py
255

Scratch my previous suggestion change then ;)
Given one of your last comment, then this method should be named _nixguix_.
nixpkgs and guix origins counts should be computed as this function does (counting the number of branches in the snapshots).

swh/web/misc/coverage.py
255

already done ;-) update incoming

swh/web/misc/coverage.py
255

*thumbs up*

swh/web/templates/misc/coverage.html
102
126

Update:

  • rebase
  • fix guix origins count

Screenshots have also been updated in diff description.

Add missing trailing spaces in django template

Awesome work!

That looks gorgeous, can't wait to see this deployed ;)

(couple of remaining remarks inline but nothing blocking)

This revision is now accepted and ready to land.Sep 2 2021, 3:07 PM

Build was aborted

Patch application report for D6004 (id=22359)

Rebasing onto 2a6ce0bc00...

Current branch diff-target is up to date.
Changes applied before test
commit 5afa65e107cb15d341dac6438065832562fe60dc
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Fri Jul 16 18:09:30 2021 +0200

    misc/coverage: Revamp and improve archive coverage widget in homepage
    
    Add new origins types that we now archived.
    
    Update source code provider logos to png only in order to ease the
    integration of future ones
    
    Split origins in multiple categories: listed, legacy and deposited.
    
    Add global origins counts per origin type.
    
    Add detailed origins counts for each listed forge, those details are
    hidden by default and can be displayed on demand through collapsible
    elements.
    
    Remove dead code.
    
    Related to T3127

Link to build: https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/1050/
See console output for more information: https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/1050/console

Add alt attribute to logos img.

Build is green

Patch application report for D6004 (id=22362)

Rebasing onto 2a6ce0bc00...

Current branch diff-target is up to date.
Changes applied before test
commit d610a20d92969a2ab286414ea15bb9d0f83e5901
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Fri Jul 16 18:09:30 2021 +0200

    misc/coverage: Revamp and improve archive coverage widget in homepage
    
    Add new origins types that we now archived.
    
    Update source code provider logos to png only in order to ease the
    integration of future ones
    
    Split origins in multiple categories: listed, legacy and deposited.
    
    Add global origins counts per origin type.
    
    Add detailed origins counts for each listed forge, those details are
    hidden by default and can be displayed on demand through collapsible
    elements.
    
    Remove dead code.
    
    Related to T3127

See https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/1053/ for more details.

Build is green

Patch application report for D6004 (id=22364)

Rebasing onto 2a6ce0bc00...

Current branch diff-target is up to date.
Changes applied before test
commit d0335365a461b215875671c53aaaf78c050ce5a6
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Fri Jul 16 18:09:30 2021 +0200

    misc/coverage: Revamp and improve archive coverage widget in homepage
    
    Add new origins types that we now archived.
    
    Update source code provider logos to png only in order to ease the
    integration of future ones
    
    Split origins in multiple categories: listed, legacy and deposited.
    
    Add global origins counts per origin type.
    
    Add detailed origins counts for each listed forge, those details are
    hidden by default and can be displayed on demand through collapsible
    elements.
    
    Remove dead code.
    
    Related to T3127

See https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/1056/ for more details.