revamp archive coverage page to list instances of mentioned listers
Closed, MigratedEdits Locked
Actions

Assigned To

Authored By

	zack
	Jul 1 2019, 6:11 PM

Description

Now that we are increasing archive coverage quite a bit, the archive coverage page is starting to show some limits. In particular, we need a structured way to list the various instances of supported listers.

as a first approximation we can make the tooltip of each listed logo include a list of instances — this would work for now, but it won't scale for much, because there is only so much usable space in a tooltip
alternatively we can make each logo link to a dedicated page, where we list all deployed instances of the lister (which will probably mean solve T1266 as a prerequisite)
alternatively, don't know, a way in between maybe? e.g. a box that opens when clicking on each logo, with a proper <div> with links to each instance?

A proper solution (T1266) would require some work, but the current state is no longer good enough in terms of clarity of what we currently archive… Thoughts?

Revisions and Commits

rDWAPPS Web applications
	D6004	rDWAPPSd0335365a461 misc/coverage: Revamp and improve archive coverage widget in homepage

Related Objects

Mentioned In: T2640: Add link from the main archive to the Bitbucket mercurial case (https://bitbucket-archive.softwareheritage.org/)
T2468: add to archive coverage page a breakdown of the number of origins per lister [instance]
T2442: Provide a unified API for listers to interact with the scheduler
Mentioned Here: T1538: Add "forge" now
T1266: automatically generate archive coverage page

Event Timeline

zack triaged this task as Normal priority.Jul 1 2019, 6:11 PM

zack created this task.

My gut feeling is that we're already past the point where maintaining the list by hand is workable: in the last week or two, we've added a dozen new sources and we're going to keep adding more (at a slower pace, but probably more on an "on the fly" basis).

I'm also not quite sure where to draw the line between "platform software supported for archival" (e.g. gitlab, bitbucket, PyPI archives, Debian archives, CGit) and "currently archived hosting platforms" (e.g. gitlab.com, framagit, bitbucket.org, debian.org, kernel.org, gnu.org). Showing the distinction will probably make more sense when/if we decide to tackle T1538.

For the question of showing from which actual hosters we archive code, I think we can have a middle ground where we curate a list of "prominent" sources (which we could hard-code in a first iteration, then programmatically determine by just taking the top N sources after solving T1266), kept on top of the section, and then pick a random sample of other origins to show on a second line below.

In T1870#34563, @olasd wrote:

My gut feeling is that we're already past the point where maintaining the list by hand is workable

I concur.

I'm also not quite sure where to draw the line between "platform software supported for archival" (e.g. gitlab, bitbucket, PyPI archives, Debian archives, CGit) and "currently archived hosting platforms" (e.g. gitlab.com, framagit, bitbucket.org, debian.org, kernel.org, gnu.org). Showing the distinction will probably make more sense when/if we decide to tackle T1538.

Yeah, that too.

As a way forward, let me note down here a proposal (by @vlorentz on IRC) which I quite like and would allow us to make progress on this task without having to go all the way down this rabbit hole:

create a new page, e.g., archive.s.o/coverage that is automatically generated by the web app and contains:
- a table of all the listers currently in production, one per row
- for each lister we give the lister type (e.g., gitlab) and the instance URL (e.g. https://gitlab.com)
- rows are grouped by lister type, so that all, say, gitlab listers come together
- we add a section heading for each group of listers of the same type, where we can have
  - the lister type logo (this information should hence be made machine-readable somewhere)
  - a anchor that we can link to, e.g., archive.s.o/coverage/gitlab
the current list of logos on archive.s.o remains for now curated by hand, and
- for logos of listers that do appear in the table, we make them link to the corresponding archive.s.o/coverage anchors, so that one can easily access the list of all instances of a given lister

This way we can still have "exceptions" on the main archive.s.o page (e.g., HAL) but still progress toward a more automated solution.

I fully support this last proposal, that makes total sense.
I would like to see an API entrypoint that provides the information that will go in archive.s.o/coverage.

olasd mentioned this in T2442: Provide a unified API for listers to interact with the scheduler.Jun 9 2020, 4:15 PM

zack mentioned this in T2468: add to archive coverage page a breakdown of the number of origins per lister [instance].Jun 27 2020, 1:24 PM

anlambert mentioned this in T2640: Add link from the main archive to the Bitbucket mercurial case (https://bitbucket-archive.softwareheritage.org/).Sep 24 2020, 11:20 AM

KShivendu added a subscriber: KShivendu.Mar 13 2021, 1:41 PM

vlorentz assigned this task to anlambert.Apr 23 2021, 4:53 PM

anlambert added a revision: D6004: misc/coverage: Revamp and improve archive coverage widget.Aug 27 2021, 3:04 PM

anlambert added a commit: rDWAPPSd0335365a461: misc/coverage: Revamp and improve archive coverage widget in homepage.Sep 2 2021, 3:58 PM

Closing this as resolved, archive coverage widget has been updated accordingly and deployed.

This task has been migrated to GitLab.

revamp archive coverage page to list instances of mentioned listersClosed, MigratedEdits LockedActions

Description

Revisions and Commits

Related Objects

Event Timeline

revamp archive coverage page to list instances of mentioned listers
Closed, MigratedEdits Locked
Actions