Page MenuHomeSoftware Heritage
Feed All Stories

Oct 7 2021

anlambert added a comment to T3608: Deprecate most of the /browse/origin/.* URLs.

Just to be clear, you're looking to keep these URL working, but turn them into redirects over to swhid-centric URLs with context parameters (and drop the original view code from these URLs), correct?

Oct 7 2021, 11:08 AM · Web app
zack added 1 blocking reviewer(s) for D6401: Filter out pull request related branches: zack.

This should stay pending until we resolve the archiving policy discussion in T3627, so I'm marking it as such.

Oct 7 2021, 10:57 AM
zack added a comment to T3627: Consider dropping pull request references from the git loader ingestion.

Thanks for your feedback @olasd. I see three main arguments raised there: (1) the raciness of archiving those data via other means (= related forks), (2) the completeness of our canvassing of synthetic refs, (3) annotating rather than not archiving "synthetic" refs.

Oct 7 2021, 10:54 AM · Git loader
olasd added a comment to T3608: Deprecate most of the /browse/origin/.* URLs.

Awesome, thanks for confirming this!

Oct 7 2021, 10:54 AM · Web app
olasd committed rSPSITE6a233452cd48: Drop netdev ignored device from prometheus config (authored by olasd).
Drop netdev ignored device from prometheus config
Oct 7 2021, 10:50 AM
olasd committed rSPSITE09689dd703c7: Add missing newline to data/common/common.yaml (authored by olasd).
Add missing newline to data/common/common.yaml
Oct 7 2021, 10:50 AM
jayeshv added a comment to T3608: Deprecate most of the /browse/origin/.* URLs.
In T3608#71803, @olasd wrote:

I'm asking this because using predictable origin-centric URLs is generally much more user friendly than having to use multiple APIs to look up the SWHID of a given object before being able to construct the URL, and one would have to always to dynamic API calls to generate the URL for browsing the "latest archival" of a given origin.

For instance, the "archived origin" SWH badge https://www.softwareheritage.org/2020/01/13/the-swh-badges-are-here/ uses an origin-centric URL.

Oct 7 2021, 10:48 AM · Web app
jayeshv added a revision to T3608: Deprecate most of the /browse/origin/.* URLs: D6430: WIP - Deprecate most of the /browse/origin/.* URLs.
Oct 7 2021, 10:45 AM · Web app
jayeshv added a comment to T3608: Deprecate most of the /browse/origin/.* URLs.
In T3608#71802, @olasd wrote:

Just to be clear, you're looking to keep these URL working, but turn them into redirects over to swhid-centric URLs with context parameters (and drop the original view code from these URLs), correct?

Oct 7 2021, 10:40 AM · Web app
vlorentz requested review of D6429: docs: Update for the new API + remove references to deprecated module swh.model.identifiers.
Oct 7 2021, 10:32 AM
vlorentz closed D6420: Rename imports of swh.model.identifiers to fix deprecation warnings..
Oct 7 2021, 10:29 AM
vlorentz committed rDVAU6fd1f523e394: Rename imports of swh.model.identifiers to fix deprecation warnings. (authored by vlorentz).
Rename imports of swh.model.identifiers to fix deprecation warnings.
Oct 7 2021, 10:29 AM
olasd added a comment to T3608: Deprecate most of the /browse/origin/.* URLs.

I'm asking this because using predictable origin-centric URLs is generally much more user friendly than having to use multiple APIs to look up the SWHID of a given object before being able to construct the URL, and one would have to always to dynamic API calls to generate the URL for browsing the "latest archival" of a given origin.

Oct 7 2021, 10:22 AM · Web app
olasd added a comment to T3608: Deprecate most of the /browse/origin/.* URLs.

Just to be clear, you're looking to keep these URL working, but turn them into redirects over to swhid-centric URLs with context parameters (and drop the original view code from these URLs), correct?

Oct 7 2021, 10:19 AM · Web app
vlorentz closed D6413: Rename imports of swh.model.identifiers to fix deprecation warnings..
Oct 7 2021, 10:16 AM
vlorentz committed rDDEP179dff0dfaa4: Rename imports of swh.model.identifiers to fix deprecation warnings. (authored by vlorentz).
Rename imports of swh.model.identifiers to fix deprecation warnings.
Oct 7 2021, 10:16 AM
vlorentz closed D6418: test_utils.py: Remove incorrect mocks.
Oct 7 2021, 10:16 AM
vlorentz committed rDDEP915c877fd7df: test_utils.py: Remove incorrect mocks (authored by vlorentz).
test_utils.py: Remove incorrect mocks
Oct 7 2021, 10:16 AM
jayeshv added a comment to T3608: Deprecate most of the /browse/origin/.* URLs.

@anlambert do you think we can deprecate following routes as well? I think they can be redirected to the corresponding swh/web/browse/views/<object_type>.py routes.

Oct 7 2021, 10:13 AM · Web app
olasd added a comment to T3625: Reduce git loader memory footprint.

While we're at it, we should probably be adding some thresholds in the buffer proxy for:

  • cumulated length of messages for revisions and releases
  • cumulated number of parents for revisions
Oct 7 2021, 10:11 AM · Git loader
olasd added a comment to T3625: Reduce git loader memory footprint.

(this also matches the fact that we've seen, on our main ingestion database, directory_add operations that would take multiple hours, and have knock-on effects on backups and replications because of the long-running insertion transactions)

Oct 7 2021, 10:09 AM · Git loader
olasd added a comment to T3625: Reduce git loader memory footprint.

So, after doing some more analysis of memory usage patterns on these edge case repositories, my suspicion is that the high memory usage is generally being caused by the loader processing batches of large directories, closely packed together, at the same time.

Oct 7 2021, 10:08 AM · Git loader
olasd requested changes to D6401: Filter out pull request related branches.

This should stay pending until we resolve the archiving policy discussion in T3627, so I'm marking it as such.

Oct 7 2021, 9:57 AM
olasd added a comment to T3627: Consider dropping pull request references from the git loader ingestion.

Yes, we must filter this stuff out (we discussed this issue with @zack some time ago)

Oct 7 2021, 9:53 AM · Git loader
olasd added a comment to D6405: Respect task configuration to allow ignoring task result event.

This looks like an okay thing to do, but instead of only ignoring results (which would only cut down a third of the messages), we should probably be deactivating events completely on these workers.

Yes, I started with that config because i did not initially found the way to configure the send_events to False (or something).

Oct 7 2021, 9:49 AM
ardumont added a comment to T3338: Load the archived bitbucket mercurial repositories.

A first run of bitbucket origins have been scheduled and mostly ingested now [1]
(remains only 13 large ones ongoing).

Oct 7 2021, 9:42 AM · System administration, Mercurial loader
douardda accepted D6401: Filter out pull request related branches.

LGTM

Oct 7 2021, 9:32 AM
olasd added a revision to T3625: Reduce git loader memory footprint: D6427: swh.storage filter/buffer improvements.
Oct 7 2021, 9:23 AM · Git loader
ardumont updated the summary of D6428: docs: Add a save forge documentation.
Oct 7 2021, 8:36 AM

Oct 6 2021

rdicosmo added a comment to T3627: Consider dropping pull request references from the git loader ingestion.

Yes, we must filter this stuff out (we discussed this issue with @zack some time ago, and you may see Torvalds' opinion too https://www.zdnet.com/article/linux-boosts-microsoft-ntfs-support-as-linus-torvalds-complains-about-github-merges/ )

Oct 6 2021, 10:23 PM · Git loader
dachary requested review of D6424: Perfect hashmap C implementation.
Oct 6 2021, 7:13 PM
Harbormaster failed to build B24262: rDWAPPS6c72612902a6: settings/tests: ensure sqlite db file has sqlite3 extension for rDWAPPS6c72612902a6: settings/tests: ensure sqlite db file has sqlite3 extension!
Oct 6 2021, 7:10 PM
olasd committed rCDFJ9486c2dd1559: Just run a full apt dist-upgrade first (authored by olasd).
Just run a full apt dist-upgrade first
Oct 6 2021, 7:00 PM
olasd committed rCDFJ97ec136332eb: base-buster: Make sure to upgrade apt and dpkg first (authored by olasd).
base-buster: Make sure to upgrade apt and dpkg first
Oct 6 2021, 6:57 PM
olasd committed rCDFJ8aa3af08f8b7: base-buster: add libcmph-dev for swh.perfecthash (authored by olasd).
base-buster: add libcmph-dev for swh.perfecthash
Oct 6 2021, 6:50 PM
olasd accepted D6420: Rename imports of swh.model.identifiers to fix deprecation warnings..

Looks fine (i.e. the identifiers DeprecationWarnings are gone in tox, except for one that gets triggered by some pytest internal assertion rewrite).

Oct 6 2021, 6:47 PM
ardumont requested review of D6428: docs: Add a save forge documentation.
Oct 6 2021, 6:46 PM
ardumont added a revision to T3629: doc: Add a "how to save a forge" as in how it's currently done: D6428: docs: Add a save forge documentation.
Oct 6 2021, 6:43 PM · Documentation
ardumont requested review of D6426: docs: Rename run a new lister doc into develop a new lister.
Oct 6 2021, 6:36 PM
ardumont added a revision to T3629: doc: Add a "how to save a forge" as in how it's currently done: D6426: docs: Rename run a new lister doc into develop a new lister.
Oct 6 2021, 6:33 PM · Documentation
ardumont added a parent task for T3629: doc: Add a "how to save a forge" as in how it's currently done: T1538: Add "forge" now.
Oct 6 2021, 6:24 PM · Documentation
ardumont added a subtask for T1538: Add "forge" now: T3629: doc: Add a "how to save a forge" as in how it's currently done.
Oct 6 2021, 6:24 PM · Add Forge Now , Roadmap 2022, meta-task, Roadmap 2021
vsellier closed T3615: Adapt rabbitmq monitoring for bullseye, a subtask of T3487: Installation of the new provenance server, as Resolved.
Oct 6 2021, 6:19 PM · System administration
vsellier closed T3615: Adapt rabbitmq monitoring for bullseye as Resolved.
Oct 6 2021, 6:19 PM · System administration
vsellier closed D6367: Adapt the prometheus rabbitmq plugin for bullseye.
Oct 6 2021, 6:13 PM · System administration
vsellier committed rSPSITE6277c45abc11: Adapt the prometheus rabbitmq plugin for bullseye (authored by ardumont).
Adapt the prometheus rabbitmq plugin for bullseye
Oct 6 2021, 6:13 PM
vsellier updated the diff for D6367: Adapt the prometheus rabbitmq plugin for bullseye.

update commit message

Oct 6 2021, 6:11 PM · System administration
ardumont accepted D6425: origin_save: Lift save request creation restrictions with permission.

jsyk, that's the kind of slight adaptations will want to add in the webapp so we can let
yannick access the deposit moderation view at some point. Create a new role, assign that
role to specific users (in keycloak) and slightly adapt the webapp code to check the
logged in user has the proper role to let them access. (I don't recall the task id if
there is one ;)

Oct 6 2021, 6:05 PM
vsellier changed the status of T3621: Create a production read-only objstorage from Open to Work in Progress.
Oct 6 2021, 6:02 PM · System administration
ardumont accepted D6367: Adapt the prometheus rabbitmq plugin for bullseye.

lgtm

Oct 6 2021, 5:58 PM · System administration
vsellier retitled D6367: Adapt the prometheus rabbitmq plugin for bullseye from wip: Adapt the prometheus rabbitmq plugin for bullseye to Adapt the prometheus rabbitmq plugin for bullseye.
Oct 6 2021, 5:48 PM · System administration
Harbormaster failed to build B24262: rDWAPPS6c72612902a6: settings/tests: ensure sqlite db file has sqlite3 extension for rDWAPPS6c72612902a6: settings/tests: ensure sqlite db file has sqlite3 extension!
Oct 6 2021, 5:47 PM
vsellier updated the diff for D6367: Adapt the prometheus rabbitmq plugin for bullseye.
  • factorize the exported configuration
  • use the right exporter port on met
Oct 6 2021, 5:45 PM · System administration
anlambert requested review of D6425: origin_save: Lift save request creation restrictions with permission.
Oct 6 2021, 5:39 PM
anlambert added a project to T3286: Use journal clients for webapp and deposit to subscribe to events: Save Code Now.
Oct 6 2021, 5:14 PM · Save Code Now, SWORD deposit, Web app
olasd added a comment to D6408: Stop sending next-gen scheduled task results to scheduler listener.

Rather than doing this, we should probably disable worker task events altogether (that is, run celery worker without the --events/--task-events flag)

Oct 6 2021, 4:51 PM
ardumont added a comment to D6405: Respect task configuration to allow ignoring task result event.

This looks like an okay thing to do, but instead of only ignoring results (which would only cut down a third of the messages), we should probably be deactivating events completely on these workers.

Oct 6 2021, 4:47 PM
zack added a reviewer for D6401: Filter out pull request related branches: zack.
Oct 6 2021, 4:45 PM
ardumont accepted D6420: Rename imports of swh.model.identifiers to fix deprecation warnings..
Oct 6 2021, 4:44 PM
ardumont accepted D6413: Rename imports of swh.model.identifiers to fix deprecation warnings..
Oct 6 2021, 4:43 PM
Harbormaster failed to build B24262: rDWAPPS6c72612902a6: settings/tests: ensure sqlite db file has sqlite3 extension for rDWAPPS6c72612902a6: settings/tests: ensure sqlite db file has sqlite3 extension!
Oct 6 2021, 4:42 PM
Harbormaster failed to build B24261: rDWAPPScc60a4a5c88e: assets, cypress: Remove debug logs for rDWAPPScc60a4a5c88e: assets, cypress: Remove debug logs!
Oct 6 2021, 4:42 PM
Harbormaster failed to build B24260: rDWAPPSc8d26ac59080: package.json: Upgrade dependencies for rDWAPPSc8d26ac59080: package.json: Upgrade dependencies!
Oct 6 2021, 4:42 PM
olasd accepted D6405: Respect task configuration to allow ignoring task result event.

This looks like an okay thing to do, but instead of only ignoring results (which would only cut down a third of the messages), we should probably be deactivating events completely on these workers.

Oct 6 2021, 4:41 PM
olasd committed rDENV5003c8e917b6: Add swh-perfecthash (authored by olasd).
Add swh-perfecthash
Oct 6 2021, 4:38 PM
ardumont accepted D6418: test_utils.py: Remove incorrect mocks.

i absolutely do not remember what those are.

Oct 6 2021, 4:38 PM
ardumont accepted D6417: setup: fix 404 url to diffusion.
Oct 6 2021, 4:36 PM
anlambert committed rDWAPPS6c72612902a6: settings/tests: ensure sqlite db file has sqlite3 extension (authored by anlambert).
settings/tests: ensure sqlite db file has sqlite3 extension
Oct 6 2021, 4:33 PM
anlambert committed rDWAPPScc60a4a5c88e: assets, cypress: Remove debug logs (authored by anlambert).
assets, cypress: Remove debug logs
Oct 6 2021, 4:33 PM
anlambert committed rDWAPPSc8d26ac59080: package.json: Upgrade dependencies (authored by anlambert).
package.json: Upgrade dependencies
Oct 6 2021, 4:33 PM
anlambert closed D6422: assets/origin/bundles: Restore visualizations state on page reload.
Oct 6 2021, 4:33 PM
anlambert committed rDWAPPS2bcf33c2e527: assets/origin/bundles: Restore visualizations state on page reload (authored by anlambert).
assets/origin/bundles: Restore visualizations state on page reload
Oct 6 2021, 4:33 PM
anlambert closed D6421: assets/bundles/origin: Add missing statuses to visits reporting.
Oct 6 2021, 4:33 PM
anlambert committed rDWAPPSd394d9452dbd: assets/bundles/origin: Add missing statuses to visits reporting (authored by anlambert).
assets/bundles/origin: Add missing statuses to visits reporting
Oct 6 2021, 4:33 PM
dachary added a comment to T3634: Create swh-perfecthash module.

@olasd these are the failed dependencies you told me to expect, right? The missing package is ... libcmph-dev.

Oct 6 2021, 4:30 PM · Object storage
dachary added a revision to T3634: Create swh-perfecthash module: D6424: Perfect hashmap C implementation.
Oct 6 2021, 4:25 PM · Object storage
vsellier updated the diff for D6367: Adapt the prometheus rabbitmq plugin for bullseye.

rebase

Oct 6 2021, 4:14 PM · System administration
vsellier commandeered D6367: Adapt the prometheus rabbitmq plugin for bullseye.
Oct 6 2021, 4:13 PM · System administration
olasd updated the task description for T3634: Create swh-perfecthash module.
Oct 6 2021, 3:58 PM · Object storage
olasd updated the task description for T3634: Create swh-perfecthash module.
Oct 6 2021, 3:55 PM · Object storage
olasd committed rCJSWH337d62d50e2a: Add swh-perfecthash (authored by olasd).
Add swh-perfecthash
Oct 6 2021, 3:54 PM
olasd changed the status of T3634: Create swh-perfecthash module, a subtask of T3104: Persistent readonly perfect hash table, from Open to Work in Progress.
Oct 6 2021, 3:52 PM · Object storage (RedHat collaboration)
olasd changed the status of T3634: Create swh-perfecthash module from Open to Work in Progress.
Oct 6 2021, 3:52 PM · Object storage
Harbormaster failed to build B24255: rDOPH8229acfd1692: import template from swh-py-template (init-py-repo) for rDOPH8229acfd1692: import template from swh-py-template (init-py-repo)!
Oct 6 2021, 3:51 PM
olasd committed rDOPH8229acfd1692: import template from swh-py-template (init-py-repo) (authored by olasd).
import template from swh-py-template (init-py-repo)
Oct 6 2021, 3:51 PM
vlorentz accepted D6422: assets/origin/bundles: Restore visualizations state on page reload.

ahah, nice!

Oct 6 2021, 3:47 PM
vlorentz accepted D6421: assets/bundles/origin: Add missing statuses to visits reporting.

thanks!

Oct 6 2021, 3:47 PM
olasd triaged T3634: Create swh-perfecthash module as Normal priority.
Oct 6 2021, 3:38 PM · Object storage
dachary added a comment to T3104: Persistent readonly perfect hash table.

I'd like to create a new package ( swh-objstorage-hash) and https://docs.softwareheritage.org/devel/tutorials/add-new-package.html is presumably the guide to do that. I however do not have the required permissions: would someone be so kind as to work with me on this?

Oct 6 2021, 3:32 PM · Object storage (RedHat collaboration)
anlambert requested review of D6422: assets/origin/bundles: Restore visualizations state on page reload.
Oct 6 2021, 3:24 PM
anlambert requested review of D6421: assets/bundles/origin: Add missing statuses to visits reporting.
Oct 6 2021, 3:21 PM
douardda added a comment to T3627: Consider dropping pull request references from the git loader ingestion.

FTR without D6401, the packfile received from GH for the CocoaPods/Specs repo contains 21162 references, 21146 of which are starting with /refs/pull/ and 7126 are ending with /merge (even if those have been explicitly not asked thanks to the filtering in RepoRepresentation.determine_wanted().
When D6401 is applied, we only get the 20-ish references that are not pull request related.

Oct 6 2021, 2:56 PM · Git loader
swh-public-ci added a comment to D6417: setup: fix 404 url to diffusion.

Build is green

Oct 6 2021, 2:56 PM
dachary updated the diff for D6417: setup: fix 404 url to diffusion.

make it more readable as suggested

Oct 6 2021, 2:54 PM
dachary added a comment to D6417: setup: fix 404 url to diffusion.

I agree

Oct 6 2021, 2:51 PM
vsellier closed T3320: Test rancher pros/cons as Resolved.

I think the issue can be closed.
The pros are:

  • it simplify the cluster management (create, configuration and most of all, kubernetes upgrades)
  • centralize the global view of the cluster and what is running on it
  • OSS and transparent policy
Oct 6 2021, 2:36 PM · System administration
ardumont added a revision to T3625: Reduce git loader memory footprint: D6419: Add some tracemalloc hooks.
Oct 6 2021, 2:19 PM · Git loader
vlorentz requested review of D6420: Rename imports of swh.model.identifiers to fix deprecation warnings..
Oct 6 2021, 2:18 PM
ardumont updated subscribers of T3627: Consider dropping pull request references from the git loader ingestion.

So I'm actually proposing that we filter out all branches whose name start with refs/pulls (with no other conditions attached).

Oct 6 2021, 2:18 PM · Git loader
vlorentz requested review of D6413: Rename imports of swh.model.identifiers to fix deprecation warnings..
Oct 6 2021, 1:16 PM