Notice that the filtering may need to be done at all levels: origins, but also SWHIDs in general.
An example (real) use case is a takedown request for just one specific commit in a repository: we do not want to dereference all the rest.

Mar 4 2021, 7:25 PM · General, Web app

rdicosmo renamed T1099: support origin and SWHID blocklist for archive search and browse from support origin blacklist for archive search and browse to support origin and SWHID blacklist for archive search and browse.

Mar 4 2021, 7:23 PM · General, Web app

rdicosmo added a parent task for T1099: support origin and SWHID blocklist for archive search and browse: T3087: Implement support for takedown notices (infra, admin tools, workflow).

Mar 4 2021, 7:21 PM · General, Web app

rdicosmo added a subtask for T3087: Implement support for takedown notices (infra, admin tools, workflow): T1099: support origin and SWHID blocklist for archive search and browse.

Mar 4 2021, 7:21 PM · Roadmap 2022, meta-task, Roadmap 2021, Web app

rdicosmo triaged T3087: Implement support for takedown notices (infra, admin tools, workflow) as Normal priority.

Mar 4 2021, 7:21 PM · Roadmap 2022, meta-task, Roadmap 2021, Web app

rdicosmo added a parent task for T1954: Up-to-date objstorage mirror on S3: T3085: Complete and updated copy of the archive on S3 (objects+graph).

Mar 4 2021, 5:50 PM · System administration, Object storage

rdicosmo added a subtask for T3085: Complete and updated copy of the archive on S3 (objects+graph): T1954: Up-to-date objstorage mirror on S3.

Mar 4 2021, 5:50 PM · Roadmap 2022, meta-task, Roadmap 2021, System administration, Object storage

rdicosmo created T3085: Complete and updated copy of the archive on S3 (objects+graph).

Mar 4 2021, 5:50 PM · Roadmap 2022, meta-task, Roadmap 2021, System administration, Object storage

rdicosmo merged task T1914: Keep mirror of contents on S3 up to date into T1954: Up-to-date objstorage mirror on S3.

Mar 4 2021, 5:44 PM · Mirror, Datasets

rdicosmo merged T1914: Keep mirror of contents on S3 up to date into T1954: Up-to-date objstorage mirror on S3.

Mar 4 2021, 5:44 PM · System administration, Object storage

rdicosmo triaged T3084: Fast track save code now requests as Normal priority.

Mar 4 2021, 5:36 PM · System administration, Web app

rdicosmo created T3084: Fast track save code now requests.

Mar 4 2021, 5:35 PM · System administration, Web app

rdicosmo added a parent task for T1524: save code now: also add new origins for unknown repos: T3082: Improve Save Code Now handling.

Mar 4 2021, 10:37 AM · Save Code Now, Web app

rdicosmo added a subtask for T3082: Improve Save Code Now handling: T1524: save code now: also add new origins for unknown repos.

Mar 4 2021, 10:37 AM · Save Code Now, meta-task, Roadmap 2021, Web app

rdicosmo added a parent task for T1481: add metric to monitor "save code now" efficiency: T3082: Improve Save Code Now handling.

Mar 4 2021, 10:36 AM · Save Code Now, System administration, Metrics/monitoring

rdicosmo added a subtask for T3082: Improve Save Code Now handling: T1481: add metric to monitor "save code now" efficiency.

Mar 4 2021, 10:36 AM · Save Code Now, meta-task, Roadmap 2021, Web app

rdicosmo added a parent task for T2117: Save Code Now: End to End monitoring: T3082: Improve Save Code Now handling.

Mar 4 2021, 10:35 AM · System administration, Monitoring, Roadmap 2021

rdicosmo added a subtask for T3082: Improve Save Code Now handling: T2117: Save Code Now: End to End monitoring.

Mar 4 2021, 10:35 AM · Save Code Now, meta-task, Roadmap 2021, Web app

rdicosmo added a parent task for T2168: Add a grafana dashboard to monitor "Save code now" requests: T3082: Improve Save Code Now handling.

Mar 4 2021, 10:34 AM · System administration

rdicosmo added a subtask for T3082: Improve Save Code Now handling: T2168: Add a grafana dashboard to monitor "Save code now" requests.

Mar 4 2021, 10:34 AM · Save Code Now, meta-task, Roadmap 2021, Web app

rdicosmo added a parent task for T2750: mercurial loader fails on save code now: T3082: Improve Save Code Now handling.

Mar 4 2021, 10:33 AM · Mercurial loader

rdicosmo added a subtask for T3082: Improve Save Code Now handling: T2750: mercurial loader fails on save code now.

Mar 4 2021, 10:33 AM · Save Code Now, meta-task, Roadmap 2021, Web app

rdicosmo added a parent task for T2891: "Save code now" may fail with "OperationalError: database is locked": T3082: Improve Save Code Now handling.

Mar 4 2021, 10:32 AM · System administration, Web app

rdicosmo added a subtask for T3082: Improve Save Code Now handling: T2891: "Save code now" may fail with "OperationalError: database is locked".

Mar 4 2021, 10:32 AM · Save Code Now, meta-task, Roadmap 2021, Web app

rdicosmo created T3082: Improve Save Code Now handling.

Mar 4 2021, 10:31 AM · Save Code Now, meta-task, Roadmap 2021, Web app

rdicosmo raised the priority of T1538: Add "forge" now from Low to Normal.

Mar 4 2021, 10:22 AM · Add Forge Now , Roadmap 2022, meta-task, Roadmap 2021

rdicosmo merged T2199: "Save Forge Now" into T1538: Add "forge" now.

Mar 4 2021, 10:21 AM · Add Forge Now , Roadmap 2022, meta-task, Roadmap 2021

rdicosmo merged task T2199: "Save Forge Now" into T1538: Add "forge" now.

Mar 4 2021, 10:21 AM · Roadmap 2020

Feb 10 2021

rdicosmo added a comment to T376: ingest git.eclipse.org repositories.

In T376#58337, @ardumont wrote:

Note that does not mean this is or will be ingested anytime soon though.
We are still missing at least the one cog to actually schedule those listed origins.

More details in T2345#58247

Feb 10 2021, 12:31 PM · Archive coverage

Feb 4 2021

rdicosmo added a comment to T2912: Next generation archive counters.

I asked one of the authors of the original HyperLogLog paper (not Philippe, that unfortunately passed away years ago :-()
The original HyperLogLog has three different behaviour, one for small cardinals, another for median cardinals, and a third for very large cardinals.
There is indeed a risk of breaking monotonicity at the boundaries between segments, but in each segment it is monotonic.
Our counters are already in the "very large cardinal" zone, so we should be safe with any implementation.

Feb 4 2021, 10:31 PM · Roadmap 2021, System administration, Monitoring, Web app

Feb 3 2021

rdicosmo added a comment to T2912: Next generation archive counters.

In T2912#58063, @zack wrote:

In T2912#58062, @rdicosmo wrote:

Thanks @vsellier, that seems quite ok indeed. The only question left is to know if the estimator implemented is monotonic (i.e. we will never have negative bumps in the graph :-))

may I suggest (for reasons discussed in the past) to just remove the graphs from the main archive.s.o page

We decided to keep the counters.

Feb 3 2021, 4:21 PM · Roadmap 2021, System administration, Monitoring, Web app

rdicosmo added a comment to T2912: Next generation archive counters.

Thanks @vsellier, that seems quite ok indeed. The only question left is to know if the estimator implemented is monotonic (i.e. we will never have negative bumps in the graph :-))

Feb 3 2021, 4:01 PM · Roadmap 2021, System administration, Monitoring, Web app

Feb 2 2021

rdicosmo closed D4977: Update persistent identifiers doc with pip install info.

Feb 2 2021, 1:55 PM

rdicosmo committed rDMOD0c1658128322: Make explicit Python 3 dependency (authored by rdicosmo).

Make explicit Python 3 dependency

Feb 2 2021, 1:55 PM

rdicosmo committed rDMOD1bfdf717c55a: Update persistent identifiers doc with pip install info (authored by rdicosmo).

Update persistent identifiers doc with pip install info

Feb 2 2021, 1:55 PM

rdicosmo updated the diff for D4977: Update persistent identifiers doc with pip install info.

Make explicit Python3 dependency

Feb 2 2021, 1:52 PM

Feb 1 2021

rdicosmo added a comment to T376: ingest git.eclipse.org repositories.

In T376#57824, @rdicosmo wrote:

Thanks @ardumont , that's great! If you think this does not need any more support on the Eclipse side, may you let them know?

Feb 1 2021, 5:59 PM · Archive coverage

rdicosmo added a comment to T376: ingest git.eclipse.org repositories.

Thanks @ardumont , that's great! If you think this does not need any more support on the Eclipse side, may you let them know?

Feb 1 2021, 5:58 PM · Archive coverage

rdicosmo moved T2952: Ambassador programme at SWH Wikipedia page from Restricted Project Column to Restricted Project Column on the Unknown Object (Project) board.

Feb 1 2021, 9:51 AM · Unknown Object (Project)

rdicosmo moved T2952: Ambassador programme at SWH Wikipedia page from Restricted Project Column to Restricted Project Column on the Unknown Object (Project) board.

Feb 1 2021, 9:51 AM · Unknown Object (Project)

Jan 31 2021

rdicosmo moved T2943: Contribute a research highlight on CIG (geodynamics.org) from Restricted Project Column to Restricted Project Column on the Unknown Object (Project) board.

Jan 31 2021, 1:05 PM · Unknown Object (Project)

Jan 30 2021

rdicosmo requested review of D4977: Update persistent identifiers doc with pip install info.

Jan 30 2021, 12:56 AM

Jan 29 2021

rdicosmo added a comment to T2912: Next generation archive counters.

In T2912#57643, @vlorentz wrote:

I don't think this solves the issue of overestimating the number of objects, when two threads insert the same objects at the same time.

! In T2912#57655, @vsellier wrote:

I'm not sure to understand,

Jan 29 2021, 1:10 PM · Roadmap 2021, System administration, Monitoring, Web app

rdicosmo added a comment to T376: ingest git.eclipse.org repositories.

Thanks @ardumont for experimenting with this. The 500 seems normal: we need to tell Eclipse about us first, I'll put you in touch. So maybe it's still a no-brainer, and we just need to document the "contant the owner to get whitelisted" human step :-)

Jan 29 2021, 10:04 AM · Archive coverage

Jan 28 2021

rdicosmo added a comment to T2912: Next generation archive counters.

Bloom filters are still on the table for other use cases, like testing super quickly for contents that we do not have, but if nobody has strong objections, this seems the way to go for the counters (very small footprint, small under/over counting errors, thanks Philippe Flajolet's magic :-))

Jan 28 2021, 7:27 PM · Roadmap 2021, System administration, Monitoring, Web app