Page MenuHomeSoftware Heritage
Feed Advanced Search

Mar 8 2021

rdicosmo updated the task description for T2204: Full-text search on source code (prototype).
Mar 8 2021, 10:50 AM · Roadmap 2021
rdicosmo added a parent task for T2220: swh-graph in production: T2204: Full-text search on source code (prototype).
Mar 8 2021, 10:47 AM · Roadmap 2022, meta-task, Roadmap 2021, Compressed graph service
rdicosmo added a subtask for T2204: Full-text search on source code (prototype): T2220: swh-graph in production.
Mar 8 2021, 10:47 AM · Roadmap 2021
rdicosmo lowered the priority of T2204: Full-text search on source code (prototype) from Normal to Wishlist.
Mar 8 2021, 10:46 AM · Roadmap 2021
rdicosmo edited projects for T2204: Full-text search on source code (prototype), added: Roadmap 2021; removed Roadmap 2020.
Mar 8 2021, 10:46 AM · Roadmap 2021
rdicosmo updated the task description for T3097: Expose metadata in the WebApp and make it searchable.
Mar 8 2021, 10:44 AM · Intrinsic metadata, Extrinsic metadata, Roadmap 2021, meta-task
rdicosmo added a parent task for T2270: Add to intrinsic metadata files to be indexed: AUTHORS and CONTRIBUTORS: T2064: Add metadata from deposits to metadata search.
Mar 8 2021, 10:34 AM · Intrinsic metadata, Indexer
rdicosmo added a subtask for T2064: Add metadata from deposits to metadata search: T2270: Add to intrinsic metadata files to be indexed: AUTHORS and CONTRIBUTORS.
Mar 8 2021, 10:34 AM · Metadata workflow
rdicosmo added a parent task for T2073: Index extrinsic metadata from the journal in swh-search/Elasticsearch: T3097: Expose metadata in the WebApp and make it searchable.
Mar 8 2021, 10:33 AM · Archive search, Metadata workflow
rdicosmo added a parent task for T2088: Specify and draw metadata view on web-app: T3097: Expose metadata in the WebApp and make it searchable.
Mar 8 2021, 10:33 AM · Roadmap 2020, Web app, Metadata workflow
rdicosmo added a parent task for T2191: Metadata Views: T3097: Expose metadata in the WebApp and make it searchable.
Mar 8 2021, 10:33 AM · Metadata workflow, Web app, Roadmap 2020
rdicosmo added a parent task for T2938: Create API endpoint to access raw_extrinsic_metadata: T3097: Expose metadata in the WebApp and make it searchable.
Mar 8 2021, 10:33 AM · Web app, Metadata workflow
rdicosmo added subtasks for T3097: Expose metadata in the WebApp and make it searchable: T2073: Index extrinsic metadata from the journal in swh-search/Elasticsearch, T2938: Create API endpoint to access raw_extrinsic_metadata, T2088: Specify and draw metadata view on web-app, T2191: Metadata Views.
Mar 8 2021, 10:33 AM · Intrinsic metadata, Extrinsic metadata, Roadmap 2021, meta-task
rdicosmo created T3097: Expose metadata in the WebApp and make it searchable.
Mar 8 2021, 10:31 AM · Intrinsic metadata, Extrinsic metadata, Roadmap 2021, meta-task
rdicosmo updated the task description for T2194: Archive Integration (Web API).
Mar 8 2021, 10:20 AM · Roadmap 2021, meta-task
rdicosmo added a parent task for T1510: Have a look at openAPI and decide whether we want to follow these specs: T1805: Public API v2.
Mar 8 2021, 10:19 AM · Web app
rdicosmo added a subtask for T1805: Public API v2: T1510: Have a look at openAPI and decide whether we want to follow these specs.
Mar 8 2021, 10:19 AM · meta-task, Web app
rdicosmo edited projects for T2194: Archive Integration (Web API), added: Roadmap 2021; removed Roadmap 2020.
Mar 8 2021, 10:17 AM · Roadmap 2021, meta-task
rdicosmo added a project to T3085: Complete and updated copy of the archive on S3 (objects+graph): meta-task.
Mar 8 2021, 10:13 AM · Roadmap 2022, meta-task, Roadmap 2021, System administration, Object storage
rdicosmo updated the task description for T1538: Add "forge" now.
Mar 8 2021, 10:13 AM · Add Forge Now , Roadmap 2022, meta-task, Roadmap 2021
rdicosmo updated the task description for T2220: swh-graph in production.
Mar 8 2021, 10:12 AM · Roadmap 2022, meta-task, Roadmap 2021, Compressed graph service
rdicosmo added a project to T3082: Improve Save Code Now handling: meta-task.
Mar 8 2021, 10:11 AM · Save Code Now, meta-task, Roadmap 2021, Web app
rdicosmo added a project to T3087: Implement support for takedown notices (infra, admin tools, workflow): meta-task.
Mar 8 2021, 10:11 AM · Roadmap 2022, meta-task, Roadmap 2021, Web app
rdicosmo added a project to T3096: Efficient and reliable download via the Vault: meta-task.
Mar 8 2021, 10:11 AM · meta-task, Roadmap 2021, Vault
rdicosmo merged task T2195: Web API 2 into T1805: Public API v2.
Mar 8 2021, 10:10 AM · Roadmap 2020
rdicosmo merged T2195: Web API 2 into T1805: Public API v2.
Mar 8 2021, 10:10 AM · meta-task, Web app
rdicosmo updated the task description for T3096: Efficient and reliable download via the Vault.
Mar 8 2021, 10:09 AM · meta-task, Roadmap 2021, Vault
rdicosmo updated the task description for T3096: Efficient and reliable download via the Vault.
Mar 8 2021, 10:08 AM · meta-task, Roadmap 2021, Vault
rdicosmo added a parent task for T2220: swh-graph in production: T3096: Efficient and reliable download via the Vault.
Mar 8 2021, 10:08 AM · Roadmap 2022, meta-task, Roadmap 2021, Compressed graph service
rdicosmo added a subtask for T3096: Efficient and reliable download via the Vault: T2220: swh-graph in production.
Mar 8 2021, 10:08 AM · meta-task, Roadmap 2021, Vault
rdicosmo updated the task description for T3096: Efficient and reliable download via the Vault.
Mar 8 2021, 10:07 AM · meta-task, Roadmap 2021, Vault
rdicosmo added a project to T3096: Efficient and reliable download via the Vault: Roadmap 2021.
Mar 8 2021, 10:05 AM · meta-task, Roadmap 2021, Vault
rdicosmo created T3096: Efficient and reliable download via the Vault.
Mar 8 2021, 10:04 AM · meta-task, Roadmap 2021, Vault
rdicosmo removed a project from T2220: swh-graph in production: Roadmap 2020.
Mar 8 2021, 9:57 AM · Roadmap 2022, meta-task, Roadmap 2021, Compressed graph service
rdicosmo moved T3085: Complete and updated copy of the archive on S3 (objects+graph) from Backlog to Work in progress on the Roadmap 2021 board.
Mar 8 2021, 9:57 AM · Roadmap 2022, meta-task, Roadmap 2021, System administration, Object storage
rdicosmo moved T3085: Complete and updated copy of the archive on S3 (objects+graph) from Done to Backlog on the Roadmap 2021 board.
Mar 8 2021, 9:57 AM · Roadmap 2022, meta-task, Roadmap 2021, System administration, Object storage
rdicosmo moved T3085: Complete and updated copy of the archive on S3 (objects+graph) from Backlog to Done on the Roadmap 2021 board.
Mar 8 2021, 9:56 AM · Roadmap 2022, meta-task, Roadmap 2021, System administration, Object storage
rdicosmo added a project to T2220: swh-graph in production: Roadmap 2021.
Mar 8 2021, 9:52 AM · Roadmap 2022, meta-task, Roadmap 2021, Compressed graph service
rdicosmo added a project to T3085: Complete and updated copy of the archive on S3 (objects+graph): Roadmap 2021.
Mar 8 2021, 9:49 AM · Roadmap 2022, meta-task, Roadmap 2021, System administration, Object storage
rdicosmo added a parent task for T1743: create a nice landing web page for exported dataset: T3085: Complete and updated copy of the archive on S3 (objects+graph).
Mar 8 2021, 9:49 AM · Datasets
rdicosmo added a subtask for T3085: Complete and updated copy of the archive on S3 (objects+graph): T1743: create a nice landing web page for exported dataset.
Mar 8 2021, 9:49 AM · Roadmap 2022, meta-task, Roadmap 2021, System administration, Object storage
rdicosmo added a comment to T1406: Documentation/tutorial for using public datasets (Athena/AWS).

Should this be closed now? The documentation is at https://docs.softwareheritage.org/devel/swh-dataset/

Mar 8 2021, 9:48 AM · Documentation, Sprint 2018 12
rdicosmo added a subtask for T3085: Complete and updated copy of the archive on S3 (objects+graph): T1848: refresh graph dataset export.
Mar 8 2021, 9:45 AM · Roadmap 2022, meta-task, Roadmap 2021, System administration, Object storage
rdicosmo added a parent task for T1848: refresh graph dataset export: T3085: Complete and updated copy of the archive on S3 (objects+graph).
Mar 8 2021, 9:45 AM · Datasets
rdicosmo added a project to T1538: Add "forge" now: Roadmap 2021.
Mar 8 2021, 9:43 AM · Add Forge Now , Roadmap 2022, meta-task, Roadmap 2021
rdicosmo added a project to T3082: Improve Save Code Now handling: Roadmap 2021.
Mar 8 2021, 9:43 AM · Save Code Now, meta-task, Roadmap 2021, Web app
rdicosmo added a project to T3087: Implement support for takedown notices (infra, admin tools, workflow): Roadmap 2021.
Mar 8 2021, 9:41 AM · Roadmap 2022, meta-task, Roadmap 2021, Web app
rdicosmo set the icon for Roadmap 2021 to Goal.
Mar 8 2021, 9:41 AM
rdicosmo created Roadmap 2021.
Mar 8 2021, 9:40 AM
rdicosmo triaged T3007: Nicolas - Hall of fame as Unbreak Now! priority.
Mar 8 2021, 9:26 AM · Unknown Object (Project)
rdicosmo closed T2952: Ambassador programme at SWH Wikipedia page as Resolved.
Mar 8 2021, 9:24 AM · Unknown Object (Project)

Mar 7 2021

rdicosmo triaged T3095: Add LIP6 gitlab instance to regular crawling list as Normal priority.
Mar 7 2021, 8:40 AM · Scientific Community Building, Archive coverage
rdicosmo updated the task description for T3082: Improve Save Code Now handling.
Mar 7 2021, 8:26 AM · Save Code Now, meta-task, Roadmap 2021, Web app

Mar 4 2021

rdicosmo added a comment to T1099: support origin and SWHID blocklist for archive search and browse.

Notice that the filtering may need to be done at all levels: origins, but also SWHIDs in general.
An example (real) use case is a takedown request for just one specific commit in a repository: we do not want to dereference all the rest.

Mar 4 2021, 7:25 PM · General, Web app
rdicosmo renamed T1099: support origin and SWHID blocklist for archive search and browse from support origin blacklist for archive search and browse to support origin and SWHID blacklist for archive search and browse.
Mar 4 2021, 7:23 PM · General, Web app
rdicosmo added a parent task for T1099: support origin and SWHID blocklist for archive search and browse: T3087: Implement support for takedown notices (infra, admin tools, workflow).
Mar 4 2021, 7:21 PM · General, Web app
rdicosmo added a subtask for T3087: Implement support for takedown notices (infra, admin tools, workflow): T1099: support origin and SWHID blocklist for archive search and browse.
Mar 4 2021, 7:21 PM · Roadmap 2022, meta-task, Roadmap 2021, Web app
rdicosmo triaged T3087: Implement support for takedown notices (infra, admin tools, workflow) as Normal priority.
Mar 4 2021, 7:21 PM · Roadmap 2022, meta-task, Roadmap 2021, Web app
rdicosmo added a parent task for T1954: Up-to-date objstorage mirror on S3: T3085: Complete and updated copy of the archive on S3 (objects+graph).
Mar 4 2021, 5:50 PM · System administration, Object storage
rdicosmo added a subtask for T3085: Complete and updated copy of the archive on S3 (objects+graph): T1954: Up-to-date objstorage mirror on S3.
Mar 4 2021, 5:50 PM · Roadmap 2022, meta-task, Roadmap 2021, System administration, Object storage
rdicosmo created T3085: Complete and updated copy of the archive on S3 (objects+graph).
Mar 4 2021, 5:50 PM · Roadmap 2022, meta-task, Roadmap 2021, System administration, Object storage
rdicosmo merged task T1914: Keep mirror of contents on S3 up to date into T1954: Up-to-date objstorage mirror on S3.
Mar 4 2021, 5:44 PM · Mirror, Datasets
rdicosmo merged T1914: Keep mirror of contents on S3 up to date into T1954: Up-to-date objstorage mirror on S3.
Mar 4 2021, 5:44 PM · System administration, Object storage
rdicosmo triaged T3084: Fast track save code now requests as Normal priority.
Mar 4 2021, 5:36 PM · System administration, Web app
rdicosmo created T3084: Fast track save code now requests.
Mar 4 2021, 5:35 PM · System administration, Web app
rdicosmo added a parent task for T1524: save code now: also add new origins for unknown repos: T3082: Improve Save Code Now handling.
Mar 4 2021, 10:37 AM · Save Code Now, Web app
rdicosmo added a subtask for T3082: Improve Save Code Now handling: T1524: save code now: also add new origins for unknown repos.
Mar 4 2021, 10:37 AM · Save Code Now, meta-task, Roadmap 2021, Web app
rdicosmo added a parent task for T1481: add metric to monitor "save code now" efficiency: T3082: Improve Save Code Now handling.
Mar 4 2021, 10:36 AM · Save Code Now, System administration, Metrics/monitoring
rdicosmo added a subtask for T3082: Improve Save Code Now handling: T1481: add metric to monitor "save code now" efficiency.
Mar 4 2021, 10:36 AM · Save Code Now, meta-task, Roadmap 2021, Web app
rdicosmo added a parent task for T2117: Save Code Now: End to End monitoring: T3082: Improve Save Code Now handling.
Mar 4 2021, 10:35 AM · System administration, Monitoring, Roadmap 2021
rdicosmo added a subtask for T3082: Improve Save Code Now handling: T2117: Save Code Now: End to End monitoring.
Mar 4 2021, 10:35 AM · Save Code Now, meta-task, Roadmap 2021, Web app
rdicosmo added a parent task for T2168: Add a grafana dashboard to monitor "Save code now" requests: T3082: Improve Save Code Now handling.
Mar 4 2021, 10:34 AM · System administration
rdicosmo added a subtask for T3082: Improve Save Code Now handling: T2168: Add a grafana dashboard to monitor "Save code now" requests.
Mar 4 2021, 10:34 AM · Save Code Now, meta-task, Roadmap 2021, Web app
rdicosmo added a parent task for T2750: mercurial loader fails on save code now: T3082: Improve Save Code Now handling.
Mar 4 2021, 10:33 AM · Mercurial loader
rdicosmo added a subtask for T3082: Improve Save Code Now handling: T2750: mercurial loader fails on save code now.
Mar 4 2021, 10:33 AM · Save Code Now, meta-task, Roadmap 2021, Web app
rdicosmo added a parent task for T2891: "Save code now" may fail with "OperationalError: database is locked": T3082: Improve Save Code Now handling.
Mar 4 2021, 10:32 AM · System administration, Web app
rdicosmo added a subtask for T3082: Improve Save Code Now handling: T2891: "Save code now" may fail with "OperationalError: database is locked".
Mar 4 2021, 10:32 AM · Save Code Now, meta-task, Roadmap 2021, Web app
rdicosmo created T3082: Improve Save Code Now handling.
Mar 4 2021, 10:31 AM · Save Code Now, meta-task, Roadmap 2021, Web app
rdicosmo raised the priority of T1538: Add "forge" now from Low to Normal.
Mar 4 2021, 10:22 AM · Add Forge Now , Roadmap 2022, meta-task, Roadmap 2021
rdicosmo merged T2199: "Save Forge Now" into T1538: Add "forge" now.
Mar 4 2021, 10:21 AM · Add Forge Now , Roadmap 2022, meta-task, Roadmap 2021
rdicosmo merged task T2199: "Save Forge Now" into T1538: Add "forge" now.
Mar 4 2021, 10:21 AM · Roadmap 2020

Feb 10 2021

rdicosmo added a comment to T376: ingest git.eclipse.org repositories.

Note that does not mean this is or will be ingested anytime soon though.
We are still missing at least the one cog to actually schedule those listed origins.

More details in T2345#58247

Feb 10 2021, 12:31 PM · Archive coverage

Feb 4 2021

rdicosmo added a comment to T2912: Next generation archive counters.

I asked one of the authors of the original HyperLogLog paper (not Philippe, that unfortunately passed away years ago :-()
The original HyperLogLog has three different behaviour, one for small cardinals, another for median cardinals, and a third for very large cardinals.
There is indeed a risk of breaking monotonicity at the boundaries between segments, but in each segment it is monotonic.
Our counters are already in the "very large cardinal" zone, so we should be safe with any implementation.

Feb 4 2021, 10:31 PM · Roadmap 2021, System administration, Monitoring, Web app

Feb 3 2021

rdicosmo added a comment to T2912: Next generation archive counters.
In T2912#58063, @zack wrote:

Thanks @vsellier, that seems quite ok indeed. The only question left is to know if the estimator implemented is monotonic (i.e. we will never have negative bumps in the graph :-))

may I suggest (for reasons discussed in the past) to just remove the graphs from the main archive.s.o page

We decided to keep the counters.

Feb 3 2021, 4:21 PM · Roadmap 2021, System administration, Monitoring, Web app
rdicosmo added a comment to T2912: Next generation archive counters.

Thanks @vsellier, that seems quite ok indeed. The only question left is to know if the estimator implemented is monotonic (i.e. we will never have negative bumps in the graph :-))

Feb 3 2021, 4:01 PM · Roadmap 2021, System administration, Monitoring, Web app

Feb 2 2021

rdicosmo closed D4977: Update persistent identifiers doc with pip install info.
Feb 2 2021, 1:55 PM
rdicosmo committed rDMOD0c1658128322: Make explicit Python 3 dependency (authored by rdicosmo).
Make explicit Python 3 dependency
Feb 2 2021, 1:55 PM
rdicosmo committed rDMOD1bfdf717c55a: Update persistent identifiers doc with pip install info (authored by rdicosmo).
Update persistent identifiers doc with pip install info
Feb 2 2021, 1:55 PM
rdicosmo updated the diff for D4977: Update persistent identifiers doc with pip install info.

Make explicit Python3 dependency

Feb 2 2021, 1:52 PM

Feb 1 2021

rdicosmo added a comment to T376: ingest git.eclipse.org repositories.

Thanks @ardumont , that's great! If you think this does not need any more support on the Eclipse side, may you let them know?

Feb 1 2021, 5:59 PM · Archive coverage
rdicosmo added a comment to T376: ingest git.eclipse.org repositories.

Thanks @ardumont , that's great! If you think this does not need any more support on the Eclipse side, may you let them know?

Feb 1 2021, 5:58 PM · Archive coverage
rdicosmo moved T2952: Ambassador programme at SWH Wikipedia page from Restricted Project Column to Restricted Project Column on the Unknown Object (Project) board.
Feb 1 2021, 9:51 AM · Unknown Object (Project)
rdicosmo moved T2952: Ambassador programme at SWH Wikipedia page from Restricted Project Column to Restricted Project Column on the Unknown Object (Project) board.
Feb 1 2021, 9:51 AM · Unknown Object (Project)

Jan 31 2021

rdicosmo moved T2943: Contribute a research highlight on CIG (geodynamics.org) from Restricted Project Column to Restricted Project Column on the Unknown Object (Project) board.
Jan 31 2021, 1:05 PM · Unknown Object (Project)

Jan 30 2021

rdicosmo requested review of D4977: Update persistent identifiers doc with pip install info.
Jan 30 2021, 12:56 AM

Jan 29 2021

rdicosmo added a comment to T2912: Next generation archive counters.

I don't think this solves the issue of overestimating the number of objects, when two threads insert the same objects at the same time.

! In T2912#57655, @vsellier wrote:

I'm not sure to understand,

Jan 29 2021, 1:10 PM · Roadmap 2021, System administration, Monitoring, Web app
rdicosmo added a comment to T376: ingest git.eclipse.org repositories.

Thanks @ardumont for experimenting with this. The 500 seems normal: we need to tell Eclipse about us first, I'll put you in touch. So maybe it's still a no-brainer, and we just need to document the "contant the owner to get whitelisted" human step :-)

Jan 29 2021, 10:04 AM · Archive coverage

Jan 28 2021

rdicosmo added a comment to T2912: Next generation archive counters.

Bloom filters are still on the table for other use cases, like testing super quickly for contents that we do not have, but if nobody has strong objections, this seems the way to go for the counters (very small footprint, small under/over counting errors, thanks Philippe Flajolet's magic :-))

Jan 28 2021, 7:27 PM · Roadmap 2021, System administration, Monitoring, Web app

Jan 27 2021

rdicosmo committed rMSLDb1b1173ba3d2: Final version PIDApalooza (authored by rdicosmo).
Final version PIDApalooza
Jan 27 2021, 8:11 PM

Jan 25 2021

rdicosmo assigned T376: ingest git.eclipse.org repositories to ardumont.
Jan 25 2021, 9:03 PM · Archive coverage