Page MenuHomeSoftware Heritage
Feed Advanced Search

May 7 2021

rdicosmo committed rMSLDd4b4e016ea3c: Add pillar of OS entry in ARDC module (authored by rdicosmo).
Add pillar of OS entry in ARDC module
May 7 2021, 10:20 AM
rdicosmo added a comment to T3312: web API rate limit: 10x more quota for authenticated users.

@anlambert ; ping me when this is done, so we can answer some pending requests :-)

May 7 2021, 9:44 AM · Web app

Apr 29 2021

rdicosmo added a comment to T3298: Consider making SWHID handling case insensitive.

So for SWHID v1, the resolver should turn the core part into lowercase , am I right ?

Apr 29 2021, 1:16 PM · Data Model, Web app
rdicosmo added a comment to T3298: Consider making SWHID handling case insensitive.
In T3298#64426, @zack wrote:

This is going to be an interesting challenge/trade-off for SWHIDv2. Because I was considering there to use more compact encodings than hex, in order to shorten the SWHID length, like base58, but those are case-sensitive in order to be more dense.

So, as a counter argument above the "SHOULD" idea, we need to be careful about promoting a practice now that might change when switching from SWHIDv1 to SWHIDv2.

Apr 29 2021, 12:19 PM · Data Model, Web app
rdicosmo updated the task description for T3298: Consider making SWHID handling case insensitive.
Apr 29 2021, 12:03 PM · Data Model, Web app
rdicosmo triaged T3298: Consider making SWHID handling case insensitive as Normal priority.
Apr 29 2021, 12:02 PM · Data Model, Web app
rdicosmo created T3298: Consider making SWHID handling case insensitive.
Apr 29 2021, 12:02 PM · Data Model, Web app

Apr 28 2021

rdicosmo added a comment to T2912: Next generation archive counters.

> I also recall now that vincent added a graph [1] recently enough.

This to try and compare a bit the counter approaches together.

So that's still using the old plumbing at least for that part.

[1] https://grafana.softwareheritage.org/goto/BlkwHorMz

Apr 28 2021, 5:23 PM · Roadmap 2021, System administration, Monitoring, Web app

Apr 27 2021

rdicosmo created T3295: Archive the Kermit historical source code collection.
Apr 27 2021, 10:41 AM · Community Building

Apr 26 2021

rdicosmo added a comment to T2912: Next generation archive counters.

Last bits deployed on archive.s.o (including the author counters).

Apr 26 2021, 1:33 PM · Roadmap 2021, System administration, Monitoring, Web app
rdicosmo moved T2912: Next generation archive counters from Work in progress to Pending validation on the Roadmap 2021 board.
Apr 26 2021, 10:50 AM · Roadmap 2021, System administration, Monitoring, Web app
rdicosmo closed T3163: Call For Participation Grants as Resolved.
Apr 26 2021, 9:23 AM · Unknown Object (Project)

Apr 24 2021

rdicosmo added a comment to T3213: Enable save code now of software source code archives for specific users.

I recall it's part of creating a primary key (of sort) composed of all the properties mentioned
above (when the artifact does not provide some hashes already).
This to bypass fetching all other again things already fetched.

Apr 24 2021, 3:20 PM · Save Code Now, Web app
rdicosmo added a comment to T3213: Enable save code now of software source code archives for specific users.

Currently users only provide an url in the save code now, the loader expects a bit more
[1] (recall it's the lister which actually provide those).

The loader expects to be provided with a list of artifacts (could be only 1 in our
case). Still, such artifacts are described through the following:

  • artifact url
  • time
  • length (could be derived from the url when discussing with the server but not all server provides it...)
  • version (could be derived with heuristic from the url as well but that's regexp-hell-ish and prone to error)
  • filename (could be derived from the url without too much risk i think...)

I gather the save code now ui could be enriched (and displayed according to chosen visit
type) but that becomes more involved for people in general.

Another road would be to make some of those properties optional...

Thoughts?

[1]

 "url": "https://ftp.gnu.org/old-gnu/emacs/",
 "artifacts": [{"url": "https://ftp.gnu.org/old-gnu/emacs/elib-1.0.tar.gz",
                "time": "1995-12-12T08:00:00+00:00",
                "length": 58335,
                "version": "1.0",
                "filename": "elib-1.0.tar.gz",
                },
                ...
               ]
...
Apr 24 2021, 9:53 AM · Save Code Now, Web app

Apr 21 2021

rdicosmo added a comment to T3213: Enable save code now of software source code archives for specific users.

Thanks @ardumont ... so it appears that adapting the logic is easy... may you do it?
@anlambert may you look into the needed modification of the UI, to enable the new type of save code now payloads for selected authenticated users?

Apr 21 2021, 6:58 PM · Save Code Now, Web app
rdicosmo added a comment to T3087: Implement support for takedown notices (infra, admin tools, workflow).

So what about exports of the archive available on git-annex?

Apr 21 2021, 6:53 PM · Roadmap 2022, meta-task, Roadmap 2021, Web app

Apr 20 2021

rdicosmo added a comment to T3278: Check older pending save code now requests apparently stuck and reschedule those.

Thanks, this is quite useful indeed.

Apr 20 2021, 7:28 PM · System administration, Save Code Now
rdicosmo added a comment to T3278: Check older pending save code now requests apparently stuck and reschedule those.

Thanks for looking into this. If I look at https://grafana.softwareheritage.org/d/WXRVVc_Mz/save-code-now?viewPanel=4&orgId=1&from=1617954242247&to=1617975842247&var-environment=production&var-instance=moma.internal.softwareheritage.org&var-status=All&var-load_task_status=All&var-visit_type=All it seems there are also some 255 requests "not yet scheduled". Maybe it's the same issue?

Apr 20 2021, 11:00 AM · System administration, Save Code Now

Apr 19 2021

rdicosmo committed rMSLDd525b5e493d9: RDA Data granularity WG presentation (authored by rdicosmo).
RDA Data granularity WG presentation
Apr 19 2021, 8:17 PM
rdicosmo added a comment to T3234: Handle gracefully trailing slashes when resolving SWHID in search box.

Thanks, it is indeed an urgent matter, as various journals depend on this!

Apr 19 2021, 6:46 PM · Web app
rdicosmo reopened T3234: Handle gracefully trailing slashes when resolving SWHID in search box as "Open".

Well, it seems we have been hit by this again, in a different form:

Apr 19 2021, 6:10 PM · Web app
rdicosmo added a comment to T3247: Implement SWHID validation in frontend.

Cool!

Apr 19 2021, 3:58 PM · Web app
rdicosmo moved T3246: Document takedown request processing workflow from Backlog to Work in progress on the Roadmap 2021 board.
Apr 19 2021, 11:53 AM · Archive content
rdicosmo moved T3077: Ease integration of fundraising campaigns from Pending validation to Done on the Roadmap 2021 board.
Apr 19 2021, 11:53 AM · Community Building, Roadmap 2021, Website

Apr 16 2021

rdicosmo added a comment to T3252: Better handling of erroneous origins submitted to save code now.

Thanks to all of you for this dicussion and proposals.

Apr 16 2021, 1:39 PM · System administration, Save Code Now, Web app
rdicosmo added a comment to T3256: Propose reason for rejecting a save code now.

Great. In addition to the content of the free form field, the standard answer should contain proper boilerplate reminding what is expected in a Save Code Now request, along the lines of what is written in the "Help" tab of https://archive.softwareheritage.org/save/

Apr 16 2021, 1:24 PM · Save Code Now, Easy hack, Web app
rdicosmo added a comment to T2117: Save Code Now: End to End monitoring.

On a related note, it may be useful to regularly report requests that did not complete (either as success or failure) in a reasonable amount of time after being scheduled.

Apr 16 2021, 9:06 AM · System administration, Monitoring, Roadmap 2021

Apr 15 2021

rdicosmo triaged T3252: Better handling of erroneous origins submitted to save code now as Normal priority.
Apr 15 2021, 10:47 PM · System administration, Save Code Now, Web app
rdicosmo added a comment to T2912: Next generation archive counters.

This kind of journal client will be necessary in any case if we want to extend the usage of the counters for other perimeters (metadata count, origin per forge, ...)

Apr 15 2021, 3:35 PM · Roadmap 2021, System administration, Monitoring, Web app
rdicosmo added a comment to T3084: Fast track save code now requests.

Pushed, packaged, deployed.

scheduler runner continues happily to schedule existing tasks and some new task with priority

Apr 15 13:12:51 saatchi swh[234257]: INFO:swh.scheduler.celery_backend.runner:Grabbed 2084 tasks load-git
Apr 15 13:12:54 saatchi swh[234257]: INFO:swh.scheduler.cli.admin.runner:Scheduled 4128 tasks
Apr 15 13:14:06 saatchi swh[234257]: INFO:swh.scheduler.celery_backend.runner:Grabbed 1 tasks load-pypi
Apr 15 13:14:06 saatchi swh[234257]: INFO:swh.scheduler.celery_backend.runner:Grabbed 1 tasks load-git (priority)
...

That task got done almost immediately...
So there you go ;)

Apr 15 2021, 3:30 PM · System administration, Web app
rdicosmo added a comment to T2912: Next generation archive counters.

Staging webapp[1] and webapp1 on production [2] are now configured to use swh-counters to display the historical values and the live object counts.

Apr 15 2021, 12:09 PM · Roadmap 2021, System administration, Monitoring, Web app

Apr 14 2021

rdicosmo added a comment to T3084: Fast track save code now requests.

Great news :-)

Apr 14 2021, 7:01 PM · System administration, Web app

Apr 13 2021

rdicosmo committed rMSLD5beda4268f79: Slides for RDA SSC IG (authored by rdicosmo).
Slides for RDA SSC IG
Apr 13 2021, 7:32 PM
rdicosmo committed R238:9cf42fd2074c: variant for renater gforge (authored by rdicosmo).
variant for renater gforge
Apr 13 2021, 6:54 PM
rdicosmo added a comment to D267: [WIP] add first implementation of FusionForge lister.

What would left to do to make this lister work? It seems already in good state, and it would be useful to index gforge.inria.fr since it will be closed soon (https://gforge.inria.fr/forum/forum.php?forum_id=11543). For the gforge.inria.fr case specifically, it is worth noticing that project creation is closed already, so a one-shot listing could be an option if it is lighter to set up: I wrote a small script to do that, but after a few requests to https://archive.softwareheritage.org/save/, requests are throttled. I would be happy to send you a listing of the public projects hosted on gforge.inria.fr if it could help.

Apr 13 2021, 5:00 PM
rdicosmo committed R238:4e2907a3f4fa: Prepare to generalize (authored by rdicosmo).
Prepare to generalize
Apr 13 2021, 4:06 PM
rdicosmo raised the priority of T3087: Implement support for takedown notices (infra, admin tools, workflow) from Normal to High.
Apr 13 2021, 2:53 PM · Roadmap 2022, meta-task, Roadmap 2021, Web app
rdicosmo added a comment to T3247: Implement SWHID validation in frontend.

Ok, this is converging with the discussion in T3234: we fully agree that having proper errors reported to the user is the way to go, so let's forget about the "sanitization" approach.

Apr 13 2021, 12:38 PM · Web app
rdicosmo added a comment to T3234: Handle gracefully trailing slashes when resolving SWHID in search box.
Apr 13 2021, 12:23 PM · Web app
rdicosmo added a comment to D5485: docs/persistent-identifiers: Add guidelines for fixing invalid SWHIDs..

Ok, so no need to change the specification document for SWHIDs.

Apr 13 2021, 12:06 PM
rdicosmo added a comment to T3234: Handle gracefully trailing slashes when resolving SWHID in search box.

@vlorentz , @anlambert : thanks for progressing the discussion on this issue.
After mulling over your inputs, here is my current understanding:

Apr 13 2021, 12:00 PM · Web app
rdicosmo added a comment to T3247: Implement SWHID validation in frontend.

I wonder if this is not overkill: SWHID may evolve in the future, and maintaining two implementations (one of them in JS!) may be source of headaches down the line.
A simple "sanitization" phase in the frontend catching the most common issues (trailing slashes, leading or trailing tabs or spaces, etc.) would probably be enough for our purpose.

Apr 13 2021, 11:34 AM · Web app

Apr 10 2021

rdicosmo added a comment to T3234: Handle gracefully trailing slashes when resolving SWHID in search box.

As a compromise, we could accept this trailing slash, but show a warning on the interface and/or codify in the SWHID specification an exhaustive list of "fixes" that user interfaces can/should do.

Apr 10 2021, 1:12 PM · Web app
rdicosmo added a comment to T3234: Handle gracefully trailing slashes when resolving SWHID in search box.

There are already many URLs in the open, so even if we remove the trailing slash now, that does not solve the problem.

Apr 10 2021, 11:44 AM · Web app
rdicosmo triaged T3234: Handle gracefully trailing slashes when resolving SWHID in search box as Normal priority.
Apr 10 2021, 11:16 AM · Web app
rdicosmo triaged T3233: Missing Apollo 11 virtual AGC repository from Google Code as Normal priority.
Apr 10 2021, 9:32 AM · SVN Loader

Apr 9 2021

rdicosmo raised the priority of T3230: Add various markdown variants to list of intrinsic metadata files to be indexed from Low to Normal.
Apr 9 2021, 4:45 PM · Intrinsic metadata, Indexer
rdicosmo committed rMSLD7b622587a441: Added general Open Science presentation (authored by rdicosmo).
Added general Open Science presentation
Apr 9 2021, 2:14 PM
rdicosmo updated the task description for T3230: Add various markdown variants to list of intrinsic metadata files to be indexed .
Apr 9 2021, 1:33 PM · Intrinsic metadata, Indexer
rdicosmo created T3230: Add various markdown variants to list of intrinsic metadata files to be indexed .
Apr 9 2021, 1:32 PM · Intrinsic metadata, Indexer

Apr 6 2021

rdicosmo triaged T3213: Enable save code now of software source code archives for specific users as Normal priority.
Apr 6 2021, 9:33 PM · Save Code Now, Web app
rdicosmo assigned T3213: Enable save code now of software source code archives for specific users to anlambert.
Apr 6 2021, 9:32 PM · Save Code Now, Web app
rdicosmo added a subtask for T3082: Improve Save Code Now handling: T3213: Enable save code now of software source code archives for specific users.
Apr 6 2021, 9:32 PM · Save Code Now, meta-task, Roadmap 2021, Web app
rdicosmo added a parent task for T3213: Enable save code now of software source code archives for specific users: T3082: Improve Save Code Now handling.
Apr 6 2021, 9:32 PM · Save Code Now, Web app
rdicosmo created T3213: Enable save code now of software source code archives for specific users.
Apr 6 2021, 9:31 PM · Save Code Now, Web app
rdicosmo moved T3077: Ease integration of fundraising campaigns from Work in progress to Pending validation on the Roadmap 2021 board.
Apr 6 2021, 11:45 AM · Community Building, Roadmap 2021, Website
rdicosmo committed rMSLD4570184c6adf: Update growth (authored by rdicosmo).
Update growth
Apr 6 2021, 11:29 AM
rdicosmo committed rMSLD280818490775: Update to LLW talk (authored by rdicosmo).
Update to LLW talk
Apr 6 2021, 11:16 AM
rdicosmo committed rMSLD994029b254db: Module deck (authored by rdicosmo).
Module deck
Apr 6 2021, 11:05 AM
rdicosmo committed rMSLD9affd6c0f92f: DIG (authored by rdicosmo).
DIG
Apr 6 2021, 11:05 AM

Apr 5 2021

rdicosmo assigned T3128: Improve deposit integration, management and display to moranegg.
Apr 5 2021, 12:13 PM · meta-task, Roadmap 2021, Monitoring, SWORD deposit, Web app
rdicosmo updated subscribers of T3128: Improve deposit integration, management and display.
Apr 5 2021, 12:13 PM · meta-task, Roadmap 2021, Monitoring, SWORD deposit, Web app
rdicosmo added a parent task for T2540: support the loading of metadata-only deposits in the metadata storage: T3128: Improve deposit integration, management and display.
Apr 5 2021, 12:13 PM · Roadmap 2020, SWORD deposit, Scientific Community Building
rdicosmo added a parent task for T2344: Build a connector for software deposit via Zenodo/InvenioRDM: T3128: Improve deposit integration, management and display.
Apr 5 2021, 12:13 PM · meta-task, Roadmap 2022, Roadmap 2020, SWORD deposit, Scientific Community Building
rdicosmo added subtasks for T3128: Improve deposit integration, management and display: T2344: Build a connector for software deposit via Zenodo/InvenioRDM, T2540: support the loading of metadata-only deposits in the metadata storage.
Apr 5 2021, 12:13 PM · meta-task, Roadmap 2021, Monitoring, SWORD deposit, Web app
rdicosmo renamed T3128: Improve deposit integration, management and display from Improve deposit management and display to Improve deposit integration, management and display.
Apr 5 2021, 12:12 PM · meta-task, Roadmap 2021, Monitoring, SWORD deposit, Web app
rdicosmo closed T3204: Extract List of all IPOL deposits in csv, a subtask of T3192: Add possibility to fetch last SWHID for a deposit using an origin on deposit cli, as Resolved.
Apr 5 2021, 11:52 AM · Easy hack, SWORD deposit
rdicosmo closed T3204: Extract List of all IPOL deposits in csv as Resolved.
Apr 5 2021, 11:52 AM · Easy hack, SWORD deposit
rdicosmo added a comment to T3202: Help new users discover the features available in the archive browsing view.

Ok, this is the way we'll go, merging in T3196 that is now obsolete.

Apr 5 2021, 10:52 AM · Web app
rdicosmo merged task T3196: Improve discoverability of the permalinks tab into T3202: Help new users discover the features available in the archive browsing view.
Apr 5 2021, 10:49 AM · Web app
rdicosmo merged T3196: Improve discoverability of the permalinks tab into T3202: Help new users discover the features available in the archive browsing view.
Apr 5 2021, 10:49 AM · Web app

Apr 2 2021

rdicosmo added a subtask for T3202: Help new users discover the features available in the archive browsing view: T3196: Improve discoverability of the permalinks tab.
Apr 2 2021, 10:10 AM · Web app
rdicosmo added a parent task for T3196: Improve discoverability of the permalinks tab: T3202: Help new users discover the features available in the archive browsing view.
Apr 2 2021, 10:10 AM · Web app
rdicosmo triaged T3202: Help new users discover the features available in the archive browsing view as Normal priority.
Apr 2 2021, 10:10 AM · Web app
rdicosmo updated the task description for T3196: Improve discoverability of the permalinks tab.
Apr 2 2021, 9:55 AM · Web app
rdicosmo added a comment to T3196: Improve discoverability of the permalinks tab.

Let me make explicit a key requirement that apparently is not as obvious as I thought: the improvement we look for must not change the current UI in any way. Remember that we have tutorials, papers, instructions all over the place with the current UI, and many active users of the current UI, including journals that have trained their vendors about it, and are quite happy: it is too late (and counterproductive) to make changes now. Please do not debate this point: it is a firm decision.

Apr 2 2021, 9:48 AM · Web app

Apr 1 2021

rdicosmo triaged T3196: Improve discoverability of the permalinks tab as Normal priority.
Apr 1 2021, 7:48 PM · Web app
rdicosmo committed R238:53ca45741ecc: Simple script to save the Inria gForge one repo at a time (authored by rdicosmo).
Simple script to save the Inria gForge one repo at a time
Apr 1 2021, 6:50 PM
rdicosmo triaged T3195: Some SWHIDs provided by the WebApp produce 404 when resolved as High priority.
Apr 1 2021, 3:56 PM · Web app

Mar 31 2021

rdicosmo updated the task description for T1538: Add "forge" now.
Mar 31 2021, 12:35 PM · Add Forge Now , Roadmap 2022, meta-task, Roadmap 2021

Mar 29 2021

rdicosmo committed rMSLDa7712c4d342a: Add simple script to generate expanded imports for a list of modules (authored by rdicosmo).
Add simple script to generate expanded imports for a list of modules
Mar 29 2021, 11:32 AM
rdicosmo committed rMSLD83984b37df5d: update (authored by rdicosmo).
update
Mar 29 2021, 11:32 AM

Mar 19 2021

rdicosmo moved T2220: swh-graph in production from Backlog to Work in progress on the Roadmap 2021 board.
Mar 19 2021, 12:45 PM · Roadmap 2022, meta-task, Roadmap 2021, Compressed graph service

Mar 17 2021

rdicosmo added a comment to T1724: Maven Central repository support.

After recent exchanges with @hboutemy and Charles Sabourdin, here is a clarification of the scope of this task.
We need a Maven repository lister that addresses the following issues:

Mar 17 2021, 10:40 AM · Maven loader, Maven lister, GSoC 2019, Archive coverage

Mar 15 2021

rdicosmo moved T3112: Provenance index for the full archive from Backlog to Work in progress on the Roadmap 2021 board.
Mar 15 2021, 9:10 PM · Roadmap 2022, Provenance database, Roadmap 2021, meta-task
rdicosmo moved T3119: FAQ from Backlog to Work in progress on the Roadmap 2021 board.
Mar 15 2021, 9:10 PM · Community Building
rdicosmo moved T3077: Ease integration of fundraising campaigns from Backlog to Work in progress on the Roadmap 2021 board.
Mar 15 2021, 9:10 PM · Community Building, Roadmap 2021, Website
rdicosmo moved T2912: Next generation archive counters from Backlog to Work in progress on the Roadmap 2021 board.
Mar 15 2021, 9:09 PM · Roadmap 2021, System administration, Monitoring, Web app
rdicosmo updated the task description for T2202: Collect extrinsic metadata.
Mar 15 2021, 9:08 PM · Roadmap 2022, meta-task, Roadmap 2021, Extrinsic metadata
rdicosmo added a parent task for T3112: Provenance index for the full archive: T3136: Prior art detection service.
Mar 15 2021, 8:59 PM · Roadmap 2022, Provenance database, Roadmap 2021, meta-task
rdicosmo added a subtask for T3136: Prior art detection service: T3112: Provenance index for the full archive.
Mar 15 2021, 8:59 PM · Roadmap 2022, Code scanner, Scientific Community Building, Roadmap 2021, meta-task
rdicosmo added a project to T3136: Prior art detection service: Code scanner.
Mar 15 2021, 8:58 PM · Roadmap 2022, Code scanner, Scientific Community Building, Roadmap 2021, meta-task
rdicosmo created T3136: Prior art detection service.
Mar 15 2021, 8:57 PM · Roadmap 2022, Code scanner, Scientific Community Building, Roadmap 2021, meta-task
rdicosmo added a subtask for T3135: Improve integrity of ingested content: T399: (Re-)Compute data checksums before insertion.
Mar 15 2021, 8:48 PM · Storage manager, Roadmap 2021, meta-task
rdicosmo added a parent task for T399: (Re-)Compute data checksums before insertion: T3135: Improve integrity of ingested content.
Mar 15 2021, 8:48 PM · Storage manager
rdicosmo created T3135: Improve integrity of ingested content.
Mar 15 2021, 8:47 PM · Storage manager, Roadmap 2021, meta-task
rdicosmo updated the task description for T3134: SWHID v2.
Mar 15 2021, 8:11 PM · Roadmap 2022, Roadmap 2020, Data Model, Web app, meta-task, Roadmap 2021
rdicosmo created T3134: SWHID v2.
Mar 15 2021, 8:09 PM · Roadmap 2022, Roadmap 2020, Data Model, Web app, meta-task, Roadmap 2021
rdicosmo added a subtask for T2234: Write use case-specific documentation: T3035: Triage documentation in wiki.
Mar 15 2021, 8:01 PM · Roadmap 2021, meta-task, Documentation
rdicosmo added a parent task for T3035: Triage documentation in wiki: T2234: Write use case-specific documentation.
Mar 15 2021, 8:01 PM · Documentation