- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Jun 14 2021
Jun 11 2021
Great, it seems we are getting there :-)
In T3365#66091, @anlambert wrote:
Jun 7 2021
Thanks @ardumont for investigating this. The fact that the IA does not provide the LastModified information may make sense for their specific case (it is possible that they do not have kept the LastModified info from the original location).
May 29 2021
May 28 2021
In T3213#65579, @anlambert wrote:The feature has been implemented and looks ready for production use.
I just tested it using the Web API and the docker environment for a real world example: the Kermit Software Source Code Archive.
May 25 2021
That will be helpful in general (to answer questions likes: which endpoint is over/underused for specific use cases) and also in view of seeing who over/underuses rate limits (e.g., to identify the need of having more generous rate limits for specific use cases).
May 20 2021
May 19 2021
In T3202#65230, @anlambert wrote:If we want non staff users to give a try to the tour before official release, we could take advantage of authentication here and activate the guided tour only for users with a dedicated permission.
In T3202#65224, @anlambert wrote:Is the Help page linked from some other place? (i.e.: are we risking 404s if we dump it?)
I mean dumping the link not the page but I could move it in the footer to still reach the page.
In T3202#65222, @anlambert wrote:After some brainstorming on the subject, I was thinking to launch the guided tour through the Help link in the left sidebar and thus dump the Help page.
May 12 2021
In T3202#65029, @anlambert wrote:So we have a winner here.
May 11 2021
May 10 2021
In T1226#64927, @anlambert wrote:Is this feature still needed?
I think so, some origins can be long to load into the archive (huge svn repo for instance),
having a mail notification would be of interest here.If yes, is it easy to implement it now?
Not at the moment, we need to resolve T3286 first.
A lot has changed since this was opened:
May 8 2021
May 7 2021
In T3312#64763, @anlambert wrote:If we need to tune rate limit for specific type of users, this could be easily added in the new throttling
code I am currently working on.
In T3312#64760, @anlambert wrote:
@anlambert ; ping me when this is done, so we can answer some pending requests :-)
Apr 29 2021
In T3298#64431, @anlambert wrote:So for SWHID v1, the resolver should turn the core part into lowercase , am I right ?
In T3298#64426, @zack wrote:This is going to be an interesting challenge/trade-off for SWHIDv2. Because I was considering there to use more compact encodings than hex, in order to shorten the SWHID length, like base58, but those are case-sensitive in order to be more dense.
So, as a counter argument above the "SHOULD" idea, we need to be careful about promoting a practice now that might change when switching from SWHIDv1 to SWHIDv2.
Apr 28 2021
> I also recall now that vincent added a graph [1] recently enough.
This to try and compare a bit the counter approaches together.
So that's still using the old plumbing at least for that part.
Apr 27 2021
Apr 26 2021
In T2912#64174, @ardumont wrote:Last bits deployed on archive.s.o (including the author counters).
Apr 24 2021
In T3213#64118, @ardumont wrote:I recall it's part of creating a primary key (of sort) composed of all the properties mentioned
above (when the artifact does not provide some hashes already).
This to bypass fetching all other again things already fetched.
In T3213#64001, @ardumont wrote:Currently users only provide an url in the save code now, the loader expects a bit more
[1] (recall it's the lister which actually provide those).The loader expects to be provided with a list of artifacts (could be only 1 in our
case). Still, such artifacts are described through the following:
- artifact url
- time
- length (could be derived from the url when discussing with the server but not all server provides it...)
- version (could be derived with heuristic from the url as well but that's regexp-hell-ish and prone to error)
- filename (could be derived from the url without too much risk i think...)
I gather the save code now ui could be enriched (and displayed according to chosen visit
type) but that becomes more involved for people in general.Another road would be to make some of those properties optional...
Thoughts?
[1]
"url": "https://ftp.gnu.org/old-gnu/emacs/", "artifacts": [{"url": "https://ftp.gnu.org/old-gnu/emacs/elib-1.0.tar.gz", "time": "1995-12-12T08:00:00+00:00", "length": 58335, "version": "1.0", "filename": "elib-1.0.tar.gz", }, ... ] ...
Apr 21 2021
Thanks @ardumont ... so it appears that adapting the logic is easy... may you do it?
@anlambert may you look into the needed modification of the UI, to enable the new type of save code now payloads for selected authenticated users?
In T3087#63791, @douardda wrote:So what about exports of the archive available on git-annex?
Apr 20 2021
Thanks, this is quite useful indeed.
Thanks for looking into this. If I look at https://grafana.softwareheritage.org/d/WXRVVc_Mz/save-code-now?viewPanel=4&orgId=1&from=1617954242247&to=1617975842247&var-environment=production&var-instance=moma.internal.softwareheritage.org&var-status=All&var-load_task_status=All&var-visit_type=All it seems there are also some 255 requests "not yet scheduled". Maybe it's the same issue?
Apr 19 2021
Thanks, it is indeed an urgent matter, as various journals depend on this!
Well, it seems we have been hit by this again, in a different form:
Cool!
Apr 16 2021
Thanks to all of you for this dicussion and proposals.
Great. In addition to the content of the free form field, the standard answer should contain proper boilerplate reminding what is expected in a Save Code Now request, along the lines of what is written in the "Help" tab of https://archive.softwareheritage.org/save/
On a related note, it may be useful to regularly report requests that did not complete (either as success or failure) in a reasonable amount of time after being scheduled.
Apr 15 2021
In T2912#63245, @vsellier wrote:This kind of journal client will be necessary in any case if we want to extend the usage of the counters for other perimeters (metadata count, origin per forge, ...)
In T3084#63278, @ardumont wrote:Pushed, packaged, deployed.
scheduler runner continues happily to schedule existing tasks and some new task with priority
Apr 15 13:12:51 saatchi swh[234257]: INFO:swh.scheduler.celery_backend.runner:Grabbed 2084 tasks load-git Apr 15 13:12:54 saatchi swh[234257]: INFO:swh.scheduler.cli.admin.runner:Scheduled 4128 tasks Apr 15 13:14:06 saatchi swh[234257]: INFO:swh.scheduler.celery_backend.runner:Grabbed 1 tasks load-pypi Apr 15 13:14:06 saatchi swh[234257]: INFO:swh.scheduler.celery_backend.runner:Grabbed 1 tasks load-git (priority) ...That task got done almost immediately...
So there you go ;)
In T2912#63242, @vsellier wrote:Staging webapp[1] and webapp1 on production [2] are now configured to use swh-counters to display the historical values and the live object counts.
Apr 14 2021
Great news :-)
Apr 13 2021
In D267#84923, @thierry-martinez wrote:What would left to do to make this lister work? It seems already in good state, and it would be useful to index gforge.inria.fr since it will be closed soon (https://gforge.inria.fr/forum/forum.php?forum_id=11543). For the gforge.inria.fr case specifically, it is worth noticing that project creation is closed already, so a one-shot listing could be an option if it is lighter to set up: I wrote a small script to do that, but after a few requests to https://archive.softwareheritage.org/save/, requests are throttled. I would be happy to send you a listing of the public projects hosted on gforge.inria.fr if it could help.
Ok, this is converging with the discussion in T3234: we fully agree that having proper errors reported to the user is the way to go, so let's forget about the "sanitization" approach.
In T3234#63111, @anlambert wrote:
Ok, so no need to change the specification document for SWHIDs.
@vlorentz , @anlambert : thanks for progressing the discussion on this issue.
After mulling over your inputs, here is my current understanding:
I wonder if this is not overkill: SWHID may evolve in the future, and maintaining two implementations (one of them in JS!) may be source of headaches down the line.
A simple "sanitization" phase in the frontend catching the most common issues (trailing slashes, leading or trailing tabs or spaces, etc.) would probably be enough for our purpose.
Apr 10 2021
As a compromise, we could accept this trailing slash, but show a warning on the interface and/or codify in the SWHID specification an exhaustive list of "fixes" that user interfaces can/should do.
There are already many URLs in the open, so even if we remove the trailing slash now, that does not solve the problem.
Apr 9 2021
Apr 6 2021
Apr 5 2021
Ok, this is the way we'll go, merging in T3196 that is now obsolete.