Deployed.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Apr 22 2021
Trying to bisect the issue and failed (multiple dimensions got the best of me, master ok, debian build in stable ko... i need to improve my tooling there).
So, the debian/unstable broke which is fixed now [1] (it wasd missing one new deps to
test the migration part).
Apr 21 2021
Landed. Remains to deploy. I'll attend to that tomorrow.
Thanks @ardumont ... so it appears that adapting the logic is easy... may you do it?
In T3213#63913, @rdicosmo wrote:Thanks @ardumont ... so it appears that adapting the logic is easy... may you do it?
@anlambert may you look into the needed modification of the UI, to enable the new type of save code now payloads for selected authenticated users?
Thanks @ardumont ... so it appears that adapting the logic is easy... may you do it?
@anlambert may you look into the needed modification of the UI, to enable the new type of save code now payloads for selected authenticated users?
In T3087#63791, @douardda wrote:So what about exports of the archive available on git-annex?
For the support of other origin visit types, @ardumont should know better than me how this could be integrated in the scheduler.
Apr 20 2021
So what about exports of the archive available on git-annex?
In T3084#63569, @douardda wrote:is there a grafana dashboard dedicated to this queue?
Apr 19 2021
D5556 landed and deployed to production, my bad for this, closing this again.
In T3234#63762, @rdicosmo wrote:Thanks, it is indeed an urgent matter, as various journals depend on this!
Thanks, it is indeed an urgent matter, as various journals depend on this!
In T3234#63747, @anlambert wrote:That's not a SWHID URL but rather the resolved browse one (here the trailing slash is part of the snapshot query parameter).
SWHID resolver always produced the same browse URL without trailing / so we are good here (I mean the 404 error was already raised prior my changes to SWHID URLs).
That's not a SWHID URL but rather the resolved browse one (here the trailing slash is part of the snapshot query parameter).
It'd be best if we distinguish directly from the listing view such details.
Well, it seems we have been hit by this again, in a different form:
Cool!
SWHIDs are now validated in each search input in production.
Fix has just been deployed to production, SWHID URLs have no more trailing slash.
In T3252#63370, @anlambert wrote:In T3252#63315, @zack wrote:Oh, and now that we have user profile pages, we should have a list of "my" save code now requests with their status visible in the user profile, for those who want to check synchronously the status of their requests (and might have disabled email notifications).
+1, great idea !
Now that we have authentication and authorization in place, and that Software Heritage ambassadors are coming, we can relax this constraint, allowing specific users the ability to trigger "save code now" also for .tar, .zip, packages etc.
is there a grafana dashboard dedicated to this queue?
Apr 16 2021
yes, great ;)
@rdicosmo great summary, I'm certainly on that page :)
Thanks to all of you for this dicussion and proposals.
Great. In addition to the content of the free form field, the standard answer should contain proper boilerplate reminding what is expected in a Save Code Now request, along the lines of what is written in the "Help" tab of https://archive.softwareheritage.org/save/
thanks !
In T3252#63410, @ardumont wrote:
+1, can you create a task about it ? This could be handled by a GSOC student who chooses to
work on the webapp.
In T3252#63374, @zack wrote:but adding an email field (auto filled for registered users) to send a notification after the origin was loaded seems a good tradeoff. To implement the email notification, we will have to add a journal client in swh-web processing origin visit messages.
Adding an email field is a poor UX solution (it needs to be reentered every time or saved in a cookie) which we used for the vault at the time because we didn't have user registration.
Now that we have user registration we can just tell users that if they want to be notified, they should login. (Which is indeed something independent from requiring user registration for being able to submit.) That will encourage users to register to have added-value functionalitlies, like notifications.
And then we should go back to all places that could use notifications (vault, save code now, deposit, "save again" button) and uniform things.
but adding an email field (auto filled for registered users) to send a notification after the origin was loaded seems a good tradeoff. To implement the email notification, we will have to add a journal client in swh-web processing origin visit messages.
In T3252#63334, @ardumont wrote:As a first step towards giving more feedback for users who submitted wrong origins for
ingestion (e.g. organization links, tarballs with wrong visit type, link to html page
probably for listing, etc...). We could allow the operator which rejects the origins a
free form input field so they could explain the reason of the rejection. It'd be less
brutal a rejection.This does not require the user registration part discussed above nor does it exclude it.
Bonus point for this, it's an easy hack ;)
As an incremental step after that, we could make that a configurable predefined template
selection box of rejecting reasons as I don't think there are so many different reasons
after all (unsupported for now, not an origin of type <type>, not a repository link,
...). Drawing stats from the first implementation could help in designing the initial
templates of rejection.Which could be another easy hack once the first part is done (if we want).
As suggested to @anlambert recently (@antoine, given it a bit more thought and added the
second incremental part since then thus the ping ;)
In T3252#63315, @zack wrote:Oh, and now that we have user profile pages, we should have a list of "my" save code now requests with their status visible in the user profile, for those who want to check synchronously the status of their requests (and might have disabled email notifications).
In T3252#63314, @zack wrote:It would be desirable to provide the user with feedback that helps fix the issue.
Totally.
Now that we have a decent user registration system I think we should consider:
- requiring user registration for submitting save code now requests (which will also provide an audit trail for users that repeatedly submit bogus if not actively harmful requests)
- send by default email notifications about the outcome of save code now requests, both successes and failures, with the possibility of disabling email notifications in the user profile
This will make the overall UX of interacting with the archive feel much more "reliable" for users, whereas right now it feels much like a leap of faith whether it will work or not, in good part due to the lack of systematic out-of-band notifications.
Deployed, so the queue is now consumed.
Well, that was simple enough.
As a first step towards giving more feedback for users who submitted wrong origins for
ingestion (e.g. organization links, tarballs with wrong visit type, link to html page
probably for listing, etc...). We could allow the operator which rejects the origins a
free form input field so they could explain the reason of the rejection. It'd be less
brutal a rejection.
Apr 15 2021
Oh, and now that we have user profile pages, we should have a list of "my" save code now requests with their status visible in the user profile, for those who want to check synchronously the status of their requests (and might have disabled email notifications).
It would be desirable to provide the user with feedback that helps fix the issue.
don't forget to count committers too
Let's go for it, then. May you take this over?
In T2912#63245, @vsellier wrote:This kind of journal client will be necessary in any case if we want to extend the usage of the counters for other perimeters (metadata count, origin per forge, ...)
I saw a parmap origin which got scheduled (la la la ;)
In T3084#63278, @ardumont wrote:Pushed, packaged, deployed.
scheduler runner continues happily to schedule existing tasks and some new task with priority
Apr 15 13:12:51 saatchi swh[234257]: INFO:swh.scheduler.celery_backend.runner:Grabbed 2084 tasks load-git Apr 15 13:12:54 saatchi swh[234257]: INFO:swh.scheduler.cli.admin.runner:Scheduled 4128 tasks Apr 15 13:14:06 saatchi swh[234257]: INFO:swh.scheduler.celery_backend.runner:Grabbed 1 tasks load-pypi Apr 15 13:14:06 saatchi swh[234257]: INFO:swh.scheduler.celery_backend.runner:Grabbed 1 tasks load-git (priority) ...That task got done almost immediately...
So there you go ;)
scheduler runner continues happily to schedule existing tasks and some new task with priority
Apr 15 13:12:51 saatchi swh[234257]: INFO:swh.scheduler.celery_backend.runner:Grabbed 2084 tasks load-git Apr 15 13:12:54 saatchi swh[234257]: INFO:swh.scheduler.cli.admin.runner:Scheduled 4128 tasks Apr 15 13:14:06 saatchi swh[234257]: INFO:swh.scheduler.celery_backend.runner:Grabbed 1 tasks load-pypi Apr 15 13:14:06 saatchi swh[234257]: INFO:swh.scheduler.celery_backend.runner:Grabbed 1 tasks load-git (priority) ...
I would really like to keep the author counter: how complex is it to add it?
In T2912#63242, @vsellier wrote:Staging webapp[1] and webapp1 on production [2] are now configured to use swh-counters to display the historical values and the live object counts.
Staging webapp[1] and webapp1 on production [2] are now configured to use swh-counters to display the historical values and the live object counts.
Deployment done on staging and production. The new counters are currently only activated on webapp1
Apr 14 2021
Proper fix is in D5530.