Page MenuHomeSoftware Heritage
Feed All Stories

Oct 28 2021

ardumont closed D6543: docker: Install scheduler runner which deals with recurrent tasks.
Oct 28 2021, 2:22 PM
ardumont committed rDENV6823cd881262: docker: Install scheduler runner which deals with recurrent tasks (authored by ardumont).
docker: Install scheduler runner which deals with recurrent tasks
Oct 28 2021, 2:22 PM
ardumont closed D6575: loader: Let logging instructions do the formatting.
Oct 28 2021, 2:21 PM
ardumont committed rDLDSVN1573529ad487: loader: Let logging instructions do the formatting (authored by ardumont).
loader: Let logging instructions do the formatting
Oct 28 2021, 2:21 PM
vlorentz accepted D6565: Pass the object_type to JournalClient.value_serializer().
Oct 28 2021, 2:12 PM
swh-public-ci added a comment to D6565: Pass the object_type to JournalClient.value_serializer().

Build is green

Oct 28 2021, 2:12 PM
douardda updated the diff for D6565: Pass the object_type to JournalClient.value_serializer().

^x^s...

Oct 28 2021, 2:08 PM
swh-public-ci added a comment to D6565: Pass the object_type to JournalClient.value_serializer().

Build is green

Oct 28 2021, 2:07 PM
vlorentz added a comment to D6576: loader: Rename start_from_scratch parameter to incremental.

What about "incremental", for consistency with the listers' terminology?

Oct 28 2021, 2:04 PM
vlorentz accepted D6575: loader: Let logging instructions do the formatting.
Oct 28 2021, 2:03 PM
douardda updated the diff for D6565: Pass the object_type to JournalClient.value_serializer().

typo

Oct 28 2021, 2:03 PM
vlorentz accepted D6543: docker: Install scheduler runner which deals with recurrent tasks.
Oct 28 2021, 2:02 PM
anlambert added a comment to D6572: interface: Add origin_snapshot_get_all method.

I'd rather name it origin_snapshot_get_all, what do you think?

I also hesitated to use that name, I used origin_snapshot_get to match naming of other methods in storage interface but I agree origin_snapshot_get_all is more explicit about what the method does, will rename then.

Oct 28 2021, 2:01 PM
swh-public-ci added a comment to D6565: Pass the object_type to JournalClient.value_serializer().

Build is green

Oct 28 2021, 2:00 PM
jayeshv triaged T3696: Code cleanup as Normal priority.
Oct 28 2021, 1:42 PM · Web app
Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org> committed rDSCHadd752a3cef2: Updated backport on buster-swh from debian/0.19.0-1_swh1 (unstable-swh) (authored by Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org>).
Updated backport on buster-swh from debian/0.19.0-1_swh1 (unstable-swh)
Oct 28 2021, 1:17 PM
Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org> committed rDSCH0722eba630ab: Merge tag 'debian/0.19.0-1_swh1' into debian/buster-swh (authored by Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org>).
Merge tag 'debian/0.19.0-1_swh1' into debian/buster-swh
Oct 28 2021, 1:17 PM
Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org> committed rDSCH83d25c5194ce: Updated debian changelog for version 0.19.0 (authored by Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org>).
Updated debian changelog for version 0.19.0
Oct 28 2021, 1:15 PM
Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org> committed rDSCHe78e56d34447: Update upstream source from tag 'debian/upstream/0.19.0' (authored by Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org>).
Update upstream source from tag 'debian/upstream/0.19.0'
Oct 28 2021, 1:15 PM
Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org> committed rDSCH7de5d27c5d4f: pristine-tar data for swh-scheduler_0.19.0.orig.tar.gz (authored by Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org>).
pristine-tar data for swh-scheduler_0.19.0.orig.tar.gz
Oct 28 2021, 1:15 PM
Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org> committed rDSCHae3c3c6f4e95: New upstream version 0.19.0 (authored by Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org>).
New upstream version 0.19.0
Oct 28 2021, 1:15 PM
ardumont closed D6520: Add a new cli endpoint to schedule recurrent visits in Celery.
Oct 28 2021, 1:10 PM
ardumont committed rDSCH50d7fd7ff49b: Add a new cli endpoint to schedule recurrent visits in Celery (authored by olasd).
Add a new cli endpoint to schedule recurrent visits in Celery
Oct 28 2021, 1:10 PM
swh-public-ci added a comment to D6520: Add a new cli endpoint to schedule recurrent visits in Celery.

Build is green

Oct 28 2021, 1:09 PM
ardumont requested review of D6576: loader: Rename start_from_scratch parameter to incremental.
Oct 28 2021, 1:07 PM
ardumont updated the diff for D6520: Add a new cli endpoint to schedule recurrent visits in Celery.

Rebase

Oct 28 2021, 1:07 PM
ardumont added a revision to T3695: Investigate revision reconstruction discrepancy with subversion export: D6576: loader: Rename start_from_scratch parameter to incremental.
Oct 28 2021, 1:06 PM · SVN Loader
swh-public-ci added a comment to D6396: Implement maven jar source files loader.

Build is green

Oct 28 2021, 12:59 PM
ardumont requested review of D6575: loader: Let logging instructions do the formatting.
Oct 28 2021, 12:58 PM
borisbaldassari updated the diff for D6396: Implement maven jar source files loader.
  • maven-lister: Fix copyright in header
Oct 28 2021, 12:57 PM
ardumont renamed T3695: Investigate revision reconstruction discrepancy with subversion export from loading svn origin while ignoring history raises to loading some svn origin while ignoring history sometimes raises.
Oct 28 2021, 12:49 PM · SVN Loader
ardumont added a comment to T3695: Investigate revision reconstruction discrepancy with subversion export.

Ah, it's not for all origins though...
I tried with other origins which demonstrates the same issue [1] and they did not fail...

Oct 28 2021, 12:48 PM · SVN Loader
ardumont added inline comments to D6395: lister: Add new maven lister.
Oct 28 2021, 12:47 PM
swh-public-ci added a comment to D6520: Add a new cli endpoint to schedule recurrent visits in Celery.

Build is green

Oct 28 2021, 12:44 PM
ardumont updated the diff for D6520: Add a new cli endpoint to schedule recurrent visits in Celery.

Drop unused period parameter in cli...

Oct 28 2021, 12:42 PM
ardumont updated the diff for D6543: docker: Install scheduler runner which deals with recurrent tasks.

Drop spurious change

Oct 28 2021, 12:39 PM
ardumont added inline comments to D6543: docker: Install scheduler runner which deals with recurrent tasks.
Oct 28 2021, 12:38 PM
vlorentz added inline comments to D6543: docker: Install scheduler runner which deals with recurrent tasks.
Oct 28 2021, 12:37 PM
douardda accepted D6520: Add a new cli endpoint to schedule recurrent visits in Celery.

Overall it looks very good to me. There is room for improvement, for sure, but let's put this to work and see how it performs.

Oct 28 2021, 12:36 PM
ardumont updated the test plan for D6574: scheduler: Add schedule recurrent tasks service.
Oct 28 2021, 12:34 PM
ardumont updated the diff for D6574: scheduler: Add schedule recurrent tasks service.

Drop no longer defined --period 10 call for non default runner.

Oct 28 2021, 12:34 PM
vlorentz added a comment to D6554: [WIP] Add a (redis-based) validation error reporting facility.

Finally, having to implement the computation of the key like that is a bit depressing...
(who said "SWHID everywhere!"?)

Oct 28 2021, 12:33 PM
Harbormaster failed remote builds in B24770: Diff 23890 for D6520: Add a new cli endpoint to schedule recurrent visits in Celery!
Oct 28 2021, 12:33 PM
swh-public-ci added a comment to D6520: Add a new cli endpoint to schedule recurrent visits in Celery.

Build has FAILED

Oct 28 2021, 12:33 PM
ardumont updated the diff for D6520: Add a new cli endpoint to schedule recurrent visits in Celery.
  • Adapt according to suggestions
  • drop unused --period flag in the cli
Oct 28 2021, 12:30 PM
olasd added a comment to D6520: Add a new cli endpoint to schedule recurrent visits in Celery.

What's not clear in the description that need rework so it's clearer?

What isn't clear to me is why we need to implement this, because the scheduler already does it.

Oct 28 2021, 12:18 PM
ardumont added a comment to D6520: Add a new cli endpoint to schedule recurrent visits in Celery.

What's not clear in the description that need rework so it's clearer?

What isn't clear to me is why we need to implement this, because the scheduler already does it.

Oct 28 2021, 12:18 PM
vlorentz added a comment to D6520: Add a new cli endpoint to schedule recurrent visits in Celery.

nvm, @olasd just explained to me this used to be done manually

Oct 28 2021, 12:17 PM
vlorentz added a comment to D6520: Add a new cli endpoint to schedule recurrent visits in Celery.

What's not clear in the description that need rework so it's clearer?

Oct 28 2021, 12:16 PM
ardumont updated the diff for D6543: docker: Install scheduler runner which deals with recurrent tasks.

Drop --period 10 which is unused.

Oct 28 2021, 12:16 PM
Harbormaster failed remote builds in B24768: Diff 23888 for D6565: Pass the object_type to JournalClient.value_serializer()!
Oct 28 2021, 12:15 PM
swh-public-ci added a comment to D6565: Pass the object_type to JournalClient.value_serializer().

Build has FAILED

Oct 28 2021, 12:15 PM
vlorentz accepted D6572: interface: Add origin_snapshot_get_all method.

I'd rather name it origin_snapshot_get_all, what do you think?

Oct 28 2021, 12:12 PM
douardda updated the diff for D6565: Pass the object_type to JournalClient.value_serializer().

Document a bit more the value_deserializer and add a test for it

Oct 28 2021, 12:11 PM
vlorentz added a comment to D6570: Remove now useless revision date checker in fixer.

You checked on the whole journal, right?

Oct 28 2021, 12:07 PM
ardumont added a comment to D6520: Add a new cli endpoint to schedule recurrent visits in Celery.

I don't really understand what this is for

Oct 28 2021, 12:07 PM
vlorentz added inline comments to D6569: Add a --type option to 'swh storage replay'.
Oct 28 2021, 12:06 PM
vlorentz added a comment to D6520: Add a new cli endpoint to schedule recurrent visits in Celery.

I don't really understand what this is for, but lgtm.

Oct 28 2021, 12:01 PM
vlorentz added a comment to D6395: lister: Add new maven lister.

You use three similar but subtly different conditionals:
Could you somehow unify them?

No, I don't think so, or not without risking losing entries.

Basically we parse the big index file, and as we get through some entries can be yielded immediately (jars), and for others we need some post-treatment (scm), mainly for uniqueness because we have a *lot* of duplicates here. So we store them during the parsing, and *after* we start yielding the scm's.

If we get interrupted during the first part (index parsing), we need to be able to yield again jars from where we stopped (but at this point in time we still haven't yielded any scm). If we get interrupted during the second part (scm post-treatment), all jars have already been yielded and we only need to yield the scm. Interestingly if we are interrupted in-between (say we finished parsing the index, yielded all jars but not scm), we still need to re-read the index to re-build the list of unique scm's and then yield them.

So IMHO we really need 2 different counters, and the slightly different conditionals. I may be missing something obvious as I've been struggling with that for some time, do you think it could be improved or optimised?

Oct 28 2021, 11:51 AM
ardumont added a comment to D6395: lister: Add new maven lister.

why? What's the blocking point?

Oct 28 2021, 11:45 AM
douardda added inline comments to D6565: Pass the object_type to JournalClient.value_serializer().
Oct 28 2021, 11:42 AM
vlorentz requested changes to D6565: Pass the object_type to JournalClient.value_serializer().
Oct 28 2021, 11:38 AM
ardumont updated the diff for D6574: scheduler: Add schedule recurrent tasks service.

Drop unused variables

Oct 28 2021, 11:38 AM
ardumont accepted D6564: Do call consumer.commit() even if not objects have been received.

ok

Oct 28 2021, 11:37 AM
vlorentz accepted D6564: Do call consumer.commit() even if not objects have been received.

"ok then"

Oct 28 2021, 11:36 AM
ardumont added a reviewer for D6574: scheduler: Add schedule recurrent tasks service: System administrators.
Oct 28 2021, 11:35 AM
ardumont requested review of D6574: scheduler: Add schedule recurrent tasks service.
Oct 28 2021, 11:35 AM
ardumont added a revision to T3667: Orchestrate origins scheduling according to scheduler metrics feedback: D6574: scheduler: Add schedule recurrent tasks service.
Oct 28 2021, 11:35 AM · System administration, Sprint 2021 01, Archive coverage, Scheduling utilities
douardda requested review of D6571: Add support for a redis-based reporting for invalid mirrorred objects.

asking for review even if tests are expected to fail because it depends on D6565 (in swh-journal)

Oct 28 2021, 11:26 AM
douardda added a comment to D6564: Do call consumer.commit() even if not objects have been received.

why if i may ask?

Oct 28 2021, 11:21 AM
ardumont accepted D6396: Implement maven jar source files loader.

One copyright header remark inline though.

Oct 28 2021, 11:10 AM
ardumont accepted D6395: lister: Add new maven lister.

Well, fine to me.
I'll let @vlorentz have the final word.

Oct 28 2021, 11:07 AM
ardumont added 1 blocking reviewer(s) for D6395: lister: Add new maven lister: Reviewers.
Oct 28 2021, 11:06 AM
ardumont removed reviewers for D6395: lister: Add new maven lister: Reviewers, ardumont.
Oct 28 2021, 11:06 AM
ardumont added a comment to D6570: Remove now useless revision date checker in fixer.

A small description to mention why it's now useless would be great.

Oct 28 2021, 11:00 AM
ardumont requested review of D6573: Migrate openvpn page to the sysadm documentation.
Oct 28 2021, 10:56 AM
ardumont renamed T3695: Investigate revision reconstruction discrepancy with subversion export from loading svn origins from scratch raise to loading svn origin while ignoring history raises.
Oct 28 2021, 10:49 AM · SVN Loader
ardumont added a revision to T3154: sysadm docs: Move relevant and public doc from intranet to swh-docs: D6573: Migrate openvpn page to the sysadm documentation.
Oct 28 2021, 10:46 AM · System administration, Documentation
ardumont committed rDDOCa9918626c530: Update onboarding/outboarding page with the intended audience (authored by ardumont).
Update onboarding/outboarding page with the intended audience
Oct 28 2021, 10:45 AM
zack closed D6567: FAQ: point to developer setup + minor fixes.
Oct 28 2021, 10:40 AM
zack committed rDDOC7e3d259a1598: FAQ: point to developer setup + minor fixes (authored by zack).
FAQ: point to developer setup + minor fixes
Oct 28 2021, 10:40 AM
swh-public-ci added a comment to D6567: FAQ: point to developer setup + minor fixes.

Build is green

Oct 28 2021, 10:35 AM
zack updated the diff for D6567: FAQ: point to developer setup + minor fixes.

rebase

Oct 28 2021, 10:24 AM
douardda requested review of D6570: Remove now useless revision date checker in fixer.
Oct 28 2021, 10:23 AM
douardda requested review of D6569: Add a --type option to 'swh storage replay'.
Oct 28 2021, 10:22 AM
ardumont added a comment to T3694: Investigate svn loading failure.

For this one, i had a thought about it recently and it should be enough to trigger a
run from scratch (loader has the flag for it which is tested but got never used).

Oct 28 2021, 10:13 AM · SVN Loader
ardumont triaged T3695: Investigate revision reconstruction discrepancy with subversion export as Normal priority.
Oct 28 2021, 10:12 AM · SVN Loader
ardumont added a comment to T3694: Investigate svn loading failure.

So i've had a look in the end [1], it's indeed the history altered issue (2.).

Oct 28 2021, 9:58 AM · SVN Loader
ardumont closed D6568: Migrate the dns setup page into its own page.
Oct 28 2021, 9:40 AM
ardumont committed rDDOCe6b0c5010ca5: Migrate the dns setup page into its own page (authored by ardumont).
Migrate the dns setup page into its own page
Oct 28 2021, 9:40 AM
ardumont added a comment to T3694: Investigate svn loading failure.

From the top of my head, 2 possible issues:

  1. svn external properties on the origin. That will get detected by the loader and

raise.

Oct 28 2021, 9:36 AM · SVN Loader
ardumont triaged T3694: Investigate svn loading failure as Normal priority.
Oct 28 2021, 9:33 AM · SVN Loader

Oct 27 2021

swh-public-ci added a comment to D6395: lister: Add new maven lister.

Build is green

Oct 27 2021, 11:15 PM
borisbaldassari updated the diff for D6395: lister: Add new maven lister.
  • maven-lister: Fix README, 88 chars, minor typos
Oct 27 2021, 11:12 PM
borisbaldassari added a comment to D6395: lister: Add new maven lister.

You use three similar but subtly different conditionals:
Could you somehow unify them?

No, I don't think so, or not without risking losing entries.

Oct 27 2021, 11:07 PM
borisbaldassari updated subscribers of D6395: lister: Add new maven lister.

Thanks for the great work.
This looks mostly ok to me.

Thanks, it's appreciated. :-)

Oct 27 2021, 10:33 PM
borisbaldassari added a comment to D6395: lister: Add new maven lister.

I can't manage to test the full toolchain (maven lister - maven loader),

why? What's the blocking point?

I've found a great doc to run the lister in a docker-dev environment, and couldn't find anything for the loader.
Well, actually the real pain point is I could not finish the execution of the lister yet as it takes days.

Oct 27 2021, 10:26 PM
swh-public-ci added a comment to D6396: Implement maven jar source files loader.

Build is green

Oct 27 2021, 8:36 PM
borisbaldassari updated the diff for D6396: Implement maven jar source files loader.
  • maven-loader: fixes after ardumont's review in D6396
Oct 27 2021, 8:34 PM
anlambert requested review of D6572: interface: Add origin_snapshot_get_all method.
Oct 27 2021, 6:52 PM