Page MenuHomeSoftware Heritage
Feed All Stories

Oct 21 2022

swh-sentry-integration assigned T4651: indexer/extrinsic_metadata: ParseError: syntax error: line 1, column 0 to vlorentz.
Oct 21 2022, 5:29 PM · Indexer
douardda committed rDSTOfe0eaee8718b: Make the replayer not crash on kafka messages that fail to be converted as… (authored by douardda).
Make the replayer not crash on kafka messages that fail to be converted as…
Oct 21 2022, 4:59 PM
douardda closed D8751: Make the replayer not crash on kafka messages that fail to be converted as model objects.
Oct 21 2022, 4:59 PM
douardda committed rDSTO242e37a7be35: Add a comment that should have been "kept" from 850a7553b (authored by douardda).
Add a comment that should have been "kept" from 850a7553b
Oct 21 2022, 4:59 PM
swh-public-ci added a comment to D8751: Make the replayer not crash on kafka messages that fail to be converted as model objects.

Build is green

Oct 21 2022, 4:59 PM
douardda updated the diff for D8751: Make the replayer not crash on kafka messages that fail to be converted as model objects.

rebase

Oct 21 2022, 4:43 PM
swh-public-ci added a comment to D8751: Make the replayer not crash on kafka messages that fail to be converted as model objects.

Build is green

Oct 21 2022, 4:38 PM
douardda closed D8705: writer: make calling flush() in write_addition(s)() optional.
Oct 21 2022, 4:31 PM
douardda committed rDJNL7c83ee11ecab: writer: make calling flush() in write_addition(s)() optional (authored by douardda).
writer: make calling flush() in write_addition(s)() optional
Oct 21 2022, 4:31 PM
douardda closed D8697: tests: make the kafka_server_base fixture a function scoped one.
Oct 21 2022, 4:31 PM
douardda committed rDJNL04a84dba6a17: tests: make the kafka_server_base fixture a function scoped one (authored by douardda).
tests: make the kafka_server_base fixture a function scoped one
Oct 21 2022, 4:31 PM
swh-public-ci added a comment to D8705: writer: make calling flush() in write_addition(s)() optional.

Build is green

Oct 21 2022, 4:30 PM
douardda updated the diff for D8705: writer: make calling flush() in write_addition(s)() optional.

typos

Oct 21 2022, 4:26 PM
douardda updated the diff for D8751: Make the replayer not crash on kafka messages that fail to be converted as model objects.

fix typos as reported by anlambert

Oct 21 2022, 4:22 PM
swh-public-ci added a comment to D8705: writer: make calling flush() in write_addition(s)() optional.

Build is green

Oct 21 2022, 4:22 PM
douardda updated the diff for D8705: writer: make calling flush() in write_addition(s)() optional.

Add the missing docstring entry

Oct 21 2022, 4:19 PM
douardda added a comment to D8751: Make the replayer not crash on kafka messages that fail to be converted as model objects.

Looks good to me , just noticed some remaining typos: one in code and another in commit message (s/sctructure/structure).

Oct 21 2022, 4:10 PM
ardumont updated subscribers of T4538: Consider archiving NAR hashes.

fwiw, I've iterated a bit over @zimoun's code and pushed it into the snippets repository
(see commits above and their commit description message). It's also able to deal with
git, hg and svn trees (ignoring their respective top metadata folder .git, .svn, ...
without impacting the performance).

Oct 21 2022, 3:46 PM · SVN Loader, Tarball loader, Nixguix loader
anlambert accepted D8751: Make the replayer not crash on kafka messages that fail to be converted as model objects.

Looks good to me , just noticed some remaining typos: one in code and another in commit message (s/sctructure/structure).

Oct 21 2022, 3:42 PM
douardda requested review of D8751: Make the replayer not crash on kafka messages that fail to be converted as model objects.
Oct 21 2022, 2:39 PM
olasd added a comment to T2513: Copy metadata on revisions to the extrinsic metadata storage.

I've relaunched the latest version of the migrate_extrinsic_metadata script on getty...

Oct 21 2022, 1:43 PM · Metadata workflow, Roadmap 2020
ardumont committed rDSNIP36e3d021495d: nixguix/nar: Only filter on the first level (authored by ardumont).
nixguix/nar: Only filter on the first level
Oct 21 2022, 12:41 PM
anlambert closed D8749: utils/highlightjs: Do not report pygments exception to sentry.
Oct 21 2022, 12:29 PM
anlambert committed rDWAPPSa2526ef5bb45: utils/highlightjs: Do not report pygments exception to sentry (authored by anlambert).
utils/highlightjs: Do not report pygments exception to sentry
Oct 21 2022, 12:29 PM
ardumont committed rDSNIP9c5d75a6e74f: nixguix: Make nar ignore .hg and .svn folders as well (authored by ardumont).
nixguix: Make nar ignore .hg and .svn folders as well
Oct 21 2022, 12:16 PM
ardumont committed rDSNIPf3979ba25e52: nixguix: Document a bit the nar class (authored by ardumont).
nixguix: Document a bit the nar class
Oct 21 2022, 12:16 PM
ardumont committed rDSNIP6724ceccd446: nixguix: Make nar ignore .git folder like the guix hash command (authored by ardumont).
nixguix: Make nar ignore .git folder like the guix hash command
Oct 21 2022, 12:16 PM
ardumont committed rDSNIPd50f518335ee: nixguix: Make nar cli consistent with guix hash cli interface (authored by ardumont).
nixguix: Make nar cli consistent with guix hash cli interface
Oct 21 2022, 12:16 PM
ardumont committed rDSNIP58d139c71b58: nixguix: Add nar.py from @zimoun (authored by ardumont).
nixguix: Add nar.py from @zimoun
Oct 21 2022, 12:16 PM
anlambert added a project to T4648: Introduce RPM Loader: RPM loader.
Oct 21 2022, 11:38 AM · RPM loader
anlambert edited projects for T4448: Implementation of Fedora Lister, added: RPM lister; removed Lister.
Oct 21 2022, 11:38 AM · RPM lister, Archive coverage
anlambert created RPM loader.
Oct 21 2022, 11:38 AM
anlambert created RPM lister.
Oct 21 2022, 11:37 AM
anlambert accepted D8566: Conda: Anaconda packages archive loader.

Looks good to me, thanks !

Oct 21 2022, 11:35 AM
anlambert closed T4646: CVSProtocolError: Error from CVS server: b"E cvs checkout: Skipping `$Log$' keyword due to excessive comment leader... as Resolved by committing rDLDCVSd00badc39fa3: cvsclient: Do not abort checkout when skipping $Log$ keyword expansion.
Oct 21 2022, 10:45 AM · CVS loader
anlambert closed D8750: cvsclient: Do not abort checkout when skipping $Log$ keyword expansion.
Oct 21 2022, 10:45 AM
anlambert committed rDLDCVSd00badc39fa3: cvsclient: Do not abort checkout when skipping $Log$ keyword expansion (authored by anlambert).
cvsclient: Do not abort checkout when skipping $Log$ keyword expansion
Oct 21 2022, 10:45 AM
ardumont accepted D8750: cvsclient: Do not abort checkout when skipping $Log$ keyword expansion.
Oct 21 2022, 10:42 AM
swh-public-ci added a comment to D8566: Conda: Anaconda packages archive loader.

Build is green

Oct 21 2022, 10:26 AM
franckbret updated the summary of D8566: Conda: Anaconda packages archive loader.
Oct 21 2022, 10:25 AM
franckbret updated the diff for D8566: Conda: Anaconda packages archive loader.

Manage case where author or last_update is empty

Oct 21 2022, 10:22 AM
jayeshv triaged T4650: Improve the search result UI as Normal priority.
Oct 21 2022, 10:19 AM · Archive search
bchauvet renamed T4648: Introduce RPM Loader from Introduce RPM Lister to Introduce RPM Loader.
Oct 21 2022, 10:18 AM · RPM loader
bchauvet added a parent task for T4648: Introduce RPM Loader: Unknown Object (Maniphest Task).
Oct 21 2022, 10:17 AM · RPM loader
jayeshv closed T4613: Generalize and simplify the query language, a subtask of T3952: Make the search query language a first class citizen , as Invalid.
Oct 21 2022, 10:16 AM · meta-task, Roadmap 2022, Archive search
jayeshv closed T4613: Generalize and simplify the query language as Invalid.
Oct 21 2022, 10:16 AM · Archive search
bchauvet edited parent tasks for T4448: Implementation of Fedora Lister, added: Unknown Object (Maniphest Task); removed: Unknown Object (Maniphest Task).
Oct 21 2022, 9:55 AM · RPM lister, Archive coverage
KShivendu updated subscribers of T4648: Introduce RPM Loader.
Oct 21 2022, 9:40 AM · RPM loader
KShivendu triaged T4648: Introduce RPM Loader as Normal priority.
Oct 21 2022, 9:40 AM · RPM loader

Oct 20 2022

seirl edited P1502 GRPC python client example.
Oct 20 2022, 5:14 PM
seirl created P1502 GRPC python client example.
Oct 20 2022, 5:13 PM
swh-public-ci added a comment to D8747: conda: Yield listed origins after all artifacts in a page are processed.

Build is green

Oct 20 2022, 3:45 PM
anlambert updated the diff for D8747: conda: Yield listed origins after all artifacts in a page are processed.

Add missing test

Oct 20 2022, 3:40 PM
anlambert retitled D8747: conda: Yield listed origins after all artifacts in a page are processed from conda: Yield listed origins after all artifacts in a page processed to conda: Yield listed origins after all artifacts in a page are processed.
Oct 20 2022, 3:39 PM
vsellier committed R261:40fc6712eec3: fix the template directory name (authored by vsellier).
fix the template directory name
Oct 20 2022, 3:35 PM
vsellier committed R261:042fab806d7a: Test the gitlab issues templating (authored by vsellier).
Test the gitlab issues templating
Oct 20 2022, 3:34 PM
anlambert requested changes to D8569: Add rubygems loader.

@Alphare, fyi I improved the rubygems lister in that commit in order to gather all artifacts related to a gem and send these info to the loader as extra arguments.
Below is an example of the lister output for a gem:

ListedOrigin(
    url="https://rubygems.org/gems/haar_joke",
    visit_type="rubygems",
    last_update=iso8601.parse_date("2016-11-05T00:00:00+00:00"),
    extra_loader_arguments={
        "artifacts": [
            {
                "url": "https://rubygems.org/downloads/haar_joke-0.0.2.gem",
                "length": 8704,
                "version": "0.0.2",
                "filename": "haar_joke-0.0.2.gem",
                "checksums": {
                    "sha256": "85a8cf5f41890e9605265eeebfe9e99aa0350a01a3c799f9f55a0615a31a2f5f"
                },
            },
            {
                "url": "https://rubygems.org/downloads/haar_joke-0.0.1.gem",
                "length": 8704,
                "version": "0.0.1",
                "filename": "haar_joke-0.0.1.gem",
                "checksums": {
                    "sha256": "a2ee7052fb8ffcfc4ec0fdb77fae9a36e473f859af196a36870a0f386b5ab55e"
                },
            },
        ],
        "rubygem_metadata": [
            {
                "date": "2016-11-05T00:00:00+00:00",
                "authors": "Gemma Gotch",
                "version": "0.0.2",
                "extrinsic_metadata_url": "https://rubygems.org/api/v2/rubygems/haar_joke/versions/0.0.2.json",
            },
            {
                "date": "2016-07-23T00:00:00+00:00",
                "authors": "Gemma Gotch",
                "version": "0.0.1",
                "extrinsic_metadata_url": "https://rubygems.org/api/v2/rubygems/haar_joke/versions/0.0.1.json",
            },
        ],
    },
}

It enables to improve the scheduling of loading tasks for Ruby gems (by providing last_update value to ListedOrigin)
and it will save you of couple of calls to RubyGems Web API in the loader to fetch the list of versions for a gem.
So loader implementation must be adapted to use these new arguments.

Oct 20 2022, 3:25 PM
anlambert requested review of D8750: cvsclient: Do not abort checkout when skipping $Log$ keyword expansion.
Oct 20 2022, 3:22 PM
jayeshv updated the task description for T4647: GraphQL: Make cursors truly opaque .
Oct 20 2022, 3:21 PM · GraphQL API
anlambert added a revision to T4646: CVSProtocolError: Error from CVS server: b"E cvs checkout: Skipping `$Log$' keyword due to excessive comment leader...: D8750: cvsclient: Do not abort checkout when skipping $Log$ keyword expansion.
Oct 20 2022, 3:20 PM · CVS loader
jayeshv triaged T4647: GraphQL: Make cursors truly opaque as Normal priority.
Oct 20 2022, 3:19 PM · GraphQL API
douardda added a comment to D8697: tests: make the kafka_server_base fixture a function scoped one.

I don't see the difference comparing the master build or the diff build times but wny not ;)

Oct 20 2022, 3:18 PM
olasd accepted D8696: tests: simplify and (possibly) fix the grpc_server helper context manager.

Woohoo, green tests!

Oct 20 2022, 3:18 PM
olasd added a comment to D8705: writer: make calling flush() in write_addition(s)() optional.

...ah... maybe, it's for other tests in another repository, right?

Oct 20 2022, 3:18 PM
anlambert triaged T4646: CVSProtocolError: Error from CVS server: b"E cvs checkout: Skipping `$Log$' keyword due to excessive comment leader... as Normal priority.
Oct 20 2022, 3:18 PM · CVS loader
olasd updated subscribers of D8705: writer: make calling flush() in write_addition(s)() optional.

@vlorentz mentions that this is missing a docstring change

Oct 20 2022, 3:18 PM
ardumont accepted D8749: utils/highlightjs: Do not report pygments exception to sentry.
Oct 20 2022, 3:18 PM
olasd added a comment to D8697: tests: make the kafka_server_base fixture a function scoped one.

Yeah, I'm a bit surprised too that this would decrease overall test times, and seems like a latent bug, but this at least doesn't seem to /increase/ test times in this module, so it's probably fine?

Oct 20 2022, 3:18 PM
vsellier placed T4646: CVSProtocolError: Error from CVS server: b"E cvs checkout: Skipping `$Log$' keyword due to excessive comment leader... up for grabs.
Oct 20 2022, 3:18 PM · CVS loader
ardumont added a comment to D8705: writer: make calling flush() in write_addition(s)() optional.

If it's for test speedup only, don't you want to keep the actual behavior instead?
That is, make the auto_flush be False by default, and then you explicitely set it up to True in the calling tests.

I believe that auto_flush=True is the current default, so the diff does what you're asking :-)

I'd be keen on having a warning, at least when we __del__ the object and some deliveries are still pending, because even if not perfect, that'd be the sign of a bug.

Oct 20 2022, 3:18 PM
olasd accepted D8705: writer: make calling flush() in write_addition(s)() optional.

If it's for test speedup only, don't you want to keep the actual behavior instead?
That is, make the auto_flush be False by default, and then you explicitely set it up to True in the calling tests.

Oct 20 2022, 3:18 PM
swh-sentry-integration assigned T4646: CVSProtocolError: Error from CVS server: b"E cvs checkout: Skipping `$Log$' keyword due to excessive comment leader... to vsellier.
Oct 20 2022, 3:18 PM · CVS loader
ardumont added a comment to D8705: writer: make calling flush() in write_addition(s)() optional.

If it's for test speedup only, don't you want to keep the actual behavior instead?
That is, make the auto_flush be False by default, and then you explicitely set it up to True in the calling tests.

Oct 20 2022, 3:18 PM
douardda added a comment to D8747: conda: Yield listed origins after all artifacts in a page are processed.

shouldn't this fix come with a test of some sort?

Oct 20 2022, 3:18 PM
douardda added a comment to D8748: Nuget: Implement incremental listing.

it's unclear to me whether this actually implements the iteration protocol described in the API doc (https://learn.microsoft.com/en-us/nuget/api/catalog-resource#iterating-over-catalog-items) or not.

Oct 20 2022, 3:18 PM
douardda added a comment to D8747: conda: Yield listed origins after all artifacts in a page are processed.

I believe the commit title lacks a verb, doesn't it? ("conda: Yield listed origins after all artifacts in a page are processed" or something similar?)

Oct 20 2022, 3:18 PM
douardda added a comment to D8748: Nuget: Implement incremental listing.

it's unclear to me whether this actually implements the iteration protocol described in the API doc (https://learn.microsoft.com/en-us/nuget/api/catalog-resource#iterating-over-catalog-items) or not.

Oct 20 2022, 3:18 PM
olasd committed rDSNIPb0425cf33b1e: takedowns: new code dump :-( (authored by olasd).
takedowns: new code dump :-(
Oct 20 2022, 3:18 PM
olasd archived Test tag please ignore.
Oct 20 2022, 3:18 PM
ardumont accepted D8697: tests: make the kafka_server_base fixture a function scoped one.

I don't see the difference comparing the master build or the diff build times but wny not ;)

Oct 20 2022, 3:18 PM

Oct 19 2022

gitlab-migration changed the status of T4639: Deploy swh-scrubber v0.1.1, a subtask of T4527: scrubber: keep a state file for postgresql datastores, from Resolved to Migrated.
Oct 19 2022, 6:08 PM · Datastore Scrubber
gitlab-migration changed the status of T4639: Deploy swh-scrubber v0.1.1 from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:08 PM · System administration, Datastore Scrubber
gitlab-migration closed T4626: staging: ingest openbsd.org cvs forge, a subtask of T3691: Implement CVS loader, as Migrated.
Oct 19 2022, 6:08 PM · CVS loader, Archive coverage
gitlab-migration closed T4626: staging: ingest openbsd.org cvs forge as Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:08 PM · System administration, Archive coverage
gitlab-migration closed T4625: staging: ingest netbsd.org cvs forge as Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:08 PM · System administration, Archive coverage
gitlab-migration closed T4621: staging: Autoscaling of the indexers not properly configured as Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:08 PM · System administration
gitlab-migration closed T4625: staging: ingest netbsd.org cvs forge, a subtask of T3691: Implement CVS loader, as Migrated.
Oct 19 2022, 6:08 PM · CVS loader, Archive coverage
gitlab-migration changed the status of T4618: Migrate large workers (worker17-18) to elastic workers from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:08 PM · System administration
gitlab-migration changed the status of T4615: jenkins: Unstuck buster-swh image build from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:08 PM · System administration
gitlab-migration changed the status of T4614: Deploy swh-search v0.16.4, a subtask of T4599: Github descriptions are not used to search origins, from Resolved to Migrated.
Oct 19 2022, 6:08 PM · Metadata workflow, Archive search
gitlab-migration changed the status of T4614: Deploy swh-search v0.16.4 from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:08 PM · System administration, Archive search
gitlab-migration closed T4612: Most indexers are consuming journal topics slower than messages are produced as Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:08 PM · Indexer, System administration
gitlab-migration changed the status of T4607: GraphQL: staging - Deploy version v0.0.6 from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:08 PM · System administration, GraphQL API
gitlab-migration changed the status of T4610: upgrade staging instance to 15.4, a subtask of T2221: Development workflow & code quality, from Resolved to Migrated.
Oct 19 2022, 6:08 PM · meta-task, Roadmap 2020
gitlab-migration changed the status of T4610: upgrade staging instance to 15.4 from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:08 PM · System administration
gitlab-migration closed T4606: Deploy swh-indexer v2.7.0, a subtask of T4457: Index metadata from Gitea/Gogs, as Migrated.
Oct 19 2022, 6:08 PM · Origin-Gitea/Gogs, Extrinsic metadata, Indexer
gitlab-migration closed T4606: Deploy swh-indexer v2.7.0, a subtask of T2064: Add metadata from deposits to metadata search, as Migrated.
Oct 19 2022, 6:08 PM · Metadata workflow
gitlab-migration closed T4606: Deploy swh-indexer v2.7.0, a subtask of T4605: Deploy swh-loader-metadata v0.0.3, as Migrated.
Oct 19 2022, 6:08 PM · Metadata Loaders, System administration
gitlab-migration closed T4606: Deploy swh-indexer v2.7.0, a subtask of T4401: Index metadata from the deposit, as Migrated.
Oct 19 2022, 6:08 PM · SWORD deposit, Indexer, Metadata workflow
gitlab-migration closed T4606: Deploy swh-indexer v2.7.0 as Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:08 PM · System administration, Indexer