Page MenuHomeSoftware Heritage

IndexerFolder
ActivePublic

Members

  • This project does not have any members.

Details

Description

Miner/index on objects contained the Software Heritage archive.

Recent Activity

Tue, Feb 25

ardumont closed T1788: indexer-license: Investigate timeouts as Resolved by committing rDCIDXfc7a19e80874: storage.db: Improve content range queries to actually finish.
Tue, Feb 25, 11:28 AM · Indexer
ardumont closed D2709: idx.storage.db: Improve content range queries to actually finish.
Tue, Feb 25, 11:28 AM · Indexer
vlorentz accepted D2709: idx.storage.db: Improve content range queries to actually finish.
Tue, Feb 25, 11:07 AM · Indexer

Mon, Feb 24

ardumont updated the test plan for D2709: idx.storage.db: Improve content range queries to actually finish.
Mon, Feb 24, 2:23 PM · Indexer
ardumont added a revision to T1788: indexer-license: Investigate timeouts: D2709: idx.storage.db: Improve content range queries to actually finish.
Mon, Feb 24, 2:14 PM · Indexer

Thu, Feb 13

moranegg added a comment to T2270: Add to intrinsic metadata files to be indexed: AUTHORS and CONTRIBUTORS.

example: https://archive.softwareheritage.org/swh:1:cnt:a6463d2ce390990c31c2f2fa8019606721f0ca13;origin=http://git.savannah.gnu.org/git/gnugo.git/

Thu, Feb 13, 2:46 PM · Easy hack, Metadata workflow, Indexer

Tue, Feb 11

krithikvaidya closed T2258: Add type annotations to indexer classes, a subtask of T2257: Fully annotate swh-indexer with types, as Resolved.
Tue, Feb 11, 4:35 AM · Indexer
krithikvaidya closed T2258: Add type annotations to indexer classes as Resolved by committing rDCIDX5f49b59e6aa3: Add type annotations to indexer classes.
Tue, Feb 11, 4:35 AM · Easy hack, Indexer

Fri, Feb 7

vlorentz added a comment to T2270: Add to intrinsic metadata files to be indexed: AUTHORS and CONTRIBUTORS.

I think it qualifies, yes

Fri, Feb 7, 11:09 AM · Easy hack, Metadata workflow, Indexer
moranegg added a comment to T2270: Add to intrinsic metadata files to be indexed: AUTHORS and CONTRIBUTORS.

@vlorentz I'm not sure it is an easy hack, could you review the task and decide?

Fri, Feb 7, 11:03 AM · Easy hack, Metadata workflow, Indexer
moranegg triaged T2270: Add to intrinsic metadata files to be indexed: AUTHORS and CONTRIBUTORS as Normal priority.
Fri, Feb 7, 11:03 AM · Easy hack, Metadata workflow, Indexer

Tue, Feb 4

krithikvaidya added a revision to T2258: Add type annotations to indexer classes: D2622: Add type annotations to indexer classes.
Tue, Feb 4, 3:31 PM · Easy hack, Indexer
vlorentz added a comment to T2258: Add type annotations to indexer classes.

Yes, but you should just open a diff instead.

Tue, Feb 4, 12:15 PM · Easy hack, Indexer
krithikvaidya added a comment to T2258: Add type annotations to indexer classes.

And does Jenkins also run code in non-master branches through the pipeline?

Tue, Feb 4, 3:10 AM · Easy hack, Indexer

Mon, Feb 3

ardumont added a comment to T2258: Add type annotations to indexer classes.

and apologies for the delay.

Mon, Feb 3, 10:41 AM · Easy hack, Indexer

Sun, Feb 2

krithikvaidya added a comment to T2258: Add type annotations to indexer classes.

Thanks for the detailed reply 🙂, and apologies for the delay. Things are clearer now 👍

Sun, Feb 2, 4:09 AM · Easy hack, Indexer

Sat, Feb 1

twentyse7en added a comment to T2259: Add type annotations to metadata mappings.

hey, I would like to contribute.

Sat, Feb 1, 6:49 PM · Easy hack, Indexer
ardumont added a comment to T2258: Add type annotations to indexer classes.

The pytest tests are succeeding in the swh-indexer module, but failing in some other modules. Since this issue pertains to only the swh-indexer module, it shouldn't cause problems, right?

Sat, Feb 1, 10:42 AM · Easy hack, Indexer
krithikvaidya added a comment to T2258: Add type annotations to indexer classes.

Hi, I'd like to take up this issue as my first issue here :) . But before I take it up, I just had a few queries:

Sat, Feb 1, 9:58 AM · Easy hack, Indexer

Wed, Jan 29

vlorentz removed a project from T2257: Fully annotate swh-indexer with types: Restricted Project.
Wed, Jan 29, 5:06 PM · Indexer
vlorentz removed a project from T2258: Add type annotations to indexer classes: Restricted Project.
Wed, Jan 29, 5:06 PM · Easy hack, Indexer
vlorentz removed a project from T2259: Add type annotations to metadata mappings: Restricted Project.
Wed, Jan 29, 5:06 PM · Easy hack, Indexer
vlorentz added a project to T2259: Add type annotations to metadata mappings: Easy hack.
Wed, Jan 29, 3:35 PM · Easy hack, Indexer
vlorentz triaged T2259: Add type annotations to metadata mappings as Low priority.
Wed, Jan 29, 3:35 PM · Easy hack, Indexer
vlorentz triaged T2258: Add type annotations to indexer classes as Low priority.
Wed, Jan 29, 3:34 PM · Easy hack, Indexer
vlorentz renamed T2257: Fully annotate swh-indexer with types from Fully annotate swh-index with types to Fully annotate swh-indexer with types.
Wed, Jan 29, 3:32 PM · Indexer
vlorentz triaged T2257: Fully annotate swh-indexer with types as Low priority.
Wed, Jan 29, 3:31 PM · Indexer

Jan 27 2020

vlorentz updated the task description for T1475: Test more edge cases of metadata indexer mappings.
Jan 27 2020, 4:41 PM · Easy hack, Indexer
vlorentz renamed T1475: Test more edge cases of metadata indexer mappings from Add more tests for edge cases of indexer mappings to Test more edge cases of metadata indexer mappings.
Jan 27 2020, 4:39 PM · Easy hack, Indexer
vlorentz added a project to T1475: Test more edge cases of metadata indexer mappings: Easy hack.
Jan 27 2020, 4:35 PM · Easy hack, Indexer

Jan 23 2020

ardumont added a comment to T1788: indexer-license: Investigate timeouts.

sample with our shiny sentry: https://sentry.softwareheritage.org/share/issue/f4a40625783b4a5588980005ddc5a5e6/

Jan 23 2020, 9:05 AM · Indexer

Jan 22 2020

vlorentz placed T1475: Test more edge cases of metadata indexer mappings up for grabs.
Jan 22 2020, 3:36 PM · Easy hack, Indexer

Jan 13 2020

vlorentz closed T2144: Define an architecture for end-to-end monitoring/testing, a subtask of T2127: Standalone Indexer Testing, as Resolved.
Jan 13 2020, 3:23 PM · Indexer, Sprint 2019/12 (Monitor and Conquer)

Dec 20 2019

vlorentz added a subtask for T2127: Standalone Indexer Testing: T2144: Define an architecture for end-to-end monitoring/testing.
Dec 20 2019, 3:09 PM · Indexer, Sprint 2019/12 (Monitor and Conquer)

Dec 3 2019

vlorentz lowered the priority of T2127: Standalone Indexer Testing from High to Normal.
Dec 3 2019, 5:50 PM · Indexer, Sprint 2019/12 (Monitor and Conquer)
vlorentz triaged T2127: Standalone Indexer Testing as High priority.
Dec 3 2019, 3:17 PM · Indexer, Sprint 2019/12 (Monitor and Conquer)

Dec 2 2019

olasd created T2127: Standalone Indexer Testing.
Dec 2 2019, 2:26 PM · Indexer, Sprint 2019/12 (Monitor and Conquer)

Nov 22 2019

vlorentz placed T1464: Auto-detect indexer tool versions instead of reading them from the config up for grabs.
Nov 22 2019, 10:54 AM · Indexer
vlorentz closed T861: mimetype indexer: edge case makes the indexer fail miserably as Resolved.

Fixed by D896.

Nov 22 2019, 10:54 AM · Indexer
vlorentz closed T861: mimetype indexer: edge case makes the indexer fail miserably, a subtask of T713: Index existing contents (mimetype, language, license), as Resolved.
Nov 22 2019, 10:54 AM · Indexer

Nov 14 2019

vlorentz added a comment to T1513: The indexer journal client is unstable.

Does it still happen? The journal client changed a lot since this task was open, including switching backend library.

Nov 14 2019, 12:27 PM · Indexer

Nov 8 2019

vlorentz closed T2060: Many rows in origin_intrinsic_metadata still do not have an origin_url as Resolved.

Fixed by @olasd

Nov 8 2019, 5:32 PM · Indexer

Nov 5 2019

vlorentz added a project to T2060: Many rows in origin_intrinsic_metadata still do not have an origin_url: Indexer.
Nov 5 2019, 1:59 PM · Indexer

Sep 30 2019

ardumont added a comment to T1788: indexer-license: Investigate timeouts.

That's the postgresql statement_timeout variable that we set for some methods on the storage backends.

Sep 30 2019, 5:47 PM · Indexer
olasd added a comment to T1788: indexer-license: Investigate timeouts.

We can investigate 2 things:

  • check postgresql options to kill queries that takes too long (solely indexer-db right now) -> and find some way to report those
Sep 30 2019, 1:53 PM · Indexer

Sep 27 2019

ardumont added a comment to T1788: indexer-license: Investigate timeouts.

We can investigate 2 things:

  • check postgresql options to kill queries that takes too long (solely indexer-db right now) -> and find some way to report those
  • push the proxy client storage idea (started within the T1389 for the storage, currently wip) up to the indexer-storage.
Sep 27 2019, 9:53 AM · Indexer

Sep 24 2019

ardumont updated subscribers of T1788: indexer-license: Investigate timeouts.

Note: ... comments stack pop ... (-> been there a while apparently)

Sep 24 2019, 4:25 PM · Indexer

Sep 6 2019

ardumont added a comment to P520 [fixed] broken ci: index failure status.

In the end, it was missing initialization data steps.

Sep 6 2019, 9:43 AM · Indexer
ardumont updated the title for P520 [fixed] broken ci: index failure status from wip: broken ci: index failure status to [fixed] broken ci: index failure status.
Sep 6 2019, 9:42 AM · Indexer

Sep 5 2019

ardumont updated the title for P520 [fixed] broken ci: index failure status from failure in indexer to wip: broken ci: index failure status.
Sep 5 2019, 3:01 PM · Indexer