Page MenuHomeSoftware Heritage
Feed Advanced Search

Jan 8 2019

zack added a comment to T1464: Auto-detect indexer tool versions instead of reading them from the config.

Can you elaborate on how this would be implemented?

Jan 8 2019, 3:03 PM · Indexer
vlorentz claimed T1464: Auto-detect indexer tool versions instead of reading them from the config.
Jan 8 2019, 2:47 PM · Indexer
vlorentz updated subscribers of T1464: Auto-detect indexer tool versions instead of reading them from the config.
Jan 8 2019, 2:38 PM · Indexer
vlorentz added a parent task for T1464: Auto-detect indexer tool versions instead of reading them from the config: T861: mimetype indexer: edge case makes the indexer fail miserably.
Jan 8 2019, 2:37 PM · Indexer
vlorentz added a subtask for T861: mimetype indexer: edge case makes the indexer fail miserably: T1464: Auto-detect indexer tool versions instead of reading them from the config.
Jan 8 2019, 2:37 PM · Indexer
vlorentz triaged T1464: Auto-detect indexer tool versions instead of reading them from the config as Normal priority.
Jan 8 2019, 2:37 PM · Indexer
ardumont closed T1462: mimetype indexer: fails with TypeError: 'NoneType' object is not subscriptable as Resolved by committing rDCIDXa9cff246ba20: indexer: Fix type check on indexing result.
Jan 8 2019, 1:38 PM · Indexer
ardumont added a comment to T1462: mimetype indexer: fails with TypeError: 'NoneType' object is not subscriptable.

It is in D894 ;)

Jan 8 2019, 12:17 PM · Indexer
vlorentz added a comment to T1462: mimetype indexer: fails with TypeError: 'NoneType' object is not subscriptable.

hmm wait nvm, it should still be fixed

Jan 8 2019, 12:16 PM · Indexer
vlorentz added a comment to T1462: mimetype indexer: fails with TypeError: 'NoneType' object is not subscriptable.

Forget about it, it's directly caused by T861

Jan 8 2019, 12:15 PM · Indexer
ardumont added a revision to T1462: mimetype indexer: fails with TypeError: 'NoneType' object is not subscriptable: D894: indexer: Fix type check on indexing result.
Jan 8 2019, 12:12 PM · Indexer
zack renamed T1462: mimetype indexer: fails with TypeError: 'NoneType' object is not subscriptable from mimetype indexer: fix new error to mimetype indexer: fails with TypeError: 'NoneType' object is not subscriptable.
Jan 8 2019, 12:07 PM · Indexer
ardumont updated the task description for T1462: mimetype indexer: fails with TypeError: 'NoneType' object is not subscriptable.
Jan 8 2019, 12:06 PM · Indexer
ardumont triaged T1462: mimetype indexer: fails with TypeError: 'NoneType' object is not subscriptable as Normal priority.
Jan 8 2019, 12:03 PM · Indexer
vlorentz triaged T1456: Make metadata indexers support ranges as Normal priority.
Jan 8 2019, 11:38 AM · Indexer
vlorentz triaged T1455: Add a journal client that schedules oneshot tasks for metadata indexers as Normal priority.
Jan 8 2019, 11:37 AM · Indexer
vlorentz claimed T861: mimetype indexer: edge case makes the indexer fail miserably.
Jan 8 2019, 11:36 AM · Indexer

Dec 21 2018

vlorentz added a revision to T1327: Add Python metadata indexer: D879: Add Python PKG-INFO mapping..
Dec 21 2018, 2:57 PM · Restricted Project, Indexer

Dec 19 2018

ardumont added a comment to T803: Indexer - Retrieval error when contents is too big.

I suppose it all depends on the current storage's configuration.

Dec 19 2018, 10:21 AM · Indexer, Object storage

Dec 18 2018

ardumont added a comment to T803: Indexer - Retrieval error when contents is too big.

In the objstorage's pathslicing implementation, there is the get_stream implementation which is not used [1]

Dec 18 2018, 5:10 PM · Indexer, Object storage
vlorentz triaged T1448: Use swh.model.hashutil.MultiHash in swh.indexer.tests.test_utils.fill_storage as Wishlist priority.
Dec 18 2018, 4:45 PM · Easy hack, Indexer
vlorentz created T1448: Use swh.model.hashutil.MultiHash in swh.indexer.tests.test_utils.fill_storage.
Dec 18 2018, 4:45 PM · Easy hack, Indexer
vlorentz added a subtask for T803: Indexer - Retrieval error when contents is too big: T1446: Add support for slices in Storage.content_get.
Dec 18 2018, 4:35 PM · Indexer, Object storage

Dec 13 2018

vlorentz edited projects for T1433: Refactor output of indexer storage's `get` methods., added: Easy hack; removed Good first contribution.
Dec 13 2018, 3:07 PM · Easy hack, Indexer
vlorentz raised the priority of T1433: Refactor output of indexer storage's `get` methods. from Low to Normal.
Dec 13 2018, 1:56 PM · Easy hack, Indexer
vlorentz added a project to T1403: Document architecture of metadata mappings.: Documentation.
Dec 13 2018, 1:48 PM · Documentation, Indexer
vlorentz edited projects for T1433: Refactor output of indexer storage's `get` methods., added: Good first contribution; removed Easy hack.
Dec 13 2018, 12:33 PM · Easy hack, Indexer

Dec 6 2018

vlorentz updated the task description for T1433: Refactor output of indexer storage's `get` methods..
Dec 6 2018, 4:07 PM · Easy hack, Indexer
vlorentz updated the task description for T1433: Refactor output of indexer storage's `get` methods..
Dec 6 2018, 4:07 PM · Easy hack, Indexer
vlorentz triaged T1433: Refactor output of indexer storage's `get` methods. as Low priority.
Dec 6 2018, 4:04 PM · Easy hack, Indexer

Dec 4 2018

ardumont closed T1227: General improvments of the indexer: Schedule indexer tasks as Resolved.
Dec 4 2018, 11:47 AM · Indexer, Scheduling utilities
ardumont closed T1227: General improvments of the indexer: Schedule indexer tasks, a subtask of T359: Indexers: batch content analyzer infrastructure, as Resolved.
Dec 4 2018, 11:47 AM · Indexer, General
ardumont removed a subtask for T1227: General improvments of the indexer: Schedule indexer tasks: T1386: Refactor indexers' initialization step.
Dec 4 2018, 11:47 AM · Indexer, Scheduling utilities
ardumont removed a parent task for T1386: Refactor indexers' initialization step: T1227: General improvments of the indexer: Schedule indexer tasks.
Dec 4 2018, 11:47 AM · Indexer, Scheduling utilities
vlorentz added a parent task for T1386: Refactor indexers' initialization step: T1410: Kill implicit configuration: new configuration scheme.
Dec 4 2018, 11:29 AM · Indexer, Scheduling utilities
vlorentz lowered the priority of T1327: Add Python metadata indexer from Normal to Low.
Dec 4 2018, 11:23 AM · Restricted Project, Indexer
vlorentz lowered the priority of T1328: Add Ruby/Gem metadata indexer from Normal to Low.
Dec 4 2018, 11:23 AM · Restricted Project, Indexer

Dec 3 2018

vlorentz added a revision to T1384: Document indexer architecture / metadata pipeline: D747: Document the metadata workflow..
Dec 3 2018, 3:14 PM · Indexer, Documentation
vlorentz triaged T1403: Document architecture of metadata mappings. as Normal priority.
Dec 3 2018, 11:46 AM · Documentation, Indexer
vlorentz created T1403: Document architecture of metadata mappings..
Dec 3 2018, 11:45 AM · Documentation, Indexer

Nov 30 2018

vlorentz closed T1402: prefer codemeta properties for all metadata keys as Resolved.

Resolved by D758.

Nov 30 2018, 4:59 PM · Indexer, Metadata workflow
zack updated subscribers of T1402: prefer codemeta properties for all metadata keys.

This comes from a discussion between myself and @vlorentz . To summarize my position: I would like to be able to tell downstream users "the keys we use in our metadata are (a subset of) the codemeta ones". AFAICT that means that they should be able to look them up here: https://codemeta.github.io/terms/ . As such, we should not use keys like schema:author, but simply author; similarly we should not use codemeta:issueTracker but simply issueTracker.

Nov 30 2018, 2:47 PM · Indexer, Metadata workflow
vlorentz updated the task description for T1402: prefer codemeta properties for all metadata keys.
Nov 30 2018, 2:41 PM · Indexer, Metadata workflow
vlorentz triaged T1402: prefer codemeta properties for all metadata keys as Normal priority.
Nov 30 2018, 2:41 PM · Indexer, Metadata workflow

Nov 29 2018

vlorentz triaged T1397: Update swh-indexers/docs/dev-info.rst to remove orchestrator as Normal priority.
Nov 29 2018, 6:43 PM · Indexer, Documentation
vlorentz added a project to T1394: Make swh/indexer/tests/test_origin_metadata.py run faster.: Indexer.
Nov 29 2018, 10:24 AM · Indexer

Nov 28 2018

ardumont closed T1374: content indexer: Determine the identifier ranges to use to schedule those as Resolved.

As we got around ~5b contents for now, i went for 100000 ranges of ~50k each
It's been scheduled and content indexers are now consuming again.

Nov 28 2018, 12:33 PM · Indexer, Scheduling utilities
ardumont closed T1374: content indexer: Determine the identifier ranges to use to schedule those, a subtask of T1227: General improvments of the indexer: Schedule indexer tasks, as Resolved.
Nov 28 2018, 12:33 PM · Indexer, Scheduling utilities
ardumont closed T818: indexer DB should not use bytea for mimetype and encoding columns as Resolved.
Nov 28 2018, 12:26 PM · Storage manager, Indexer
ardumont closed T818: indexer DB should not use bytea for mimetype and encoding columns, a subtask of T1374: content indexer: Determine the identifier ranges to use to schedule those, as Resolved.
Nov 28 2018, 12:26 PM · Indexer, Scheduling utilities

Nov 27 2018

ardumont renamed T1277: swh-journal: Create a journal client for listing origin visits from swh-journal: Create a journal client for listing origins to swh-journal: Create a journal client for listing origin visits.
Nov 27 2018, 11:59 AM · Indexer, Journal
vlorentz added a subtask for T1385: Monitor output of metadata indexers: T359: Indexers: batch content analyzer infrastructure.
Nov 27 2018, 11:58 AM · Indexer
vlorentz added a parent task for T359: Indexers: batch content analyzer infrastructure: T1385: Monitor output of metadata indexers.
Nov 27 2018, 11:58 AM · Indexer, General
ardumont triaged T1386: Refactor indexers' initialization step as Normal priority.
Nov 27 2018, 11:58 AM · Indexer, Scheduling utilities
vlorentz triaged T1385: Monitor output of metadata indexers as High priority.
Nov 27 2018, 11:56 AM · Indexer
vlorentz triaged T1384: Document indexer architecture / metadata pipeline as Normal priority.
Nov 27 2018, 11:55 AM · Indexer, Documentation
ardumont added a subtask for T1374: content indexer: Determine the identifier ranges to use to schedule those: T818: indexer DB should not use bytea for mimetype and encoding columns.
Nov 27 2018, 11:53 AM · Indexer, Scheduling utilities
ardumont added a parent task for T818: indexer DB should not use bytea for mimetype and encoding columns: T1374: content indexer: Determine the identifier ranges to use to schedule those.
Nov 27 2018, 11:53 AM · Storage manager, Indexer
vlorentz closed T1375: Deploy revision metadata indexer as Resolved.
Nov 27 2018, 11:45 AM · Indexer
vlorentz closed T1375: Deploy revision metadata indexer, a subtask of T1324: Deploy metadata indexers in production, as Resolved.
Nov 27 2018, 11:45 AM · Indexer
vlorentz closed T1376: Deploy origin indexer, a subtask of T1324: Deploy metadata indexers in production, as Resolved.
Nov 27 2018, 11:45 AM · Indexer
vlorentz closed T1376: Deploy origin indexer as Resolved.
Nov 27 2018, 11:45 AM · Indexer
vlorentz closed T1324: Deploy metadata indexers in production as Resolved.
Nov 27 2018, 11:44 AM · Indexer
vlorentz closed T1324: Deploy metadata indexers in production, a subtask of T1227: General improvments of the indexer: Schedule indexer tasks, as Resolved.
Nov 27 2018, 11:44 AM · Indexer, Scheduling utilities
vlorentz closed T1326: metadata indexer: Deploy origin head as Resolved.
Nov 27 2018, 11:44 AM · Metadata workflow, Indexer
vlorentz closed T1326: metadata indexer: Deploy origin head, a subtask of T1324: Deploy metadata indexers in production, as Resolved.
Nov 27 2018, 11:44 AM · Indexer

Nov 21 2018

vlorentz added a revision to T1375: Deploy revision metadata indexer: D691: Deploy metadata indexers.
Nov 21 2018, 4:40 PM · Indexer
vlorentz added a revision to T1376: Deploy origin indexer: D691: Deploy metadata indexers.
Nov 21 2018, 4:40 PM · Indexer
ardumont updated the task description for T1324: Deploy metadata indexers in production.
Nov 21 2018, 4:30 PM · Indexer
ardumont triaged T1376: Deploy origin indexer as Normal priority.
Nov 21 2018, 4:30 PM · Indexer
vlorentz renamed T1324: Deploy metadata indexers in production from Deploy metadata indexer in production to Deploy metadata indexers in production.
Nov 21 2018, 4:29 PM · Indexer
ardumont updated the task description for T1324: Deploy metadata indexers in production.
Nov 21 2018, 4:28 PM · Indexer
ardumont triaged T1375: Deploy revision metadata indexer as Normal priority.
Nov 21 2018, 4:28 PM · Indexer
ardumont renamed T1324: Deploy metadata indexers in production from metadata indexer: Put them in production to Deploy metadata indexer in production.
Nov 21 2018, 4:22 PM · Indexer
vlorentz removed a parent task for T1326: metadata indexer: Deploy origin head: T1336: Deploy search over origin intrinsic metadata.
Nov 21 2018, 4:21 PM · Metadata workflow, Indexer
vlorentz added a parent task for T1324: Deploy metadata indexers in production: T1336: Deploy search over origin intrinsic metadata.
Nov 21 2018, 4:21 PM · Indexer
ardumont added a comment to T1326: metadata indexer: Deploy origin head.

And it's done:

Nov 21 2018, 4:20 PM · Metadata workflow, Indexer
ardumont renamed T991: Indexers: Send range of ids instead of list of ids from Indexers: Send range of ids instead of raw ids to Indexers: Send range of ids instead of list of ids.
Nov 21 2018, 3:26 PM · Indexer
ardumont triaged T1374: content indexer: Determine the identifier ranges to use to schedule those as Normal priority.
Nov 21 2018, 3:25 PM · Indexer, Scheduling utilities
ardumont closed T1310: Simplify indexer design: move away from the pipeline approach as Resolved.
Nov 21 2018, 3:23 PM · Indexer, Scheduling utilities
ardumont closed T1310: Simplify indexer design: move away from the pipeline approach, a subtask of T1227: General improvments of the indexer: Schedule indexer tasks, as Resolved.
Nov 21 2018, 3:23 PM · Indexer, Scheduling utilities
ardumont closed T991: Indexers: Send range of ids instead of list of ids, a subtask of T1326: metadata indexer: Deploy origin head, as Resolved.
Nov 21 2018, 3:23 PM · Metadata workflow, Indexer
ardumont closed T991: Indexers: Send range of ids instead of list of ids, a subtask of T1310: Simplify indexer design: move away from the pipeline approach, as Resolved.
Nov 21 2018, 3:23 PM · Indexer, Scheduling utilities
ardumont closed T991: Indexers: Send range of ids instead of list of ids as Resolved.
Nov 21 2018, 3:23 PM · Indexer
ardumont updated the task description for T1324: Deploy metadata indexers in production.
Nov 21 2018, 3:22 PM · Indexer
ardumont updated the task description for T1324: Deploy metadata indexers in production.
Nov 21 2018, 3:21 PM · Indexer
ardumont added a subtask for T1324: Deploy metadata indexers in production: T1326: metadata indexer: Deploy origin head.
Nov 21 2018, 3:21 PM · Indexer
ardumont added a parent task for T1326: metadata indexer: Deploy origin head: T1324: Deploy metadata indexers in production.
Nov 21 2018, 3:21 PM · Metadata workflow, Indexer
ardumont removed a parent task for T1324: Deploy metadata indexers in production: T1326: metadata indexer: Deploy origin head.
Nov 21 2018, 3:20 PM · Indexer
ardumont removed a subtask for T1326: metadata indexer: Deploy origin head: T1324: Deploy metadata indexers in production.
Nov 21 2018, 3:20 PM · Metadata workflow, Indexer
ardumont renamed T1326: metadata indexer: Deploy origin head from Deploy revision/origin metadata indexers to metadata indexer: Deploy origin head.
Nov 21 2018, 3:19 PM · Metadata workflow, Indexer
ardumont added a comment to T1326: metadata indexer: Deploy origin head.

Let me fix that.

Nov 21 2018, 3:19 PM · Metadata workflow, Indexer
ardumont added a comment to T1326: metadata indexer: Deploy origin head.

Actually, that's not a duplicate

yes, it is ;)

Nov 21 2018, 3:18 PM · Metadata workflow, Indexer
ardumont closed T1312: indexer: Adapt textual content indexer to actually filter textual content themselves as Resolved.

Only fossology_license_indexer needs that.

Nov 21 2018, 9:41 AM · Indexer, Scheduling utilities
ardumont closed T1312: indexer: Adapt textual content indexer to actually filter textual content themselves, a subtask of T1310: Simplify indexer design: move away from the pipeline approach, as Resolved.
Nov 21 2018, 9:41 AM · Indexer, Scheduling utilities

Nov 20 2018

ardumont added a comment to T991: Indexers: Send range of ids instead of list of ids.

By the way, status on this:

Nov 20 2018, 5:47 PM · Indexer
vlorentz added a parent task for T1326: metadata indexer: Deploy origin head: T1336: Deploy search over origin intrinsic metadata.
Nov 20 2018, 4:07 PM · Metadata workflow, Indexer
vlorentz removed a parent task for T1324: Deploy metadata indexers in production: T1336: Deploy search over origin intrinsic metadata.
Nov 20 2018, 4:07 PM · Indexer
vlorentz added a parent task for T991: Indexers: Send range of ids instead of list of ids: T1326: metadata indexer: Deploy origin head.
Nov 20 2018, 4:07 PM · Indexer
vlorentz added subtasks for T1326: metadata indexer: Deploy origin head: T1324: Deploy metadata indexers in production, T991: Indexers: Send range of ids instead of list of ids.
Nov 20 2018, 4:07 PM · Metadata workflow, Indexer