Paths

Table of Contentst

Differential D783

rm ctags mocks + add ctags to idx db + fix doc.
ClosedPublic
Actions

Authored by vlorentz on Dec 5 2018, 5:05 PM.

Tags

None

Subscribers

Details

Reviewers

Group Reviewers

Maniphest Tasks

T1307: Remove mock storages used in tests.
T1432: Remove mock storages from the indexers

Commits

rDCIDXfb34e1aabb2a: rm ctags mocks + add ctags to idx db + fix doc.

Diff Detail

Repository

rDCIDX Metadata indexer

Lint

Automatic diff as part of commit; lint not applicable.

Unit

Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Remove useless var

vlorentz added a task: T1432: Remove mock storages from the indexers.Dec 5 2018, 5:07 PM

Build has FAILED

Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/112/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/112/console

Harbormaster failed remote builds in B2891: Diff 2469!Dec 5 2018, 5:08 PM

Build has FAILED

Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/114/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/114/console

Harbormaster failed remote builds in B2892: Diff 2470!Dec 5 2018, 5:12 PM

fix

Build has FAILED

Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/118/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/118/console

Harbormaster failed remote builds in B2893: Diff 2471!Dec 5 2018, 5:24 PM

Update CommonContentIndexerTest to work with non-mocked storage.

Build is green
See https://jenkins.softwareheritage.org/job/DCIDX/job/tox/122/ for more details.

Harbormaster completed remote builds in B2894: Diff 2472.Dec 5 2018, 5:33 PM

ardumont added inline comments.Dec 5 2018, 5:35 PM

swh/indexer/storage/in_memory.py
169	This is not pointless. This is an implementation detail from the indexer storage. I expected the multiple ctags implementations (universal, exuberant, etc...) to be idempotent in their computations (still do). So in the indexer storage, the function that add those data simply ignore the conflicted data (which should be exactly the same as before). In the end, only read operations are expected when we pass yet again on the same content. Why were we expected to pass on the same content, you might ask? Because not so long ago, the indexers were a pipeline. Thus, adding a new indexer would have triggered such behavior. Because orchestrator would have broadcast yet again same contents to the new and possibly the other indexers as we.. As it's an implementation detail, in theory, you could implement this as you wish here as long as tests are fine ;)

vlorentz marked an inline comment as done.Dec 5 2018, 5:56 PM

vlorentz added inline comments.

swh/indexer/storage/in_memory.py
169	I expected the multiple ctags implementations (universal, exuberant, etc...) to be idempotent in their computations (still do). ctags implementations are registered as tools, and rows from different tools do not conflict with each other: `create unique index on content_ctags(id, hash_sha1(name), kind, line, lang, indexer_configuration_id);`

ardumont added inline comments.Dec 6 2018, 10:07 AM

swh/indexer/storage/in_memory.py
169	as tools, and rows from different tools do not conflict with each other from different tools do not conflict Yes, i did that. I should have avoided the multiple implementations reference. That's noisy. I mentioned both because i tested both (or more?). And both gave me the same result given the same input (independently from each other). What i meant was for the case `same tool, same content`, the computed data is the same. So the sql insertion function will simply drop the conflicted data. There will be no merge as it's supposed to not have divergent new data. The also supposedly gain here is that there is no writes operation with this approach. So it's supposedly faster (we'd need metric to ensure that ;). Against what you proposed which would always write. Like i said early on, implementation detail. Hoping this is clearer.

Remove todo

Build is green
See https://jenkins.softwareheritage.org/job/DCIDX/job/tox/124/ for more details.

Harbormaster completed remote builds in B2895: Diff 2473.Dec 6 2018, 2:37 PM

vlorentz mentioned this in D784: Add content_fossology_license_{get,add} to the mem idx storage..Dec 6 2018, 3:35 PM

vlorentz added a child revision: D784: Add content_fossology_license_{get,add} to the mem idx storage..Dec 6 2018, 4:30 PM

vlorentz mentioned this in D785: Start removing mocks for the mimetype indexer..Dec 6 2018, 4:44 PM

vlorentz added a child revision: D785: Start removing mocks for the mimetype indexer..Dec 6 2018, 4:47 PM

vlorentz mentioned this in D786: Add content_fossology_license_get_range + Remove mock from non-range tests..Dec 6 2018, 4:52 PM

vlorentz added a child revision: D786: Add content_fossology_license_get_range + Remove mock from non-range tests..Dec 6 2018, 4:52 PM

ardumont accepted this revision.Dec 6 2018, 9:24 PM

This revision is now accepted and ready to land.Dec 6 2018, 9:24 PM

squash

Closed by commit rDCIDXfb34e1aabb2a: rm ctags mocks + add ctags to idx db + fix doc. (authored by vlorentz). · Explain WhyDec 7 2018, 10:04 AM

This revision was automatically updated to reflect the committed changes.

Harbormaster failed remote builds in B2903: Diff 2481!Dec 7 2018, 10:05 AM

Build has FAILED

Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/132/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/132/console

vlorentz added a task: T1307: Remove mock storages used in tests..Dec 7 2018, 5:13 PM

Revision Contents
Changeset List

Path

Size

swh/

indexer/

storage/

4 lines

175 lines

tests/

storage/

test_in_memory.py

24 lines

8 lines

Diff 2482

swh/indexer/storage/init.py

Loading...

swh/indexer/storage/in_memory.py

Loading...

swh/indexer/tests/storage/test_in_memory.py

Loading...

swh/indexer/tests/test_ctags.py

Loading...