Details

Reviewers

vlorentz

Group Reviewers

Reviewers

Commits

rDCIDX12801f81f2fb: tests/utils: Fix various test data model issues failing validation

Summary

Following the latest data model validation, we have our test data which is not passing those.

This is to fix those:

Fix wrong b'target_type' in revision dict input
Add missing snapshot 'target_type' key field
Add missing revision {type, committer_date, message}
Remove unnecessary fields from origins (lister, project)
Fix directory entries data
Fix initialization data in various metadata integration tests

Test Plan

tox

Diff Detail

Repository

rDCIDX Metadata indexer

Branch

master

Lint

No Linters Available

Unit

No Unit Test Coverage

Build Status

Buildable 7690
Build 11043: tox-on-jenkins	Jenkins
Build 11042: arc lint + arc unit

Event Timeline

ardumont created this revision.Sep 4 2019, 4:21 PM

Build has FAILED

Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/607/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/607/console

Harbormaster failed remote builds in B7667: Diff 6591!Sep 4 2019, 4:23 PM

ardumont planned changes to this revision.Sep 4 2019, 4:24 PM

vlorentz added inline comments.Sep 5 2019, 11:11 AM

swh/indexer/tests/utils.py
166–170	nitpick: use the same formatting as the author date
189–190	the var name should be `DIRECTORY_ENTRIES`, btw

tests/utils: Fix directory data
test_metadata: Fix directory setup
tests/utils: Fix indentation
tests/utils: Fix variable name

This tries to fix the test data to pass the new validation. Sadly, this is not
enough to make the tests green again as some side-effects were actually making
the tests work. Trying to untangle what's running wrong is somehow tedious and
not that simple.

ardumont planned changes to this revision.Sep 5 2019, 2:44 PM

ardumont marked 2 inline comments as done.

Build has FAILED

Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/609/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/609/console

Harbormaster failed remote builds in B7677: Diff 6601!Sep 5 2019, 2:44 PM

fix typo

Build has FAILED

Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/610/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/610/console

Harbormaster failed remote builds in B7678: Diff 6602!Sep 5 2019, 2:47 PM

ardumont planned changes to this revision.Sep 5 2019, 2:49 PM

grrrr, current status: P520

test_metadata: Fix the content id to the one actually used

printing all around... the incongruity came up:

##### dir-ls:  <generator object Storage.directory_ls at 0x7fe96f6716d8>
##### files:  [{'status': 'visible', 'sha1': b'abc', 'sha1_git': b'abc', 'sha256': b'abc', 'length': 1029, 'name': b'index.js', 'type': 'file', 'target': b'abc', 'perms': 33188, 'dir_id': b'10'}, {'status': 'visible', 'sha1': b'aab', 'sha1_git': b'aab', 'sha256': b'aab', 'length': 1029, 'name': b'package.json', 'type': 'file', 'target': b'aab', 'perms': 33188, 'dir_id': b'10'}]
##### detected-files {'NpmMapping': [b'aab']}
############### context:  NpmMapping
############### files [b'aab']
############### metadata  dict_keys([(b'cde', -9039153642638674709)]) # <- indexer storage in memory state

Build has FAILED

Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/611/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/611/console

Harbormaster failed remote builds in B7687: Diff 6611!Sep 5 2019, 5:41 PM

ardumont planned changes to this revision.Sep 5 2019, 5:41 PM

ardumont added inline comments.Sep 5 2019, 5:53 PM

swh/indexer/tests/test_metadata.py
1183	I made the `test_metadata.py` ok by using this id which was what ended being looked up ... but i did not follow through why this needs to be changed...

fix typo
test_metadata: Fix the content id to the one actually used
test_origin_metadata: Make tests green

As mentioned in one of the commits, i did not care for the duplication (i aimed at fixing those tests). I'll have to check again in a bit but not right now.

@vlorentz please have a look to check there is nothing strikingly wrong already ;)

Build is green
See https://jenkins.softwareheritage.org/job/DCIDX/job/tox/612/ for more details.

Harbormaster completed remote builds in B7690: Diff 6614.Sep 5 2019, 6:31 PM

ardumont edited the summary of this revision. (Show Details)Sep 6 2019, 9:37 AM

ardumont added a child revision: D1959: Publish origin_intrinsic_metadata to Kafka..Sep 6 2019, 9:39 AM

ardumont added inline comments.Sep 6 2019, 9:44 AM

swh/indexer/tests/test_metadata.py
1183	Context of what i am saying is P520.

ardumont retitled this revision from tests/utils: Fix various test data model issues to indexer: ci: broken master build: Fix various test/utils data model issues (following the new model validation delivery).Sep 6 2019, 9:45 AM

vlorentz requested changes to this revision.Sep 6 2019, 10:36 AM

vlorentz added inline comments.

swh/indexer/tests/test_metadata.py
1183	This is caused by:
swh/indexer/tests/utils.py
206–215	because of this, directory_ls used to return `{'target': b'aab', .., 'sha1': b'cde'}` (`b'aab'` being a a sha1_git).
578–581	But, this new code adds `{'target': b'aab', .., 'sha1': b'aab'}`; so the sha1 changed. You should not populate `storage.content_add` from `DIRECTORY_ENTRIES`. Add a new constant (with the same data that was deleted) instead.

This revision now requires changes to proceed.Sep 6 2019, 10:36 AM

ardumont added inline comments.Sep 6 2019, 11:07 AM

swh/indexer/tests/utils.py
578–581	But, this new code adds {'target': b'aab', .., 'sha1': b'aab'}; so the sha1 changed. yes, i hear you. But, I remember i tried to add a constants in the objstorage (b'aab', b'abc') first but it failed the same. The lookup of the sha1 stopped being b'cde' (within the indexer run) when removing the extra fields in the DIRECTORY_ENTRIES (necessary to pass the validation). Looks like as you said some misbehavior in the in-memory storage somehow...
578–581	The actual fix is to target b'cde' for that particular entry... (then i don't have to fix any other failing tests)

vlorentz added inline comments.Sep 6 2019, 11:09 AM

swh/indexer/tests/utils.py
578–581	No. The actual fix is to use `storage.content_add([{'target': b'aab', .., 'sha1': b'cde'}])`

ardumont added inline comments.Sep 6 2019, 11:15 AM

swh/indexer/tests/utils.py
578–581	Well, i mean the real fix would be to untangle this, this is not maintainable as is. It's not clear where what is coming from where. As a first approximation though, just changing the less cogs possible (only the requisite from the validation introduction) to have the tests green as before is the way. Then improving. And No. The actual fix is to use storage.content_add([{'target': b'aab', .., 'sha1': b'cde'}]) Even though correct, that does not add up to clarity...

Or let's use actual hashes instead of placeholders like 'aab' or 'cde', so it doesn't break again when we add hash validation in the model.

just changing the less cogs possible

That's what I'm suggesting. Move the extra fields from DIRECTORY_ENTRIES to a new CONTENT var.

tests/utils: Fix various test data model issues failing validation

Because of the test data were wrongly structured.
The validation check introduced made it apparent.

Summary:

Fix wrong b'target_type' in revision dict input
Add missing snapshot 'target_type' key field
Add missing revision {type, committer_date, message}
Remove unnecessary fields from origins (lister, project)
Remove wrong extra directory entry fields

(+ Rebase to squash commits)

Build is green
See https://jenkins.softwareheritage.org/job/DCIDX/job/tox/614/ for more details.

Harbormaster completed remote builds in B7697: Diff 6622.Sep 6 2019, 11:31 AM

vlorentz accepted this revision.Sep 6 2019, 11:35 AM

vlorentz added inline comments.

swh/indexer/tests/utils.py
560–561	Still not happy about this, but let's fix it later.

This revision is now accepted and ready to land.Sep 6 2019, 11:35 AM

Closed by commit rDCIDX12801f81f2fb: tests/utils: Fix various test data model issues failing validation (authored by ardumont). · Explain WhySep 6 2019, 11:42 AM

This revision was automatically updated to reflect the committed changes.

ardumont added inline comments.Sep 6 2019, 11:42 AM

swh/indexer/tests/utils.py
560–561	neither am i but yeah, not right now ;)

indexer: ci: broken master build: Fix various test/utils data model issues (following the new model validation delivery)
ClosedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 6614

swh/indexer/tests/test_metadata.py

swh/indexer/tests/test_origin_metadata.py

swh/indexer/tests/utils.py

indexer: ci: broken master build: Fix various test/utils data model issues (following the new model validation delivery)ClosedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 6614

swh/indexer/tests/test_metadata.py

swh/indexer/tests/test_origin_metadata.py

swh/indexer/tests/utils.py

indexer: ci: broken master build: Fix various test/utils data model issues (following the new model validation delivery)
ClosedPublic
Actions

Revision Contents
Changeset List