Page MenuHomeSoftware Heritage

origin_search: Add keyword search for instrinsic_metadata keywords/description
ClosedPublic

Authored by KShivendu on Jul 2 2021, 1:24 PM.

Details

Summary

instrinsic_metadata contains keywords and description which are very useful for
finding desirable origins based on a list of keywords provided by the user.

Diff Detail

Repository
rDSEA Archive search
Branch
metadata-description
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 22456
Build 34980: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 34979: arc lint + arc unit

Event Timeline

Build has FAILED

Patch application report for D5963 (id=21434)

Rebasing onto 2e1fb86387...

Current branch diff-target is up to date.
Changes applied before test
commit c100cd9f5d6dfd7042c0a717cff3fa609e1b3ac1
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 2 11:19:08 2021 +0000

    origin_search: Add keyword search for instrinsic_metadata keywords/description

Link to build: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/183/
See console output for more information: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/183/console

Harbormaster returned this revision to the author for changes because remote builds failed.Jul 2 2021, 1:29 PM
Harbormaster failed remote builds in B22412: Diff 21434!
  • test_search: Fix failing tests

Build has FAILED

Patch application report for D5963 (id=21435)

Rebasing onto 2e1fb86387...

Current branch diff-target is up to date.
Changes applied before test
commit f8ded206f956b441750cc17a111fa1c810660dcf
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 2 12:15:55 2021 +0000

    test_search: Fix failing tests

commit c100cd9f5d6dfd7042c0a717cff3fa609e1b3ac1
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 2 11:19:08 2021 +0000

    origin_search: Add keyword search for instrinsic_metadata keywords/description

Link to build: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/184/
See console output for more information: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/184/console

Harbormaster returned this revision to the author for changes because remote builds failed.Jul 2 2021, 2:48 PM
Harbormaster failed remote builds in B22413: Diff 21435!
  • origin_search: Fix signature

Build is green

Patch application report for D5963 (id=21442)

Rebasing onto 2e1fb86387...

Current branch diff-target is up to date.
Changes applied before test
commit 8238dd0163136d45a51fbb44b5c32f0f73eecc8e
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 2 14:01:43 2021 +0000

    origin_search: Fix signature

commit f8ded206f956b441750cc17a111fa1c810660dcf
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 2 12:15:55 2021 +0000

    test_search: Fix failing tests

commit c100cd9f5d6dfd7042c0a717cff3fa609e1b3ac1
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 2 11:19:08 2021 +0000

    origin_search: Add keyword search for instrinsic_metadata keywords/description

See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/185/ for more details.

vlorentz added inline comments.
swh/search/elasticsearch.py
448

what does ^2 do?

457–463

why these two changes?

swh/search/in_memory.py
436–441

you can sort in-place, no need to copy

  • origin_search: Polish the code with get_expansion and other methods

Build is green

Patch application report for D5963 (id=21471)

Rebasing onto 2e1fb86387...

Current branch diff-target is up to date.
Changes applied before test
commit cd468bdb2fcfc577edf83432f329faa943d46918
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 2 16:14:08 2021 +0000

    origin_search: Polish the code with get_expansion and other methods

commit 8238dd0163136d45a51fbb44b5c32f0f73eecc8e
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 2 14:01:43 2021 +0000

    origin_search: Fix signature

commit f8ded206f956b441750cc17a111fa1c810660dcf
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 2 12:15:55 2021 +0000

    test_search: Fix failing tests

commit c100cd9f5d6dfd7042c0a717cff3fa609e1b3ac1
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 2 11:19:08 2021 +0000

    origin_search: Add keyword search for instrinsic_metadata keywords/description

See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/187/ for more details.

KShivendu added inline comments.
swh/search/elasticsearch.py
448

Boosts a result score by 2 times (might not be exactly 2x, but that's the idea) if queried keywords are found in that field

swh/search/tests/test_search.py
458–467

These comments are outdated. I'll update them.
But you may check if these test cases are sufficient.

This revision is now accepted and ready to land.Jul 5 2021, 11:54 AM
KShivendu marked an inline comment as done.
  • Squash
  • Minor polishes

Build is green

Patch application report for D5963 (id=21477)

Rebasing onto 2e1fb86387...

Current branch diff-target is up to date.
Changes applied before test
commit d2c137d31d482439f17420cfa12c2bba05c6a6ec
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 2 11:19:08 2021 +0000

    origin_search: Add keyword search for instrinsic_metadata keywords/description

See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/188/ for more details.

Build is green

Patch application report for D5963 (id=21478)

Rebasing onto 2e1fb86387...

Current branch diff-target is up to date.
Changes applied before test
commit f378a989e972f1896f4b99fac7e916e4fb133a00
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 2 11:19:08 2021 +0000

    origin_search: Add keyword search for instrinsic_metadata keywords/description
    
    instrinsic_metadata contains keywords and description which are very useful for
    finding desirable origins based on a list of keywords provided by the user.

See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/189/ for more details.