Page MenuHomeSoftware Heritage

scanner: use 'extract_regex_objs' from swh.model
ClosedPublic

Authored by DanSeraf on Mar 26 2021, 2:39 PM.

Details

Summary

Since the function extract_regex_objs was moved to swh.model.from_disk the scanner should use the one in the model in order to remove the duplicate code between the two.

Tests changed according to the new function.

Fixes T2679

Diff Detail

Repository
rDTSCN Code scanner
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D5359 (id=19197)

Rebasing onto eb04424434...

Current branch diff-target is up to date.
Changes applied before test
commit 7b06a22ac5c910b496d5d4c7106f08220ba2dd2e
Author: Daniele Serafini <me@danieleserafini.eu>
Date:   Thu Mar 25 17:20:05 2021 +0100

    use 'extract_regex_objs' from swh.model

See https://jenkins.softwareheritage.org/job/DTSCN/job/tests-on-diff/113/ for more details.

vlorentz added inline comments.
swh/scanner/scanner.py
218

why not root_path.encode()?

swh/scanner/scanner.py
218

honestly i didn't think about it, but probably it's better to convert all the string with .encode() as you said

swh/scanner/scanner.py
218

oh, i just noticed that i don't need to convert the root_path there, thanks!

removed unnecessary conversion

Build is green

Patch application report for D5359 (id=19209)

Rebasing onto eb04424434...

Current branch diff-target is up to date.
Changes applied before test
commit b3256c87728eef4ed44c91c3c8daa59960debbbd
Author: Daniele Serafini <me@danieleserafini.eu>
Date:   Thu Mar 25 17:20:05 2021 +0100

    use 'extract_regex_objs' from swh.model

See https://jenkins.softwareheritage.org/job/DTSCN/job/tests-on-diff/114/ for more details.

This revision is now accepted and ready to land.Mar 26 2021, 4:12 PM
This revision was automatically updated to reflect the committed changes.