Add -r/--recursive flag in swh identify command
Related T3160
Details
- Reviewers
KShivendu - Group Reviewers
Reviewers - Maniphest Tasks
- T3160: swh identify: add a -R/--recursive flag
Diff Detail
- Repository
- rDMOD Data model
- Branch
- identify-cli-recursive-flag
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 20431 Build 31711: Phabricator diff pipeline on jenkins Jenkins console · Jenkins Build 31710: arc lint + arc unit
Event Timeline
Build is green
Patch application report for D5420 (id=19374)
Rebasing onto 9523be0552...
First, rewinding head to replay your work on top of it... Applying: cli/identify: Add support for --recursive
Changes applied before test
commit 1a71c52340e44370163fc478a44a80fcb8a6e81a Author: KShivendu <shivendu@iitbhilai.ac.in> Date: Tue Apr 6 00:06:34 2021 +0530 cli/identify: Add support for --recursive
See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/294/ for more details.
swh/model/cli.py | ||
---|---|---|
257 | I will fix this and add a test soon. Do you see any other mistake in the diff? |
swh/model/cli.py | ||
---|---|---|
182 | Agreed. Unless you want to go the extra mile and behave like most git commands do. That is: detect if output is being piped to another command. If it is, disable coloring by default, but allow override with a flag. If it isn't being piped, enable coloring by default, allowing override with a flag. |
swh/model/cli.py | ||
---|---|---|
182 | That's a great idea. I will do it. |
swh/model/cli.py | ||
---|---|---|
6–10 | is this interleaving of from/import/from/import for modules all coming from the stdlib ok w.r.t. coding guidelines and isort? (i'm not 100% sure of the answer, so maybe just double-check it) | |
175 | this should be called something like "TerminalColor" or "ColorEscapes", it's not about colors in general | |
176–178 | enum values should be all uppercase | |
181 | make this function a class method of the Enum | |
200 | as SWHID are always computed recursively, just not shown by default, the help message should be something like "Show computed SWHID recursively" | |
284 | This is not the right way to implement this. Because you're recomputing the SWHIDs at all level, making the complexity of this thing quadratic (O(n^2)). By reading this I realized that this issue is probably not an easy hack as I initially thought, but does require in fact some refactoring. Instead of calling the high-level functions to compute the SWHID of a single file or dir like we are doing right now, we should build a single model object for the top-level dir, and either output its SWHID, or traverse it (without recomputing SWHIDs) to output all of it. As observed in T1136 it is a feature that already exists in swh-scanner too, that should be refactored. |
Updating D5420: cli/identify: Use TerminalColor Enum and change recursive flag's description
Build is green
Patch application report for D5420 (id=19537)
Rebasing onto eeedac7af7...
First, rewinding head to replay your work on top of it... Applying: cli/identify: Add support for --recursive
Changes applied before test
commit 37b91cbec9620c5b8f67ce9622a1759cd0180b1b Author: KShivendu <shivendu@iitbhilai.ac.in> Date: Tue Apr 6 00:06:34 2021 +0530 cli/identify: Add support for --recursive
See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/300/ for more details.
swh/model/cli.py | ||
---|---|---|
284 | Where do I start? Can you please give me some pointers for the same? |
we should build a single model object for the top-level dir, and either output its SWHID, or traverse it (without recomputing SWHIDs) to output all of it
Where should I start? Can you please give me some pointers? I don't know much about the swh-models.
@KShivendu zack's comment explains how the code should work, and gives a pointer to an existing implementation of the technique (swh-scanner); this should be enough to start.
But don't feel obligated to continue working on this; as mentioned the task is harder than we expected.
Right.
As an update, we briefly discussed the best way forward on this the other day on #swh-devel.
The next step seems to be to check if swh.scanner can use only swh.model.merkle as its model of the filesystem, without needing a separate swh.scanner.model. If that's the case, the recursive support should be based on that (and we should get rid of swh.scanner.model ; or, at the very least, remove from it all the duplicate logic from swh.model.merkle).