That's useful for tests, and there's no harm in keeping it.
Diff Detail
- Repository
- rDLDG Git loader
- Branch
- eventful-whatsnew
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 2518 Build 3119: tox-on-jenkins Jenkins Build 3118: arc lint + arc unit
Event Timeline
Build is green
See https://jenkins.softwareheritage.org/job/DLDG/job/tox/29/ for more details.
Have you tried runnning this change (with a loader with remote storage setup).
For example:
python3 -m swh.loader.git.updater --origin-url /path/to/one-repo/you/can/control cd /path/to/one-repo/you/can/control # do something # commit git commit -am 'Dummy action to commit stuff' python3 -m swh.loader.git.updater --origin-url /path/to/one-repo/you/can/control
I ask because that status result will get serialized and i'm not sure of the result on whatsnew here (id are bytes).
Also, you justify this change with "That's useful for tests, and there's no harm in keeping it."
First passertion might be acceptable, but it would then make sense to generalize what you do in test_load_unchanged (ie. add the returned status as third argument of the assertion).
Second argument, well I guess you know now ;-)
swh/loader/git/updater.py | ||
---|---|---|
486–497 | Generally speaking, it's best to have only one exit point in a function/method, as far as possible. In this case, I see no reason not to do so: status = {'status': 'uneventful'} if self.base_snapshot: if self.snapshot['id'] != self.base_snapshot['id']: status['status'] = 'eventful' status['whatsnew'] = 'snapshot: {} -> {}'.format( self.base_snapshot['id'], self.snapshot['id'])} elif self.snapshot['branches']: status['status'] = 'eventful' status['whatsnew'] = 'new branches: {}'.format( ', '.join(map(repr, self.snapshot['branches'])))} return status |
repr() always returns a string, so it's fine
I was not even on branches, i was speaking about the base_snapshot['id'] which must be bytes, most probably.
Also, my proposed manual test is not the one which will cause issue.
It's more running with a celery instance.
(and i forgot to hit enter yesterday... nice)
I remember why, i wanted to test something prior to hit enter ;)
In the end, that will work but the log won't be that human readable:
>>> raw_id b'4\x972t\xcc\xefj\xb4\xdf\xaa\xf8e\x99y/\xa9\xc3\xfeF\x89' >>> 'hello {}'. format(raw_id) "hello b'4\\x972t\\xcc\\xefj\\xb4\\xdf\\xaa\\xf8e\\x99y/\\xa9\\xc3\\xfeF\\x89'"
I think you need to hashutil.hash_to_hex the id to keep it human readable.
AFAIR, that's what is/was done with the loaders.
- Show the load status on assertion failures.
- Simplify the logic in load_status.
- hash_to_hex on snapshot ids.
Build is green
See https://jenkins.softwareheritage.org/job/DLDG/job/tox/32/ for more details.
This was about
there's no harm in keeping it
which smells like a small yagni. Either it's useful or not. If not then there is harm keeping it: every written line of code needs maintenance, testing (which lines are not, in fact) etc. so there is always a cost (not counting the time taken to write the code at first place).
Once again, I'm against this diff, I just want to point how easy it is add code, and how difficult it is, sometimes, not to write code. One must always ask him/herself whether the written line of code needs to exists, needs to be taken care of for the next 5-15 years.
If you think that's a bad idea, I can abandon this diff; that's also why we have code reviews.