Page MenuHomeSoftware Heritage

Add merge detection to revision information

Authored by jbertran on Jul 29 2016, 3:02 PM.

Diff Detail

rDWAPPS Web applications
Automatic diff as part of commit; lint not applicable.
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

jbertran retitled this revision from to Add merge detection to revision information.
jbertran updated this object.
jbertran edited the test plan for this revision. (Show Details)
zack requested changes to this revision.Jul 31 2016, 3:09 PM
zack edited edge metadata.

Two classes of comments about this diff:

  1. minor stylistic stuff at the beginning
  2. lotsa copy/paste canned values in the tests, that should really be factored out before accepting this

When addressing (2), please use as constant/static attribute names something that makes it obvious what they are used for in the tests, e.g., EXPECTED_REVISION, NON_EXISTENT_REVISION, ROBOT_COMMITTER, etc. That practice makes tests much more readable and maintainable in the long run.


96 ↗(On Diff #282)

typo here "do" -> "to"

99 ↗(On Diff #282)

stylistic matter: the src/dest variables are used only once (hence no speed up in using them), and naming them doesn't add anything to the readability of the code. It is minor point, in cases like this one putting the regexps directly in the sub() arguments would be preferrable


extra empty line here? or is it intended?


same here: extra empty line?


same for the 3 empty lines here: are they intended? if so (for readability?), leave them around, otherwise please remove them


several constants here seem to be reused in multiple tests, at least:

  • the commit id
  • the directory id
  • the committer name, fullname, and email

please define them either as module-level constants, or as class-level attributes of the test module, and reference them from the various tests where they are used (using self.* or not, depending on the choice)


details about this artifact look like other candidates for defining constants/attributes, and referencing them from multiple tests


here's another one: this revision, or at least many metadata about it, are reused in multiple tests. Please factorize this out, … and DRY (Don't Repeat Yourself) :-)


other stuff to be factored out

This revision now requires changes to proceed.Jul 31 2016, 3:09 PM
jbertran added a subscriber: ardumont.
jbertran added inline comments.

I'd rather remove these entirely than define them as constants, since they're irrelevant to the tests I'm performing. Does that seem fair?


Same here.


I should make a standalone task for this, since there are a lot of other objects that would benefit from factorizing in On the other hand, I believe @ardumont's rationale behind avoiding factorizing was that it much improves readability of the test itself. Since performance is far from being an issue in these tests, I'm tempted to argue against factorizing.


Nothing about my comments is about performance. They're all about readability/maintanability of the tests.
Having to change test data in 10 difference places instead of 1 is problematic, period. Every time the same thing is written more than once in code, there's a potential problem; when something is written more than two, there's a certain problem :-)
And reading self.COMMIT_ID is much easier and meaningful than reading 18d8be353ed3480476f032475e7c233eff7371d5.

I'm not against a separate task for fixing this for old code, but your proposed change is contributing to the problem. So please avoid introducing even more test value duplication with this change, and submit a separate task for the old code. (I think it'd be easier to just fix this all at once, but YMMV.)

I suggest looking at for one way of keeping the test data "factored out".

jbertran edited edge metadata.
  • tests: factorize and cleanup
  • renderers: fix typo
  • revision log template: remove empty lines
zack requested changes to this revision.Aug 1 2016, 2:49 PM
zack edited edge metadata.

Almost there. Deduplication is great, you just need to push it further, to also include dictionary structures :-)


Sure. Just note it down as low priority task, maybe?


OK, almost there! The constants you added are great and reduce duplication a lot.
… but now what is duplicated is the above records, with all its 7 fields, which appears in several places :-)

I get there are two versions of it: one with _BIN/_RAW, the other with strings, but they are just 2 which is better than 10.
So please add two additional constants, e.g.: REVISION_RAW, and REVISION, defined using the other more fine grained constants, and in places like this just go with:

mock_backend.revision_get = MagicMock(return_value=REVISION_RAW)

this will become

self.assertEqual(actual_revision, REVISION)

this OTOH it is totally fine as is, as you need a finer grained version of the macro-constant


This could remain as is, as you need to do something different for this case.
But it'd be much nicer as something like:

stub_rev = REVISION_RAW
stub_rev['id'] = ...
stub_rev['message'] = ...

which has the added benefit of making clear what is different in the test data of this test case from other ones.

(Note: the above has potentially variable aliasing issues though, so you should either use a copy constructor, or make sure the constants are in a place that get refreshed for each test case.)



This revision now requires changes to proceed.Aug 1 2016, 2:49 PM
jbertran edited edge metadata.
  • Service tests: factorize test data structures

I finally get what you're saying. Thanks for putting up with the iterations ^^

zack edited edge metadata.

*thumbs up*

This revision is now accepted and ready to land.Aug 1 2016, 4:02 PM
This revision was automatically updated to reflect the committed changes.

@ardumont's rationale behind avoiding factorizing was that it much improves readability of the test itself...

@jbertran Precision: avoiding factorizing in a test context.

I do abide by the DRY principle but not fully in a test context.
I think now i was simply plain wrong.

We have too much duplication now.

Regarding this factorization, we could also create some intermediary helper functions make_revision, make_tree (or something) which takes as parameter the important part (in regard of that test, it seems to be the id here).