Page MenuHomeSoftware Heritage

Add mercurial.from_disk.HgLoaderFromDisk
ClosedPublic

Authored by acezar on Jul 6 2020, 4:27 PM.

Details

Summary

Rather than relying on mercurial bundle this loader expect a local
clone of the repository.

Diff Detail

Repository
rDENV Development environment
Branch
acezar
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 17092
Build 26380: arc lint + arc unit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
swh/loader/mercurial/__init__.py
10–17 ↗(On Diff #16107)

The switch will be done in loader.py

swh/loader/mercurial/cli.py
6–8 ↗(On Diff #16107)

When you commit change in a file that has not been sorted before isort has been added to precommit, the imports get updated.

47–51 ↗(On Diff #16107)

Same answer: loader.py

swh/loader/mercurial/from_bundle.py
14–40 ↗(On Diff #16107)

Same answer: isort in precommit

swh/loader/mercurial/from_disk.py
349–350 ↗(On Diff #16107)

to_model converts from_disk.Content to model.BaseContent but storage.content_add only accepts model.Content. In our case we should only have model.Content from to_model anything else is actually not handled and should be an error.

swh/loader/mercurial/tests/test_hgutil.py
23 ↗(On Diff #16107)

https://docs.pytest.org/en/stable/monkeypatch.html

All modifications will be undone after the requesting test function or fixture has finished.

Build is green

Patch application report for D3435 (id=16202)

Could not rebase; Attempt merge onto bd914dec39...

Updating bd914de..b355360
Fast-forward
 requirements.txt                                   |   1 +
 setup.py                                           |   2 +
 swh/loader/mercurial/__init__.py                   |   4 +-
 swh/loader/mercurial/cli.py                        |   6 +-
 swh/loader/mercurial/from_bundle.py                | 641 ++++++++++++++++++++
 swh/loader/mercurial/from_disk.py                  | 454 +++++++++++++++
 swh/loader/mercurial/hgutil.py                     |  77 +++
 swh/loader/mercurial/identify.py                   | 541 +++++++++++++++++
 swh/loader/mercurial/loader.py                     | 645 +--------------------
 swh/loader/mercurial/tasks.py                      |   8 +-
 swh/loader/mercurial/tests/data/build.py           | 265 +++++++++
 swh/loader/mercurial/tests/data/example.json       |   1 +
 swh/loader/mercurial/tests/data/example.sh         |  59 ++
 swh/loader/mercurial/tests/data/example.tgz        | Bin 0 -> 51200 bytes
 swh/loader/mercurial/tests/data/hello.json         |   1 +
 swh/loader/mercurial/tests/data/the-sandbox.json   |   1 +
 swh/loader/mercurial/tests/data/transplant.json    |   1 +
 swh/loader/mercurial/tests/loader_checker.py       |  74 +++
 .../tests/{test_loader.py => test_from_bundle.py}  |  14 +-
 swh/loader/mercurial/tests/test_from_disk.py       | 199 +++++++
 swh/loader/mercurial/tests/test_hgutil.py          |  46 ++
 swh/loader/mercurial/tests/test_identify.py        |  74 +++
 swh/loader/mercurial/tests/test_loader.org         | 121 ----
 swh/loader/mercurial/tests/test_tasks.py           |   6 +-
 24 files changed, 2467 insertions(+), 774 deletions(-)
 create mode 100644 swh/loader/mercurial/from_bundle.py
 create mode 100644 swh/loader/mercurial/from_disk.py
 create mode 100644 swh/loader/mercurial/hgutil.py
 create mode 100644 swh/loader/mercurial/identify.py
 create mode 100755 swh/loader/mercurial/tests/data/build.py
 create mode 100644 swh/loader/mercurial/tests/data/example.json
 create mode 100644 swh/loader/mercurial/tests/data/example.sh
 create mode 100644 swh/loader/mercurial/tests/data/example.tgz
 create mode 100644 swh/loader/mercurial/tests/data/hello.json
 create mode 100644 swh/loader/mercurial/tests/data/the-sandbox.json
 create mode 100644 swh/loader/mercurial/tests/data/transplant.json
 create mode 100644 swh/loader/mercurial/tests/loader_checker.py
 rename swh/loader/mercurial/tests/{test_loader.py => test_from_bundle.py} (93%)
 create mode 100644 swh/loader/mercurial/tests/test_from_disk.py
 create mode 100644 swh/loader/mercurial/tests/test_hgutil.py
 create mode 100644 swh/loader/mercurial/tests/test_identify.py
 delete mode 100644 swh/loader/mercurial/tests/test_loader.org
Changes applied before test
commit b35536071623338213dcf22352c3a8f332b32344
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Wed Oct 28 11:33:17 2020 +0100

    Add mercurial.from_disk.HgLoaderFromDisk
    
    Rather than relying on mercurial bundles this loader expect a local repository.
    
    Differential Revision: https://forge.softwareheritage.org/D3435

commit c8c91ab674a9ade49caacd63a5b507bab67df9dc
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:42:21 2020 +0200

    Add new example repository generated from script
    
    First updatable example repository documented by its generation script.

commit bc32e1280cfd6a59df595cdcbcc2c2b51b3618aa
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:22:07 2020 +0200

    Add `Hg20BundleLoader` tests from json files
    
    Generated json files with `swh/loader/mercurial/tests/data/build.py` for
    existing repositories and added them to `Hg20BundleLoader` tests.
    
    Introduce `LoaderChecker` as a standardized way to test repositories
    against json files.

commit ff11f77f1b493bd1c8ed257e790ded8da276101c
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Fri Oct 16 11:28:35 2020 +0200

    Add testing repository builder
    
    This build script purpose is to create example repositories from bash scripts
    and extract assertion data from them into json files.
    
    Advantages:
    
        - the bash script documents the repository creation
        - automating creation allow easy repository update
        - automation extraction allow easier update of assertion data

commit a2e9cf16919a5f81a06f955a533a254a9b3c9689
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Thu Oct 8 18:07:50 2020 +0200

    add swh-hg-identify a cli to identify hg objects

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/107/ for more details.

swh/loader/mercurial/__init__.py
10–17 ↗(On Diff #16107)

will add a loader.mercurial_from_disk=swh.loader.mercurial:register_from_disk. Thanks for the hint.

Build is green

Patch application report for D3435 (id=16207)

Could not rebase; Attempt merge onto bd914dec39...

Updating bd914de..ab26d6f
Fast-forward
 requirements.txt                                   |   1 +
 setup.py                                           |   3 +
 swh/loader/mercurial/__init__.py                   |  10 +
 swh/loader/mercurial/cli.py                        |   6 +-
 swh/loader/mercurial/from_disk.py                  | 454 +++++++++++++++++
 swh/loader/mercurial/hgutil.py                     |  77 +++
 swh/loader/mercurial/identify.py                   | 541 +++++++++++++++++++++
 swh/loader/mercurial/loader.py                     |   6 +-
 swh/loader/mercurial/tasks_from_disk.py            |  33 ++
 swh/loader/mercurial/tests/data/build.py           | 265 ++++++++++
 swh/loader/mercurial/tests/data/example.json       |   1 +
 swh/loader/mercurial/tests/data/example.sh         |  59 +++
 swh/loader/mercurial/tests/data/example.tgz        | Bin 0 -> 51200 bytes
 swh/loader/mercurial/tests/data/hello.json         |   1 +
 swh/loader/mercurial/tests/data/the-sandbox.json   |   1 +
 swh/loader/mercurial/tests/data/transplant.json    |   1 +
 swh/loader/mercurial/tests/loader_checker.py       |  74 +++
 swh/loader/mercurial/tests/test_from_disk.py       | 199 ++++++++
 swh/loader/mercurial/tests/test_hgutil.py          |  46 ++
 swh/loader/mercurial/tests/test_identify.py        |  74 +++
 swh/loader/mercurial/tests/test_loader.org         | 121 -----
 swh/loader/mercurial/tests/test_tasks_from_disk.py |  47 ++
 22 files changed, 1893 insertions(+), 127 deletions(-)
 create mode 100644 swh/loader/mercurial/from_disk.py
 create mode 100644 swh/loader/mercurial/hgutil.py
 create mode 100644 swh/loader/mercurial/identify.py
 create mode 100644 swh/loader/mercurial/tasks_from_disk.py
 create mode 100755 swh/loader/mercurial/tests/data/build.py
 create mode 100644 swh/loader/mercurial/tests/data/example.json
 create mode 100644 swh/loader/mercurial/tests/data/example.sh
 create mode 100644 swh/loader/mercurial/tests/data/example.tgz
 create mode 100644 swh/loader/mercurial/tests/data/hello.json
 create mode 100644 swh/loader/mercurial/tests/data/the-sandbox.json
 create mode 100644 swh/loader/mercurial/tests/data/transplant.json
 create mode 100644 swh/loader/mercurial/tests/loader_checker.py
 create mode 100644 swh/loader/mercurial/tests/test_from_disk.py
 create mode 100644 swh/loader/mercurial/tests/test_hgutil.py
 create mode 100644 swh/loader/mercurial/tests/test_identify.py
 delete mode 100644 swh/loader/mercurial/tests/test_loader.org
 create mode 100644 swh/loader/mercurial/tests/test_tasks_from_disk.py
Changes applied before test
commit ab26d6fef18cc4d7939e2566de3d398b5882e582
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Wed Oct 28 11:33:17 2020 +0100

    Add mercurial.from_disk.HgLoaderFromDisk
    
    Rather than relying on mercurial bundles this loader expect a local repository.
    
    Differential Revision: https://forge.softwareheritage.org/D3435

commit c8c91ab674a9ade49caacd63a5b507bab67df9dc
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:42:21 2020 +0200

    Add new example repository generated from script
    
    First updatable example repository documented by its generation script.

commit bc32e1280cfd6a59df595cdcbcc2c2b51b3618aa
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:22:07 2020 +0200

    Add `Hg20BundleLoader` tests from json files
    
    Generated json files with `swh/loader/mercurial/tests/data/build.py` for
    existing repositories and added them to `Hg20BundleLoader` tests.
    
    Introduce `LoaderChecker` as a standardized way to test repositories
    against json files.

commit ff11f77f1b493bd1c8ed257e790ded8da276101c
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Fri Oct 16 11:28:35 2020 +0200

    Add testing repository builder
    
    This build script purpose is to create example repositories from bash scripts
    and extract assertion data from them into json files.
    
    Advantages:
    
        - the bash script documents the repository creation
        - automating creation allow easy repository update
        - automation extraction allow easier update of assertion data

commit a2e9cf16919a5f81a06f955a533a254a9b3c9689
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Thu Oct 8 18:07:50 2020 +0200

    add swh-hg-identify a cli to identify hg objects

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/110/ for more details.

swh/loader/mercurial/cli.py
6–8 ↗(On Diff #16107)

Okay, so it looks like the import should habe been sorted when the pre-commit hook was added, but that is out of scope for this change I guess.

swh/loader/mercurial/from_disk.py
313 ↗(On Diff #16207)

We need to keep the tip tags because the previous loader was loading it ? Or is there another reason ? (It would be nice if the doc was clearer about that.)

403 ↗(On Diff #16207)

As we discussed before, we need to better define Archive here.

181–182 ↗(On Diff #16107)

(gentle ping)

246 ↗(On Diff #16107)

Looks like this phabricator let this commetn drift a bit. Do you remeberwhat is was about ?

277 ↗(On Diff #16107)

I think that this feedback about self._revision_nodeid_to_swhid[parent_hg_nodeid] still apply. Am I missing something?

337–340 ↗(On Diff #16107)

(gentle ping)

349–350 ↗(On Diff #16107)

I see.

What about the binary content of file. when is it dropped ?

378 ↗(On Diff #16107)

gentle ping

swh/loader/mercurial/hgutil.py
20–31 ↗(On Diff #16107)

Can you give some details about what happened on this topic ?

acezar marked 9 inline comments as done.

Followup

Build is green

Patch application report for D3435 (id=16257)

Could not rebase; Attempt merge onto bd914dec39...

Updating bd914de..ede3e31
Fast-forward
 requirements.txt                                   |   1 +
 setup.py                                           |   3 +
 swh/loader/mercurial/__init__.py                   |  10 +
 swh/loader/mercurial/from_disk.py                  | 468 ++++++++++++++++++
 swh/loader/mercurial/hgutil.py                     |  77 +++
 swh/loader/mercurial/identify.py                   | 541 +++++++++++++++++++++
 swh/loader/mercurial/tasks_from_disk.py            |  33 ++
 swh/loader/mercurial/tests/data/build.py           | 265 ++++++++++
 swh/loader/mercurial/tests/data/example.json       |   1 +
 swh/loader/mercurial/tests/data/example.sh         |  59 +++
 swh/loader/mercurial/tests/data/example.tgz        | Bin 0 -> 51200 bytes
 swh/loader/mercurial/tests/data/hello.json         |   1 +
 swh/loader/mercurial/tests/data/the-sandbox.json   |   1 +
 swh/loader/mercurial/tests/data/transplant.json    |   1 +
 swh/loader/mercurial/tests/loader_checker.py       |  74 +++
 swh/loader/mercurial/tests/test_from_disk.py       | 199 ++++++++
 swh/loader/mercurial/tests/test_hgutil.py          |  46 ++
 swh/loader/mercurial/tests/test_identify.py        |  74 +++
 swh/loader/mercurial/tests/test_loader.org         | 121 -----
 swh/loader/mercurial/tests/test_tasks_from_disk.py |  47 ++
 20 files changed, 1901 insertions(+), 121 deletions(-)
 create mode 100644 swh/loader/mercurial/from_disk.py
 create mode 100644 swh/loader/mercurial/hgutil.py
 create mode 100644 swh/loader/mercurial/identify.py
 create mode 100644 swh/loader/mercurial/tasks_from_disk.py
 create mode 100755 swh/loader/mercurial/tests/data/build.py
 create mode 100644 swh/loader/mercurial/tests/data/example.json
 create mode 100644 swh/loader/mercurial/tests/data/example.sh
 create mode 100644 swh/loader/mercurial/tests/data/example.tgz
 create mode 100644 swh/loader/mercurial/tests/data/hello.json
 create mode 100644 swh/loader/mercurial/tests/data/the-sandbox.json
 create mode 100644 swh/loader/mercurial/tests/data/transplant.json
 create mode 100644 swh/loader/mercurial/tests/loader_checker.py
 create mode 100644 swh/loader/mercurial/tests/test_from_disk.py
 create mode 100644 swh/loader/mercurial/tests/test_hgutil.py
 create mode 100644 swh/loader/mercurial/tests/test_identify.py
 delete mode 100644 swh/loader/mercurial/tests/test_loader.org
 create mode 100644 swh/loader/mercurial/tests/test_tasks_from_disk.py
Changes applied before test
commit ede3e31d7b8b654c81607a967982e7330d88c98a
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Wed Oct 28 11:33:17 2020 +0100

    Add mercurial.from_disk.HgLoaderFromDisk
    
    Rather than relying on mercurial bundles this loader expect a local repository.
    
    Differential Revision: https://forge.softwareheritage.org/D3435

commit c8c91ab674a9ade49caacd63a5b507bab67df9dc
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:42:21 2020 +0200

    Add new example repository generated from script
    
    First updatable example repository documented by its generation script.

commit bc32e1280cfd6a59df595cdcbcc2c2b51b3618aa
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:22:07 2020 +0200

    Add `Hg20BundleLoader` tests from json files
    
    Generated json files with `swh/loader/mercurial/tests/data/build.py` for
    existing repositories and added them to `Hg20BundleLoader` tests.
    
    Introduce `LoaderChecker` as a standardized way to test repositories
    against json files.

commit ff11f77f1b493bd1c8ed257e790ded8da276101c
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Fri Oct 16 11:28:35 2020 +0200

    Add testing repository builder
    
    This build script purpose is to create example repositories from bash scripts
    and extract assertion data from them into json files.
    
    Advantages:
    
        - the bash script documents the repository creation
        - automating creation allow easy repository update
        - automation extraction allow easier update of assertion data

commit a2e9cf16919a5f81a06f955a533a254a9b3c9689
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Thu Oct 8 18:07:50 2020 +0200

    add swh-hg-identify a cli to identify hg objects

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/111/ for more details.

acezar added inline comments.
swh/loader/mercurial/from_disk.py
313 ↗(On Diff #16207)

was a typo: tip must be ignored. Because as said: A release correspond to a user defined tag and tip is not user defined.

246 ↗(On Diff #16107)

it was about passing context rather than the nodeid and ask the context again to the repo in the method. (was on self.store_directories(hg_nodeid))

277 ↗(On Diff #16107)

I thought it was about the whole parent block hence why I added self.get_revision_parents(rev_ctx).

349–350 ↗(On Diff #16107)

Made sure that only the necessary data is returned at the en of the method. Garbage collector should cleanup thing if storage does not hold reference until flush.

swh/loader/mercurial/hgutil.py
20–31 ↗(On Diff #16107)

removed the duplication. kept the hgutil.py version

swh/loader/mercurial/from_disk.py
349–350 ↗(On Diff #16107)

Sure, but which part is dropping the reference to the actual file data ?

The data are passed into: content_data, which is passed into Content, which is passed to ModelContent which still hold the data because self.storage.content_add is called on it.

swh/loader/mercurial/hgutil.py
20–31 ↗(On Diff #16107)

Okay thanks

swh/loader/mercurial/from_disk.py
349–350 ↗(On Diff #16107)

Never mind I just noticed that we return a new object at the end of the method:

# Here we make sure to return only necessary data.
return Content({"sha1_git": content.hash, "perms": perms})

Is this code new, or did I missed it on the first pass ?

swh/loader/mercurial/from_disk.py
189–191 ↗(On Diff #16257)

It is still unclear to me when/why we have this special case. Can you clarify that point ?

349–350 ↗(On Diff #16107)

This is indeed new code.

acezar marked 5 inline comments as done.

Followup

Build is green

Patch application report for D3435 (id=16277)

Could not rebase; Attempt merge onto bd914dec39...

Updating bd914de..f14a65f
Fast-forward
 requirements.txt                                   |   1 +
 setup.py                                           |   3 +
 swh/loader/mercurial/__init__.py                   |  10 +
 swh/loader/mercurial/from_disk.py                  | 475 ++++++++++++++++++
 swh/loader/mercurial/hgutil.py                     |  77 +++
 swh/loader/mercurial/identify.py                   | 541 +++++++++++++++++++++
 swh/loader/mercurial/tasks_from_disk.py            |  33 ++
 swh/loader/mercurial/tests/data/build.py           | 265 ++++++++++
 swh/loader/mercurial/tests/data/example.json       |   1 +
 swh/loader/mercurial/tests/data/example.sh         |  59 +++
 swh/loader/mercurial/tests/data/example.tgz        | Bin 0 -> 51200 bytes
 swh/loader/mercurial/tests/data/hello.json         |   1 +
 swh/loader/mercurial/tests/data/the-sandbox.json   |   1 +
 swh/loader/mercurial/tests/data/transplant.json    |   1 +
 swh/loader/mercurial/tests/loader_checker.py       |  74 +++
 swh/loader/mercurial/tests/test_from_disk.py       | 199 ++++++++
 swh/loader/mercurial/tests/test_hgutil.py          |  46 ++
 swh/loader/mercurial/tests/test_identify.py        |  74 +++
 swh/loader/mercurial/tests/test_loader.org         | 121 -----
 swh/loader/mercurial/tests/test_tasks_from_disk.py |  47 ++
 20 files changed, 1908 insertions(+), 121 deletions(-)
 create mode 100644 swh/loader/mercurial/from_disk.py
 create mode 100644 swh/loader/mercurial/hgutil.py
 create mode 100644 swh/loader/mercurial/identify.py
 create mode 100644 swh/loader/mercurial/tasks_from_disk.py
 create mode 100755 swh/loader/mercurial/tests/data/build.py
 create mode 100644 swh/loader/mercurial/tests/data/example.json
 create mode 100644 swh/loader/mercurial/tests/data/example.sh
 create mode 100644 swh/loader/mercurial/tests/data/example.tgz
 create mode 100644 swh/loader/mercurial/tests/data/hello.json
 create mode 100644 swh/loader/mercurial/tests/data/the-sandbox.json
 create mode 100644 swh/loader/mercurial/tests/data/transplant.json
 create mode 100644 swh/loader/mercurial/tests/loader_checker.py
 create mode 100644 swh/loader/mercurial/tests/test_from_disk.py
 create mode 100644 swh/loader/mercurial/tests/test_hgutil.py
 create mode 100644 swh/loader/mercurial/tests/test_identify.py
 delete mode 100644 swh/loader/mercurial/tests/test_loader.org
 create mode 100644 swh/loader/mercurial/tests/test_tasks_from_disk.py
Changes applied before test
commit f14a65f97b272db087b2ad823dbdeb44fda768e0
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Wed Oct 28 11:33:17 2020 +0100

    Add mercurial.from_disk.HgLoaderFromDisk
    
    Rather than relying on mercurial bundles this loader expect a local repository.

commit c8c91ab674a9ade49caacd63a5b507bab67df9dc
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:42:21 2020 +0200

    Add new example repository generated from script
    
    First updatable example repository documented by its generation script.

commit bc32e1280cfd6a59df595cdcbcc2c2b51b3618aa
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:22:07 2020 +0200

    Add `Hg20BundleLoader` tests from json files
    
    Generated json files with `swh/loader/mercurial/tests/data/build.py` for
    existing repositories and added them to `Hg20BundleLoader` tests.
    
    Introduce `LoaderChecker` as a standardized way to test repositories
    against json files.

commit ff11f77f1b493bd1c8ed257e790ded8da276101c
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Fri Oct 16 11:28:35 2020 +0200

    Add testing repository builder
    
    This build script purpose is to create example repositories from bash scripts
    and extract assertion data from them into json files.
    
    Advantages:
    
        - the bash script documents the repository creation
        - automating creation allow easy repository update
        - automation extraction allow easier update of assertion data

commit a2e9cf16919a5f81a06f955a533a254a9b3c9689
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Thu Oct 8 18:07:50 2020 +0200

    add swh-hg-identify a cli to identify hg objects

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/114/ for more details.

Build is green

Patch application report for D3435 (id=16286)

Could not rebase; Attempt merge onto bd914dec39...

Updating bd914de..12411aa
Fast-forward
 requirements.txt                                   |   1 +
 setup.py                                           |   3 +
 swh/loader/mercurial/__init__.py                   |  10 +
 swh/loader/mercurial/from_disk.py                  | 475 ++++++++++++++++++
 swh/loader/mercurial/hgutil.py                     |  77 +++
 swh/loader/mercurial/identify.py                   | 541 +++++++++++++++++++++
 swh/loader/mercurial/tasks_from_disk.py            |  33 ++
 swh/loader/mercurial/tests/data/build.py           | 265 ++++++++++
 swh/loader/mercurial/tests/data/example.json       |   1 +
 swh/loader/mercurial/tests/data/example.sh         |  59 +++
 swh/loader/mercurial/tests/data/example.tgz        | Bin 0 -> 51200 bytes
 swh/loader/mercurial/tests/data/hello.json         |   1 +
 swh/loader/mercurial/tests/data/the-sandbox.json   |   1 +
 swh/loader/mercurial/tests/data/transplant.json    |   1 +
 swh/loader/mercurial/tests/loader_checker.py       |  74 +++
 swh/loader/mercurial/tests/test_from_disk.py       | 205 ++++++++
 swh/loader/mercurial/tests/test_hgutil.py          |  46 ++
 swh/loader/mercurial/tests/test_identify.py        |  74 +++
 swh/loader/mercurial/tests/test_loader.org         | 121 -----
 swh/loader/mercurial/tests/test_tasks_from_disk.py |  47 ++
 20 files changed, 1914 insertions(+), 121 deletions(-)
 create mode 100644 swh/loader/mercurial/from_disk.py
 create mode 100644 swh/loader/mercurial/hgutil.py
 create mode 100644 swh/loader/mercurial/identify.py
 create mode 100644 swh/loader/mercurial/tasks_from_disk.py
 create mode 100755 swh/loader/mercurial/tests/data/build.py
 create mode 100644 swh/loader/mercurial/tests/data/example.json
 create mode 100644 swh/loader/mercurial/tests/data/example.sh
 create mode 100644 swh/loader/mercurial/tests/data/example.tgz
 create mode 100644 swh/loader/mercurial/tests/data/hello.json
 create mode 100644 swh/loader/mercurial/tests/data/the-sandbox.json
 create mode 100644 swh/loader/mercurial/tests/data/transplant.json
 create mode 100644 swh/loader/mercurial/tests/loader_checker.py
 create mode 100644 swh/loader/mercurial/tests/test_from_disk.py
 create mode 100644 swh/loader/mercurial/tests/test_hgutil.py
 create mode 100644 swh/loader/mercurial/tests/test_identify.py
 delete mode 100644 swh/loader/mercurial/tests/test_loader.org
 create mode 100644 swh/loader/mercurial/tests/test_tasks_from_disk.py
Changes applied before test
commit 12411aa64133e2578f20444ab09e117ccb4634d5
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Wed Oct 28 11:33:17 2020 +0100

    Add mercurial.from_disk.HgLoaderFromDisk
    
    Rather than relying on mercurial bundles this loader expect a local repository.

commit c8c91ab674a9ade49caacd63a5b507bab67df9dc
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:42:21 2020 +0200

    Add new example repository generated from script
    
    First updatable example repository documented by its generation script.

commit bc32e1280cfd6a59df595cdcbcc2c2b51b3618aa
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:22:07 2020 +0200

    Add `Hg20BundleLoader` tests from json files
    
    Generated json files with `swh/loader/mercurial/tests/data/build.py` for
    existing repositories and added them to `Hg20BundleLoader` tests.
    
    Introduce `LoaderChecker` as a standardized way to test repositories
    against json files.

commit ff11f77f1b493bd1c8ed257e790ded8da276101c
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Fri Oct 16 11:28:35 2020 +0200

    Add testing repository builder
    
    This build script purpose is to create example repositories from bash scripts
    and extract assertion data from them into json files.
    
    Advantages:
    
        - the bash script documents the repository creation
        - automating creation allow easy repository update
        - automation extraction allow easier update of assertion data

commit a2e9cf16919a5f81a06f955a533a254a9b3c9689
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Thu Oct 8 18:07:50 2020 +0200

    add swh-hg-identify a cli to identify hg objects

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/118/ for more details.

I only have one last question. The rest seems fine.

swh/loader/mercurial/from_disk.py
193 ↗(On Diff #16286)

Small nits: If they are no other usage than the cli call, maybe pulling is not necessary.

251–255 ↗(On Diff #14564)

The questions still stand.

Build is green

Patch application report for D3435 (id=16302)

Could not rebase; Attempt merge onto bd914dec39...

Updating bd914de..e19a069
Fast-forward
 requirements.txt                                   |   1 +
 setup.py                                           |   3 +
 swh/loader/mercurial/__init__.py                   |  10 +
 swh/loader/mercurial/from_disk.py                  | 477 ++++++++++++++++++
 swh/loader/mercurial/hgutil.py                     |  77 +++
 swh/loader/mercurial/identify.py                   | 541 +++++++++++++++++++++
 swh/loader/mercurial/tasks_from_disk.py            |  33 ++
 swh/loader/mercurial/tests/data/build.py           | 265 ++++++++++
 swh/loader/mercurial/tests/data/example.json       |   1 +
 swh/loader/mercurial/tests/data/example.sh         |  59 +++
 swh/loader/mercurial/tests/data/example.tgz        | Bin 0 -> 51200 bytes
 swh/loader/mercurial/tests/data/hello.json         |   1 +
 swh/loader/mercurial/tests/data/the-sandbox.json   |   1 +
 swh/loader/mercurial/tests/data/transplant.json    |   1 +
 swh/loader/mercurial/tests/loader_checker.py       |  74 +++
 swh/loader/mercurial/tests/test_from_disk.py       | 205 ++++++++
 swh/loader/mercurial/tests/test_hgutil.py          |  46 ++
 swh/loader/mercurial/tests/test_identify.py        |  74 +++
 swh/loader/mercurial/tests/test_loader.org         | 121 -----
 swh/loader/mercurial/tests/test_tasks_from_disk.py |  47 ++
 20 files changed, 1916 insertions(+), 121 deletions(-)
 create mode 100644 swh/loader/mercurial/from_disk.py
 create mode 100644 swh/loader/mercurial/hgutil.py
 create mode 100644 swh/loader/mercurial/identify.py
 create mode 100644 swh/loader/mercurial/tasks_from_disk.py
 create mode 100755 swh/loader/mercurial/tests/data/build.py
 create mode 100644 swh/loader/mercurial/tests/data/example.json
 create mode 100644 swh/loader/mercurial/tests/data/example.sh
 create mode 100644 swh/loader/mercurial/tests/data/example.tgz
 create mode 100644 swh/loader/mercurial/tests/data/hello.json
 create mode 100644 swh/loader/mercurial/tests/data/the-sandbox.json
 create mode 100644 swh/loader/mercurial/tests/data/transplant.json
 create mode 100644 swh/loader/mercurial/tests/loader_checker.py
 create mode 100644 swh/loader/mercurial/tests/test_from_disk.py
 create mode 100644 swh/loader/mercurial/tests/test_hgutil.py
 create mode 100644 swh/loader/mercurial/tests/test_identify.py
 delete mode 100644 swh/loader/mercurial/tests/test_loader.org
 create mode 100644 swh/loader/mercurial/tests/test_tasks_from_disk.py
Changes applied before test
commit e19a06944997022b177b8caf252e91b2017190db
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Wed Oct 28 11:33:17 2020 +0100

    Add mercurial.from_disk.HgLoaderFromDisk
    
    Rather than relying on mercurial bundles this loader expect a local repository.

commit c8c91ab674a9ade49caacd63a5b507bab67df9dc
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:42:21 2020 +0200

    Add new example repository generated from script
    
    First updatable example repository documented by its generation script.

commit bc32e1280cfd6a59df595cdcbcc2c2b51b3618aa
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:22:07 2020 +0200

    Add `Hg20BundleLoader` tests from json files
    
    Generated json files with `swh/loader/mercurial/tests/data/build.py` for
    existing repositories and added them to `Hg20BundleLoader` tests.
    
    Introduce `LoaderChecker` as a standardized way to test repositories
    against json files.

commit ff11f77f1b493bd1c8ed257e790ded8da276101c
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Fri Oct 16 11:28:35 2020 +0200

    Add testing repository builder
    
    This build script purpose is to create example repositories from bash scripts
    and extract assertion data from them into json files.
    
    Advantages:
    
        - the bash script documents the repository creation
        - automating creation allow easy repository update
        - automation extraction allow easier update of assertion data

commit a2e9cf16919a5f81a06f955a533a254a9b3c9689
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Thu Oct 8 18:07:50 2020 +0200

    add swh-hg-identify a cli to identify hg objects

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/120/ for more details.

acezar marked an inline comment as done.

Followup

Build is green

Patch application report for D3435 (id=16303)

Could not rebase; Attempt merge onto bd914dec39...

Updating bd914de..58f1de0
Fast-forward
 requirements.txt                                   |   1 +
 setup.py                                           |   3 +
 swh/loader/mercurial/__init__.py                   |  10 +
 swh/loader/mercurial/from_disk.py                  | 475 ++++++++++++++++++
 swh/loader/mercurial/hgutil.py                     |  77 +++
 swh/loader/mercurial/identify.py                   | 541 +++++++++++++++++++++
 swh/loader/mercurial/tasks_from_disk.py            |  33 ++
 swh/loader/mercurial/tests/data/build.py           | 265 ++++++++++
 swh/loader/mercurial/tests/data/example.json       |   1 +
 swh/loader/mercurial/tests/data/example.sh         |  59 +++
 swh/loader/mercurial/tests/data/example.tgz        | Bin 0 -> 51200 bytes
 swh/loader/mercurial/tests/data/hello.json         |   1 +
 swh/loader/mercurial/tests/data/the-sandbox.json   |   1 +
 swh/loader/mercurial/tests/data/transplant.json    |   1 +
 swh/loader/mercurial/tests/loader_checker.py       |  74 +++
 swh/loader/mercurial/tests/test_from_disk.py       | 205 ++++++++
 swh/loader/mercurial/tests/test_hgutil.py          |  46 ++
 swh/loader/mercurial/tests/test_identify.py        |  74 +++
 swh/loader/mercurial/tests/test_loader.org         | 121 -----
 swh/loader/mercurial/tests/test_tasks_from_disk.py |  47 ++
 20 files changed, 1914 insertions(+), 121 deletions(-)
 create mode 100644 swh/loader/mercurial/from_disk.py
 create mode 100644 swh/loader/mercurial/hgutil.py
 create mode 100644 swh/loader/mercurial/identify.py
 create mode 100644 swh/loader/mercurial/tasks_from_disk.py
 create mode 100755 swh/loader/mercurial/tests/data/build.py
 create mode 100644 swh/loader/mercurial/tests/data/example.json
 create mode 100644 swh/loader/mercurial/tests/data/example.sh
 create mode 100644 swh/loader/mercurial/tests/data/example.tgz
 create mode 100644 swh/loader/mercurial/tests/data/hello.json
 create mode 100644 swh/loader/mercurial/tests/data/the-sandbox.json
 create mode 100644 swh/loader/mercurial/tests/data/transplant.json
 create mode 100644 swh/loader/mercurial/tests/loader_checker.py
 create mode 100644 swh/loader/mercurial/tests/test_from_disk.py
 create mode 100644 swh/loader/mercurial/tests/test_hgutil.py
 create mode 100644 swh/loader/mercurial/tests/test_identify.py
 delete mode 100644 swh/loader/mercurial/tests/test_loader.org
 create mode 100644 swh/loader/mercurial/tests/test_tasks_from_disk.py
Changes applied before test
commit 58f1de0a88a14687354ac5cbef85a431a56b7849
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Wed Oct 28 11:33:17 2020 +0100

    Add mercurial.from_disk.HgLoaderFromDisk
    
    Rather than relying on mercurial bundles this loader expect a local repository.

commit c8c91ab674a9ade49caacd63a5b507bab67df9dc
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:42:21 2020 +0200

    Add new example repository generated from script
    
    First updatable example repository documented by its generation script.

commit bc32e1280cfd6a59df595cdcbcc2c2b51b3618aa
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:22:07 2020 +0200

    Add `Hg20BundleLoader` tests from json files
    
    Generated json files with `swh/loader/mercurial/tests/data/build.py` for
    existing repositories and added them to `Hg20BundleLoader` tests.
    
    Introduce `LoaderChecker` as a standardized way to test repositories
    against json files.

commit ff11f77f1b493bd1c8ed257e790ded8da276101c
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Fri Oct 16 11:28:35 2020 +0200

    Add testing repository builder
    
    This build script purpose is to create example repositories from bash scripts
    and extract assertion data from them into json files.
    
    Advantages:
    
        - the bash script documents the repository creation
        - automating creation allow easy repository update
        - automation extraction allow easier update of assertion data

commit a2e9cf16919a5f81a06f955a533a254a9b3c9689
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Thu Oct 8 18:07:50 2020 +0200

    add swh-hg-identify a cli to identify hg objects

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/121/ for more details.

acezar added inline comments.
swh/loader/mercurial/from_disk.py
193 ↗(On Diff #16286)

Removed the TODO

349–350 ↗(On Diff #16107)

New in this diff, but was already in the next revision: D4541 (without the comment).

251–255 ↗(On Diff #14564)

Added comment stating that it match the historical implementation.

467–475 ↗(On Diff #14564)

revision hash use the freeform fullname when provided and we create author using Person.from_fullname

Ther is (maybe) a last small bit to document, but the patch looks good overall.

swh/loader/mercurial/from_disk.py
467–475 ↗(On Diff #14564)

Is this documented ?

acezar marked 2 inline comments as done.

Followup

acezar added inline comments.
swh/loader/mercurial/from_disk.py
467–475 ↗(On Diff #14564)

Added comment

Build is green

Patch application report for D3435 (id=16304)

Could not rebase; Attempt merge onto bd914dec39...

Updating bd914de..bfc44ab
Fast-forward
 requirements.txt                                   |   1 +
 setup.py                                           |   3 +
 swh/loader/mercurial/__init__.py                   |  10 +
 swh/loader/mercurial/from_disk.py                  | 478 ++++++++++++++++++
 swh/loader/mercurial/hgutil.py                     |  77 +++
 swh/loader/mercurial/identify.py                   | 541 +++++++++++++++++++++
 swh/loader/mercurial/tasks_from_disk.py            |  33 ++
 swh/loader/mercurial/tests/data/build.py           | 265 ++++++++++
 swh/loader/mercurial/tests/data/example.json       |   1 +
 swh/loader/mercurial/tests/data/example.sh         |  59 +++
 swh/loader/mercurial/tests/data/example.tgz        | Bin 0 -> 51200 bytes
 swh/loader/mercurial/tests/data/hello.json         |   1 +
 swh/loader/mercurial/tests/data/the-sandbox.json   |   1 +
 swh/loader/mercurial/tests/data/transplant.json    |   1 +
 swh/loader/mercurial/tests/loader_checker.py       |  74 +++
 swh/loader/mercurial/tests/test_from_disk.py       | 205 ++++++++
 swh/loader/mercurial/tests/test_hgutil.py          |  46 ++
 swh/loader/mercurial/tests/test_identify.py        |  74 +++
 swh/loader/mercurial/tests/test_loader.org         | 121 -----
 swh/loader/mercurial/tests/test_tasks_from_disk.py |  47 ++
 20 files changed, 1917 insertions(+), 121 deletions(-)
 create mode 100644 swh/loader/mercurial/from_disk.py
 create mode 100644 swh/loader/mercurial/hgutil.py
 create mode 100644 swh/loader/mercurial/identify.py
 create mode 100644 swh/loader/mercurial/tasks_from_disk.py
 create mode 100755 swh/loader/mercurial/tests/data/build.py
 create mode 100644 swh/loader/mercurial/tests/data/example.json
 create mode 100644 swh/loader/mercurial/tests/data/example.sh
 create mode 100644 swh/loader/mercurial/tests/data/example.tgz
 create mode 100644 swh/loader/mercurial/tests/data/hello.json
 create mode 100644 swh/loader/mercurial/tests/data/the-sandbox.json
 create mode 100644 swh/loader/mercurial/tests/data/transplant.json
 create mode 100644 swh/loader/mercurial/tests/loader_checker.py
 create mode 100644 swh/loader/mercurial/tests/test_from_disk.py
 create mode 100644 swh/loader/mercurial/tests/test_hgutil.py
 create mode 100644 swh/loader/mercurial/tests/test_identify.py
 delete mode 100644 swh/loader/mercurial/tests/test_loader.org
 create mode 100644 swh/loader/mercurial/tests/test_tasks_from_disk.py
Changes applied before test
commit bfc44ab8688a8b74ccaf7ecb25be5fb8db27f548
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Wed Oct 28 11:33:17 2020 +0100

    Add mercurial.from_disk.HgLoaderFromDisk
    
    Rather than relying on mercurial bundles this loader expect a local repository.

commit c8c91ab674a9ade49caacd63a5b507bab67df9dc
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:42:21 2020 +0200

    Add new example repository generated from script
    
    First updatable example repository documented by its generation script.

commit bc32e1280cfd6a59df595cdcbcc2c2b51b3618aa
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:22:07 2020 +0200

    Add `Hg20BundleLoader` tests from json files
    
    Generated json files with `swh/loader/mercurial/tests/data/build.py` for
    existing repositories and added them to `Hg20BundleLoader` tests.
    
    Introduce `LoaderChecker` as a standardized way to test repositories
    against json files.

commit ff11f77f1b493bd1c8ed257e790ded8da276101c
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Fri Oct 16 11:28:35 2020 +0200

    Add testing repository builder
    
    This build script purpose is to create example repositories from bash scripts
    and extract assertion data from them into json files.
    
    Advantages:
    
        - the bash script documents the repository creation
        - automating creation allow easy repository update
        - automation extraction allow easier update of assertion data

commit a2e9cf16919a5f81a06f955a533a254a9b3c9689
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Thu Oct 8 18:07:50 2020 +0200

    add swh-hg-identify a cli to identify hg objects

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/122/ for more details.

ok we are getting close I think. Just a few more comments to be handled and we are done.

swh/loader/mercurial/from_disk.py
53 ↗(On Diff #16304)

s/data/date/ I guess

122 ↗(On Diff #16304)

shouldn't the repo_directory attribute be "declared" here too?

128 ↗(On Diff #16304)

I may be missing something but why is this a property? why not just initialize a self.repo in the __fetch_data__ method (where the repo_directory is initialized)?

This revision now requires changes to proceed.Nov 27 2020, 4:37 PM
acezar marked 4 inline comments as done.

Followup

swh/loader/mercurial/from_disk.py
128 ↗(On Diff #16304)

You're right it has been added for an old need and is no more necessary.

Build is green

Patch application report for D3435 (id=16424)

Rebasing onto c8c91ab674...

Current branch diff-target is up to date.
Changes applied before test
commit a1c8afa5e42cc58eef255c38ce0585aa71eac0a6
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Wed Oct 28 11:33:17 2020 +0100

    Add mercurial.from_disk.HgLoaderFromDisk
    
    Rather than relying on mercurial bundles this loader expect a local repository.

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/127/ for more details.

This revision is now accepted and ready to land.Nov 30 2020, 11:43 AM
This revision was automatically updated to reflect the committed changes.