Page MenuHomeSoftware Heritage

Add mercurial.from_disk.HgLoaderFromDisk
ClosedPublic

Authored by acezar on Jul 6 2020, 4:27 PM.

Details

Summary

Rather than relying on mercurial bundle this loader expect a local
clone of the repository.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
swh/loader/mercurial/__init__.py
10–27

The switch will be done in loader.py

swh/loader/mercurial/cli.py
6–8 ↗(On Diff #16107)

When you commit change in a file that has not been sorted before isort has been added to precommit, the imports get updated.

47–51 ↗(On Diff #16107)

Same answer: loader.py

swh/loader/mercurial/from_bundle.py
14–40 ↗(On Diff #16107)

Same answer: isort in precommit

swh/loader/mercurial/from_disk.py
349–350

to_model converts from_disk.Content to model.BaseContent but storage.content_add only accepts model.Content. In our case we should only have model.Content from to_model anything else is actually not handled and should be an error.

swh/loader/mercurial/tests/test_hgutil.py
23

https://docs.pytest.org/en/stable/monkeypatch.html

All modifications will be undone after the requesting test function or fixture has finished.

Build is green

Patch application report for D3435 (id=16202)

Could not rebase; Attempt merge onto bd914dec39...

Updating bd914de..b355360
Fast-forward
 requirements.txt                                   |   1 +
 setup.py                                           |   2 +
 swh/loader/mercurial/__init__.py                   |   4 +-
 swh/loader/mercurial/cli.py                        |   6 +-
 swh/loader/mercurial/from_bundle.py                | 641 ++++++++++++++++++++
 swh/loader/mercurial/from_disk.py                  | 454 +++++++++++++++
 swh/loader/mercurial/hgutil.py                     |  77 +++
 swh/loader/mercurial/identify.py                   | 541 +++++++++++++++++
 swh/loader/mercurial/loader.py                     | 645 +--------------------
 swh/loader/mercurial/tasks.py                      |   8 +-
 swh/loader/mercurial/tests/data/build.py           | 265 +++++++++
 swh/loader/mercurial/tests/data/example.json       |   1 +
 swh/loader/mercurial/tests/data/example.sh         |  59 ++
 swh/loader/mercurial/tests/data/example.tgz        | Bin 0 -> 51200 bytes
 swh/loader/mercurial/tests/data/hello.json         |   1 +
 swh/loader/mercurial/tests/data/the-sandbox.json   |   1 +
 swh/loader/mercurial/tests/data/transplant.json    |   1 +
 swh/loader/mercurial/tests/loader_checker.py       |  74 +++
 .../tests/{test_loader.py => test_from_bundle.py}  |  14 +-
 swh/loader/mercurial/tests/test_from_disk.py       | 199 +++++++
 swh/loader/mercurial/tests/test_hgutil.py          |  46 ++
 swh/loader/mercurial/tests/test_identify.py        |  74 +++
 swh/loader/mercurial/tests/test_loader.org         | 121 ----
 swh/loader/mercurial/tests/test_tasks.py           |   6 +-
 24 files changed, 2467 insertions(+), 774 deletions(-)
 create mode 100644 swh/loader/mercurial/from_bundle.py
 create mode 100644 swh/loader/mercurial/from_disk.py
 create mode 100644 swh/loader/mercurial/hgutil.py
 create mode 100644 swh/loader/mercurial/identify.py
 create mode 100755 swh/loader/mercurial/tests/data/build.py
 create mode 100644 swh/loader/mercurial/tests/data/example.json
 create mode 100644 swh/loader/mercurial/tests/data/example.sh
 create mode 100644 swh/loader/mercurial/tests/data/example.tgz
 create mode 100644 swh/loader/mercurial/tests/data/hello.json
 create mode 100644 swh/loader/mercurial/tests/data/the-sandbox.json
 create mode 100644 swh/loader/mercurial/tests/data/transplant.json
 create mode 100644 swh/loader/mercurial/tests/loader_checker.py
 rename swh/loader/mercurial/tests/{test_loader.py => test_from_bundle.py} (93%)
 create mode 100644 swh/loader/mercurial/tests/test_from_disk.py
 create mode 100644 swh/loader/mercurial/tests/test_hgutil.py
 create mode 100644 swh/loader/mercurial/tests/test_identify.py
 delete mode 100644 swh/loader/mercurial/tests/test_loader.org
Changes applied before test
commit b35536071623338213dcf22352c3a8f332b32344
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Wed Oct 28 11:33:17 2020 +0100

    Add mercurial.from_disk.HgLoaderFromDisk
    
    Rather than relying on mercurial bundles this loader expect a local repository.
    
    Differential Revision: https://forge.softwareheritage.org/D3435

commit c8c91ab674a9ade49caacd63a5b507bab67df9dc
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:42:21 2020 +0200

    Add new example repository generated from script
    
    First updatable example repository documented by its generation script.

commit bc32e1280cfd6a59df595cdcbcc2c2b51b3618aa
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:22:07 2020 +0200

    Add `Hg20BundleLoader` tests from json files
    
    Generated json files with `swh/loader/mercurial/tests/data/build.py` for
    existing repositories and added them to `Hg20BundleLoader` tests.
    
    Introduce `LoaderChecker` as a standardized way to test repositories
    against json files.

commit ff11f77f1b493bd1c8ed257e790ded8da276101c
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Fri Oct 16 11:28:35 2020 +0200

    Add testing repository builder
    
    This build script purpose is to create example repositories from bash scripts
    and extract assertion data from them into json files.
    
    Advantages:
    
        - the bash script documents the repository creation
        - automating creation allow easy repository update
        - automation extraction allow easier update of assertion data

commit a2e9cf16919a5f81a06f955a533a254a9b3c9689
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Thu Oct 8 18:07:50 2020 +0200

    add swh-hg-identify a cli to identify hg objects

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/107/ for more details.

swh/loader/mercurial/__init__.py
10–27

will add a loader.mercurial_from_disk=swh.loader.mercurial:register_from_disk. Thanks for the hint.

Build is green

Patch application report for D3435 (id=16207)

Could not rebase; Attempt merge onto bd914dec39...

Updating bd914de..ab26d6f
Fast-forward
 requirements.txt                                   |   1 +
 setup.py                                           |   3 +
 swh/loader/mercurial/__init__.py                   |  10 +
 swh/loader/mercurial/cli.py                        |   6 +-
 swh/loader/mercurial/from_disk.py                  | 454 +++++++++++++++++
 swh/loader/mercurial/hgutil.py                     |  77 +++
 swh/loader/mercurial/identify.py                   | 541 +++++++++++++++++++++
 swh/loader/mercurial/loader.py                     |   6 +-
 swh/loader/mercurial/tasks_from_disk.py            |  33 ++
 swh/loader/mercurial/tests/data/build.py           | 265 ++++++++++
 swh/loader/mercurial/tests/data/example.json       |   1 +
 swh/loader/mercurial/tests/data/example.sh         |  59 +++
 swh/loader/mercurial/tests/data/example.tgz        | Bin 0 -> 51200 bytes
 swh/loader/mercurial/tests/data/hello.json         |   1 +
 swh/loader/mercurial/tests/data/the-sandbox.json   |   1 +
 swh/loader/mercurial/tests/data/transplant.json    |   1 +
 swh/loader/mercurial/tests/loader_checker.py       |  74 +++
 swh/loader/mercurial/tests/test_from_disk.py       | 199 ++++++++
 swh/loader/mercurial/tests/test_hgutil.py          |  46 ++
 swh/loader/mercurial/tests/test_identify.py        |  74 +++
 swh/loader/mercurial/tests/test_loader.org         | 121 -----
 swh/loader/mercurial/tests/test_tasks_from_disk.py |  47 ++
 22 files changed, 1893 insertions(+), 127 deletions(-)
 create mode 100644 swh/loader/mercurial/from_disk.py
 create mode 100644 swh/loader/mercurial/hgutil.py
 create mode 100644 swh/loader/mercurial/identify.py
 create mode 100644 swh/loader/mercurial/tasks_from_disk.py
 create mode 100755 swh/loader/mercurial/tests/data/build.py
 create mode 100644 swh/loader/mercurial/tests/data/example.json
 create mode 100644 swh/loader/mercurial/tests/data/example.sh
 create mode 100644 swh/loader/mercurial/tests/data/example.tgz
 create mode 100644 swh/loader/mercurial/tests/data/hello.json
 create mode 100644 swh/loader/mercurial/tests/data/the-sandbox.json
 create mode 100644 swh/loader/mercurial/tests/data/transplant.json
 create mode 100644 swh/loader/mercurial/tests/loader_checker.py
 create mode 100644 swh/loader/mercurial/tests/test_from_disk.py
 create mode 100644 swh/loader/mercurial/tests/test_hgutil.py
 create mode 100644 swh/loader/mercurial/tests/test_identify.py
 delete mode 100644 swh/loader/mercurial/tests/test_loader.org
 create mode 100644 swh/loader/mercurial/tests/test_tasks_from_disk.py
Changes applied before test
commit ab26d6fef18cc4d7939e2566de3d398b5882e582
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Wed Oct 28 11:33:17 2020 +0100

    Add mercurial.from_disk.HgLoaderFromDisk
    
    Rather than relying on mercurial bundles this loader expect a local repository.
    
    Differential Revision: https://forge.softwareheritage.org/D3435

commit c8c91ab674a9ade49caacd63a5b507bab67df9dc
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:42:21 2020 +0200

    Add new example repository generated from script
    
    First updatable example repository documented by its generation script.

commit bc32e1280cfd6a59df595cdcbcc2c2b51b3618aa
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:22:07 2020 +0200

    Add `Hg20BundleLoader` tests from json files
    
    Generated json files with `swh/loader/mercurial/tests/data/build.py` for
    existing repositories and added them to `Hg20BundleLoader` tests.
    
    Introduce `LoaderChecker` as a standardized way to test repositories
    against json files.

commit ff11f77f1b493bd1c8ed257e790ded8da276101c
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Fri Oct 16 11:28:35 2020 +0200

    Add testing repository builder
    
    This build script purpose is to create example repositories from bash scripts
    and extract assertion data from them into json files.
    
    Advantages:
    
        - the bash script documents the repository creation
        - automating creation allow easy repository update
        - automation extraction allow easier update of assertion data

commit a2e9cf16919a5f81a06f955a533a254a9b3c9689
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Thu Oct 8 18:07:50 2020 +0200

    add swh-hg-identify a cli to identify hg objects

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/110/ for more details.

swh/loader/mercurial/cli.py
6–8 ↗(On Diff #16107)

Okay, so it looks like the import should habe been sorted when the pre-commit hook was added, but that is out of scope for this change I guess.

swh/loader/mercurial/from_disk.py
181–182

(gentle ping)

246

Looks like this phabricator let this commetn drift a bit. Do you remeberwhat is was about ?

277

I think that this feedback about self._revision_nodeid_to_swhid[parent_hg_nodeid] still apply. Am I missing something?

313

We need to keep the tip tags because the previous loader was loading it ? Or is there another reason ? (It would be nice if the doc was clearer about that.)

337–340

(gentle ping)

349–350

I see.

What about the binary content of file. when is it dropped ?

378

gentle ping

403

As we discussed before, we need to better define Archive here.

swh/loader/mercurial/hgutil.py
20–31

Can you give some details about what happened on this topic ?

acezar marked 9 inline comments as done.

Followup

Build is green

Patch application report for D3435 (id=16257)

Could not rebase; Attempt merge onto bd914dec39...

Updating bd914de..ede3e31
Fast-forward
 requirements.txt                                   |   1 +
 setup.py                                           |   3 +
 swh/loader/mercurial/__init__.py                   |  10 +
 swh/loader/mercurial/from_disk.py                  | 468 ++++++++++++++++++
 swh/loader/mercurial/hgutil.py                     |  77 +++
 swh/loader/mercurial/identify.py                   | 541 +++++++++++++++++++++
 swh/loader/mercurial/tasks_from_disk.py            |  33 ++
 swh/loader/mercurial/tests/data/build.py           | 265 ++++++++++
 swh/loader/mercurial/tests/data/example.json       |   1 +
 swh/loader/mercurial/tests/data/example.sh         |  59 +++
 swh/loader/mercurial/tests/data/example.tgz        | Bin 0 -> 51200 bytes
 swh/loader/mercurial/tests/data/hello.json         |   1 +
 swh/loader/mercurial/tests/data/the-sandbox.json   |   1 +
 swh/loader/mercurial/tests/data/transplant.json    |   1 +
 swh/loader/mercurial/tests/loader_checker.py       |  74 +++
 swh/loader/mercurial/tests/test_from_disk.py       | 199 ++++++++
 swh/loader/mercurial/tests/test_hgutil.py          |  46 ++
 swh/loader/mercurial/tests/test_identify.py        |  74 +++
 swh/loader/mercurial/tests/test_loader.org         | 121 -----
 swh/loader/mercurial/tests/test_tasks_from_disk.py |  47 ++
 20 files changed, 1901 insertions(+), 121 deletions(-)
 create mode 100644 swh/loader/mercurial/from_disk.py
 create mode 100644 swh/loader/mercurial/hgutil.py
 create mode 100644 swh/loader/mercurial/identify.py
 create mode 100644 swh/loader/mercurial/tasks_from_disk.py
 create mode 100755 swh/loader/mercurial/tests/data/build.py
 create mode 100644 swh/loader/mercurial/tests/data/example.json
 create mode 100644 swh/loader/mercurial/tests/data/example.sh
 create mode 100644 swh/loader/mercurial/tests/data/example.tgz
 create mode 100644 swh/loader/mercurial/tests/data/hello.json
 create mode 100644 swh/loader/mercurial/tests/data/the-sandbox.json
 create mode 100644 swh/loader/mercurial/tests/data/transplant.json
 create mode 100644 swh/loader/mercurial/tests/loader_checker.py
 create mode 100644 swh/loader/mercurial/tests/test_from_disk.py
 create mode 100644 swh/loader/mercurial/tests/test_hgutil.py
 create mode 100644 swh/loader/mercurial/tests/test_identify.py
 delete mode 100644 swh/loader/mercurial/tests/test_loader.org
 create mode 100644 swh/loader/mercurial/tests/test_tasks_from_disk.py
Changes applied before test
commit ede3e31d7b8b654c81607a967982e7330d88c98a
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Wed Oct 28 11:33:17 2020 +0100

    Add mercurial.from_disk.HgLoaderFromDisk
    
    Rather than relying on mercurial bundles this loader expect a local repository.
    
    Differential Revision: https://forge.softwareheritage.org/D3435

commit c8c91ab674a9ade49caacd63a5b507bab67df9dc
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:42:21 2020 +0200

    Add new example repository generated from script
    
    First updatable example repository documented by its generation script.

commit bc32e1280cfd6a59df595cdcbcc2c2b51b3618aa
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:22:07 2020 +0200

    Add `Hg20BundleLoader` tests from json files
    
    Generated json files with `swh/loader/mercurial/tests/data/build.py` for
    existing repositories and added them to `Hg20BundleLoader` tests.
    
    Introduce `LoaderChecker` as a standardized way to test repositories
    against json files.

commit ff11f77f1b493bd1c8ed257e790ded8da276101c
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Fri Oct 16 11:28:35 2020 +0200

    Add testing repository builder
    
    This build script purpose is to create example repositories from bash scripts
    and extract assertion data from them into json files.
    
    Advantages:
    
        - the bash script documents the repository creation
        - automating creation allow easy repository update
        - automation extraction allow easier update of assertion data

commit a2e9cf16919a5f81a06f955a533a254a9b3c9689
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Thu Oct 8 18:07:50 2020 +0200

    add swh-hg-identify a cli to identify hg objects

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/111/ for more details.

acezar added inline comments.
swh/loader/mercurial/from_disk.py
246

it was about passing context rather than the nodeid and ask the context again to the repo in the method. (was on self.store_directories(hg_nodeid))

277

I thought it was about the whole parent block hence why I added self.get_revision_parents(rev_ctx).

313

was a typo: tip must be ignored. Because as said: A release correspond to a user defined tag and tip is not user defined.

349–350

Made sure that only the necessary data is returned at the en of the method. Garbage collector should cleanup thing if storage does not hold reference until flush.

swh/loader/mercurial/hgutil.py
20–31

removed the duplication. kept the hgutil.py version

swh/loader/mercurial/from_disk.py
349–350

Sure, but which part is dropping the reference to the actual file data ?

The data are passed into: content_data, which is passed into Content, which is passed to ModelContent which still hold the data because self.storage.content_add is called on it.

swh/loader/mercurial/hgutil.py
20–31

Okay thanks

swh/loader/mercurial/from_disk.py
349–350

Never mind I just noticed that we return a new object at the end of the method:

# Here we make sure to return only necessary data.
return Content({"sha1_git": content.hash, "perms": perms})

Is this code new, or did I missed it on the first pass ?

swh/loader/mercurial/from_disk.py
189–191

It is still unclear to me when/why we have this special case. Can you clarify that point ?

349–350

This is indeed new code.

acezar marked 5 inline comments as done.

Followup

Build is green

Patch application report for D3435 (id=16277)

Could not rebase; Attempt merge onto bd914dec39...

Updating bd914de..f14a65f
Fast-forward
 requirements.txt                                   |   1 +
 setup.py                                           |   3 +
 swh/loader/mercurial/__init__.py                   |  10 +
 swh/loader/mercurial/from_disk.py                  | 475 ++++++++++++++++++
 swh/loader/mercurial/hgutil.py                     |  77 +++
 swh/loader/mercurial/identify.py                   | 541 +++++++++++++++++++++
 swh/loader/mercurial/tasks_from_disk.py            |  33 ++
 swh/loader/mercurial/tests/data/build.py           | 265 ++++++++++
 swh/loader/mercurial/tests/data/example.json       |   1 +
 swh/loader/mercurial/tests/data/example.sh         |  59 +++
 swh/loader/mercurial/tests/data/example.tgz        | Bin 0 -> 51200 bytes
 swh/loader/mercurial/tests/data/hello.json         |   1 +
 swh/loader/mercurial/tests/data/the-sandbox.json   |   1 +
 swh/loader/mercurial/tests/data/transplant.json    |   1 +
 swh/loader/mercurial/tests/loader_checker.py       |  74 +++
 swh/loader/mercurial/tests/test_from_disk.py       | 199 ++++++++
 swh/loader/mercurial/tests/test_hgutil.py          |  46 ++
 swh/loader/mercurial/tests/test_identify.py        |  74 +++
 swh/loader/mercurial/tests/test_loader.org         | 121 -----
 swh/loader/mercurial/tests/test_tasks_from_disk.py |  47 ++
 20 files changed, 1908 insertions(+), 121 deletions(-)
 create mode 100644 swh/loader/mercurial/from_disk.py
 create mode 100644 swh/loader/mercurial/hgutil.py
 create mode 100644 swh/loader/mercurial/identify.py
 create mode 100644 swh/loader/mercurial/tasks_from_disk.py
 create mode 100755 swh/loader/mercurial/tests/data/build.py
 create mode 100644 swh/loader/mercurial/tests/data/example.json
 create mode 100644 swh/loader/mercurial/tests/data/example.sh
 create mode 100644 swh/loader/mercurial/tests/data/example.tgz
 create mode 100644 swh/loader/mercurial/tests/data/hello.json
 create mode 100644 swh/loader/mercurial/tests/data/the-sandbox.json
 create mode 100644 swh/loader/mercurial/tests/data/transplant.json
 create mode 100644 swh/loader/mercurial/tests/loader_checker.py
 create mode 100644 swh/loader/mercurial/tests/test_from_disk.py
 create mode 100644 swh/loader/mercurial/tests/test_hgutil.py
 create mode 100644 swh/loader/mercurial/tests/test_identify.py
 delete mode 100644 swh/loader/mercurial/tests/test_loader.org
 create mode 100644 swh/loader/mercurial/tests/test_tasks_from_disk.py
Changes applied before test
commit f14a65f97b272db087b2ad823dbdeb44fda768e0
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Wed Oct 28 11:33:17 2020 +0100

    Add mercurial.from_disk.HgLoaderFromDisk
    
    Rather than relying on mercurial bundles this loader expect a local repository.

commit c8c91ab674a9ade49caacd63a5b507bab67df9dc
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:42:21 2020 +0200

    Add new example repository generated from script
    
    First updatable example repository documented by its generation script.

commit bc32e1280cfd6a59df595cdcbcc2c2b51b3618aa
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:22:07 2020 +0200

    Add `Hg20BundleLoader` tests from json files
    
    Generated json files with `swh/loader/mercurial/tests/data/build.py` for
    existing repositories and added them to `Hg20BundleLoader` tests.
    
    Introduce `LoaderChecker` as a standardized way to test repositories
    against json files.

commit ff11f77f1b493bd1c8ed257e790ded8da276101c
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Fri Oct 16 11:28:35 2020 +0200

    Add testing repository builder
    
    This build script purpose is to create example repositories from bash scripts
    and extract assertion data from them into json files.
    
    Advantages:
    
        - the bash script documents the repository creation
        - automating creation allow easy repository update
        - automation extraction allow easier update of assertion data

commit a2e9cf16919a5f81a06f955a533a254a9b3c9689
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Thu Oct 8 18:07:50 2020 +0200

    add swh-hg-identify a cli to identify hg objects

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/114/ for more details.

Build is green

Patch application report for D3435 (id=16286)

Could not rebase; Attempt merge onto bd914dec39...

Updating bd914de..12411aa
Fast-forward
 requirements.txt                                   |   1 +
 setup.py                                           |   3 +
 swh/loader/mercurial/__init__.py                   |  10 +
 swh/loader/mercurial/from_disk.py                  | 475 ++++++++++++++++++
 swh/loader/mercurial/hgutil.py                     |  77 +++
 swh/loader/mercurial/identify.py                   | 541 +++++++++++++++++++++
 swh/loader/mercurial/tasks_from_disk.py            |  33 ++
 swh/loader/mercurial/tests/data/build.py           | 265 ++++++++++
 swh/loader/mercurial/tests/data/example.json       |   1 +
 swh/loader/mercurial/tests/data/example.sh         |  59 +++
 swh/loader/mercurial/tests/data/example.tgz        | Bin 0 -> 51200 bytes
 swh/loader/mercurial/tests/data/hello.json         |   1 +
 swh/loader/mercurial/tests/data/the-sandbox.json   |   1 +
 swh/loader/mercurial/tests/data/transplant.json    |   1 +
 swh/loader/mercurial/tests/loader_checker.py       |  74 +++
 swh/loader/mercurial/tests/test_from_disk.py       | 205 ++++++++
 swh/loader/mercurial/tests/test_hgutil.py          |  46 ++
 swh/loader/mercurial/tests/test_identify.py        |  74 +++
 swh/loader/mercurial/tests/test_loader.org         | 121 -----
 swh/loader/mercurial/tests/test_tasks_from_disk.py |  47 ++
 20 files changed, 1914 insertions(+), 121 deletions(-)
 create mode 100644 swh/loader/mercurial/from_disk.py
 create mode 100644 swh/loader/mercurial/hgutil.py
 create mode 100644 swh/loader/mercurial/identify.py
 create mode 100644 swh/loader/mercurial/tasks_from_disk.py
 create mode 100755 swh/loader/mercurial/tests/data/build.py
 create mode 100644 swh/loader/mercurial/tests/data/example.json
 create mode 100644 swh/loader/mercurial/tests/data/example.sh
 create mode 100644 swh/loader/mercurial/tests/data/example.tgz
 create mode 100644 swh/loader/mercurial/tests/data/hello.json
 create mode 100644 swh/loader/mercurial/tests/data/the-sandbox.json
 create mode 100644 swh/loader/mercurial/tests/data/transplant.json
 create mode 100644 swh/loader/mercurial/tests/loader_checker.py
 create mode 100644 swh/loader/mercurial/tests/test_from_disk.py
 create mode 100644 swh/loader/mercurial/tests/test_hgutil.py
 create mode 100644 swh/loader/mercurial/tests/test_identify.py
 delete mode 100644 swh/loader/mercurial/tests/test_loader.org
 create mode 100644 swh/loader/mercurial/tests/test_tasks_from_disk.py
Changes applied before test
commit 12411aa64133e2578f20444ab09e117ccb4634d5
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Wed Oct 28 11:33:17 2020 +0100

    Add mercurial.from_disk.HgLoaderFromDisk
    
    Rather than relying on mercurial bundles this loader expect a local repository.

commit c8c91ab674a9ade49caacd63a5b507bab67df9dc
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:42:21 2020 +0200

    Add new example repository generated from script
    
    First updatable example repository documented by its generation script.

commit bc32e1280cfd6a59df595cdcbcc2c2b51b3618aa
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:22:07 2020 +0200

    Add `Hg20BundleLoader` tests from json files
    
    Generated json files with `swh/loader/mercurial/tests/data/build.py` for
    existing repositories and added them to `Hg20BundleLoader` tests.
    
    Introduce `LoaderChecker` as a standardized way to test repositories
    against json files.

commit ff11f77f1b493bd1c8ed257e790ded8da276101c
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Fri Oct 16 11:28:35 2020 +0200

    Add testing repository builder
    
    This build script purpose is to create example repositories from bash scripts
    and extract assertion data from them into json files.
    
    Advantages:
    
        - the bash script documents the repository creation
        - automating creation allow easy repository update
        - automation extraction allow easier update of assertion data

commit a2e9cf16919a5f81a06f955a533a254a9b3c9689
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Thu Oct 8 18:07:50 2020 +0200

    add swh-hg-identify a cli to identify hg objects

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/118/ for more details.

I only have one last question. The rest seems fine.

swh/loader/mercurial/from_disk.py
193

Small nits: If they are no other usage than the cli call, maybe pulling is not necessary.

251–255

The questions still stand.

Build is green

Patch application report for D3435 (id=16302)

Could not rebase; Attempt merge onto bd914dec39...

Updating bd914de..e19a069
Fast-forward
 requirements.txt                                   |   1 +
 setup.py                                           |   3 +
 swh/loader/mercurial/__init__.py                   |  10 +
 swh/loader/mercurial/from_disk.py                  | 477 ++++++++++++++++++
 swh/loader/mercurial/hgutil.py                     |  77 +++
 swh/loader/mercurial/identify.py                   | 541 +++++++++++++++++++++
 swh/loader/mercurial/tasks_from_disk.py            |  33 ++
 swh/loader/mercurial/tests/data/build.py           | 265 ++++++++++
 swh/loader/mercurial/tests/data/example.json       |   1 +
 swh/loader/mercurial/tests/data/example.sh         |  59 +++
 swh/loader/mercurial/tests/data/example.tgz        | Bin 0 -> 51200 bytes
 swh/loader/mercurial/tests/data/hello.json         |   1 +
 swh/loader/mercurial/tests/data/the-sandbox.json   |   1 +
 swh/loader/mercurial/tests/data/transplant.json    |   1 +
 swh/loader/mercurial/tests/loader_checker.py       |  74 +++
 swh/loader/mercurial/tests/test_from_disk.py       | 205 ++++++++
 swh/loader/mercurial/tests/test_hgutil.py          |  46 ++
 swh/loader/mercurial/tests/test_identify.py        |  74 +++
 swh/loader/mercurial/tests/test_loader.org         | 121 -----
 swh/loader/mercurial/tests/test_tasks_from_disk.py |  47 ++
 20 files changed, 1916 insertions(+), 121 deletions(-)
 create mode 100644 swh/loader/mercurial/from_disk.py
 create mode 100644 swh/loader/mercurial/hgutil.py
 create mode 100644 swh/loader/mercurial/identify.py
 create mode 100644 swh/loader/mercurial/tasks_from_disk.py
 create mode 100755 swh/loader/mercurial/tests/data/build.py
 create mode 100644 swh/loader/mercurial/tests/data/example.json
 create mode 100644 swh/loader/mercurial/tests/data/example.sh
 create mode 100644 swh/loader/mercurial/tests/data/example.tgz
 create mode 100644 swh/loader/mercurial/tests/data/hello.json
 create mode 100644 swh/loader/mercurial/tests/data/the-sandbox.json
 create mode 100644 swh/loader/mercurial/tests/data/transplant.json
 create mode 100644 swh/loader/mercurial/tests/loader_checker.py
 create mode 100644 swh/loader/mercurial/tests/test_from_disk.py
 create mode 100644 swh/loader/mercurial/tests/test_hgutil.py
 create mode 100644 swh/loader/mercurial/tests/test_identify.py
 delete mode 100644 swh/loader/mercurial/tests/test_loader.org
 create mode 100644 swh/loader/mercurial/tests/test_tasks_from_disk.py
Changes applied before test
commit e19a06944997022b177b8caf252e91b2017190db
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Wed Oct 28 11:33:17 2020 +0100

    Add mercurial.from_disk.HgLoaderFromDisk
    
    Rather than relying on mercurial bundles this loader expect a local repository.

commit c8c91ab674a9ade49caacd63a5b507bab67df9dc
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:42:21 2020 +0200

    Add new example repository generated from script
    
    First updatable example repository documented by its generation script.

commit bc32e1280cfd6a59df595cdcbcc2c2b51b3618aa
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:22:07 2020 +0200

    Add `Hg20BundleLoader` tests from json files
    
    Generated json files with `swh/loader/mercurial/tests/data/build.py` for
    existing repositories and added them to `Hg20BundleLoader` tests.
    
    Introduce `LoaderChecker` as a standardized way to test repositories
    against json files.

commit ff11f77f1b493bd1c8ed257e790ded8da276101c
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Fri Oct 16 11:28:35 2020 +0200

    Add testing repository builder
    
    This build script purpose is to create example repositories from bash scripts
    and extract assertion data from them into json files.
    
    Advantages:
    
        - the bash script documents the repository creation
        - automating creation allow easy repository update
        - automation extraction allow easier update of assertion data

commit a2e9cf16919a5f81a06f955a533a254a9b3c9689
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Thu Oct 8 18:07:50 2020 +0200

    add swh-hg-identify a cli to identify hg objects

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/120/ for more details.

acezar marked an inline comment as done.

Followup

Build is green

Patch application report for D3435 (id=16303)

Could not rebase; Attempt merge onto bd914dec39...

Updating bd914de..58f1de0
Fast-forward
 requirements.txt                                   |   1 +
 setup.py                                           |   3 +
 swh/loader/mercurial/__init__.py                   |  10 +
 swh/loader/mercurial/from_disk.py                  | 475 ++++++++++++++++++
 swh/loader/mercurial/hgutil.py                     |  77 +++
 swh/loader/mercurial/identify.py                   | 541 +++++++++++++++++++++
 swh/loader/mercurial/tasks_from_disk.py            |  33 ++
 swh/loader/mercurial/tests/data/build.py           | 265 ++++++++++
 swh/loader/mercurial/tests/data/example.json       |   1 +
 swh/loader/mercurial/tests/data/example.sh         |  59 +++
 swh/loader/mercurial/tests/data/example.tgz        | Bin 0 -> 51200 bytes
 swh/loader/mercurial/tests/data/hello.json         |   1 +
 swh/loader/mercurial/tests/data/the-sandbox.json   |   1 +
 swh/loader/mercurial/tests/data/transplant.json    |   1 +
 swh/loader/mercurial/tests/loader_checker.py       |  74 +++
 swh/loader/mercurial/tests/test_from_disk.py       | 205 ++++++++
 swh/loader/mercurial/tests/test_hgutil.py          |  46 ++
 swh/loader/mercurial/tests/test_identify.py        |  74 +++
 swh/loader/mercurial/tests/test_loader.org         | 121 -----
 swh/loader/mercurial/tests/test_tasks_from_disk.py |  47 ++
 20 files changed, 1914 insertions(+), 121 deletions(-)
 create mode 100644 swh/loader/mercurial/from_disk.py
 create mode 100644 swh/loader/mercurial/hgutil.py
 create mode 100644 swh/loader/mercurial/identify.py
 create mode 100644 swh/loader/mercurial/tasks_from_disk.py
 create mode 100755 swh/loader/mercurial/tests/data/build.py
 create mode 100644 swh/loader/mercurial/tests/data/example.json
 create mode 100644 swh/loader/mercurial/tests/data/example.sh
 create mode 100644 swh/loader/mercurial/tests/data/example.tgz
 create mode 100644 swh/loader/mercurial/tests/data/hello.json
 create mode 100644 swh/loader/mercurial/tests/data/the-sandbox.json
 create mode 100644 swh/loader/mercurial/tests/data/transplant.json
 create mode 100644 swh/loader/mercurial/tests/loader_checker.py
 create mode 100644 swh/loader/mercurial/tests/test_from_disk.py
 create mode 100644 swh/loader/mercurial/tests/test_hgutil.py
 create mode 100644 swh/loader/mercurial/tests/test_identify.py
 delete mode 100644 swh/loader/mercurial/tests/test_loader.org
 create mode 100644 swh/loader/mercurial/tests/test_tasks_from_disk.py
Changes applied before test
commit 58f1de0a88a14687354ac5cbef85a431a56b7849
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Wed Oct 28 11:33:17 2020 +0100

    Add mercurial.from_disk.HgLoaderFromDisk
    
    Rather than relying on mercurial bundles this loader expect a local repository.

commit c8c91ab674a9ade49caacd63a5b507bab67df9dc
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:42:21 2020 +0200

    Add new example repository generated from script
    
    First updatable example repository documented by its generation script.

commit bc32e1280cfd6a59df595cdcbcc2c2b51b3618aa
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:22:07 2020 +0200

    Add `Hg20BundleLoader` tests from json files
    
    Generated json files with `swh/loader/mercurial/tests/data/build.py` for
    existing repositories and added them to `Hg20BundleLoader` tests.
    
    Introduce `LoaderChecker` as a standardized way to test repositories
    against json files.

commit ff11f77f1b493bd1c8ed257e790ded8da276101c
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Fri Oct 16 11:28:35 2020 +0200

    Add testing repository builder
    
    This build script purpose is to create example repositories from bash scripts
    and extract assertion data from them into json files.
    
    Advantages:
    
        - the bash script documents the repository creation
        - automating creation allow easy repository update
        - automation extraction allow easier update of assertion data

commit a2e9cf16919a5f81a06f955a533a254a9b3c9689
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Thu Oct 8 18:07:50 2020 +0200

    add swh-hg-identify a cli to identify hg objects

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/121/ for more details.

acezar added inline comments.
swh/loader/mercurial/from_disk.py
193

Removed the TODO

251–255

Added comment stating that it match the historical implementation.

349–350

New in this diff, but was already in the next revision: D4541 (without the comment).

467–475

revision hash use the freeform fullname when provided and we create author using Person.from_fullname

Ther is (maybe) a last small bit to document, but the patch looks good overall.

swh/loader/mercurial/from_disk.py
467–475

Is this documented ?

acezar marked 2 inline comments as done.

Followup

acezar added inline comments.
swh/loader/mercurial/from_disk.py
467–475

Added comment

Build is green

Patch application report for D3435 (id=16304)

Could not rebase; Attempt merge onto bd914dec39...

Updating bd914de..bfc44ab
Fast-forward
 requirements.txt                                   |   1 +
 setup.py                                           |   3 +
 swh/loader/mercurial/__init__.py                   |  10 +
 swh/loader/mercurial/from_disk.py                  | 478 ++++++++++++++++++
 swh/loader/mercurial/hgutil.py                     |  77 +++
 swh/loader/mercurial/identify.py                   | 541 +++++++++++++++++++++
 swh/loader/mercurial/tasks_from_disk.py            |  33 ++
 swh/loader/mercurial/tests/data/build.py           | 265 ++++++++++
 swh/loader/mercurial/tests/data/example.json       |   1 +
 swh/loader/mercurial/tests/data/example.sh         |  59 +++
 swh/loader/mercurial/tests/data/example.tgz        | Bin 0 -> 51200 bytes
 swh/loader/mercurial/tests/data/hello.json         |   1 +
 swh/loader/mercurial/tests/data/the-sandbox.json   |   1 +
 swh/loader/mercurial/tests/data/transplant.json    |   1 +
 swh/loader/mercurial/tests/loader_checker.py       |  74 +++
 swh/loader/mercurial/tests/test_from_disk.py       | 205 ++++++++
 swh/loader/mercurial/tests/test_hgutil.py          |  46 ++
 swh/loader/mercurial/tests/test_identify.py        |  74 +++
 swh/loader/mercurial/tests/test_loader.org         | 121 -----
 swh/loader/mercurial/tests/test_tasks_from_disk.py |  47 ++
 20 files changed, 1917 insertions(+), 121 deletions(-)
 create mode 100644 swh/loader/mercurial/from_disk.py
 create mode 100644 swh/loader/mercurial/hgutil.py
 create mode 100644 swh/loader/mercurial/identify.py
 create mode 100644 swh/loader/mercurial/tasks_from_disk.py
 create mode 100755 swh/loader/mercurial/tests/data/build.py
 create mode 100644 swh/loader/mercurial/tests/data/example.json
 create mode 100644 swh/loader/mercurial/tests/data/example.sh
 create mode 100644 swh/loader/mercurial/tests/data/example.tgz
 create mode 100644 swh/loader/mercurial/tests/data/hello.json
 create mode 100644 swh/loader/mercurial/tests/data/the-sandbox.json
 create mode 100644 swh/loader/mercurial/tests/data/transplant.json
 create mode 100644 swh/loader/mercurial/tests/loader_checker.py
 create mode 100644 swh/loader/mercurial/tests/test_from_disk.py
 create mode 100644 swh/loader/mercurial/tests/test_hgutil.py
 create mode 100644 swh/loader/mercurial/tests/test_identify.py
 delete mode 100644 swh/loader/mercurial/tests/test_loader.org
 create mode 100644 swh/loader/mercurial/tests/test_tasks_from_disk.py
Changes applied before test
commit bfc44ab8688a8b74ccaf7ecb25be5fb8db27f548
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Wed Oct 28 11:33:17 2020 +0100

    Add mercurial.from_disk.HgLoaderFromDisk
    
    Rather than relying on mercurial bundles this loader expect a local repository.

commit c8c91ab674a9ade49caacd63a5b507bab67df9dc
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:42:21 2020 +0200

    Add new example repository generated from script
    
    First updatable example repository documented by its generation script.

commit bc32e1280cfd6a59df595cdcbcc2c2b51b3618aa
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Mon Oct 19 16:22:07 2020 +0200

    Add `Hg20BundleLoader` tests from json files
    
    Generated json files with `swh/loader/mercurial/tests/data/build.py` for
    existing repositories and added them to `Hg20BundleLoader` tests.
    
    Introduce `LoaderChecker` as a standardized way to test repositories
    against json files.

commit ff11f77f1b493bd1c8ed257e790ded8da276101c
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Fri Oct 16 11:28:35 2020 +0200

    Add testing repository builder
    
    This build script purpose is to create example repositories from bash scripts
    and extract assertion data from them into json files.
    
    Advantages:
    
        - the bash script documents the repository creation
        - automating creation allow easy repository update
        - automation extraction allow easier update of assertion data

commit a2e9cf16919a5f81a06f955a533a254a9b3c9689
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Thu Oct 8 18:07:50 2020 +0200

    add swh-hg-identify a cli to identify hg objects

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/122/ for more details.

ok we are getting close I think. Just a few more comments to be handled and we are done.

swh/loader/mercurial/from_disk.py
53

s/data/date/ I guess

122

shouldn't the repo_directory attribute be "declared" here too?

128

I may be missing something but why is this a property? why not just initialize a self.repo in the __fetch_data__ method (where the repo_directory is initialized)?

This revision now requires changes to proceed.Nov 27 2020, 4:37 PM
acezar marked 4 inline comments as done.

Followup

swh/loader/mercurial/from_disk.py
128

You're right it has been added for an old need and is no more necessary.

Build is green

Patch application report for D3435 (id=16424)

Rebasing onto c8c91ab674...

Current branch diff-target is up to date.
Changes applied before test
commit a1c8afa5e42cc58eef255c38ce0585aa71eac0a6
Author: Antoine Cezar <antoine.cezar@octobus.net>
Date:   Wed Oct 28 11:33:17 2020 +0100

    Add mercurial.from_disk.HgLoaderFromDisk
    
    Rather than relying on mercurial bundles this loader expect a local repository.

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/127/ for more details.

This revision is now accepted and ready to land.Nov 30 2020, 11:43 AM
This revision was automatically updated to reflect the committed changes.