Page MenuHomeSoftware Heritage

Adapt the git loader to swh-model >0.4
ClosedPublic

Authored by douardda on Jul 7 2020, 7:03 PM.

Details

Summary

in which extra_headers is now a top-level attribute of a Revision.

Add tests that actually include having extra_header in loaded git revisions.
For this, the fast-import based dump of the test git repo is replaced by
a bundle dump (required to ensure things like gpg signatures are kept).
A few git revisions including gpg signature, encoding and mergetag
headers are also added to the git repo.

Diff Detail

Repository
rDLDG Git loader
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D3454 (id=12229)

Rebasing onto fc2ec733aa...

First, rewinding head to replay your work on top of it...
Applying: Adapt the git loader to swh-model >0.4
Changes applied before test
commit 5ba5d65e90e336adc232838163a6154a32d90fdc
Author: David Douard <david.douard@sdfa3.org>
Date:   Tue Jul 7 15:48:40 2020 +0200

    Adapt the git loader to swh-model >0.4
    
    in which extra_headers is now a top-level attribute of a Revision.
    
    Add tests that actually include having extra_header in loaded git revisions.
    For this, the fast-import based dump of the test git repo is replaced by
    a bundle dump (required to ensure things like gpg signatures are kept).
    A few git revisions including gpg signature, encoding and mergetag
    headers are also added to the git repo.

See https://jenkins.softwareheritage.org/job/DLDG/job/tests-on-diff/33/ for more details.

olasd added a subscriber: olasd.

Nice!

This revision is now accepted and ready to land.Jul 7 2020, 9:08 PM

Oh, that needs a requirements-swh.txt bump as well, I guess.

In D3454#84921, @olasd wrote:

Oh, that needs a requirements-swh.txt bump as well, I guess.

yes but I need to tag swh-storage, since it also requires the new revision api from the storage.

swh/loader/git/converters.py
116

That might help:
https://github.com/dulwich/dulwich/blob/a8c7a6e3173537798c6a8f77591aac9ced88a135/dulwich/objects.py#L1229-L1252

It seems extra fields are supplementary tuple of freeform information on the commit.
So for the test, take an existing revision from the bundle, amend it with some
those extra fields and check that they get back within the extra headers.
And that would be enough?

Build has FAILED

Patch application report for D3454 (id=12241)

Rebasing onto fc2ec733aa...

First, rewinding head to replay your work on top of it...
Applying: Adapt the git loader to swh-model >0.4
Using index info to reconstruct a base tree...
M	requirements-swh.txt
Falling back to patching base and 3-way merge...
Removing swh/loader/git/tests/data/git-repos/example-submodule.fast-export.xz
Auto-merging requirements-swh.txt
CONFLICT (content): Merge conflict in requirements-swh.txt
Patch failed at 0001 Adapt the git loader to swh-model >0.4

Resolve all conflicts manually, mark them as resolved with
"git add/rm <conflicted_files>", then run "git rebase --continue".
You can instead skip this commit: run "git rebase --skip".
To abort and get back to the state before "git rebase", run "git rebase --abort".

Rebase failed (ret=1)!

Could not rebase; Attempt merge onto fc2ec733aa...

Already up to date.
Changes applied before test
commit f6a523cd2fcaa5e0d63f57602e35922269dcb054
Author: David Douard <david.douard@sdfa3.org>
Date:   Tue Jul 7 15:48:40 2020 +0200

    Adapt the git loader to swh-model >0.4
    
    in which extra_headers is now a top-level attribute of a Revision.
    
    Add tests that actually include having extra_header in loaded git revisions.
    For this, the fast-import based dump of the test git repo is replaced by
    a bundle dump (required to ensure things like gpg signatures are kept).
    A few git revisions including gpg signature, encoding and mergetag
    headers are also added to the git repo.

Link to build: https://jenkins.softwareheritage.org/job/DLDG/job/tests-on-diff/34/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDG/job/tests-on-diff/34/console

fix MANIFEST.in to include the git bundle file

Build is green

Patch application report for D3454 (id=12248)

Rebasing onto fc2ec733aa...

First, rewinding head to replay your work on top of it...
Applying: Adapt the git loader to swh-model >0.4
Using index info to reconstruct a base tree...
M	MANIFEST.in
M	requirements-swh.txt
Falling back to patching base and 3-way merge...
Removing swh/loader/git/tests/data/git-repos/example-submodule.fast-export.xz
Auto-merging requirements-swh.txt
CONFLICT (content): Merge conflict in requirements-swh.txt
Auto-merging MANIFEST.in
CONFLICT (content): Merge conflict in MANIFEST.in
Patch failed at 0001 Adapt the git loader to swh-model >0.4

Resolve all conflicts manually, mark them as resolved with
"git add/rm <conflicted_files>", then run "git rebase --continue".
You can instead skip this commit: run "git rebase --skip".
To abort and get back to the state before "git rebase", run "git rebase --abort".

Rebase failed (ret=1)!

Could not rebase; Attempt merge onto fc2ec733aa...

Already up to date.
Changes applied before test
commit 338d8bb46cc97c407d2b14cc5be99fe8f5456fea
Author: David Douard <david.douard@sdfa3.org>
Date:   Tue Jul 7 15:48:40 2020 +0200

    Adapt the git loader to swh-model >0.4
    
    in which extra_headers is now a top-level attribute of a Revision.
    
    Add tests that actually include having extra_header in loaded git revisions.
    For this, the fast-import based dump of the test git repo is replaced by
    a bundle dump (required to ensure things like gpg signatures are kept).
    A few git revisions including gpg signature, encoding and mergetag
    headers are also added to the git repo.

See https://jenkins.softwareheritage.org/job/DLDG/job/tests-on-diff/35/ for more details.

Build is green

Patch application report for D3454 (id=12249)

Rebasing onto fc2ec733aa...

Current branch diff-target is up to date.
Changes applied before test
commit 0394f0f85edf6d78548980cd5db0c4436ad7488e
Author: David Douard <david.douard@sdfa3.org>
Date:   Tue Jul 7 15:48:40 2020 +0200

    Adapt the git loader to swh-model >0.4
    
    in which extra_headers is now a top-level attribute of a Revision.
    
    Add tests that actually include having extra_header in loaded git revisions.
    For this, the fast-import based dump of the test git repo is replaced by
    a bundle dump (required to ensure things like gpg signatures are kept).
    A few git revisions including gpg signature, encoding and mergetag
    headers are also added to the git repo.

See https://jenkins.softwareheritage.org/job/DLDG/job/tests-on-diff/36/ for more details.

anlambert added a subscriber: anlambert.
anlambert added inline comments.
swh/loader/git/converters.py
121–123

I think you can remove these lines.

137
extra_headers=tuple(git_metadata),

small simplification as suggested by anlambert

Build is green

Patch application report for D3454 (id=12252)

Rebasing onto fc2ec733aa...

Current branch diff-target is up to date.
Changes applied before test
commit 0394f0f85edf6d78548980cd5db0c4436ad7488e
Author: David Douard <david.douard@sdfa3.org>
Date:   Tue Jul 7 15:48:40 2020 +0200

    Adapt the git loader to swh-model >0.4
    
    in which extra_headers is now a top-level attribute of a Revision.
    
    Add tests that actually include having extra_header in loaded git revisions.
    For this, the fast-import based dump of the test git repo is replaced by
    a bundle dump (required to ensure things like gpg signatures are kept).
    A few git revisions including gpg signature, encoding and mergetag
    headers are also added to the git repo.

See https://jenkins.softwareheritage.org/job/DLDG/job/tests-on-diff/37/ for more details.

This revision was automatically updated to reflect the committed changes.