Page MenuHomeSoftware Heritage

Rename bundle types and use SWHIDs everywhere instead of raw sha1_git
ClosedPublic

Authored by vlorentz on Aug 18 2021, 3:34 PM.

Details

Summary

The interfaces were a bit messy, as bundle types sometimes included the object
type ('directory' was also a bundle type) and sometimes did not.

Instead, the bundle types now do not include the object type at all, which
means users do no have to care about what cooker will cook their bundle,
they just need to provide a bundle type.

Instead, the information about the object type is now carried along the
object id, in a SWHID.

This allows simplifying interfaces and removing multiple convertions
(especially between hex and bytes).

This is unfortunately a pretty large commit, as it is easier to change
everything at once, than to try to make it incrementally and adding
temporary convertions at some moving borders.

It also invalidates the existing cache entirely, as it used bundle types
and object ids as keys, which are now changed.

Depends on D6111.

Diff Detail

Event Timeline

Build has FAILED

Patch application report for D6112 (id=22116)

Could not rebase; Attempt merge onto c269238587...

Updating c269238..8ef973b
Fast-forward
 swh/vault/api/client.py                 |   4 +
 swh/vault/api/server.py                 |   5 +-
 swh/vault/backend.py                    | 188 ++++++++++++----------------
 swh/vault/cache.py                      |  32 ++---
 swh/vault/cli.py                        |  52 ++++----
 swh/vault/cookers/__init__.py           |  52 +++++---
 swh/vault/cookers/base.py               |  52 ++++----
 swh/vault/cookers/directory.py          |   7 +-
 swh/vault/cookers/git_bare.py           |  29 ++---
 swh/vault/cookers/revision_flat.py      |  10 +-
 swh/vault/cookers/revision_gitfast.py   |   6 +-
 swh/vault/in_memory_backend.py          |  24 ++--
 swh/vault/interface.py                  |  19 ++-
 swh/vault/sql/30-schema.sql             |  14 +--
 swh/vault/tests/test_backend.py         | 172 ++++++++++++-------------
 swh/vault/tests/test_cache.py           |  58 ++++-----
 swh/vault/tests/test_cli.py             |  21 ++--
 swh/vault/tests/test_cookers.py         | 214 ++++++++++++++++----------------
 swh/vault/tests/test_cookers_base.py    |  29 +++--
 swh/vault/tests/test_git_bare_cooker.py |  76 ++++++++----
 swh/vault/tests/test_init_cookers.py    |   8 +-
 swh/vault/tests/test_server.py          |  15 ++-
 22 files changed, 550 insertions(+), 537 deletions(-)
Changes applied before test
commit 8ef973b7e88ffa5dad803bced371269017ae7a83
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 18 15:33:29 2021 +0200

    Rename bundle types and use SWHIDs everywhere instead of raw sha1_git
    
    The interfaces were a bit messy, as bundle types sometimes included the object
    type ('directory' was also a bundle type) and sometimes did not.
    
    Instead, the bundle types now do not include the object type at all, which
    means users do no have to care about what cooker will cook their bundle,
    they just need to provide a bundle type.
    
    Instead, the information about the object type is now carried along the
    object id, in a SWHID.
    
    This allows simplifying interfaces and removing multiple convertions
    (especially between hex and bytes).
    
    This is unfortunately a pretty large commit, as it is easier to change
    everything at once, than to try to make it incrementally and adding
    temporary convertions at some moving borders.
    
    It also invalidates the existing cache entirely, as it used bundle types
    and object ids as keys, which are now changed.

commit d9e712cf70821a80e391c444d748fa3c66d130a0
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 18 12:25:46 2021 +0200

    Add support for releases pointing to other releases or contents.

Link to build: https://jenkins.softwareheritage.org/job/DVAU/job/tests-on-diff/170/
See console output for more information: https://jenkins.softwareheritage.org/job/DVAU/job/tests-on-diff/170/console

Harbormaster returned this revision to the author for changes because remote builds failed.Aug 18 2021, 3:35 PM
Harbormaster failed remote builds in B23073: Diff 22116!

Build is green

Patch application report for D6112 (id=22117)

Could not rebase; Attempt merge onto c269238587...

Updating c269238..6e3a193
Fast-forward
 swh/vault/api/client.py                 |   4 +
 swh/vault/api/serializers.py            |  17 +++
 swh/vault/api/server.py                 |   5 +-
 swh/vault/backend.py                    | 188 ++++++++++++----------------
 swh/vault/cache.py                      |  32 ++---
 swh/vault/cli.py                        |  52 ++++----
 swh/vault/cookers/__init__.py           |  52 +++++---
 swh/vault/cookers/base.py               |  52 ++++----
 swh/vault/cookers/directory.py          |   7 +-
 swh/vault/cookers/git_bare.py           |  29 ++---
 swh/vault/cookers/revision_flat.py      |  10 +-
 swh/vault/cookers/revision_gitfast.py   |   6 +-
 swh/vault/in_memory_backend.py          |  24 ++--
 swh/vault/interface.py                  |  19 ++-
 swh/vault/sql/30-schema.sql             |  14 +--
 swh/vault/tests/test_backend.py         | 172 ++++++++++++-------------
 swh/vault/tests/test_cache.py           |  58 ++++-----
 swh/vault/tests/test_cli.py             |  21 ++--
 swh/vault/tests/test_cookers.py         | 214 ++++++++++++++++----------------
 swh/vault/tests/test_cookers_base.py    |  29 +++--
 swh/vault/tests/test_git_bare_cooker.py |  76 ++++++++----
 swh/vault/tests/test_init_cookers.py    |   8 +-
 swh/vault/tests/test_server.py          |  15 ++-
 23 files changed, 567 insertions(+), 537 deletions(-)
 create mode 100644 swh/vault/api/serializers.py
Changes applied before test
commit 6e3a1936020237b81f194bfa073ad666be251f66
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 18 15:33:29 2021 +0200

    Rename bundle types and use SWHIDs everywhere instead of raw sha1_git
    
    The interfaces were a bit messy, as bundle types sometimes included the object
    type ('directory' was also a bundle type) and sometimes did not.
    
    Instead, the bundle types now do not include the object type at all, which
    means users do no have to care about what cooker will cook their bundle,
    they just need to provide a bundle type.
    
    Instead, the information about the object type is now carried along the
    object id, in a SWHID.
    
    This allows simplifying interfaces and removing multiple convertions
    (especially between hex and bytes).
    
    This is unfortunately a pretty large commit, as it is easier to change
    everything at once, than to try to make it incrementally and adding
    temporary convertions at some moving borders.
    
    It also invalidates the existing cache entirely, as it used bundle types
    and object ids as keys, which are now changed.

commit d9e712cf70821a80e391c444d748fa3c66d130a0
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 18 12:25:46 2021 +0200

    Add support for releases pointing to other releases or contents.

See https://jenkins.softwareheritage.org/job/DVAU/job/tests-on-diff/171/ for more details.

anlambert added a subscriber: anlambert.

Looks good but the SQL schema migration file is missing in sql/upgrades so I cannot accept the diff yet.

This revision now requires changes to proceed.Aug 19 2021, 12:28 PM

fix task arg serialization/deserialization

Looks good but the SQL schema migration file is missing in sql/upgrades so I cannot accept the diff yet.

Actually, we're going to drop the database + objstorage and recreate it, writing a migration for this looks like too much trouble for a cache.

Build is green

Patch application report for D6112 (id=22124)

Rebasing onto c269238587...

Current branch diff-target is up to date.
Changes applied before test
commit fc7bcc02caa19d69c8cf54bf933a29f886e64e4e
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 18 15:33:29 2021 +0200

    Rename bundle types and use SWHIDs everywhere instead of raw sha1_git
    
    The interfaces were a bit messy, as bundle types sometimes included the object
    type ('directory' was also a bundle type) and sometimes did not.
    
    Instead, the bundle types now do not include the object type at all, which
    means users do no have to care about what cooker will cook their bundle,
    they just need to provide a bundle type.
    
    Instead, the information about the object type is now carried along the
    object id, in a SWHID.
    
    This allows simplifying interfaces and removing multiple convertions
    (especially between hex and bytes).
    
    This is unfortunately a pretty large commit, as it is easier to change
    everything at once, than to try to make it incrementally and adding
    temporary convertions at some moving borders.
    
    It also invalidates the existing cache entirely, as it used bundle types
    and object ids as keys, which are now changed.

commit d9e712cf70821a80e391c444d748fa3c66d130a0
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 18 12:25:46 2021 +0200

    Add support for releases pointing to other releases or contents.

See https://jenkins.softwareheritage.org/job/DVAU/job/tests-on-diff/172/ for more details.

Looks good but the SQL schema migration file is missing in sql/upgrades so I cannot accept the diff yet.

Actually, we're going to drop the database + objstorage and recreate it, writing a migration for this looks like too much trouble for a cache.

Ack, I guess you can remove the existing migration files then.

This revision is now accepted and ready to land.Aug 19 2021, 3:02 PM