Page MenuHomeSoftware Heritage

Support serialization and deserialization of ints of arbitrary length
ClosedPublic

Authored by olasd on Apr 17 2020, 9:52 AM.

Details

Summary

msgpack only has built-in support for ints that fit in 64 bits. However, we
happen to be storing arbitrary json in the archive, which itself has support for
integers of arbitrary length, which themselves are mapped to "long" integers in
Python, which make the msgpack encoder blow up.

Fortunately, overflowing integers are passed to the default object hook. We
generate a msgpack "extended type" with code 1 for arbitrary integers.

Test Plan

extended the msgpack roundtrip test with the culprit integer noticed
in the archive.

Diff Detail

Repository
rDCORE Foundations and core functionalities
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D3026 (id=10747)

Rebasing onto ad8bf9c09f...

Current branch diff-target is up to date.
Changes applied before test
commit b7ec05dbe03c717f42b03c14a5f968e4217331eb
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Fri Apr 17 09:48:32 2020 +0200

    Support serialization and deserialization of ints of arbitrary length
    
    msgpack only has built-in support for ints that fit in 64 bits. However, we
    happen to be storing arbitrary json in the archive, which itself has support for
    integers of arbitrary length, which themselves are mapped to "long" integers in
    Python, which make the msgpack encoder blow up.
    
    Fortunately, overflowing integers are passed to the default object hook. We
    generate a msgpack "extended type" with code 1 for arbitrary integers.

See https://jenkins.softwareheritage.org/job/DCORE/job/tests-on-diff/3/ for more details.

This revision is now accepted and ready to land.Apr 17 2020, 10:52 AM