Page MenuHomeSoftware Heritage

Re-implement debian loader with package loader mechanism
Started, Work in Progress, NormalPublic

Event Timeline

ardumont triaged this task as Normal priority.Tue, Oct 1, 1:22 PM
ardumont created this task.

Current input of the debian loader (without the optimization) (outputed from the lister):

sample = \
{'date': '2019-10-12T05:58:09.165557+00:00',
 'origin': {'type': 'deb', 'url': 'deb://Debian/packages/cicero'},
 'packages': {'stretch/contrib/0.7.2-3': {'files': {'cicero_0.7.2-3.diff.gz': {'md5sum': 'a93661b6a48db48d59ba7d26796fc9ce',
                                                                               'name': 'cicero_0.7.2-3.diff.gz',
                                                                               'sha256': 'f039c9642fe15c75bed5254315e2a29f9f2700da0e29d9b0729b3ffc46c8971c',
                                                                               'size': 3964,
                                                                               'uri': 'http://deb.debian.org/debian//pool/contrib/c/cicero/cicero_0.7.2-3.diff.gz'},
                                                    'cicero_0.7.2-3.dsc': {'md5sum': 'd5dac83eb9cfc9bb52a15eb618b4670a',
                                                                           'name': 'cicero_0.7.2-3.dsc',
                                                                           'sha256': '35b7f1048010c67adfd8d70e4961aefd8800eb9a83a4d1cc68088da0009d9a03',
                                                                           'size': 1864,
                                                                           'uri': 'http://deb.debian.org/debian//pool/contrib/c/cicero/cicero_0.7.2-3.dsc'},
                                                    'cicero_0.7.2.orig.tar.gz': {'md5sum': '4353dede07c5728319ba7f5595a7230a',
                                                                                 'name': 'cicero_0.7.2.orig.tar.gz',
                                                                                 'sha256': '63f40f2436ea9f67b44e2d4bd669dbabe90e2635a204526c20e0b3c8ee957786',
                                                                                 'size': 96527,
                                                                                 'uri': 'http://deb.debian.org/debian//pool/contrib/c/cicero/cicero_0.7.2.orig.tar.gz'}},
                                          'id': 23,
                                          'name': 'cicero',
                                          'revision_id': None,
                                          'version': '0.7.2-3'}}}

What i understand a bit better now is that the .dsc is the intrinsic metadata file (well i knew that ;)
Contrary to other package loaders though, we cannot look into the archive tarball (.orig.tar.gz) because it's not there, it's already outside and provided as one of the entry 'files' in the dict.

tl; dr:

  • we need to adapt the current package loader to allow that ;)
  • the current debian loader's entrypoint which deals with that is def process_package(package) function. And package is a Mapping[str, Mapping[str, str]]. Something like, in the current sample, the value sample['packages']['stretch/contrib/0.7.2-3'].
ardumont changed the task status from Open to Work in Progress.Sun, Oct 13, 6:37 PM