Page MenuHomeSoftware Heritage

loader-pypi: Snapshot with null branch are badly handled by loader
Closed, ResolvedPublic

Description

There remained errors on the pypi loader about None reference.

Sample stacktrace for origin https://pypi.org/project/configpy/

[2018-11-29 08:38:27,300: ERROR/Worker-1903] Loading failure, updating to `partial` status
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/swh/loader/core/loader.py", line 886, in load
    self.prepare(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/swh/loader/pypi/loader.py", line 152, in prepare
    self._prepare_state()
  File "/usr/lib/python3/dist-packages/swh/loader/pypi/loader.py", line 161, in _prepare_state
    self.known_artifacts = self._known_artifacts(last_snapshot)
  File "/usr/lib/python3/dist-packages/swh/loader/pypi/loader.py", line 115, in _known_artifacts
    for rev in last_snapshot['branches'].values()
  File "/usr/lib/python3/dist-packages/swh/loader/pypi/loader.py", line 116, in <listcomp>
    if rev['target_type'] == 'revision']
TypeError: 'NoneType' object is not subscriptable

kibana dashboard is at [1]

Its snapshot [2] feels wrong:

{'branches': {b'HEAD': {'target': b'releases/0.5', 'target_type': 'alias'},
              b'releases/0.2': None,
              b'releases/0.3': None,
              b'releases/0.4': None,
              b'releases/0.5': {'target': b'\xa5\x978\x92;S{\xf6\xcfE\xa9r'
                                          b'\xf0\xdd\x85e\xe8Z\x1d\x80',
                                'target_type': 'revision'}},
 'id': b'\xee<\xdfq\xc76\xec\xa2Dn?\x8aE\xb3\xc3\xe0I\x87H\xb0',
 'next_branch': None}

[1] http://kibana0.internal.softwareheritage.org:5601/app/kibana#/dashboard/290b3720-f30f-11e8-b8ce-cf95f437ce37

[2] Retrieving snapshot:

c = {'storage': {
  'cls': 'remote',
  'args': {'url': 'http://uffizi.internal.softwareheritage.org:5002/'}
  }
}
from swh.storage import get_storage
storage = get_storage(**c['storage']

origin = storage.origin_get({'type': 'pypi', 'url': 'https://pypi.org/project/configpy/'}
origin_id = origin['id']

snap = s.snapshot_get_latest(origin_id)
from pprint import pprint
pprint(snap)

Note:
D710 fixed a wrong behavior in snapshot resolution (most probably few snapshots were really used prior to this fix).
-> This led to new errors fixed in D727 (which catch most errors when trying to solve the snapshot revisions)
-> This finally revealed that some snapshots are badly formatted (i did not expect snapshot branches to target None)

Event Timeline

ardumont triaged this task as Normal priority.Nov 29 2018, 10:05 AM
ardumont created this task.
olasd added a subscriber: olasd.Nov 29 2018, 10:35 AM

I'd expect branches will have a null target if that release only has binary distributions, but that's not the case for configpy. Needs to be investigated further

Looks related to source packages without the presence of the PKG-INFO file, see debug output of the loader below:

antoine@guggenheim:~/tmp$ python3 -m swh.loader.pypi.loader configpy                                                                                                                                                                                                           
WARNING:swh.loader.pypi.PyPILoader:** DEBUG MODE ** Will not pre-clean up temp dir /tmp/swh.loader.pypi/swh.loader.pypi.bjsw73cu-4459                                                                                                                                          
DEBUG:swh.loader.pypi.PyPILoader:Creating pypi origin for https://pypi.org/projects/configpy/                                                                                                                                                                                  
DEBUG:swh.loader.pypi.PyPILoader:Done creating pypi origin for https://pypi.org/projects/configpy/                                                                                                                                                                             
DEBUG:swh.loader.pypi.PyPILoader:Creating origin_visit for origin 8 at time 2018-11-29 09:51:03.111485+00:00                                                                                                                                                                   
DEBUG:swh.loader.pypi.PyPILoader:Done Creating origin_visit for origin 8 at time 2018-11-29 09:51:03.111485+00:00                                                                                                                                                              
DEBUG:root:Release version: 0.2                                                                                                                                                                                                                                                
DEBUG:root:Artifact local path: /tmp/swh.loader.pypi/swh.loader.pypi.bjsw73cu-4459/configpy/0.2/configpy-0.2.tgz                                                                                                                                                               
WARNING:root:configpy 0.2: No PKG-INFO detected, skipping                                                                                                                                                                                                                      
DEBUG:root:Release version: 0.4                                                                                                                                                                                                                                                
DEBUG:root:Artifact local path: /tmp/swh.loader.pypi/swh.loader.pypi.bjsw73cu-4459/configpy/0.4/configpy-0.4.tgz                                                                                                                                                               
WARNING:root:configpy 0.4: No PKG-INFO detected, skipping                                                                                                                                                                                                                      
DEBUG:root:Release version: 0.5
DEBUG:root:Artifact local path: /tmp/swh.loader.pypi/swh.loader.pypi.bjsw73cu-4459/configpy/0.5/configpy-0.5.tar.gz
DEBUG:root:Clean up uncompressed archive path /tmp/swh.loader.pypi/swh.loader.pypi.bjsw73cu-4459/configpy/0.5/uncompress
DEBUG:root:Clean up archive /tmp/swh.loader.pypi/swh.loader.pypi.bjsw73cu-4459/configpy/0.5/configpy-0.5.tar.gz
DEBUG:root:Release version: 0.3
DEBUG:root:Artifact local path: /tmp/swh.loader.pypi/swh.loader.pypi.bjsw73cu-4459/configpy/0.3/configpy-0.3.tgz
WARNING:root:configpy 0.3: No PKG-INFO detected, skipping
DEBUG:swh.loader.pypi.PyPILoader:Sending 13 contents
DEBUG:swh.loader.pypi.PyPILoader:Done sending 13 contents
DEBUG:swh.loader.pypi.PyPILoader:Sending 5 directories
DEBUG:swh.loader.pypi.PyPILoader:Done sending 5 directories
DEBUG:swh.loader.pypi.PyPILoader:Sending 1 revisions
DEBUG:swh.loader.pypi.PyPILoader:Done sending 1 revisions
DEBUG:swh.loader.pypi.PyPILoader:Updating origin_visit for origin 8 with status partial
DEBUG:swh.loader.pypi.PyPILoader:Done updating origin_visit for origin 8 with status partial
WARNING:swh.loader.pypi.PyPILoader:** DEBUG MODE ** Will not clean up temp dir /tmp/swh.loader.pypi/swh.loader.pypi.bjsw73cu-4459
olasd added a comment.Nov 29 2018, 5:17 PM

Then I think these snapshots do look as expected, and the surrounding code should be adapted :)

Then I think these snapshots do look as expected, and the surrounding code should be adapted :)

That's a good news!

ardumont renamed this task from loader-pypi: badly(?) formatted snapshot targets None releases to loader-pypi: Snapshot with null branch are badly handled by loader.Nov 29 2018, 9:27 PM
ardumont closed this task as Resolved.Nov 30 2018, 10:53 AM
ardumont claimed this task.

And deployed.