While working on the npm loader by adapting the code from the PyPI one,
I noticed that the download of tarballs was wery slow.
Turned out that this is due to the use of the iter_content method
from the requests reponse api . By default, that method iterates
on the response content one bytes at a time so the slow download.
Turning the chunk_size parameter of that method to None will read data
as it arrives in whatever size the chunks are received and greatly
speedup download time.
For instance, before that fix, loading all Sphinx packages took:
$ time python3 -m swh.loader.pypi.loader sphinx ... real 53m53,489s user 53m19,212s sys 0m11,460s
After that fix, that process now takes:
$ time python3 -m swh.loader.pypi.loader sphinx ... real 2m21,667s user 0m55,900s sys 0m10,416s