Page MenuHomeSoftware Heritage

mirror PyPI
Closed, MigratedEdits Locked

Description

[ it is not yet clear whether the best way to remain up-to-date wrt PyPI is to have a local mirror, or rather follow a list of changes there, but in the meantime here are some information about how to mirror PyPI, which I've tracked down for unrelated reasons ]

PyPI is easy to mirror, and a network of public mirrors already exist. The Python package bandersnatch automate the task of setting up an initial mirror and keeping it up to date.

At the time of writing, time/space figures about mirroring PyPI:

  • a full mirror took about 1 day to retrieve
  • a subsequent update (to catchup with new packages arrived over the mirroring day) took about 10 minutes
  • on disk space (for compressed packages in various formats) is about 290 GB

Event Timeline

A full mirror of PyPI, took 1 day ago, is currently available on the Debsources machine under /srv/pypi . Ask @zack if you don't have access and would like to have a look at its structure.

> PyPI is easy to mirror [1], and a network of public mirrors [2] already exist. The Python package bandersnatch [3] automate the task of setting up an initial mirror and keeping it up to date.

Most doc links from this sentence are either broken ([1] is 404) or not accessible ([2] is 403) except for the bandersnatch tool [3].
Looking at the faq [4], they also (now?) recommend bandersnatch. Quoting it:

How can I run a mirror of PyPI?

    If you need to run your own mirror of PyPI, the bandersnatch [3] project is the recommended solution. 
    Note that the storage requirements for a PyPI mirror would exceed 1 terabyte—and growing!

[1] https://pypi.python.org/mirrors
[2] https://www.pypi-mirrors.org/
[3] https://pypi.org/project/bandersnatch/ (that url seems stable ;)
[4] https://pypi.org/help/#mirroring

Looking at the faq [4], they also (now?) recommend bandersnatch. Quoting it:

not sure if that's useful to you, but just in case: the out-of-dated-ness is most likely due to the fact that, since I opened this task 2 years ago, pypi has been completely revamped. See, e.g.:

better LWN link to the actual article covering this: https://lwn.net/Articles/751458/

the out-of-dated-ness is most likely due to the fact that, since I opened this task 2 years ago...

yes, i guessed as much.

pypi has been completely revamped. See, e.g.:...

And thanks for the links ;)

ardumont claimed this task.

As per comment [1], closing this as we will not implement mirroring.

https://forge.softwareheritage.org/T422#21479