Query: Advanced Search

	Include stories about projects I am a member of.

To install WSL as suggested, that documentation [1] should probably help.

Then again, i'll check the pypi api's documentation. Hopefully, it's explained somewhere ;)

In T421#21696, @zack wrote:

Still, we should probably have a "master" branch, to ease navigation, shouldn't we? (What do we do for Debian packages on this?)

So, having one branch in the snapshot per distribution format (tar/zip/etc.) is a nice and clean way of handling this.

In T421#21693, @olasd wrote:

Unpack all the sdist formats

If things are well, the contents are identical. In that case, the revision objects would end up with the same id; we can ignore that there ever was multiple formats, and just have a single branch pointing to a single revision for that version of the package in the snapshot

If the contents are different, load both and make the snapshot have a branch pointing to each format.

The Debian loader doesn't create release objects. Our data model doesn't allow to attach arbitrary structured metadata to release objects (as Git doesn't either), so we've shortcut this level of indirection.

In T421#21639, @zack wrote:

The basic loader will be the tarball loader, yes. In addition to that there are two aspects to be defined:

the stack of objects to be added to the DAG

the metadata to extract

For (1), I think what we currently do for Debian packages is as you said, i.e., snapshot -> release -> revision -> tarball root dir. Maybe you can check for comparison (or @olasd can chime in?). We should do the same here.

There remains 3 actions to do for the current implementation to be complete:

As far as I can tell from those examples, the metadata that PyPI gives you are the most recent ones, probably the ones extracted from the most recent version, so it would be incorrect to associate them to other releases.

In T421#21642, @ardumont wrote:

The pypi api provides already quite the information (P288, P289 for examples).
For now, the current implementation leverages it.

For (1), I think what we currently do for Debian packages is as you said, i.e., snapshot -> release -> revision -> tarball root dir. Maybe you can check for comparison (or @olasd can chime in?). We should do the same here.

The basic loader will be the tarball loader, yes. In addition to that there are two aspects to be defined:

the stack of objects to be added to the DAG
the metadata to extract

capable of extracting upstream metadata that are meaningful (and specific to) PyPI.

In T422#21473, @ardumont wrote:
If your consumer is actually an organization or service that will be downloading a lot of packages from PyPI, consider using your own index mirror or cache.
That's not a sustainable way. If we choose that path for all the forges we need to archive... that will be difficult in terms of infrastructure and maintenance.

They have multiple apis:

basic json one [1] which permits to request information on a per project basis (no listing) [1] (~> foresee the use of this one for the loader)
xmlrpc deprecated one [2] (this one lists ~> that would be for the lister use)
html page (listing all packages)
rss feed (update events)

We have no guarantee that the internal object ids are monotonic: concurrent transactions can make object_ids of objects go backwards.

Added the kafka-related products to the licensing page. Pretty much everything in the ecosystem is Apache2-licensed.

Advanced SearchUse ResultsEdit QueryHide Query

Jan 8 2023

Nov 7 2022

Oct 19 2022

Sep 12 2022

Sep 8 2022

May 2 2022

Mar 30 2022

Mar 3 2022

Feb 28 2022

Dec 9 2021

Apr 9 2021

Mar 18 2021

Feb 5 2021

Sep 2 2020

Sep 1 2020

May 28 2020

Oct 18 2018

Oct 5 2018

Oct 1 2018

Sep 6 2018

Aug 27 2018

Aug 23 2018

Aug 22 2018

Aug 21 2018

Aug 2 2018

Aug 1 2018

Jul 25 2018

Jul 24 2018

Jul 10 2018

Sep 14 2017

Sep 7 2017

Jul 3 2017

May 15 2017

Feb 14 2017

Feb 8 2017

Feb 7 2017

Jan 24 2017

Aug 23 2016

Aug 19 2016

Aug 16 2016

Aug 9 2016

Jul 26 2016

Jul 22 2016

Jun 14 2016

Jun 13 2016

May 29 2016

May 26 2016

May 13 2016

Apr 29 2016

Mar 10 2016

Advanced Search
Use Results
Edit Query
Hide Query