Ingest source code archives available from the Python Package Index (PyPI)
- source code (Git)
- issues
Ingest source code archives available from the Python Package Index (PyPI)
In T4512#90697, @olasd wrote:Diverging from the layout of the original tarball may make efforts to keep the metadata needed to efficiently rebuild original tarballs (via disarchive) harder.
But clearly it's not great. I wonder if we could do something about this in swh-web instead
I agree that the UX of switching branches from a release to another on snapshots of PyPI origins is not good.
I don't think it would be appropriate to remove that directory; we try to reproduce tarball faithfully. And there might be other entries at the root (eg. when loading .jar, there would typically be only two directories at the root).
Fix has been deployed to production, closing this.
Apparently we decided not to archive them so better filtering those files out as proposed in T3575.
We don't keep the binary indexes from Debian repositories, for instance.
So they're metadata specific to files that we don't archive at all because they're not source? That doesn't sound very useful to keep at all. We don't keep the binary indexes from Debian repositories, for instance.
They are metadata on the file itself (file name, checksums, has signature, upload time, file-specific comment (often empty), yank status), so they have nothing in common
In practice, is there many meaningful differences between the wheel metadata and the sdist metadata? If not then I think option 3 would be the most sensible.
roh, i did not want to close it...
I just wanted to update the diff... oh, well!
And deployed.
Then I think these snapshots do look as expected, and the surrounding code should be adapted :)
Then I think these snapshots do look as expected, and the surrounding code should be adapted :)
Looks related to source packages without the presence of the PKG-INFO file, see debug output of the loader below:
I'd expect branches will have a null target if that release only has binary distributions, but that's not the case for configpy. Needs to be investigated further