Changeset View
Changeset View
Standalone View
Standalone View
docs/package-loader-tutorial.rst
Show First 20 Lines • Show All 366 Lines • ▼ Show 20 Lines | ||||||||||
it already downloaded, using :term:`extids <extid>`. | it already downloaded, using :term:`extids <extid>`. | |||||||||
The rough idea it to find some way to uniquely identify packages before downloading | The rough idea it to find some way to uniquely identify packages before downloading | |||||||||
them and encode it in a short string, the ExtID. | them and encode it in a short string, the ExtID. | |||||||||
Using checksums | Using checksums | |||||||||
+++++++++++++++ | +++++++++++++++ | |||||||||
Ideally, this short string is a checksum of the archive, provided by the API | Ideally, this short string is a checksum of the archive, provided by the API | |||||||||
ardumont: we should replacer archive by tarball or something. I keep getting confused when reading the… | ||||||||||
before downloading the archive itself. | before downloading the archive itself. | |||||||||
This is ideal, because this ensures that we detect changes in the package's content | This is ideal, because this ensures that we detect changes in the package's content | |||||||||
even if it keeps the same name and version number. | even if it keeps the same name and version number. | |||||||||
However, this is only usable when all fields used to generate release objects | ||||||||||
(message, authors, ...) are extracted from the archive. | ||||||||||
.. important:: | ||||||||||
Not Done Inline Actions
ardumont: | ||||||||||
If release objects are generated from extrinsic fields (ie. not extracted from | ||||||||||
the archive, such as authorship information added by the package repository) | ||||||||||
two different package versions with the same tarball would end up with the | ||||||||||
same release number; causing the loader to create incorrect snapshots. | ||||||||||
If this is not the case of the repository you want to load from, skip to the | If this is not the case of the repository you want to load from, skip to the | |||||||||
next subsection. | next subsection. | |||||||||
This is used for example by the PyPI loader (with a sha256sum) and the NPM loader | This is used for example by the PyPI loader (with a sha256sum) and the NPM loader | |||||||||
(with a sha1sum). | (with a sha1sum). | |||||||||
The Debian loader uses a similar scheme: as a single package is assembled from | The Debian loader uses a similar scheme: as a single package is assembled from | |||||||||
a set of tarballs, it only uses the hash of the ``.dsc`` file, which itself contains | a set of tarballs, it only uses the hash of the ``.dsc`` file, which itself contains | |||||||||
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines | ||||||||||
will read this template, substitute the variables based on the object's attributes, | will read this template, substitute the variables based on the object's attributes, | |||||||||
compute the hash of the result, and return it. | compute the hash of the result, and return it. | |||||||||
Note that, as mentioned before, this is not perfect because a tarball may be replaced | Note that, as mentioned before, this is not perfect because a tarball may be replaced | |||||||||
with a different tarball of exactly the same length and modification time, | with a different tarball of exactly the same length and modification time, | |||||||||
and we won't detect it. | and we won't detect it. | |||||||||
But this is extremely unlikely, so we consider it to be good enough. | But this is extremely unlikely, so we consider it to be good enough. | |||||||||
.. important:: | ||||||||||
The manifest must cover all fields used to generate Release objects. | ||||||||||
Alternatively, if this is not good enough for your loader, you can simply not implement | Alternatively, if this is not good enough for your loader, you can simply not implement | |||||||||
ExtIDs, and your loader will always load all tarballs. | ExtIDs, and your loader will always load all tarballs. | |||||||||
This can be bandwidth-heavy for both |swh| and the origin you are loaded from, | This can be bandwidth-heavy for both |swh| and the origin you are loaded from, | |||||||||
so this decision should not be taken lightly. | so this decision should not be taken lightly. | |||||||||
Choosing the ExtID type | Choosing the ExtID type | |||||||||
▲ Show 20 Lines • Show All 243 Lines • Show Last 20 Lines |
we should replacer archive by tarball or something. I keep getting confused when reading the doc.