Remove description from Release message, add raw extrinsic metadata
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Oct 6 2022
Make use of checksums
Oct 5 2022
In D8454#224659, @anlambert wrote:LGTM, thanks !
rebase
Switch back artifacts and crates_metadata to list
Fix documentation and remove finalize cleanup and related test now that we use tempdir
Remove useless code
Merge get_db_dumb and parse_db_dumb into get_and_parse_db_dump
In D8454#224502, @franckbret wrote:extra_loader_arguments "artifacts" and "crates_metadata" are now lists + some code improvment
extra_loader_arguments "artifacts" and "crates_metadata" are now lists + some code improvment
shorter code for author fallback + use packaging.version parse
Remove release description from message
Oct 4 2022
Make use of checksums after D8595 landed
Add some tests to check that extract_intrinsic_metadata works as expected
Sep 30 2022
rebase
Rebase
Manage empty description and simplify .cabal parsing
shorter code and empty description handling
In D8379#223393, @anlambert wrote:@franckbret , you updated the wrong diff (conda instead of hackage)
restore previous state after I've arc diff to bad differential number
artifacts are now list
artifacts are now list
Trust p_info.version instead of intrinsic_metadata["version"]
Sep 29 2022
Add more fixture and tests
In D8379#223065, @anlambert wrote:Here the problems is that the cabal file isn't formatted as the naïve parser expect.. we expect each line to be something like {k}: {v}\n but in this case its {k}: \n \t {v} \n and we end with an empty value..
We can let the cabal parser in its current state at the moment, we will see how many packages have a cabal file with unexpected format when testing the loader on staging.
I'm actually making more fixture and test for each case, will update this patch soon.
I guess other errors are related to some tricky things like this one and that until we have a a real cabal parser that manage all cases we won't be able to fall into exceptions.
Or the info is simply missing in the cabal file, doing intrinsic_metadata.get("<field_name>") and handling None values should help to fix those issues.
sorry for posting twice my last comment, I didn't see the first one after submitting it...
Ensure filename ends with {version}.tar.gz before splitting filename to get a package name
Rebase
rebase
In D8379#221592, @anlambert wrote:That diff requires some changes as the api_info function got renamed (see inline comments).
Also while testing the loader in docker, I got a couple of errors on some packages, see below:
docker-swh-loader-1 | [2022-09-26 13:06:42,922: ERROR/ForkPoolWorker-8] Failed to load branch releases/0.1.0 for https://hackage.haskell.org/package/numeric-qq docker-swh-loader-1 | Traceback (most recent call last): docker-swh-loader-1 | File "/src/swh-loader-core/swh/loader/package/loader.py", line 672, in load docker-swh-loader-1 | res = self._load_release(p_info, origin) docker-swh-loader-1 | File "/src/swh-loader-core/swh/loader/package/loader.py", line 851, in _load_release docker-swh-loader-1 | p_info, uncompressed_path, directory=directory.hash docker-swh-loader-1 | File "/src/swh-loader-core/swh/loader/package/hackage/loader.py", line 171, in build_release docker-swh-loader-1 | assert version == p_info.version docker-swh-loader-1 | AssertionError
In D8379#221592, @anlambert wrote:That diff requires some changes as the api_info function got renamed (see inline comments).
Also while testing the loader in docker, I got a couple of errors on some packages, see below:
docker-swh-loader-1 | [2022-09-26 13:06:42,922: ERROR/ForkPoolWorker-8] Failed to load branch releases/0.1.0 for https://hackage.haskell.org/package/numeric-qq docker-swh-loader-1 | Traceback (most recent call last): docker-swh-loader-1 | File "/src/swh-loader-core/swh/loader/package/loader.py", line 672, in load docker-swh-loader-1 | res = self._load_release(p_info, origin) docker-swh-loader-1 | File "/src/swh-loader-core/swh/loader/package/loader.py", line 851, in _load_release docker-swh-loader-1 | p_info, uncompressed_path, directory=directory.hash docker-swh-loader-1 | File "/src/swh-loader-core/swh/loader/package/hackage/loader.py", line 171, in build_release docker-swh-loader-1 | assert version == p_info.version docker-swh-loader-1 | AssertionErrordocker-swh-loader-1 | [2022-09-26 13:08:03,416: ERROR/ForkPoolWorker-11] Failed to load branch releases/1.0.0.0 for https://hackage.haskell.org/package/haskell2010 docker-swh-loader-1 | Traceback (most recent call last): docker-swh-loader-1 | File "/src/swh-loader-core/swh/loader/package/loader.py", line 672, in load docker-swh-loader-1 | res = self._load_release(p_info, origin) docker-swh-loader-1 | File "/src/swh-loader-core/swh/loader/package/loader.py", line 851, in _load_release docker-swh-loader-1 | p_info, uncompressed_path, directory=directory.hash docker-swh-loader-1 | File "/src/swh-loader-core/swh/loader/package/hackage/loader.py", line 172, in build_release docker-swh-loader-1 | author = Person.from_fullname(intrinsic_metadata["author"].encode()) docker-swh-loader-1 | KeyError: 'author'docker-swh-loader-1 | [2022-09-26 13:21:31,790: ERROR/ForkPoolWorker-40] Failed to load branch releases/0.1.0.0 for https://hackage.haskell.org/package/hs-inspector docker-swh-loader-1 | Traceback (most recent call last): docker-swh-loader-1 | File "/src/swh-loader-core/swh/loader/package/loader.py", line 672, in load docker-swh-loader-1 | res = self._load_release(p_info, origin) docker-swh-loader-1 | File "/src/swh-loader-core/swh/loader/package/loader.py", line 851, in _load_release docker-swh-loader-1 | p_info, uncompressed_path, directory=directory.hash docker-swh-loader-1 | File "/src/swh-loader-core/swh/loader/package/hackage/loader.py", line 173, in build_release docker-swh-loader-1 | description: str = intrinsic_metadata["synopsis"] docker-swh-loader-1 | KeyError: 'synopsis'docker-swh-loader-1 | Traceback (most recent call last): docker-swh-loader-1 | File "/src/swh-loader-core/swh/loader/package/loader.py", line 672, in load docker-swh-loader-1 | res = self._load_release(p_info, origin) docker-swh-loader-1 | File "/src/swh-loader-core/swh/loader/package/loader.py", line 851, in _load_release docker-swh-loader-1 | p_info, uncompressed_path, directory=directory.hash docker-swh-loader-1 | File "/src/swh-loader-core/swh/loader/package/hackage/loader.py", line 170, in build_release docker-swh-loader-1 | version: str = intrinsic_metadata["version"] docker-swh-loader-1 | KeyError: 'version'
Sep 28 2022
Some improvments after review
Sep 27 2022
rebase
In D8529#222126, @vlorentz wrote:Is https://rubygems.org/versions documented somewhere?
Make use of http_request after D8520, update documentation docker section.
Explain that the lister discovers origins on other forges because NuGet packages are binaries
In D8542#221864, @anlambert wrote:@franckbret , have you considered exploiting the https://fastapi.metacpan.org/v1/release/_search endpoint of the CPAN elasticsearch ?
It seems to list all CPAN releases with dates, links to tarballs and checksums. You could build a list of artifacts for each package as in the crates loader
and pass them as loader arguments.
Thanks for the review.
rebase
rebase
Replace api_info that has been renamed to get_url_body
Sep 26 2022
Sphinx fix
Make use of self.http_request as introduced by D8520
Make use of http_request after D8520
Update docker usage documentation section and remove some useless code
Improvments after review
More complete tests and basic documentation
Make use of generic http_request method after D8520
Make use of http_retry instead of throttling_retry decorator after D8519
Make use of http_retry instead of throttling_retry decorator after D8519
Make use of http_retry instead of throttling_retry decorator after D8519