In D8386#229882, @olasd wrote:In D8386#229677, @KShivendu wrote:I noticed that https://archive.softwareheritage.org/browse/origin/directory/?origin_url=deb://Ubuntu/packages/nginx has duplicate branch names, which is very confusing. In fact, even the default branch is repeated twice and I see two check marks. If we use branch names like 0.3.9-15.fc26, won't the same happen with Fedora listers? It doesn't seem to differentiate between the editions. (or does it?)
This seems like a misfeature in the webapp:
https://archive.softwareheritage.org/api/1/snapshot/158a3f36b0bd3da461fb7458de44cfa2c94e4270/
The snapshot has multiple branches, with the same version suffix, pointing at the same objects (because the exact same version of the package is present in multiple Ubuntu suites).
I'm not 100% sure how we should be fixing that, but that bug shouldn't prevent you from giving the fedora snapshots the "semantically correct" structure.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Nov 15 2022
Nov 14 2022
Nov 10 2022
@KShivendu , I added some inline comments to improve the loader output.
In T4632#98216, @vlorentz wrote:@anlambert What about non-Fedora RPM repositories? (RHEL, SUSE, Rocky Linux, ...)
@KShivendu thanks for the adaptations !
Actually for fedora, I found a better origin URL pattern: https://src.fedoraproject.org/rpms/{pkg_name}
@franckbret, as explained in my inline comment we cannot use the date filtering on the release index of CPAN elasticsearch.
Nov 9 2022
Fixed and deployed.
It seems the fix is to only encode the ? in an origin URL when it is provided as URL argument.
The issue we hit in production might be related to the varnish cache I think.
Hmm, I do not hit the issue locally so that is why tests did not spot the issue. This seems to only happen in production.
Actually this is not related to the replica lag but it is a regression induced by recent commit rDWAPPS4cc9676a54cc368394c05b7f19c92ea072f8041e.
Ah right I noticed that behavior when fixing a recent bug in the webapp (D8820), will fix that asap.
@KShivendu I forgot to mention in my review that we should also get the checksums associated to a rpm archive, the loader will then use it to check download integrity.
@KShivendu , after testing the lister in docker environment there is room for improvements before we can accept that diff.
After reviewing and hacking on the fedora lister, I think we should use origin URL in the form https://packages.fedoraproject.org/pkgs/{src_pkg_name} for a fedora source package.
Nov 8 2022
New forge webhooks now deployed to production, see Request archival section of Web API endpoints.
All forge webhook receivers are working as expected, time to activate them in production.
SourceForge case: I added webhook settings for that sample repositories of mine :
BitBucket case: I added webhook settings for that sample repository of mine https://bitbucket.org/anlambert/webhook-test.git using the following payload URL https://webapp.staging.swh.network/api/1/origin/save/webhook/bitbucket/.
Gitea case: I added webhook settings for that sample repository of mine https://try.gitea.io/anlambert/webhook-test.git using the following payload URL https://webapp.staging.swh.network/api/1/origin/save/webhook/gitea/.
GitLab case: I added webhook settings for that sample repository of mine https://gitlab.com/anlambert/test using the following payload URL https://webapp.staging.swh.network/api/1/origin/save/webhook/gitlab/.
GitHub case: I added webhook settings for that sample repository of mine https://github.com/anlambert/webhook-test using the following payload URL https://webapp.staging.swh.network/api/1/origin/save/webhook/github/.
Webhooks feature has been deployed and activated on staging, now let's test it before activating it for production too.
Fixed and deployed, closing this.
This is a regression induced by commit rDLDSVN04566a7f3616fb94063355aae491a4e5ad4a832f, I understood the issue and I am working on a fix.
Rebase
Nov 7 2022
I made the desired changes both on hedgedoc and on our WordPress website for all languages.
Rebase
Nov 4 2022
Looks good to me.
Would not it be more generic to validate the listed origin URLs at the scheduler level, more precisely in the record_listed_origins method ?
This way, all listers will benefit from URLs validation without touching their code.
Proceeding like this, we could even compute the list of rejected URLs in the send_origins method of base lister class (to log a warning for instance).
LGTM, I guess we should deploy this as soon as possible, right ?
Fix content or directory branch browsing when the branch query parameter
is passed to the directory browsing view and update tests to cover that
case.
Jeez, what a subtle bug !
Nov 3 2022
Use more Pythonic way to concatenate lists
Rebase