Page MenuHomeSoftware Heritage

save_origin_webhooks: Add push webhook receivers for popular forges
ClosedPublic

Authored by anlambert on Oct 31 2022, 5:14 PM.

Details

Summary

Add new Web API endpoints accepting only POST requests coming from push
webhooks of the following popular forges or their instances:

  • Bitbucket
  • Gitea
  • GitHub
  • GitLab
  • SourceForge

There is one API endpoint per forge type enabling to request or update the
archival of a repository when new commits are pushed to it through the Save
Code Now service. Each endpoint simply processes the webhook JSON payload
sent by a forge in order to extract the repository URL and the visit type
in order to create a new save request for the repository.

Related to T4548

Diff Detail

Repository
rDWAPPS Web applications
Branch
save-origin-webhooks
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 32662
Build 51169: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 51168: arc lint + arc unit

Event Timeline

Build has FAILED

Patch application report for D8798 (id=31708)

Rebasing onto 024da72220...

Current branch diff-target is up to date.
Changes applied before test
commit e5b792a0cbe038593b75c72017ccbffed373cc95
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Wed Oct 26 14:44:50 2022 +0200

    save_origin_webhooks: Add push webhook receivers for popular forges
    
    Add new Web API endpoints accepting only POST requests coming from push
    webhooks of the following popular forges or their instances:
    
    - Bitbucket
    - Gitea
    - GitHub
    - GitLab
    - SourceForge
    
    There is one API endpoint per forge type enabling to request or update the
    archival of a repository when new commits are pushed to it through the Save
    Code Now service. Each endpoint simply processes the webhook JSON payload
    sent by a forge in order to extract the repository URL and the visit type
    in order to create a new save request for the repository.
    
    Related to T4548

Link to build: https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/2097/
See console output for more information: https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/2097/console

Harbormaster returned this revision to the author for changes because remote builds failed.Oct 31 2022, 5:28 PM
Harbormaster failed remote builds in B32657: Diff 31708!

Remove test_app.py until I find a better way to manage API URLs spread into multiple apps

Build has FAILED

Patch application report for D8798 (id=31709)

Rebasing onto 024da72220...

Current branch diff-target is up to date.
Changes applied before test
commit 969970cd30d97b7ba3c3b8edc16fe043dc327e08
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Wed Oct 26 14:44:50 2022 +0200

    save_origin_webhooks: Add push webhook receivers for popular forges
    
    Add new Web API endpoints accepting only POST requests coming from push
    webhooks of the following popular forges or their instances:
    
    - Bitbucket
    - Gitea
    - GitHub
    - GitLab
    - SourceForge
    
    There is one API endpoint per forge type enabling to request or update the
    archival of a repository when new commits are pushed to it through the Save
    Code Now service. Each endpoint simply processes the webhook JSON payload
    sent by a forge in order to extract the repository URL and the visit type
    in order to create a new save request for the repository.
    
    Related to T4548

Link to build: https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/2098/
See console output for more information: https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/2098/console

Harbormaster returned this revision to the author for changes because remote builds failed.Oct 31 2022, 7:14 PM
Harbormaster failed remote builds in B32658: Diff 31709!

Build is green

Patch application report for D8798 (id=31709)

Rebasing onto 024da72220...

Current branch diff-target is up to date.
Changes applied before test
commit 969970cd30d97b7ba3c3b8edc16fe043dc327e08
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Wed Oct 26 14:44:50 2022 +0200

    save_origin_webhooks: Add push webhook receivers for popular forges
    
    Add new Web API endpoints accepting only POST requests coming from push
    webhooks of the following popular forges or their instances:
    
    - Bitbucket
    - Gitea
    - GitHub
    - GitLab
    - SourceForge
    
    There is one API endpoint per forge type enabling to request or update the
    archival of a repository when new commits are pushed to it through the Save
    Code Now service. Each endpoint simply processes the webhook JSON payload
    sent by a forge in order to extract the repository URL and the visit type
    in order to create a new save request for the repository.
    
    Related to T4548

See https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/2099/ for more details.

Thanks, I like it.

Before going into the details, what do you think of D8800? it's a refactoring of this diff, using an abstract OriginSaveWebhookReceiver class, and each forge defines a subclass. This replaces calls to origin_save_webhook_receiver with lots of arguments and decorators. I didn't touch the tests at all.

Thanks, I like it.

Before going into the details, what do you think of D8800? it's a refactoring of this diff, using an abstract OriginSaveWebhookReceiver class, and each forge defines a subclass. This replaces calls to origin_save_webhook_receiver with lots of arguments and decorators. I didn't touch the tests at all.

Looks better indeed without that big stack of decorators, I will update that diff with your approach then.

Update: Simplify implementation using classes (thanks to @vlorentz)

Build is green

Patch application report for D8798 (id=31713)

Rebasing onto 024da72220...

Current branch diff-target is up to date.
Changes applied before test
commit de83a15c3c32e83aa380d915ed62d4cd8a874863
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Wed Oct 26 14:44:50 2022 +0200

    save_origin_webhooks: Add push webhook receivers for popular forges
    
    Add new Web API endpoints accepting only POST requests coming from push
    webhooks of the following popular forges or their instances:
    
    - Bitbucket
    - Gitea
    - GitHub
    - GitLab
    - SourceForge
    
    There is one API endpoint per forge type enabling to request or update the
    archival of a repository when new commits are pushed to it through the Save
    Code Now service. Each endpoint simply processes the webhook JSON payload
    sent by a forge in order to extract the repository URL and the visit type
    in order to create a new save request for the repository.
    
    Related to T4548

See https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/2102/ for more details.

I just did a proper review now, it still looks good :)

Just a few nitpicks below

swh/web/save_origin_webhooks/generic_receiver.py
41–42

it's best to make link text self-descriptive

44–45
47–52

Shouldn't it return the same values as https://archive.softwareheritage.org/1/origin/save/doc/ , for the sake of consistency?

77–78

Preterit feels more correct than present perfect here. It's also consistent with other errors below

anlambert added inline comments.
swh/web/save_origin_webhooks/generic_receiver.py
47–52

I do not think we should return the other fields as most of them will be null (visit date and status for instance) so it is quite pointless to include them imho.

anlambert marked an inline comment as done.

Address @vlorentz comments

Build is green

Patch application report for D8798 (id=31715)

Rebasing onto 024da72220...

Current branch diff-target is up to date.
Changes applied before test
commit 93696b9cc7b954e7fd898695166eb9f364bb3a28
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Wed Oct 26 14:44:50 2022 +0200

    save_origin_webhooks: Add push webhook receivers for popular forges
    
    Add new Web API endpoints accepting only POST requests coming from push
    webhooks of the following popular forges or their instances:
    
    - Bitbucket
    - Gitea
    - GitHub
    - GitLab
    - SourceForge
    
    There is one API endpoint per forge type enabling to request or update the
    archival of a repository when new commits are pushed to it through the Save
    Code Now service. Each endpoint simply processes the webhook JSON payload
    sent by a forge in order to extract the repository URL and the visit type
    in order to create a new save request for the repository.
    
    Related to T4548

See https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/2103/ for more details.

This revision is now accepted and ready to land.Nov 3 2022, 9:20 AM