After taking down an origin, we need it to disappear from search results, even if it gets a new visit after the fact (at least, until an administrator can review that the content can re-appear). So we need to implement some sort of sticky blocklist within swh.search.
Description
Description
Revisions and Commits
Revisions and Commits
rDSEA Archive search | |||
D5465 | rDSEAebee5d1ba6b3 Add basic support for an origin blocklist |
Status | Assigned | Task | ||
---|---|---|---|---|
Migrated | gitlab-migration | T3087 Implement support for takedown notices (infra, admin tools, workflow) | ||
Migrated | gitlab-migration | T1099 support origin and SWHID blocklist for archive search and browse | ||
Migrated | gitlab-migration | T3224 Implement blocklist support in swh.search |
Event Timeline
Comment Actions
This has now been deployed and tested in staging with a canary origin (github.com/olasd/Pythagore). Time to deploy in production.
Comment Actions
And this is now available in production.
To blocklist an origin:
from swh.search import get_search s = get_search(cls='remote', url='http://search0.internal.staging.swh.network:5010') s.origin_update([{"url": "https://github.com/olasd/Pythagore", "blocklisted": True}])