Now that the Web API graph endpoint is available in production, we are able to query the Graph service and resolve origin sha1s to their associated URL.
But currently, the web application in production is sending requests to belvedere (as somerset is experiencing disk usage issues).
While somerset has an index on the origin table for fast origin lookup by sha1:
15:11 $ psql service=swh-replica psql (13.0 (Debian 13.0-1.pgdg100+1)) SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off) Type "help" for help. softwareheritage=> \d origin Table "public.origin" Column | Type | Collation | Nullable | Default --------+--------+-----------+----------+------------------------------------ id | bigint | | not null | nextval('origin_id_seq'::regclass) url | text | | not null | Indexes: "origin_pkey" PRIMARY KEY, btree (id) "origin_digest_idx" btree (digest(url, 'sha1'::text)) "origin_url_idx" btree (url) "origin_url_idx1" gin (url gin_trgm_ops)
it is missing on belvedere:
12:27 $ psql service=swh psql (13.0 (Debian 13.0-1.pgdg100+1), server 12.4 (Debian 12.4-1.pgdg100+1)) SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off) Type "help" for help. softwareheritage=> \d origin Table "public.origin" Column | Type | Collation | Nullable | Default --------+--------+-----------+----------+------------------------------------ id | bigint | | not null | nextval('origin_id_seq'::regclass) url | text | | not null | Indexes: "origin_pkey" PRIMARY KEY, btree (id) "origin_url_idx" gin (url gin_trgm_ops) "origin_url_idx1" hash (url) "origin_url_idx2" UNIQUE, btree (url) Referenced by: TABLE "origin_metadata" CONSTRAINT "origin_metadata_origin_fkey" FOREIGN KEY (origin_id) REFERENCES origin(id) TABLE "origin_visit" CONSTRAINT "origin_visit_origin_fkey" FOREIGN KEY (origin) REFERENCES origin(id) TABLE "skipped_content" CONSTRAINT "skipped_content_origin_fkey" FOREIGN KEY (origin) REFERENCES origin(id) Publications: "softwareheritage"