Page MenuHomeSoftware Heritage

lister gitlab: Ingest inria forge instance
Closed, ResolvedPublic

Description

As per title

Event Timeline

ardumont triaged this task as Normal priority.Oct 5 2018, 10:31 AM
ardumont created this task.

This should be good now.

data.csv:

swh-lister-gitlab-full;recurring;[{"instance": "inria", "api_baseurl": "https://gitlab.inria.fr/api/v4"}]
swh-lister-gitlab-incremental;recurring;[{"instance": "inria", "api_baseurl": "https://gitlab.inria.fr/api/v4"}]

Scheduling those tasks:

cat data.csv | python3 -m swh.scheduler.cli task schedule -c type -c policy -c args --delimiter ';' -

The first full listing is done.

Ingestion done [1]

11:17:47 softwareheritage@db:5432=> select count(*) from origin where type='git' and url like 'https://gitlab.inria.fr%';
┌───────┐
│ count │
├───────┤
│  1006 │
└───────┘
(1 row)

Time: 16222.339 ms (00:16.222)
11:18:21 softwareheritage@db:5432=> select count(*) from origin o inner join origin_visit ov on o.id=ov.origin where type='git' and url like 'https://gitlab.inria.fr%';
┌───────┐
│ count │
├───────┤
│  1006 │
└───────┘
(1 row)

Time: 7215.516 ms (00:07.216)  ; this was 0 on friday (at the deployment time)

[1] Like for the main gitlab.com instance the first time, i up-ed their task priority to 'high' for this to happen.