Page MenuHomeSoftware Heritage

System administrationFolder
ActivePublic

Members

  • This project does not have any members.
  • View All

Watchers

  • This project does not have any watchers.
  • View All

Details

Description

general system administration tasks, not specific to any product

Recent Activity

Today

vlorentz moved T3450: 404 error when visiting a successfully archived repository from code-review/monitoring to done on the System administration board.
Mon, Aug 2, 10:38 AM · Storage manager, System administration
vlorentz closed T3450: 404 error when visiting a successfully archived repository as Resolved.

This should be resolved now (actually, on the 31st at 16).

Mon, Aug 2, 10:37 AM · Storage manager, System administration
vlorentz added a comment to T3444: 26/07/2021: Unstuck infrastructure outage then post-mortem.

my understanding of the logs is that three OSDs (osd.3, osd.11, and osd.14 IIRC) were suddenly all stuck "waiting for subops". Unfortunately, while this is easy to debug while the issue is happening (dump_ops_in_flight), I didn't think of it at the time, given the hurry to fix the issue...

Mon, Aug 2, 10:27 AM · System administration
douardda added a comment to T3444: 26/07/2021: Unstuck infrastructure outage then post-mortem.

FTR I've tried to investigate a bit to find clues of what the origin of the outage was, but I did not find any obvious culprit.

Mon, Aug 2, 10:06 AM · System administration

Fri, Jul 30

ardumont added a comment to T3338: Load the archived bitbucket mercurial repositories.
13:40:06 softwareheritage@belvedere:5432=> select now(), count(distinct url) from origin o inner join origin_visit ov on o.id=ov.origin where o.url like 'https://bitbucket.org/%' and ov.type='hg';
+-------------------------------+--------+
|              now              | count  |
+-------------------------------+--------+
| 2021-07-30 11:39:37.122152+00 | 253848 |
+-------------------------------+--------+
(1 row)
Fri, Jul 30, 2:49 PM · System administration, Mercurial loader
ardumont claimed T3338: Load the archived bitbucket mercurial repositories.

(claiming i said ;)

Fri, Jul 30, 2:42 PM · System administration, Mercurial loader
ardumont placed T3338: Load the archived bitbucket mercurial repositories up for grabs.

(Claiming the task to find it back more easily through my activity view.)

Fri, Jul 30, 1:42 PM · System administration, Mercurial loader
ardumont moved T3338: Load the archived bitbucket mercurial repositories from in-progress to code-review/monitoring on the System administration board.
Fri, Jul 30, 1:33 PM · System administration, Mercurial loader
ardumont added a comment to T3338: Load the archived bitbucket mercurial repositories.

Started in the same tmux session as the sourceforge ingestion.

Fri, Jul 30, 1:33 PM · System administration, Mercurial loader
vlorentz added a project to T3452: Replication lag between the dbs should raise icinga alerts: Monitoring.
Fri, Jul 30, 1:32 PM · Monitoring, System administration
ardumont changed the status of T3338: Load the archived bitbucket mercurial repositories from Open to Work in Progress.
Fri, Jul 30, 1:04 PM · System administration, Mercurial loader
ardumont added a comment to T3338: Load the archived bitbucket mercurial repositories.

It would probably make sense to set up a new worker instance for this to avoid interfering with the regular loading.

Fri, Jul 30, 1:04 PM · System administration, Mercurial loader
ardumont closed T3337: Smoke test ingestion of bitbucket repositories with latest loader mercurial, a subtask of T3338: Load the archived bitbucket mercurial repositories, as Resolved.
Fri, Jul 30, 1:00 PM · System administration, Mercurial loader
ardumont closed T3337: Smoke test ingestion of bitbucket repositories with latest loader mercurial as Resolved.
Fri, Jul 30, 1:00 PM · System administration, Mercurial loader
ardumont added a comment to T3337: Smoke test ingestion of bitbucket repositories with latest loader mercurial.

Smoke test it with local bitbucket repositories

that's next.

Fri, Jul 30, 12:59 PM · System administration, Mercurial loader
ardumont claimed T3337: Smoke test ingestion of bitbucket repositories with latest loader mercurial.
Fri, Jul 30, 12:22 PM · System administration, Mercurial loader
ardumont changed the status of T3337: Smoke test ingestion of bitbucket repositories with latest loader mercurial from Open to Work in Progress.
Fri, Jul 30, 12:16 PM · System administration, Mercurial loader
ardumont changed the status of T3337: Smoke test ingestion of bitbucket repositories with latest loader mercurial, a subtask of T3338: Load the archived bitbucket mercurial repositories, from Open to Work in Progress.
Fri, Jul 30, 12:16 PM · System administration, Mercurial loader
ardumont added a comment to T3337: Smoke test ingestion of bitbucket repositories with latest loader mercurial.

Smoke test it with remote repositories.

Fri, Jul 30, 12:16 PM · System administration, Mercurial loader
ardumont moved T3444: 26/07/2021: Unstuck infrastructure outage then post-mortem from in-progress to code-review/monitoring on the System administration board.
Fri, Jul 30, 11:22 AM · System administration
ardumont moved T3446: Restart scheduling regularly origins with relevant scheduling policies from in-progress to code-review/monitoring on the System administration board.
Fri, Jul 30, 11:22 AM · System administration
ardumont moved T3374: Ingest sourceforge repositories (origins of type git, svn, hg) from in-progress to code-review/monitoring on the System administration board.
Fri, Jul 30, 11:22 AM · System administration, Archive coverage, Origin-SourceForge
ardumont moved T3450: 404 error when visiting a successfully archived repository from in-progress to code-review/monitoring on the System administration board.
Fri, Jul 30, 11:22 AM · Storage manager, System administration
ardumont added a comment to T3450: 404 error when visiting a successfully archived repository.

Thanks for the heads up @ both of you.

Fri, Jul 30, 11:14 AM · Storage manager, System administration
ardumont triaged T3452: Replication lag between the dbs should raise icinga alerts as High priority.
Fri, Jul 30, 11:13 AM · Monitoring, System administration
vlorentz added a project to T3450: 404 error when visiting a successfully archived repository: Storage manager.
Fri, Jul 30, 11:07 AM · Storage manager, System administration
vlorentz added a comment to T3450: 404 error when visiting a successfully archived repository.

@ardumont noticed the replication was blocked, but our automated monitoring didn't alert us. He unblocked the replication, so your code should appear in the next hours.

Fri, Jul 30, 11:07 AM · Storage manager, System administration
ardumont changed the status of T3450: 404 error when visiting a successfully archived repository from Open to Work in Progress.
Fri, Jul 30, 10:40 AM · Storage manager, System administration
ardumont updated the task description for T3374: Ingest sourceforge repositories (origins of type git, svn, hg).
Fri, Jul 30, 10:25 AM · System administration, Archive coverage, Origin-SourceForge
vlorentz triaged T3451: Convert the refresh-savecodenow-statuses cron to a systemd timer as Low priority.
Fri, Jul 30, 10:25 AM · Web app, System administration
vlorentz triaged T3450: 404 error when visiting a successfully archived repository as High priority.

This may simply be lag between the loader's and the frontend's databases.

Fri, Jul 30, 10:12 AM · Storage manager, System administration
ardumont added a comment to T3446: Restart scheduling regularly origins with relevant scheduling policies.

Heads up, this is running slightly different now.

Fri, Jul 30, 9:59 AM · System administration

Thu, Jul 29

ardumont renamed T3337: Smoke test ingestion of bitbucket repositories with latest loader mercurial from Deploy swh.loader.mercurial 1.0 in production to Smoke test ingestion of bitbucket repositories with latest loader mercurial.
Thu, Jul 29, 6:36 PM · System administration, Mercurial loader
ardumont added a comment to T3336: Deploy swh.loader.mercurial 2.1 in staging.

For history purpose readabillty, this must bev2.1 git-patched version (not a release per say).
A more recent version release which is a tag v2.1.0 [1] has been done built with the work
solving the work @olasd started with versioned extid.

Thu, Jul 29, 6:33 PM · System administration, Mercurial loader
ardumont added a subtask for T3338: Load the archived bitbucket mercurial repositories: T3418: Decide a consistent policy on having multiple archived objects for the same extid.
Thu, Jul 29, 6:30 PM · System administration, Mercurial loader
ardumont moved T3338: Load the archived bitbucket mercurial repositories from Backlog to Weekly backlog on the System administration board.

Latest mercurial loader v2.1 deployed [1] [2]
We should be able to continue with this now.

Thu, Jul 29, 6:29 PM · System administration, Mercurial loader
ardumont moved T3448: production: Deploy swh.loader.mercurial v2.1.0 from deployed/landed to done on the System administration board.
Thu, Jul 29, 6:21 PM · System administration, Storage manager, Mercurial loader
ardumont moved T3448: production: Deploy swh.loader.mercurial v2.1.0 from in-progress to deployed/landed on the System administration board.
Thu, Jul 29, 6:21 PM · System administration, Storage manager, Mercurial loader
ardumont closed T3448: production: Deploy swh.loader.mercurial v2.1.0 as Resolved.

It's working but the check does not pass green [1]. As far as i could tell, the unsuccessful
event [1] is seen as failure by the check.

Thu, Jul 29, 6:20 PM · System administration, Storage manager, Mercurial loader
ardumont closed T3443: Deploy lister v1.5.0 as Resolved.
Thu, Jul 29, 5:48 PM · System administration, Lister
ardumont moved T3443: Deploy lister v1.5.0 from in-progress to deployed/landed on the System administration board.
Thu, Jul 29, 5:48 PM · System administration, Lister
ardumont added a comment to T3443: Deploy lister v1.5.0.

Still ongoing.

17:31:19 softwareheritage-scheduler@belvedere:5432=> select now(), visit_type, count(*) from listed_origins lo inner join listers l on l.id=lo.lister_id where l.name='gitlab' and l.instance_name='gitlab' group by visit_type order by count(
*) desc;
+-------------------------------+------------+---------+
|              now              | visit_type |  count  |
+-------------------------------+------------+---------+
| 2021-07-29 15:46:55.979469+00 | git        | 1747880 |
+-------------------------------+------------+---------+
(1 row)
Thu, Jul 29, 5:47 PM · System administration, Lister
ardumont moved T3374: Ingest sourceforge repositories (origins of type git, svn, hg) from Backlog to in-progress on the System administration board.
Thu, Jul 29, 5:45 PM · System administration, Archive coverage, Origin-SourceForge
ardumont added a project to T3374: Ingest sourceforge repositories (origins of type git, svn, hg): System administration.
Thu, Jul 29, 5:45 PM · System administration, Archive coverage, Origin-SourceForge
ardumont changed the status of T3448: production: Deploy swh.loader.mercurial v2.1.0 from Open to Work in Progress.
Thu, Jul 29, 5:40 PM · System administration, Storage manager, Mercurial loader
ardumont added a comment to T3448: production: Deploy swh.loader.mercurial v2.1.0.

At the end of it all though, the final production check end-to-end for mercurial origin should go green.

Thu, Jul 29, 5:40 PM · System administration, Storage manager, Mercurial loader
ardumont updated the task description for T3448: production: Deploy swh.loader.mercurial v2.1.0.
Thu, Jul 29, 5:32 PM · System administration, Storage manager, Mercurial loader
ardumont moved T3448: production: Deploy swh.loader.mercurial v2.1.0 from Backlog to Weekly backlog on the System administration board.
Thu, Jul 29, 1:24 PM · System administration, Storage manager, Mercurial loader
ardumont moved T3394: cassandra - origin url hashing encoding issue from Backlog to done on the System administration board.
Thu, Jul 29, 1:24 PM · System administration, Storage manager
ardumont moved T3397: Improve the documentation deployment to support multiple sphinx instances from Backlog to done on the System administration board.
Thu, Jul 29, 1:24 PM · System administration, Documentation