Page MenuHomeSoftware Heritage

staging: Deploy metadata loader
Closed, MigratedEdits Locked

Description

Plan to be defined further:

  • Prepare repository with the correct flag (has-debian-branches, ...) [1]
  • Prepare ci [2] (already done)
  • Prepare git repository hosting server (tate) so debian package build happens out of a tag
  • rDLDMD7350b5bdf6943f30cda72339307757fbc517b7a6: Prepare debian package
  • debian builds ok? yes [4]
  • Create new sentry issue for the loader
  • Prepare puppet manifest for new service swh-worker@loader_metadata
  • D7687: Adapt swh-worker@loader_git manifest to optionally install the dependency on loader metadata
  • Deploy and restart swh-worker@loader_git

[1] https://docs.softwareheritage.org/devel/tutorials/add-new-package.html

[2] https://jenkins.softwareheritage.org/job/debian/job/packages/job/DLDMD/

[3]

root@tate:/srv/phabricator# phabricator-setup-hook /srv/phabricator/repos/258 post-receive-swh-modules
Hook post-receive-swh-modules successfully installed on /srv/phabricator/repos/258:
lrwxrwxrwx 1 phabricator phabricator 39 Apr 26 12:57 post-receive -> ../../../hooks/post-receive-swh-modules
root@tate:/srv/phabricator# ls -lah repos/258/hooks/post-receive
lrwxrwxrwx 1 phabricator phabricator 39 Apr 26 12:57 repos/258/hooks/post-receive -> ../../../hooks/post-receive-swh-modules

[4] https://jenkins.softwareheritage.org/job/debian/job/packages/job/DLDMD/job/gbp-buildpackage/

Event Timeline

ardumont triaged this task as Normal priority.Apr 22 2022, 3:50 PM
ardumont created this task.
ardumont updated the task description. (Show Details)
ardumont renamed this task from staging: Deploy metadata fetcher to staging: Deploy metadata loader.Apr 22 2022, 3:53 PM
ardumont updated the task description. (Show Details)
ardumont changed the task status from Open to Work in Progress.Apr 26 2022, 3:32 PM
ardumont updated the task description. (Show Details)
ardumont updated the task description. (Show Details)
ardumont moved this task from Weekly backlog to in-progress on the System administration board.
ardumont updated the task description. (Show Details)

So status on this, the git loader queue is quite filled in.
So triggering specific loading for github origins through the cli [1] was in order.
We do obtain results after that [2]

[1]

swhworker@worker3:~$ url=https://github.com/progval/irctest ; swh loader run git $url lister_name=github lister_instance_name="github"
WARNING:swh.core.cli:Could not load subcommand graph: DistributionNotFound(Requirement.parse('py4j'), None)
INFO:swh.loader.git.loader.GitLoader:Load origin 'https://github.com/progval/irctest' with type 'git'
WARNING:swh.lister.github.utils:No tokens set in configuration, using anonymous mode
Enumerating objects: 6351, done.
Counting objects: 100% (1686/1686), done.
Compressing objects: 100% (294/294), done.
Total 6351 (delta 1456), reused 1572 (delta 1391), pack-reused 4665
INFO:swh.loader.git.loader:Listed 172 refs for repo https://github.com/progval/irctest
/usr/lib/python3/dist-packages/swh/storage/api/client.py:45: DeprecationWarning: Call to deprecated method post. (Use _post instead) -- Deprecated since version 2.1.0.
  return self.post("content/add", {"content": content})
{'status': 'eventful'} for origin 'https://github.com/progval/irctest'

[2]

16:08:22 swh@db1:5432=> select target, format from raw_extrinsic_metadata where target > 'swh:1:ori:' and format='application/vnd.github.v3+json' limit 10;
+----------------------------------------------------+--------------------------------+
|                       target                       |             format             |
+----------------------------------------------------+--------------------------------+
| swh:1:ori:fbc59d47d4cc806cfdae45afc0a77a9a3dd482a6 | application/vnd.github.v3+json |
| swh:1:ori:8d179abca2faf8d58ed7a16c7b619e873b278285 | application/vnd.github.v3+json |
| swh:1:ori:a0016c9d9fc19f7dcb6a76f42bd66c76de3c12eb | application/vnd.github.v3+json |
| swh:1:ori:a0016c9d9fc19f7dcb6a76f42bd66c76de3c12eb | application/vnd.github.v3+json |
+----------------------------------------------------+--------------------------------+
(4 rows)

Time: 113.204 ms
ardumont claimed this task.
ardumont moved this task from deployed/landed/monitoring to done on the System administration board.