- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Nov 24 2020
The backfillng is done for several objects type and still in progress for revision, content, directory :
root@journal0:/opt/kafka/bin# for topic in $(./kafka-topics.sh --bootstrap-server $SERVER --list); do echo -n "$topic : "; ./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list $SERVER --topic $topic | awk -F: '{s+=$3}END{print s}'; done __consumer_offsets : 0 swh.journal.objects.content : 927440 swh.journal.objects.directory : 213279 swh.journal.objects.metadata_authority : 0 swh.journal.objects.metadata_fetcher : 0 swh.journal.objects.origin : 62892 swh.journal.objects.origin_visit : 68368 swh.journal.objects.origin_visit_status : 136721 swh.journal.objects.raw_extrinsic_metadata : 0 swh.journal.objects.release : 3101 swh.journal.objects.revision : 155746 swh.journal.objects.skipped_content : 189 swh.journal.objects.snapshot : 36046 swh.journal.objects_privileged.release : 0 swh.journal.objects_privileged.revision : 0
I have some doubts on how to import the following object types and if they need to :
- swh.journal.objects.metadata_authority
- swh.journal.objects.metadata_fetcher
- swh.journal.objects.raw_extrinsic_metadata
- swh.journal.objects_privileged.release
- swh.journal.objects_privileged.revision
The topics were created with 64 partitions and a replication factor of 1 :
- the vm memory was increased from 12G to 20G (completely "pifometrique" approximated value)
- a new data disk of 500Go is attached to the VM (there are currently 300G of objects on storage1.staging)
- the kafka's logdir was configured to be stored on a zfs volume composed of only the new data disk :
root@journal0:~# apt install zfs-dkms ## reboot
Kafka is up and running on journal0.
The next steps are:
- tune the server as there is not a lot of disk space (and memory but only if needed) :
root@journal0:~# df -h Filesystem Size Used Avail Use% Mounted on udev 5.9G 0 5.9G 0% /dev tmpfs 1.2G 560K 1.2G 1% /run /dev/vda1 32G 7.2G 24G 24% / tmpfs 5.9G 8.0K 5.9G 1% /dev/shm tmpfs 5.0M 0 5.0M 0% /run/lock tmpfs 4.0M 0 4.0M 0% /sys/fs/cgroup tmpfs 244M 0 244M 0% /run/user/1025
root@journal0:~# free -h total used free shared buff/cache available Mem: 11Gi 6.5Gi 354Mi 11Mi 4.8Gi 4.9Gi Swap: 0B 0B 0B
- Create the topics as explained in T2520#48682 (with a smaller number of partition and a replication factor to 1 as we only have one staging server)
- Launch the backfill to populate kafka with the current content of the staging archive
precision: the tests were only done on the internal plaintext listener without sasl authentication
Rename the zookeper cluster to match the kafka cluster name
The configuration seems to work pretty well in vagrant :
root@journal0:/opt/kafka/bin# ./kafka-topics.sh --bootstrap-server 10.168.130.70:9092 --list | xargs -n1 -i{} ./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list 10.168.130.70:9092 --topic {} | grep -v consumer swh.journal.objects.content:0:898 swh.journal.objects.directory:0:1172 swh.journal.objects.origin:0:26 swh.journal.objects.origin_visit:0:55 swh.journal.objects.origin_visit_status:0:2 swh.journal.objects.release:0:0 swh.journal.objects.revision:0:209 swh.journal.objects.snapshot:0:1
I let the worker create the topics so there is only one partition. It will need to be adapted for the staging environment
Fix wrong indentation on the storage service configuration
Nov 23 2020
Move the sentry dsn declarations with their friends
closed by rSPSITEed253c86d4f6
rebase
Avoid the declaration of the parent directories by using mkdir -p
Nov 20 2020
All the loader where restarted on all the workers :
sudo clush -b -w @swh-workers 'apt-get update && apt-get -y upgrade -V' sudo clush -b -w @swh-workers 'puppet agent --enable && puppet agent --test' sudo clush -b -w @swh-workers 'systemctl default'
Automatic tasks restarted on worker01, the logs are under watch.
after upgrading the packages on worker01, the npm load was successful :
swhworker@worker01:~$ time SWH_CONFIG_FILENAME=/etc/softwareheritage/loader_npm.yml swh loader run npm https://www.npmjs.com/package/bootstrap-vue {'status': 'eventful', 'snapshot_id': '30d32aff7fab1a2c364dc5c61503b0aec3f9fb11'}
The problem is not reproduced in staging but the worker and storage have the same package versions:
vsellier@worker0 ~ % apt list --upgradable Listing... Done python3-swh.deposit.client/unknown 0.6.0-1~swh1~bpo10+1 all [upgradable from: 0.5.0-1~swh1~bpo10+1] python3-swh.deposit.loader/unknown 0.6.0-1~swh1~bpo10+1 all [upgradable from: 0.5.0-1~swh1~bpo10+1] python3-swh.deposit/unknown 0.6.0-1~swh1~bpo10+1 all [upgradable from: 0.5.0-1~swh1~bpo10+1] python3-swh.indexer.storage/unknown 0.5.0-2~swh1~bpo10+1 all [upgradable from: 0.4.2-1~swh1~bpo10+1] python3-swh.indexer/unknown 0.5.0-2~swh1~bpo10+1 all [upgradable from: 0.4.2-1~swh1~bpo10+1] python3-swh.journal/unknown 0.5.1-1~swh1~bpo10+1 all [upgradable from: 0.5.0-1~swh1~bpo10+1] python3-swh.loader.git/unknown 0.5.0-1~swh1~bpo10+1 all [upgradable from: 0.4.1-1~swh1~bpo10+1] python3-swh.model/unknown 0.9.0-1~swh1~bpo10+1 all [upgradable from: 0.7.3-1~swh1~bpo10+1] python3-swh.storage/unknown 0.17.2-1~swh1~bpo10+1 all [upgradable from: 0.17.0-1~swh1~bpo10+1] python3-swh.vault/unknown 0.3.3-1~swh1~bpo10+1 all [upgradable from: 0.3.1-1~swh1~bpo10+1]
vsellier@storage1 ~ % apt list --upgradable Listing... Done libpq5/buster-pgdg 13.1-1.pgdg100+1 amd64 [upgradable from: 13.0-1.pgdg100+1] postgresql-13/buster-pgdg 13.1-1.pgdg100+1 amd64 [upgradable from: 13.0-1.pgdg100+1] postgresql-client-13/buster-pgdg 13.1-1.pgdg100+1 amd64 [upgradable from: 13.0-1.pgdg100+1] postgresql-client-common/buster-pgdg 223.pgdg100+1 all [upgradable from: 220.pgdg100+1] postgresql-client/buster-pgdg 13+223.pgdg100+1 all [upgradable from: 13+220.pgdg100+1] postgresql-common/buster-pgdg 223.pgdg100+1 all [upgradable from: 220.pgdg100+1] postgresql/buster-pgdg 13+223.pgdg100+1 all [upgradable from: 13+220.pgdg100+1] python3-swh.indexer.storage/unknown 0.5.0-2~swh1~bpo10+1 all [upgradable from: 0.4.2-1~swh1~bpo10+1] python3-swh.indexer/unknown 0.5.0-2~swh1~bpo10+1 all [upgradable from: 0.4.2-1~swh1~bpo10+1] python3-swh.journal/unknown 0.5.1-1~swh1~bpo10+1 all [upgradable from: 0.5.0-1~swh1~bpo10+1] python3-swh.model/unknown 0.9.0-1~swh1~bpo10+1 all [upgradable from: 0.7.3-1~swh1~bpo10+1] python3-swh.storage/unknown 0.17.2-1~swh1~bpo10+1 all [upgradable from: 0.17.0-1~swh1~bpo10+1]
- after upgrading storage1.staging, the exact problem is also present
- After upgrading the worker, everything goes well.
- puppet applied on worker01
- task by tasks tests :
- mercurial :
swhworker@worker01:~$ SWH_CONFIG_FILENAME=/etc/softwareheritage/loader_mercurial.yml swh loader run mercurial https://foss.heptapod.net/fluiddyn/fluidfft INFO:swh.loader.mercurial.Bundle20Loader:Load origin 'https://foss.heptapod.net/fluiddyn/fluidfft' with type 'hg' {'status': 'eventful'} swhworker@worker01:~$ SWH_CONFIG_FILENAME=/etc/softwareheritage/loader_mercurial.yml swh loader run mercurial https://hg.mozilla.org/projects/nss INFO:swh.loader.mercurial.Bundle20Loader:Load origin 'https://hg.mozilla.org/projects/nss' with type 'hg' WARNING:swh.loader.mercurial.Bundle20Loader:No matching revision for tag NSS_3_15_5_BETA2 (hg changeset: e5d3ec1d9a35f7cac554543d52775092de9f6a01). Skipping WARNING:swh.loader.mercurial.Bundle20Loader:No matching revision for tag NSS_3_15_5_BETA2 (hg changeset: 0000000000000000000000000000000000000000). Skipping WARNING:swh.loader.mercurial.Bundle20Loader:No matching revision for tag NSS_3_18_RTM (hg changeset: 0000000000000000000000000000000000000000). Skipping WARNING:swh.loader.mercurial.Bundle20Loader:No matching revision for tag NSS_3_18_RTM (hg changeset: 0000000000000000000000000000000000000000). Skipping WARNING:swh.loader.mercurial.Bundle20Loader:No matching revision for tag NSS_3_24_BETA3 (hg changeset: 0000000000000000000000000000000000000000). Skipping {'status': 'eventful'}
- svn
root@worker01:~# SWH_CONFIG_FILENAME=/etc/softwareheritage/loader_svn.yml swh loader run svn svn://svn.appwork.org/utils INFO:swh.loader.svn.SvnLoader:Load origin 'svn://svn.appwork.org/utils' with type 'svn' INFO:swh.loader.svn.SvnLoader:Processing revisions [3428-3436] for {'swh-origin': 'svn://svn.appwork.org/utils', 'remote_url': 'svn://svn.appwork.org/utils', 'local_url': b'/tmp/swh.loader.svn.dojsubkd-890577/utils', 'uuid': b'21714237-3853-44ef-a1f0-ef8f03a7d1fe'} {'status': 'eventful'}
- npm:
ko : https://sentry.softwareheritage.org/share/issue/363ef9d218ac4817a992b7dc9bf283a6/
root@worker01:~# SWH_CONFIG_FILENAME=/etc/softwareheritage/loader_npm.yml swh loader run npm https://www.npmjs.com/package/bootstrap-vue WARNING:swh.storage.retry:Retry adding a batch WARNING:swh.storage.retry:Retry adding a batch WARNING:swh.storage.retry:Retry adding a batch ERROR:swh.loader.package.loader:Failed loading branch releases/2.18.0 for https://www.npmjs.com/package/bootstrap-vue Traceback (most recent call last): File "/usr/lib/python3/dist-packages/tenacity/__init__.py", line 333, in call result = fn(*args, **kwargs) File "/usr/lib/python3/dist-packages/swh/storage/retry.py", line 117, in raw_extrinsic_metadata_add return self.storage.raw_extrinsic_metadata_add(metadata) File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 181, in meth_ return self.post(meth._endpoint_path, post_data) File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 278, in post return self._decode_response(response) File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 352, in _decode_response self.raise_for_status(response) File "/usr/lib/python3/dist-packages/swh/storage/api/client.py", line 29, in raise_for_status super().raise_for_status(response) File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 342, in raise_for_status raise exception from None swh.core.api.RemoteException: <RemoteException 500 TypeError: ["__init__() got an unexpected keyword argument 'id'"]>
root@worker01:~# SWH_CONFIG_FILENAME=/etc/softwareheritage/loader_npm.yml swh loader run npm https://www.npmjs.com/package/vue
ERROR:swh.loader.package.loader:Failed loading branch releases/0.8.6 for https://www.npmjs.com/package/vue
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/swh/loader/package/loader.py", line 424, in load res = self._load_revision(p_info, origin) File "/usr/lib/python3/dist-packages/swh/loader/package/loader.py", line 539, in _load_revision dl_artifacts = self.download_package(p_info, tmpdir) File "/usr/lib/python3/dist-packages/swh/loader/package/loader.py", line 277, in download_package return [download(p_info.url, dest=tmpdir, filename=p_info.filename)] File "/usr/lib/python3/dist-packages/swh/loader/package/utils.py", line 80, in download raise ValueError("Fail to query '%s'. Reason: %s" % (url, response.status_code))
ValueError: Fail to query 'https://registry.npmjs.org/vue/-/vue-0.8.6.tgz'. Reason: 404
after applying the D4359 change on saam, the load is ok :
root@worker01:/etc/softwareheritage# sudo -u swhworker SWH_CONFIG_FILENAME=/etc/softwareheritage/loader_git.yml swh loader run git https://github.com/SoftwareHeritage/puppet-swh-site INFO:swh.loader.git.BulkLoader:Load origin 'https://github.com/SoftwareHeritage/puppet-swh-site' with type 'git' Enumerating objects: 537, done. Counting objects: 100% (537/537), done. Compressing objects: 100% (326/326), done. Total 19066 (delta 260), reused 445 (delta 194), pack-reused 18529 INFO:swh.loader.git.BulkLoader:Listed 3 refs for repo https://github.com/SoftwareHeritage/puppet-swh-site {'status': 'eventful'}
update the commit message
- The configuration was applied on moma
- a manual import was performed on worker01 :
- the /etc/softwareheritage/loader_git.yaml config was updated:
root@worker01:/etc/softwareheritage# diff -U3 /tmp/loader_git.yml loader_git.yml --- /tmp/loader_git.yml 2020-11-20 08:43:18.682462213 +0000 +++ loader_git.yml 2020-11-20 08:44:00.150375756 +0000 @@ -13,7 +13,7 @@ - cls: filter - cls: remote args: - url: http://uffizi.internal.softwareheritage.org:5002/ + url: http://saam.internal.softwareheritage.org:5002/ max_content_size: 104857600 save_data: false save_data_path: "/srv/storage/space/data/sharded_packfiles"
- the import was run on the puppet-swh-site repository:
root@worker01:/etc/softwareheritage# sudo -u swhworker SWH_CONFIG_FILENAME=/etc/softwareheritage/loader_git.yml swh loader run git https://github.com/SoftwareHeritage/puppet-swh-site
The first try returns this exception :
swh.core.api.RemoteException: <RemoteException 500 ValueError: ["Storage class azure-prefixed is not available: No module named 'swh.objstorage.backends.azure'"]>
rebase
rebase
In D4534#113059, @olasd wrote:We use the puppet java module in a bunch of other places, maybe it makes sense to directly import that (which would mean using include ::java)?
Use ::java instead of directly install the jre package
Nov 19 2020
I will not land this now, it seems there is another issue with the startup of kafka when the logdir is already existing but empty (i.e. created by puppet). I need to dig further
The systemd configuration looks good.
Looks fine to me except for the vagrant bit.
thanks, it's fixed
fix misdirected removal
up -d is waiting the containers are "started" before returning the hand so you are sure the execs on line 169- can be executed.
You will also miss the return code of the docker-compose up command
Nov 18 2020
Nov 17 2020
Rectification : kafka is installed on the node but it seems the configuration is not complete