Page MenuHomeSoftware Heritage
Paste Active Pastes
  • refs/heads/master 69394173
    refs/pull/@@NUM@@/head 58297691
    HEAD 51612028
    releases/@@NUM@@.@@NUM@@.@@NUM@@ 11783124
    refs/changes/@@NUM@@/@@NUM@@/@@NUM@@ 6991689
    ...
    • May 13 2020, 7:52 PM
    • 3,613 Lines
  • tmpdir_factory = TempdirFactory(_tmppath_factory=TempPathFactory(_given_basetemp=None, _trace=<pluggy._tracing.TagTracerSub object at 0x7f3d11bfc710>, _basetemp=PosixPath('/tmp/pytest-of-tony/pytest-1')))
    @pytest.fixture(scope="session")
    def elasticsearch_session(tmpdir_factory):
    tmpdir = tmpdir_factory.mktemp("elasticsearch")
    ...
    • May 13 2020, 11:21 AM
    • 175 Lines
  • {
    "extrinsic": {
    "provider": "https://deposit.softwareheritage.org/1/private/602/meta/",
    "raw": {
    "origin": {
    ...
    • May 12 2020, 4:25 PM
    • 79 Lines
  • Staged
    modified swh/search/tests/test_cli.py
    @@ -9,6 +9,7 @@ import yaml
    import pytest
    ...
    • May 12 2020, 2:01 PM
    • 39 Lines
  • ```
    $ curl -i -u user:pass https://deposit.softwareheritage.org/1/servicedocument/
    HTTP/1.1 503 Service Unavailable
    Date: Mon, 11 May 2020 14:03:49 GMT
    Server: gunicorn/19.9.0
    ...
    • May 11 2020, 4:05 PM
    • 25 Lines
  • {
    "extrinsic": {
    "provider": "https://deposit.softwareheritage.org/1/private/596/meta/",
    "raw": {
    "origin": {
    ...
    • May 7 2020, 11:26 AM
    • 79 Lines
  • -- Crawling history of software origin visits by Software Heritage. Each
    -- visit see its history change through new origin visit status updates
    create table origin_visit_status
    (
    origin bigint not null,
    ...
    • May 6 2020, 4:19 PM
    • 516 Lines
  • swh scheduler task list
    WARNING:swh.core.cli:Could not load subcommand search: cannot import name 'get_journal_client' from 'swh.journal.cli' (/home/ddouard/src/swh-environment/swh-journal/swh/journal/cli.py)
    INFO:swh.core.config:Loading config file /etc/softwareheritage/global.ini
    Found 3 tasks
    ...
    • May 5 2020, 12:17 PM
    • 15 Lines
  • softwareheritage-deposit=> select * from deposit_client ;
    user_ptr_id | collections | provider_url | domain
    -------------+-------------+------------------------------------------------+---------------------------------
    2 | {1} | https://hal.archives-ouvertes.fr/ | archives-ouvertes.fr
    3 | {2} | https://www.softwareheritage.org | softwareheritage.org
    ...
    • May 4 2020, 4:31 PM
    • 10 Lines
  • swh-loader_1 | Installing collected packages: swh.deposit
    swh-loader_1 | Attempting uninstall: swh.deposit
    swh-loader_1 | Found existing installation: swh.deposit 0.0.82
    swh-loader_1 | Uninstalling swh.deposit-0.0.82:
    swh-loader_1 | Successfully uninstalled swh.deposit-0.0.82
    ...
    • May 4 2020, 11:54 AM
    • 16 Lines
  • docker-compose up swh-loader
    docker_swh-scheduler-db_1 is up-to-date
    docker_swh-deposit-db_1 is up-to-date
    docker_swh-storage-db_1 is up-to-date
    Recreating docker_swh-objstorage_1 ...
    ...
    • May 4 2020, 11:20 AM
    • 54 Lines
  • docker-compose up swh-loader
    docker_swh-storage-db_1 is up-to-date
    docker_swh-objstorage_1 is up-to-date
    docker_amqp_1 is up-to-date
    docker_zookeeper_1 is up-to-date
    ...
    • May 4 2020, 11:01 AM
    • 55 Lines
  • dockeruser@desktop5  /home/dev/swh-environment/docker   master  docker-compose up swh-loader
    docker_amqp_1 is up-to-date
    docker_zookeeper_1 is up-to-date
    docker_swh-storage-db_1 is up-to-date
    docker_swh-objstorage_1 is up-to-date
    ...
    • Apr 30 2020, 1:06 PM
    • 62 Lines
  • version: '2'
    services:
    swh-loader:
    volumes:
    ...
    • Apr 30 2020, 1:06 PM
    • 8 Lines
  • 12:25 <+olasd> ardumont: you should even do this migration the other way around. 1/ create the new table, empty; 2/ deploy new storage that fills both tables in parallel; 3/ run the migration process for origin visits that don't have any state information
    12:26 <+olasd> 4/ switch over the read queries to the new tables; 5/ drop fields from the old table
    12:26 <+olasd> (#showerthoughts)
    12:27 <+olasd> that way, the only downtime needed is during #2 which should be minimal; it should even be doable on the fly as the RPC api is not changing at all
    12:28 <+ardumont> for 2. i need to check the code, i think we stop the writing in origin visit to only write to origin visit status
    ...
    • Apr 30 2020, 12:02 PM
    • 64 Lines
  • ERROR:root:relation "origin_visit_status" does not exist
    LIGNE 4 : INNER JOIN origin_visit_status ovs
    ^
    Traceback (most recent call last):
    File "/home/antoine/.virtualenvs/swh/lib/python3.7/site-packages/flask/app.py", line 1950, in full_dispatch_request
    ...
    • Apr 29 2020, 12:17 PM
    • 35 Lines
  • (swh) david@pavo:~/swh/swh-environment/swh-storage$ /usr/bin/python3 -m tox -e py3 -- -v -m 'not cassandra' -k test_rpc_serve_bwcompat -s
    GLOB sdist-make: /home/david/swh/swh-environment/swh-storage/setup.py
    py3 inst-nodeps: /home/david/swh/swh-environment/swh-storage/.tox/.tmp/package/1/swh.storage-0.0.187.post9.zip
    py3 installed: aiohttp==3.6.2,aiohttp-utils==3.1.1,apipkg==1.5,arrow==0.15.5,async-timeout==3.0.1,attrs==19.3.0,attrs-strict==0.0.6,blinker==1.4,cassandra-driver==3.23.0,certifi==2020.4.5.1,chardet==3.0.4,click==7.1.1,confluent-kafka==1.4.1,coverage==5.1,decorator==4.4.2,Deprecated==1.2.9,execnet==1.7.1,Flask==1.1.2,geomet==0.1.2,gunicorn==20.0.4,hypothesis==5.10.4,idna==2.9,importlib-metadata==1.6.0,iso8601==0.1.12,itsdangerous==1.1.0,Jinja2==2.11.2,MarkupSafe==1.1.1,mirakuru==2.2.0,more-itertools==8.2.0,msgpack==1.0.0,multidict==4.7.5,mypy==0.770,mypy-extensions==0.4.3,packaging==20.3,pkg-resources==0.0.0,pluggy==0.13.1,port-for==0.4,psutil==5.7.0,psycopg2==2.8.5,py==1.8.1,pyparsing==2.4.7,pytest==5.4.1,pytest-cov==2.8.1,pytest-forked==1.1.3,pytest-mock==3.1.0,pytest-postgresql==2.3.0,pytest-xdist==1.31.0,python-dateutil==2.8.1,python-mimeparse==1.6.0,pytz==2019.3,PyYAML==5.3.1,requests==2.23.0,sentry-sdk==0.14.3,six==1.14.0,sortedcontainers==2.1.0,sqlalchemy-stubs==0.3,swh.core==0.0.95,swh.journal==0.0.31,swh.model==0.0.68,swh.objstorage==0.0.42,swh.storage==0.0.187.post9,tenacity==6.1.0,typed-ast==1.4.1,typing-extensions==3.7.4.2,urllib3==1.25.9,vcversioner==2.16.0.0,wcwidth==0.1.9,Werkzeug==1.0.1,wrapt==1.12.1,yarl==1.4.2,zipp==3.1.0
    py3 run-test-pre: PYTHONHASHSEED='2792914223'
    ...
    • Apr 24 2020, 2:38 PM
    • 37 Lines
  • swh-loader_1 | Total 6363 (delta 4101), reused 1867 (delta 1059)
    swh-loader_1 | [2020-04-23 12:04:33,713: INFO/ForkPoolWorker-1] Listed 307 refs for repo https://forge.softwareheritage.org/source/swh-loader-core.git
    swh-loader_1 | [2020-04-23 12:05:06,623: ERROR/ForkPoolWorker-1] Loading failure, updating to `partial` status
    swh-loader_1 | Traceback (most recent call last):
    swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/loader/core/loader.py", line 304, in load
    ...
    • Apr 23 2020, 3:03 PM
    • 25 Lines
  • _________________________________________________________________________ ERROR at setup of test_origin_save_metrics __________________________________________________________________________
    file /home/antoine/swh/swh-environment/swh-web/swh/web/tests/misc/test_metrics.py, line 34
    @pytest.mark.django_db
    def test_origin_save_metrics(client):
    file /home/antoine/.virtualenvs/swh/lib/python3.7/site-packages/pytest_flask/plugin.py, line 132
    ...
    • Apr 23 2020, 2:21 PM
    • 12 Lines
  • $ psql service=staging-swh -c "select convert_from(name, 'utf-8') as bname, rev.metadata from snapshot s inner join snapshot_branches sbs on sbs.snapshot_id=s.object_id inner join snapshot_branch sb on sbs.branch_id=sb.object_id inner join revision rev on (rev.id=sb.target and sb.target_type='revision') where s.id='\x757b6422d07b0f50f983d86b0eef192ef248ca0b'" > output.txt
    | metadata
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    http://kokkinizita.linuxaudio.org/linuxaudio/downloads/AMB-plugins-0.8.1.tar.bz2 | {"extrinsic": {"raw": {"url": "http://kokkinizita.linuxaudio.org/linuxaudio/downloads/AMB-plugins-0.8.1.tar.bz2"}, "when": "2020-03-18T11:40:20.875888+00:00", "provider": "https://nix-community.github.io/nixpkgs-swh/sources.json"}, "original_artifact": [{"length": 28988, "filename": "AMB-plugins-0.8.1.tar.bz2", "checksums": {"sha1": "fda55d11342d9a59ead64e30e037d92114637c87", "sha256": "f44a60b782948662537c0cb14befa6678d6dce790c64dc2c9058eab849a58b74"}}]}
    ...
    • Apr 22 2020, 10:47 AM
    • 12 Lines
  • diff --git a/swh/web/common/origin_visits.py b/swh/web/common/origin_visits.py
    index 4adaed89..e37eedaa 100644
    --- a/swh/web/common/origin_visits.py
    +++ b/swh/web/common/origin_visits.py
    @@ -125,9 +125,9 @@ def get_origin_visit(
    ...
    • Apr 21 2020, 11:44 AM
    • 16 Lines
    • Diff
  • subprocess.run(
    ("pv {export_path}/*/*.edges.csv.zst | "
    "tee graph.edges.csv.zst |"
    "zstdcat |"
    "tee >( wc -l > graph.edges.count.txt ) |"
    ...
    • Apr 17 2020, 4:34 PM
    • 18 Lines
    • Python
  • GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
    swh-dataset-export-seirl-test-20200323 swh.journal.objects.snapshot 234 647473 658902 11429 - - -
    swh-dataset-export-seirl-test-20200323 swh.journal.objects.release 196 98984 99633 649 - - -
    swh-dataset-export-seirl-test-20200323 swh.journal.objects.directory 229 28637831 29006612 368781 - - -
    ...
    • Apr 15 2020, 3:57 PM
    • 1,282 Lines
  • $ swh graph memcache --graph ./swh-graph --cache /dev/shm/graph-small
    $ ls -l /dev/shm/graph-small
    total 16464
    lrwxrwxrwx 1 antoine antoine 47 Apr 10 11:47 stderr -> /home/antoine/swh/graph/small/compressed/stderr
    lrwxrwxrwx 1 antoine antoine 47 Apr 10 11:47 stdout -> /home/antoine/swh/graph/small/compressed/stdout
    ...
    • Apr 10 2020, 11:49 AM
    • 26 Lines
    • Bash Scripting
  • softwareheritage=> \copy metadata_provider to stdout with (format csv, header);
    id,provider_name,provider_type,provider_url,metadata
    1,"",deposit_client,https://hal.archives-ouvertes.fr/,{}
    2,"",deposit_client,https://www.softwareheritage.org,{}
    3,hal,deposit_client,https://hal.archives-ouvertes.fr/,{}
    ...
    • Apr 9 2020, 1:09 PM
    • 181 Lines
  • diff --cc swh/storage/tests/test_converters.py
    index 72443ec4,f3ee1904..00000000
    --- a/swh/storage/tests/test_converters.py
    +++ b/swh/storage/tests/test_converters.py
    @@@ -144,7 -144,7 +144,11 @@@ def test_db_to_release()
    ...
    • Apr 9 2020, 12:05 PM
    • 48 Lines
  • psql service=staging-swh
    psql (11.7, server 11.6 (Debian 11.6-1.pgdg100+1))
    SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
    Type "help" for help.
    ...
    • Apr 9 2020, 10:10 AM
    • 15 Lines
  • 17:00 $ bin/octocatalog-diff --octocatalog-diff-args --no-truncate-details --to staging moma
    Found host moma.softwareheritage.org
    WARN -> Environment "add-keycloak-realm-and-client" contained non-word characters, correcting name to add_keycloak_realm_and_client
    WARN -> Environment "api-remove-rl-for-m" contained non-word characters, correcting name to api_remove_rl_for_m
    WARN -> Environment "change-swh-web-static-dir" contained non-word characters, correcting name to change_swh_web_static_dir
    ...
    • Apr 8 2020, 5:02 PM
    • 23 Lines
  • diff --git a/swh/storage/storage.py b/swh/storage/storage.py
    index 6b52485..44f3df9 100644
    --- a/swh/storage/storage.py
    +++ b/swh/storage/storage.py
    @@ -894,10 +894,9 @@ class Storage:
    ...
    • Apr 8 2020, 3:11 PM
    • 16 Lines
  • sbuild (Debian sbuild) 0.79.0 (05 February 2020) on mirzakhani.olasd.eu
    +==============================================================================+
    | swh-web 0.0.226-1~swh2 (amd64) Tue, 07 Apr 2020 12:37:33 +0000 |
    +==============================================================================+
    ...
    • Apr 7 2020, 2:39 PM
    • 2,871 Lines
  • Apr 07 03:33:13 worker0 python3[16392]: [2020-04-07 03:33:13,434: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: signal 9 (SIGKILL).')
    Apr 07 03:33:13 worker0 python3[16392]: Traceback (most recent call last):
    Apr 07 03:33:13 worker0 python3[16392]: File "/usr/lib/python3/dist-packages/billiard/pool.py", line 1267, in mark_as_worker_lost
    Apr 07 03:33:13 worker0 python3[16392]: human_status(exitcode)),
    Apr 07 03:33:13 worker0 python3[16392]: billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 9 (SIGKILL).
    ...
    • Apr 7 2020, 9:58 AM
    • 10 Lines
  • dir_to_dir
    old: 48341950415
    new: 49543338345
    +2.48%
    ...
    • Apr 5 2020, 2:12 PM
    • 42 Lines
  • origin_visit
    old: 194970670
    new: 1009322644
    snapshot
    ...
    • Apr 4 2020, 12:20 PM
    • 19 Lines
  • $ cat foo.py
    from tenacity import retry
    @retry
    def foo(n):
    ...
    • Apr 2 2020, 6:59 PM
    • 18 Lines
  • @overload
    def _get_key(self, object_type: str, object_: Union[Revision, Release, Directory, Snapshot]) -> bytes:
    ...
    ...
    • Apr 2 2020, 4:22 PM
    • 37 Lines
    • Python
  • def _get_key(
    self,
    object_type: str,
    object_: BaseModel) -> Union[bytes, Dict]:
    if object_type in ('revision', 'release', 'directory', 'snapshot'):
    ...
    • Apr 2 2020, 4:03 PM
    • 28 Lines
    • Python
  • Info: Loading facts
    Info: Caching catalog for 9fb17b6df4b3.test
    Info: Applying configuration version '1585756962'
    Notice: /Stage[main]/Apt/File[preferences]/ensure: created
    Info: /Stage[main]/Apt/File[preferences]: Scheduling refresh of Class[Apt::Update]
    ...
    • Apr 1 2020, 6:08 PM
    • 279 Lines
  • 14:59 $ doco exec puppet-agent puppet agent -t
    Info: Using configured environment 'production'
    Info: Retrieving pluginfacts
    Info: Retrieving plugin
    Info: Retrieving locales
    ...
    • Apr 1 2020, 5:06 PM
    • 307 Lines
  • WARNING:swh.core.cli:Could not load subcommand journal: cannot import name 'HashCollision' from 'swh.storage' (/home/danseraf/swh-environment/swh-storage/swh/storage/__init__.py)
    WARNING:swh.core.cli:Could not load subcommand indexer: cannot import name 'HashCollision' from 'swh.storage' (/home/danseraf/swh-environment/swh-storage/swh/storage/__init__.py)
    WARNING:swh.core.cli:Could not load subcommand search: cannot import name 'HashCollision' from 'swh.storage' (/home/danseraf/swh-environment/swh-storage/swh/storage/__init__.py)
    • Mar 26 2020, 2:32 PM
    • 3 Lines
  • (swh) ~/swh-environment/swh-scanner   master  make check
    python3 -m flake8 swh
    swh/scanner/model.py:70:47: E226 missing whitespace around arithmetic operator
    swh/scanner/scanner.py:18:9: E126 continuation line over-indented for hanging indent
    swh/scanner/scanner.py:25:9: E123 closing bracket does not match indentation of opening bracket's line
    ...
    • Mar 25 2020, 11:00 AM
    • 14 Lines
  • ```
    id | type | arguments | next_run | current_interval | status | policy | retries_left | priorit$
    ---------+-----------------+------------------------------------------------------------------------------------------------------+-------------------------------+------------------+--------------------+-----------+--------------+--------$
    1186662 | load-functional | {"args": [], "kwargs": {"url": "https://nix-community.github.io/nixpkgs-swh/sources.json"}} | 2020-03-25 02:40:59.571373+00 | 1 day | disabled | recurring | 0 |
    1186665 | load-functional | {"args": [], "kwargs": {"url": "https://nix-community.github.io/nixpkgs-swh/sources-unstable.json"}} | 2020-03-25 08:30:08.666509+00 | 1 day | next_run_scheduled | recurring | 0 |
    ...
    • Mar 25 2020, 9:32 AM
    • 8 Lines
  • { config, lib, pkgs, unstable-pkgs, mypkgs, ... }:
    with lib;
    let xsession-enable = config.my.xsession.enable;
    optional-dependencies = config.my.emacs.optional-dependencies;
    ...
    • Mar 24 2020, 8:02 PM
    • 242 Lines
  • from typing import Callable
    class Foo:
    def __init__(self, f: Callable[[int], None]):
    self.f: Callable[[int], None] = f
    ...
    • Mar 24 2020, 5:15 PM
    • 8 Lines
  • from typing import Callable
    class Foo:
    f: Callable[[int], None]
    ...
    • Mar 24 2020, 5:06 PM
    • 15 Lines
  • from typing import Callable
    class Foo:
    f: Callable
    ...
    • Mar 24 2020, 5:02 PM
    • 7 Lines
  • *** rdkafka_cgrp.c:3037:rd_kafka_cgrp_op_serve: assert: rktp->rktp_assigned ***
    rd_kafka_t 0x2121b30: rdkafka#consumer-2
    producer.msg_cnt 0 (0 bytes)
    rk_rep reply queue: 2259 ops
    brokers:
    ...
    • Mar 23 2020, 9:45 PM
    • 12 Lines
  • tony  yavin4  ~  %  my-hm build
    ~/repo/private/home ~
    querying info about '/nix/store/34ckk0vslnar8xj70sa0cgpiyg1r96r5-emacs-with-packages-26.3' on 'https://cache.nixos.org'...
    downloading 'https://cache.nixos.org/34ckk0vslnar8xj70sa0cgpiyg1r96r5.narinfo'...
    querying info about '/nix/store/dmjkw4hlf56x4wg2g5kgw0cj96qjqdp3-emacs-rust-mode-20191208.1654' on 'https://cache.nixos.org'...
    ...
    • Mar 23 2020, 6:28 PM
    • 80 Lines
  • {
    "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
    "@type": "SoftwareSourceCode",
    "name": "Foo Software",
    "author": [
    ...
    • Mar 21 2020, 12:06 PM
    • 25 Lines
    • JSON
  • {
    "total-collisions-raised-in-sentry": 9677,
    "total-collisions": 3,
    "total-falsy-collisions": 11,
    "detailed-collisions": {
    ...
    • Mar 20 2020, 3:49 PM
    • 489 Lines
  • Do you prefer:
    @attr.s(frozen=True)
    class Person(BaseModel):
    ...
    • Mar 20 2020, 12:18 PM
    • 24 Lines
  • DEBUG:swh.loader.package.loader:Number of skipped contents: 0 [58/4699]
    DEBUG:urllib3.connectionpool:http://storage0.internal.staging.swh.network:5002 "POST /content/skipped/missing HTTP/1.1" 200 1
    DEBUG:swh.loader.package.loader:Number of contents: 261
    DEBUG:urllib3.connectionpool:http://storage0.internal.staging.swh.network:5002 "POST /content/missing HTTP/1.1" 200 1
    DEBUG:swh.loader.package.loader:Number of directories: 22
    ...
    • Mar 20 2020, 11:16 AM
    • 57 Lines
  • Traceback (most recent call first):
    <built-in method acquire of _thread.lock object at remote 0x7f3c084e89b8>
    File "/usr/lib/python3.7/threading.py", line 300, in wait
    gotit = waiter.acquire(True, timeout)
    File "/usr/lib/python3.7/threading.py", line 552, in wait
    ...
    • Mar 18 2020, 4:33 PM
    • 92 Lines
  • #!/usr/bin/env bash
    # to run as swhworker
    export SWH_CONFIG_FILENAME=/etc/softwareheritage/loader_functional.yml
    ...
    • Mar 18 2020, 2:31 PM
    • 10 Lines
  • DEBUG:swh.journal.client:Consumer settings: {'security.protocol': 'SASL_SSL', 'sasl.mechanisms': 'SCRAM-SHA-512', 'sasl.username': 'seirl', 'sasl.password': 'CENSORED', 'debug': 'all', 'bootstrap.servers': 'kafka01.euwest.azure.internal.softwareheritage.org:9094,kafka02.euwest.azure.internal.softwareheritage.org:9094,kafka03.euwest.azure.internal.softwareheritage.org:9094,kafka04.euwest.azure.internal.softwareheritage.org:9094,kafka05.euwest.azure.internal.softwareheritage.org:9094,kafka06.euwest.azure.internal.softwareheritage.org:9094', 'auto.offset.reset': 'earliest', 'group.id': 'swh-dataset-export-seirl-test-51', 'on_commit': <function _on_commit at 0x7f4bc5db85e0>, 'error_cb': <function _error_cb at 0x7f4bc5dfbe50>, 'enable.auto.commit': False, 'logger': <Logger swh.journal.client.rdkafka (DEBUG)>}
    DEBUG:swh.journal.client:Subscribing to: ['swh.journal.objects.origin_visit', 'swh.journal.objects.snapshot', 'swh.journal.objects.release', 'swh.journal.objects.revision', 'swh.journal.objects.directory']
    DEBUG:swh.journal.client.rdkafka:SASL [rdkafka#consumer-1] [thrd:app]: Selected provider SCRAM (builtin) for SASL mechanism SCRAM-SHA-512
    DEBUG:swh.journal.client.rdkafka:OPENSSL [rdkafka#consumer-1] [thrd:app]: librdkafka built with OpenSSL version 0x1000212f
    DEBUG:swh.journal.client.rdkafka:MEMBERID [rdkafka#consumer-1] [thrd:app]: Group "swh-dataset-export-seirl-test-51": updating member id "(not-set)" -> ""
    ...
    • Mar 17 2020, 6:01 PM
    • 564 Lines
  • ```
    visits: Iterable[OriginVisit] = [
    _fix_origin_visit(v) for v in objects
    if _fix_origin_visit(v) is not None
    ]
    ...
    • Mar 17 2020, 10:43 AM
    • 11 Lines
  • > {}['foo']
    [ 'foo' ]
    > obj = {}
    {}
    > obj['foo']
    ...
    • Mar 12 2020, 5:59 PM
    • 6 Lines
    • Javascript
  • python analyze.py ~/sources.json
    There are 17599 sources in nixpkgs
    Sources are coming from 1529 different hosts
    ...
    • Mar 12 2020, 3:59 PM
    • 78 Lines
  • def attrib_typecheck(default: Any = attr.NOTHING,
    type: Optional[Type] = None,
    validator: Collection[Callable] = ()):
    "A 'partial' of attr.ib that prefill the validator with type_validator"
    return attr.attrib(
    ...
    • Mar 12 2020, 3:33 PM
    • 8 Lines
  • def test_timestamp_seconds():
    attr.validate(Timestamp(seconds=0, microseconds=0))
    with pytest.raises(AttributeTypeError):
    attr.validate(Timestamp(seconds='0', microseconds=0))
    ...
    • Mar 12 2020, 10:52 AM
    • 12 Lines
  • diff --git a/swh/model/tests/test_model.py b/swh/model/tests/test_model.py
    index 8bffa80..3f4de69 100644
    --- a/swh/model/tests/test_model.py
    +++ b/swh/model/tests/test_model.py
    @@ -298,6 +298,7 @@ def test_release_model_id_computation():
    ...
    • Mar 11 2020, 2:11 PM
    • 10 Lines
  • seirl@granet ~/swh-environment (git)-[master] % kafkacat -b kafka01.euwest.azure.softwareheritage.org:9093 -d broker -L :(
    %7|1583865197.850|BRKMAIN|rdkafka#producer-1| [thrd::0/internal]: :0/internal: Enter main broker thread
    %7|1583865197.850|BROKER|rdkafka#producer-1| [thrd:app]: kafka01.euwest.azure.softwareheritage.org:9093/bootstrap: Added new broker with NodeId -1
    %7|1583865197.850|CONNECT|rdkafka#producer-1| [thrd:app]: kafka01.euwest.azure.softwareheritage.org:9093/bootstrap: Selected for cluster connection: bootstrap servers added (broker has 0 connection attempt(s))
    %7|1583865197.850|BRKMAIN|rdkafka#producer-1| [thrd:kafka01.euwest.azure.softwareheritage.org:9093/bootstrap]: kafka01.euwest.azure.softwareheritage.org:9093/bootstrap: Enter main broker thread
    ...
    • Mar 10 2020, 7:33 PM
    • 24 Lines
  • python3 -m pytest .
    Traceback (most recent call last):
    File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
    File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    ...
    • Mar 10 2020, 5:11 PM
    • 59 Lines
  • make test
    python3 -m pytest .
    usage: pytest.py [options] [file_or_dir] [file_or_dir] [...]
    pytest.py: error: unrecognized arguments: --no-start-live-server --live-server-port .
    inifile: /home/zack/dati/projects/sw-heritage/git/swh-environment/swh-scanner/pytest.ini
    ...
    • Mar 10 2020, 5:07 PM
    • 7 Lines
  • $ sudo smartctl -a /dev/nvme0
    smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-4-amd64] (local build)
    Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
    === START OF INFORMATION SECTION ===
    ...
    • Mar 9 2020, 9:25 AM
    • 67 Lines
  • #!/bin/bash
    # Copyright (C) 2019 Stefano Zacchiroli <zack@upsilon.cc>
    # License: GNU General Public License (GPL), version 3 or above
    #
    ...
    • Mar 8 2020, 4:00 PM
    • 62 Lines
    • Bash Scripting
  • | Traceback (most recent call last):
    | File "/usr/bin/swh", line 11, in <module>
    | load_entry_point('swh.core==0.0.94', 'console_scripts', 'swh')()
    | File "/usr/lib/python3/dist-packages/swh/core/cli/__init__.py", line 111, in main
    | return swh(auto_envvar_prefix='SWH')
    ...
    • Mar 6 2020, 5:36 PM
    • 59 Lines
  • diff --git a/swh/storage/tests/test_storage.py b/swh/storage/tests/test_storage.py
    index c68f28c..20959fc 100644
    --- a/swh/storage/tests/test_storage.py
    +++ b/swh/storage/tests/test_storage.py
    @@ -1087,6 +1087,12 @@ class TestStorage:
    ...
    • Mar 4 2020, 3:53 PM
    • 17 Lines
    • Diff
  • 2020-03-04T14:13:06.142010302Z swh_graph-replayer-release.1.8shri9bb8q0h@mirror-replay01 | Starting the SWH mirror graph replayer
    2020-03-04T14:13:21.173425515Z swh_graph-replayer-release.1.8shri9bb8q0h@mirror-replay01 | Traceback (most recent call last):
    2020-03-04T14:13:21.173475615Z swh_graph-replayer-release.1.8shri9bb8q0h@mirror-replay01 | File "/usr/bin/swh", line 11, in <module>
    2020-03-04T14:13:21.173483615Z swh_graph-replayer-release.1.8shri9bb8q0h@mirror-replay01 | load_entry_point('swh.core==0.0.94', 'console_scripts', 'swh')()
    2020-03-04T14:13:21.173487415Z swh_graph-replayer-release.1.8shri9bb8q0h@mirror-replay01 | File "/usr/lib/python3/dist-packages/swh/core/cli/__init__.py", line 111, in main
    ...
    • Mar 4 2020, 3:20 PM
    • 48 Lines
  • 2020-03-04T12:55:32.779118726Z swh_graph-replayer-revision.1.ihrszqxqznya@mirror-replay03 | DEBUG:swh.journal.client.rdkafka:CGRPOP [rdkafka#consumer-1] [thrd:main]: Group "test-graph-replayer-b70e8cf435c0" received op PARTITION_LEAVE in state up (join state wait-unassign, v840) for swh.journal.objects.revision [0]
    2020-03-04T12:55:32.779122726Z swh_graph-replayer-revision.1.ihrszqxqznya@mirror-replay03 | DEBUG:swh.journal.client.rdkafka:PARTDEL [rdkafka#consumer-1] [thrd:main]: Group "test-graph-replayer-b70e8cf435c0": delete swh.journal.objects.revision [0]
    2020-03-04T12:55:32.779134326Z swh_graph-replayer-revision.1.ihrszqxqznya@mirror-replay03 | DEBUG:swh.journal.client.rdkafka:CGRPOP [rdkafka#consumer-1] [thrd:main]: Group "test-graph-replayer-b70e8cf435c0" received op REPLY:FETCH_STOP in state up (join state wait-unassign, v840) for swh.journal.objects.revision [0]
    2020-03-04T12:55:32.779138526Z swh_graph-replayer-revision.1.ihrszqxqznya@mirror-replay03 | DEBUG:swh.journal.client.rdkafka:UNASSIGN [rdkafka#consumer-1] [thrd:main]: Unassign not done yet (255 wait_unassign, 255 assigned, 0 wait commit, join state wait-unassign): FETCH_STOP done
    ...
    • Mar 4 2020, 2:04 PM
    • 16 Lines
  • storage:
    cls: remote
    args:
    url: http://localhost:5002/
    ...
    • Mar 3 2020, 5:10 PM
    • 17 Lines
  • -- SWH Indexer DB schema upgrade
    -- from_version: 130
    -- to_version: 131
    -- description:
    ...
    • Mar 2 2020, 11:16 AM
    • 116 Lines
  • Traceback (most recent call last):
    File "/home/zack/.virtualenvs/swh/lib/python3.7/site-packages/aiohttp/connector.py", line 955, in _create_direct_connection
    traces=traces), loop=self._loop)
    File "/home/zack/.virtualenvs/swh/lib/python3.7/site-packages/aiohttp/connector.py", line 825, in _resolve_host
    self._resolver.resolve(host, port, family=self._family)
    ...
    • Feb 28 2020, 4:08 PM
    • 61 Lines
  • with lost_task_runs as (
    select task_run.id
    from task
    inner join task_run on task.id = task_run.task
    where task.policy = 'recurring' and
    ...
    • Feb 27 2020, 10:09 AM
    • 14 Lines
    • SQL
  • import datetime
    import gzip
    import hashlib
    import multiprocessing
    import os
    ...
    • Feb 11 2020, 5:55 PM
    • 87 Lines
    • Python
  • import datetime
    import gc
    import gzip
    import hashlib
    import multiprocessing
    ...
    • Feb 11 2020, 5:54 PM
    • 108 Lines
    • Python
  • diff --git swh/journal/client.py swh/journal/client.py
    index 0d481b0..42e2b96 100644
    --- swh/journal/client.py
    +++ swh/journal/client.py
    @@ -76,7 +76,7 @@ class JournalClient:
    ...
    • Feb 11 2020, 4:44 PM
    • 39 Lines
    • Diff
  • from swh.journal.client import JournalClient
    import logging
    logging.basicConfig(level=logging.INFO)
    logging.info('Running test')
    ...
    • Feb 10 2020, 6:03 PM
    • 33 Lines
    • Python
  • [2020-02-10 17:37:13] INFO:swh.journal.client.rdkafka:REQTMOUT [rdkafka#consumer-1] [thrd:GroupCoordinator]: GroupCoordinator/6: Timed out HeartbeatRequest in flight (after 10580ms, timeout #0)
    [2020-02-10 17:37:13] WARNING:swh.journal.client.rdkafka:REQTMOUT [rdkafka#consumer-1] [thrd:GroupCoordinator]: GroupCoordinator/6: Timed out 1 in-flight, 0 retry-queued, 0 out-queue, 0 partially-sent requests
    [2020-02-10 17:37:13] INFO:swh.journal.client:Received non-fatal kafka error: KafkaError{code=_TIMED_OUT,val=-185,str="GroupCoordinator: 1 request(s) timed out: disconnect (after 29019ms in state UP)"}
    [2020-02-10 17:37:14] INFO:swh.journal.client.rdkafka:REQTMOUT [rdkafka#consumer-1] [thrd:GroupCoordinator]: GroupCoordinator/6: Timed out HeartbeatRequest in flight (after 10536ms, timeout #0): possibly held back by preceeding OffsetCommitRequest with timeout in 47805ms
    [2020-02-10 17:37:14] WARNING:swh.journal.client.rdkafka:REQTMOUT [rdkafka#consumer-1] [thrd:GroupCoordinator]: GroupCoordinator/6: Timed out 1 in-flight, 0 retry-queued, 0 out-queue, 0 partially-sent requests
    ...
    • Feb 10 2020, 5:39 PM
    • 68 Lines
  • INFO:swh.journal.client.rdkafka:REQTMOUT [rdkafka#consumer-1] [thrd:kafka06.euwest.azure.internal.softwareheritage.org:9092/bootstr]: kafka06.euwest.azure.internal.softwareheritage.org:9092/6: Timed out FetchRequest in flight (after 60337ms, timeout #0)
    WARNING:swh.journal.client.rdkafka:REQTMOUT [rdkafka#consumer-1] [thrd:kafka06.euwest.azure.internal.softwareheritage.org:9092/bootstr]: kafka06.euwest.azure.internal.softwareheritage.org:9092/6: Timed out 1 in-flight, 0 retry-queued, 0 out-queue, 0 partially-sent requests
    INFO:swh.journal.client:Received non-fatal kafka error: KafkaError{code=_TIMED_OUT,val=-185,str="kafka06.euwest.azure.internal.softwareheritage.org:9092/6: 1 request(s) timed out: disconnect (after 71045ms in state UP)"}
    INFO:swh.journal.client.rdkafka:REQTMOUT [rdkafka#consumer-1] [thrd:kafka06.euwest.azure.internal.softwareheritage.org:9092/bootstr]: kafka06.euwest.azure.internal.softwareheritage.org:9092/6: Timed out FetchRequest in flight (after 60682ms, timeout #0)
    WARNING:swh.journal.client.rdkafka:REQTMOUT [rdkafka#consumer-1] [thrd:kafka06.euwest.azure.internal.softwareheritage.org:9092/bootstr]: kafka06.euwest.azure.internal.softwareheritage.org:9092/6: Timed out 1 in-flight, 0 retry-queued, 0 out-queue, 0 partially-sent requests
    ...
    • Feb 10 2020, 4:55 PM
    • 46 Lines
  • swh/storage/tests/test_retry.py F
    ================================================================================================================================== FAILURES ===================================================================================================================================
    ___________________________________________________________________________________________________________________ test_retrying_proxy_storage_content_add ___________________________________________________________________________________________________________________
    ...
    • Feb 5 2020, 6:40 PM
    • 39 Lines
  • WARNING cassandra.cluster:cql.py:45 Downgrading core protocol version from 66 to 65 for 127.0.0.1:59387. To avoid this, it is best practice to explicitly set Cluster(protocol_version) to the version supported by your cluster. http://datastax.github.io/python-driver/api/cassandra/cluster.html#cassandra.cluster.Cluster.protocol_version
    WARNING cassandra.cluster:cql.py:45 Downgrading core protocol version from 65 to 4 for 127.0.0.1:59387. To avoid this, it is best practice to explicitly set Cluster(protocol_version) to the version supported by your cluster. http://datastax.github.io/python-driver/api/cassandra/cluster.html#cassandra.cluster.Cluster.protocol_version
    ERROR cassandra.cluster:thread.py:57 Exception refreshing schema in response to schema change:
    Traceback (most recent call last):
    File "cassandra/cluster.py", line 4044, in cassandra.cluster.refresh_schema_and_set_result
    ...
    • Feb 4 2020, 1:16 PM
    • 12 Lines
  • >>> pprint.pprint(s.revision_get([b'Y\xd7\xa5=\xe5\x980\x8eN\x1f\xffy\x19Z\xe8Z{#\xea\x0e']))
    [{'author': {'email': b'me@nanx.me',
    'fullname': b'Nan Xiao <me@nanx.me>',
    'name': b'Nan Xiao'},
    'committer': {'email': b'me@nanx.me',
    ...
    • Jan 30 2020, 3:11 PM
    • 76 Lines
  • elasticsearch_1 | OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
    elasticsearch_1 | {"type": "server", "timestamp": "2020-01-27T12:45:13,203Z", "level": "INFO", "component": "o.e.e.NodeEnvironment", "cluster.name": "docker-cluster", "node.name": "959d65b52b05", "message": "using [1] data paths, mounts [[/usr/share/elasticsearch/data (/dev/nvme0n1p3)]], net usable_space [199.1gb], net total_space [438.6gb], types [ext4]" }
    elasticsearch_1 | {"type": "server", "timestamp": "2020-01-27T12:45:13,206Z", "level": "INFO", "component": "o.e.e.NodeEnvironment", "cluster.name": "docker-cluster", "node.name": "959d65b52b05", "message": "heap size [989.8mb], compressed ordinary object pointers [true]" }
    elasticsearch_1 | {"type": "server", "timestamp": "2020-01-27T12:45:13,210Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "docker-cluster", "node.name": "959d65b52b05", "message": "node name [959d65b52b05], node ID [jwb72D4vRxa9uqgxR7PSbg], cluster name [docker-cluster]" }
    elasticsearch_1 | {"type": "server", "timestamp": "2020-01-27T12:45:13,210Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "docker-cluster", "node.name": "959d65b52b05", "message": "version[7.5.0], pid[1], build[default/docker/e9ccaed468e2fac2275a3761849cbee64b39519f/2019-11-26T01:06:52.518245Z], OS[Linux/4.19.0-6-amd64/amd64], JVM[AdoptOpenJDK/OpenJDK 64-Bit Server VM/13.0.1/13.0.1+9]" }
    ...
    • Jan 27 2020, 1:46 PM
    • 60 Lines
  • > assert results == {cont['sha1']: cont}
    E assert {b'4\x972t\xcc\xefj\xb4\xdf\xaa\xf8e\x99y/\xa9\xc3\xfeF\x89': [{'blake2s256': b'\xd5\xfe\x199'\n b"We'\xe4"\n b',\xfdv\xa9'\n b'EZ$2'\n b'\xfe\x7fVf'\n b'\x95dW}'\n b'\xd9<B\x80'\n b'\xe7mf\x1d',\n 'length': 3,\n 'sha1': b'4\x972t'\n...
    • Jan 24 2020, 4:16 PM
    • 2 Lines
  • root@551c92280895:/# aptitude why g++
    i python3-swh.web Depends python3-pypandoc
    i A python3-pypandoc Depends python3-pip
    i A python3-pip Recommends build-essential
    i A build-essential Depends g++ (>= 4:8.3)
    • Jan 23 2020, 4:36 PM
    • 5 Lines
  • delete from origin_visit where type='cran';
    delete from origin where url like 'https://cran.r-project.org/%';
    • Jan 16 2020, 2:59 PM
    • 2 Lines
  • #!/usr/bin/env bash
    set -xe
    USER=$1
    ...
    • Jan 16 2020, 1:32 PM
    • 29 Lines
  • [testenv:xdist]
    deps =
    pytest-cov
    pytest-xdist
    commands =
    ...
    • Jan 15 2020, 3:26 PM
    • 6 Lines
  • SystemCheckError: System check identified some issues:
    ERRORS:
    ?: (urls.E007) The custom handler400 view 'swh.web.common.exc.swh_handle400' does not take the correct number of arguments (request, exception).
    ?: (urls.E007) The custom handler403 view 'swh.web.common.exc.swh_handle403' does not take the correct number of arguments (request, exception).
    ...
    • Jan 14 2020, 11:35 AM
    • 6 Lines
  • content : 22% 1544836073 / 7031830448
    directory : 9% 308855808 / 3528500560
    origin : 100% 91383460 / 91383581
    origin_visit : 98% 1088929623 / 1110533233
    release : 100% 12108512 / 12108527
    ...
    • Jan 6 2020, 1:17 PM
    • 7 Lines
  • def test_create_deposit_multipart(host):
    deposit = host.check_output(
    'swh deposit upload --format json --username test --password test '
    ...
    • Dec 20 2019, 2:12 PM
    • 32 Lines
    • Python
  • url2 = 'http://deb.debian.org/debian//pool/main/l/lxqt-config/lxqt-config_0.14.1.orig.tar.xz.asc'
    # patched download to not check anything
    In [10]: download(url2, dest='/tmp')
    Out[10]:
    ...
    • Dec 20 2019, 12:57 PM
    • 37 Lines
  • Dec 19 14:47:07 worker2 python3[30717]: [2019-12-19 14:47:07,898: ERROR/ForkPoolWorker-1] Fail to load https://softwareheritage.org/swh-ddev
    Traceback (most recent call last):
    File "/usr/lib/python3/dist-packages/swh/core/tarball.py", line 72, in uncompress
    shutil.unpack_archive(tarpath, extract_dir=dest)
    File "/usr/lib/python3.7/shutil.py", line 999, in unpack_archive
    ...
    • Dec 19 2019, 3:51 PM
    • 20 Lines
  • PACKAGE_FILES3 = {
    'bullseye/main/0.10-1': {
    'files': {
    'libbarcode-datamatrix-perl_0.10-1.debian.tar.xz': {
    'md5sum': '30bd8e44db00610333af39ccd0805110',
    ...
    • Dec 19 2019, 1:20 PM
    • 40 Lines
  • <?xml version="1.0" encoding="utf-8"?>
    <entry xmlns="http://www.w3.org/2005/Atom"
    xmlns:codemeta="https://doi.org/10.5063/SCHEMA/CODEMETA-2.0">
    <title>Je suis GPL</title>
    <client>swh</client>
    ...
    • Dec 17 2019, 2:26 PM
    • 25 Lines
  • Notice: /Stage[main]/Profile::Swh::Deploy::Webapp/Gunicorn::Instance[swh-webapp]/File[/etc/gunicorn/instances/swh-webapp.cfg]/content:
    --- /etc/gunicorn/instances/swh-webapp.cfg 2018-03-06 18:52:38.179007424 +0000
    +++ /tmp/puppet-file20191213-2435-qv2yh7 2019-12-13 14:53:05.846920036 +0000
    @@ -1,6 +1,13 @@
    # Gunicorn instance configuration.
    ...
    • Dec 13 2019, 3:56 PM
    • 97 Lines
  • swh-web_1 | Traceback (most recent call last):
    swh-web_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/django/core/handlers/exception.py", line 41, in inner
    swh-web_1 | response = get_response(request)
    swh-web_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/django/core/handlers/base.py", line 172, in _get_response
    swh-web_1 | resolver_match = resolver.resolve(request.path_info)
    ...
    • Dec 12 2019, 2:47 PM
    • 395 Lines
  • create index on task(type, status, policy);
    update task
    set arguments=jsonb_set(arguments, '{kwargs}', json_build_object('url', arguments#>>'{kwargs,package_url}')::jsonb)
    where type = 'load-npm' and
    ...
    • Dec 10 2019, 6:40 PM
    • 87 Lines
  • with swh_count_origins as (
    select value
    from object_counts
    where object_type='origin'
    ),
    ...
    • Dec 6 2019, 12:14 PM
    • 9 Lines
    • SQL
  • * scheduler
    Following module migration to their own namespace (D2395):
    #+BEGIN_SRC sh
    ...
    • Dec 5 2019, 8:38 AM
    • 16 Lines