Page MenuHomeSoftware Heritage
Paste Active Pastes
  • storage:
    cls: remote
    args:
    url: http://localhost:5002/
    ...
    • Mar 3 2020, 5:10 PM
    • 17 Lines
  • -- SWH Indexer DB schema upgrade
    -- from_version: 130
    -- to_version: 131
    -- description:
    ...
    • Mar 2 2020, 11:16 AM
    • 116 Lines
  • Traceback (most recent call last):
    File "/home/zack/.virtualenvs/swh/lib/python3.7/site-packages/aiohttp/connector.py", line 955, in _create_direct_connection
    traces=traces), loop=self._loop)
    File "/home/zack/.virtualenvs/swh/lib/python3.7/site-packages/aiohttp/connector.py", line 825, in _resolve_host
    self._resolver.resolve(host, port, family=self._family)
    ...
    • Feb 28 2020, 4:08 PM
    • 61 Lines
  • with lost_task_runs as (
    select task_run.id
    from task
    inner join task_run on task.id = task_run.task
    where task.policy = 'recurring' and
    ...
    • Feb 27 2020, 10:09 AM
    • 14 Lines
    • SQL
  • import datetime
    import gzip
    import hashlib
    import multiprocessing
    import os
    ...
    • Feb 11 2020, 5:55 PM
    • 87 Lines
    • Python
  • import datetime
    import gc
    import gzip
    import hashlib
    import multiprocessing
    ...
    • Feb 11 2020, 5:54 PM
    • 108 Lines
    • Python
  • diff --git swh/journal/client.py swh/journal/client.py
    index 0d481b0..42e2b96 100644
    --- swh/journal/client.py
    +++ swh/journal/client.py
    @@ -76,7 +76,7 @@ class JournalClient:
    ...
    • Feb 11 2020, 4:44 PM
    • 39 Lines
    • Diff
  • from swh.journal.client import JournalClient
    import logging
    logging.basicConfig(level=logging.INFO)
    logging.info('Running test')
    ...
    • Feb 10 2020, 6:03 PM
    • 33 Lines
    • Python
  • [2020-02-10 17:37:13] INFO:swh.journal.client.rdkafka:REQTMOUT [rdkafka#consumer-1] [thrd:GroupCoordinator]: GroupCoordinator/6: Timed out HeartbeatRequest in flight (after 10580ms, timeout #0)
    [2020-02-10 17:37:13] WARNING:swh.journal.client.rdkafka:REQTMOUT [rdkafka#consumer-1] [thrd:GroupCoordinator]: GroupCoordinator/6: Timed out 1 in-flight, 0 retry-queued, 0 out-queue, 0 partially-sent requests
    [2020-02-10 17:37:13] INFO:swh.journal.client:Received non-fatal kafka error: KafkaError{code=_TIMED_OUT,val=-185,str="GroupCoordinator: 1 request(s) timed out: disconnect (after 29019ms in state UP)"}
    [2020-02-10 17:37:14] INFO:swh.journal.client.rdkafka:REQTMOUT [rdkafka#consumer-1] [thrd:GroupCoordinator]: GroupCoordinator/6: Timed out HeartbeatRequest in flight (after 10536ms, timeout #0): possibly held back by preceeding OffsetCommitRequest with timeout in 47805ms
    [2020-02-10 17:37:14] WARNING:swh.journal.client.rdkafka:REQTMOUT [rdkafka#consumer-1] [thrd:GroupCoordinator]: GroupCoordinator/6: Timed out 1 in-flight, 0 retry-queued, 0 out-queue, 0 partially-sent requests
    ...
    • Feb 10 2020, 5:39 PM
    • 68 Lines
  • INFO:swh.journal.client.rdkafka:REQTMOUT [rdkafka#consumer-1] [thrd:kafka06.euwest.azure.internal.softwareheritage.org:9092/bootstr]: kafka06.euwest.azure.internal.softwareheritage.org:9092/6: Timed out FetchRequest in flight (after 60337ms, timeout #0)
    WARNING:swh.journal.client.rdkafka:REQTMOUT [rdkafka#consumer-1] [thrd:kafka06.euwest.azure.internal.softwareheritage.org:9092/bootstr]: kafka06.euwest.azure.internal.softwareheritage.org:9092/6: Timed out 1 in-flight, 0 retry-queued, 0 out-queue, 0 partially-sent requests
    INFO:swh.journal.client:Received non-fatal kafka error: KafkaError{code=_TIMED_OUT,val=-185,str="kafka06.euwest.azure.internal.softwareheritage.org:9092/6: 1 request(s) timed out: disconnect (after 71045ms in state UP)"}
    INFO:swh.journal.client.rdkafka:REQTMOUT [rdkafka#consumer-1] [thrd:kafka06.euwest.azure.internal.softwareheritage.org:9092/bootstr]: kafka06.euwest.azure.internal.softwareheritage.org:9092/6: Timed out FetchRequest in flight (after 60682ms, timeout #0)
    WARNING:swh.journal.client.rdkafka:REQTMOUT [rdkafka#consumer-1] [thrd:kafka06.euwest.azure.internal.softwareheritage.org:9092/bootstr]: kafka06.euwest.azure.internal.softwareheritage.org:9092/6: Timed out 1 in-flight, 0 retry-queued, 0 out-queue, 0 partially-sent requests
    ...
    • Feb 10 2020, 4:55 PM
    • 46 Lines
  • swh/storage/tests/test_retry.py F
    ================================================================================================================================== FAILURES ===================================================================================================================================
    ___________________________________________________________________________________________________________________ test_retrying_proxy_storage_content_add ___________________________________________________________________________________________________________________
    ...
    • Feb 5 2020, 6:40 PM
    • 39 Lines
  • WARNING cassandra.cluster:cql.py:45 Downgrading core protocol version from 66 to 65 for 127.0.0.1:59387. To avoid this, it is best practice to explicitly set Cluster(protocol_version) to the version supported by your cluster. http://datastax.github.io/python-driver/api/cassandra/cluster.html#cassandra.cluster.Cluster.protocol_version
    WARNING cassandra.cluster:cql.py:45 Downgrading core protocol version from 65 to 4 for 127.0.0.1:59387. To avoid this, it is best practice to explicitly set Cluster(protocol_version) to the version supported by your cluster. http://datastax.github.io/python-driver/api/cassandra/cluster.html#cassandra.cluster.Cluster.protocol_version
    ERROR cassandra.cluster:thread.py:57 Exception refreshing schema in response to schema change:
    Traceback (most recent call last):
    File "cassandra/cluster.py", line 4044, in cassandra.cluster.refresh_schema_and_set_result
    ...
    • Feb 4 2020, 1:16 PM
    • 12 Lines
  • >>> pprint.pprint(s.revision_get([b'Y\xd7\xa5=\xe5\x980\x8eN\x1f\xffy\x19Z\xe8Z{#\xea\x0e']))
    [{'author': {'email': b'me@nanx.me',
    'fullname': b'Nan Xiao <me@nanx.me>',
    'name': b'Nan Xiao'},
    'committer': {'email': b'me@nanx.me',
    ...
    • Jan 30 2020, 3:11 PM
    • 76 Lines
  • elasticsearch_1 | OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
    elasticsearch_1 | {"type": "server", "timestamp": "2020-01-27T12:45:13,203Z", "level": "INFO", "component": "o.e.e.NodeEnvironment", "cluster.name": "docker-cluster", "node.name": "959d65b52b05", "message": "using [1] data paths, mounts [[/usr/share/elasticsearch/data (/dev/nvme0n1p3)]], net usable_space [199.1gb], net total_space [438.6gb], types [ext4]" }
    elasticsearch_1 | {"type": "server", "timestamp": "2020-01-27T12:45:13,206Z", "level": "INFO", "component": "o.e.e.NodeEnvironment", "cluster.name": "docker-cluster", "node.name": "959d65b52b05", "message": "heap size [989.8mb], compressed ordinary object pointers [true]" }
    elasticsearch_1 | {"type": "server", "timestamp": "2020-01-27T12:45:13,210Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "docker-cluster", "node.name": "959d65b52b05", "message": "node name [959d65b52b05], node ID [jwb72D4vRxa9uqgxR7PSbg], cluster name [docker-cluster]" }
    elasticsearch_1 | {"type": "server", "timestamp": "2020-01-27T12:45:13,210Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "docker-cluster", "node.name": "959d65b52b05", "message": "version[7.5.0], pid[1], build[default/docker/e9ccaed468e2fac2275a3761849cbee64b39519f/2019-11-26T01:06:52.518245Z], OS[Linux/4.19.0-6-amd64/amd64], JVM[AdoptOpenJDK/OpenJDK 64-Bit Server VM/13.0.1/13.0.1+9]" }
    ...
    • Jan 27 2020, 1:46 PM
    • 60 Lines
  • > assert results == {cont['sha1']: cont}
    E assert {b'4\x972t\xcc\xefj\xb4\xdf\xaa\xf8e\x99y/\xa9\xc3\xfeF\x89': [{'blake2s256': b'\xd5\xfe\x199'\n b"We'\xe4"\n b',\xfdv\xa9'\n b'EZ$2'\n b'\xfe\x7fVf'\n b'\x95dW}'\n b'\xd9<B\x80'\n b'\xe7mf\x1d',\n 'length': 3,\n 'sha1': b'4\x972t'\n...
    • Jan 24 2020, 4:16 PM
    • 2 Lines
  • root@551c92280895:/# aptitude why g++
    i python3-swh.web Depends python3-pypandoc
    i A python3-pypandoc Depends python3-pip
    i A python3-pip Recommends build-essential
    i A build-essential Depends g++ (>= 4:8.3)
    • Jan 23 2020, 4:36 PM
    • 5 Lines
  • delete from origin_visit where type='cran';
    delete from origin where url like 'https://cran.r-project.org/%';
    • Jan 16 2020, 2:59 PM
    • 2 Lines
  • #!/usr/bin/env bash
    set -xe
    USER=$1
    ...
    • Jan 16 2020, 1:32 PM
    • 29 Lines
  • [testenv:xdist]
    deps =
    pytest-cov
    pytest-xdist
    commands =
    ...
    • Jan 15 2020, 3:26 PM
    • 6 Lines
  • SystemCheckError: System check identified some issues:
    ERRORS:
    ?: (urls.E007) The custom handler400 view 'swh.web.common.exc.swh_handle400' does not take the correct number of arguments (request, exception).
    ?: (urls.E007) The custom handler403 view 'swh.web.common.exc.swh_handle403' does not take the correct number of arguments (request, exception).
    ...
    • Jan 14 2020, 11:35 AM
    • 6 Lines
  • content : 22% 1544836073 / 7031830448
    directory : 9% 308855808 / 3528500560
    origin : 100% 91383460 / 91383581
    origin_visit : 98% 1088929623 / 1110533233
    release : 100% 12108512 / 12108527
    ...
    • Jan 6 2020, 1:17 PM
    • 7 Lines
  • def test_create_deposit_multipart(host):
    deposit = host.check_output(
    'swh deposit upload --format json --username test --password test '
    ...
    • Dec 20 2019, 2:12 PM
    • 32 Lines
    • Python
  • url2 = 'http://deb.debian.org/debian//pool/main/l/lxqt-config/lxqt-config_0.14.1.orig.tar.xz.asc'
    # patched download to not check anything
    In [10]: download(url2, dest='/tmp')
    Out[10]:
    ...
    • Dec 20 2019, 12:57 PM
    • 37 Lines
  • Dec 19 14:47:07 worker2 python3[30717]: [2019-12-19 14:47:07,898: ERROR/ForkPoolWorker-1] Fail to load https://softwareheritage.org/swh-ddev
    Traceback (most recent call last):
    File "/usr/lib/python3/dist-packages/swh/core/tarball.py", line 72, in uncompress
    shutil.unpack_archive(tarpath, extract_dir=dest)
    File "/usr/lib/python3.7/shutil.py", line 999, in unpack_archive
    ...
    • Dec 19 2019, 3:51 PM
    • 20 Lines
  • PACKAGE_FILES3 = {
    'bullseye/main/0.10-1': {
    'files': {
    'libbarcode-datamatrix-perl_0.10-1.debian.tar.xz': {
    'md5sum': '30bd8e44db00610333af39ccd0805110',
    ...
    • Dec 19 2019, 1:20 PM
    • 40 Lines
  • <?xml version="1.0" encoding="utf-8"?>
    <entry xmlns="http://www.w3.org/2005/Atom"
    xmlns:codemeta="https://doi.org/10.5063/SCHEMA/CODEMETA-2.0">
    <title>Je suis GPL</title>
    <client>swh</client>
    ...
    • Dec 17 2019, 2:26 PM
    • 25 Lines
  • Notice: /Stage[main]/Profile::Swh::Deploy::Webapp/Gunicorn::Instance[swh-webapp]/File[/etc/gunicorn/instances/swh-webapp.cfg]/content:
    --- /etc/gunicorn/instances/swh-webapp.cfg 2018-03-06 18:52:38.179007424 +0000
    +++ /tmp/puppet-file20191213-2435-qv2yh7 2019-12-13 14:53:05.846920036 +0000
    @@ -1,6 +1,13 @@
    # Gunicorn instance configuration.
    ...
    • Dec 13 2019, 3:56 PM
    • 97 Lines
  • swh-web_1 | Traceback (most recent call last):
    swh-web_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/django/core/handlers/exception.py", line 41, in inner
    swh-web_1 | response = get_response(request)
    swh-web_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/django/core/handlers/base.py", line 172, in _get_response
    swh-web_1 | resolver_match = resolver.resolve(request.path_info)
    ...
    • Dec 12 2019, 2:47 PM
    • 395 Lines
  • create index on task(type, status, policy);
    update task
    set arguments=jsonb_set(arguments, '{kwargs}', json_build_object('url', arguments#>>'{kwargs,package_url}')::jsonb)
    where type = 'load-npm' and
    ...
    • Dec 10 2019, 6:40 PM
    • 87 Lines
  • with swh_count_origins as (
    select value
    from object_counts
    where object_type='origin'
    ),
    ...
    • Dec 6 2019, 12:14 PM
    • 9 Lines
    • SQL
  • * scheduler
    Following module migration to their own namespace (D2395):
    #+BEGIN_SRC sh
    ...
    • Dec 5 2019, 8:38 AM
    • 16 Lines
  • self.search.origin_update([
    {'url': 'https://bitbucket.org/bitbucket0145/bitbucket_repo.git'},
    {'url': 'https://gitorious.org/railstutorial/railstutorial.git'},
    {'url': 'https://bitbucket.org/bittelc/railstutorial.git'},
    ])
    ...
    • Dec 4 2019, 4:40 PM
    • 11 Lines
    • Python
  • [b'3\xbe\x07S\xcf(~j\xc0\xc0C\xd6\xea\xe6-\x1f\xd3&7;',
    b'\xa2\x82b\xfcGdS\x82O\xe0\x00\xf9\xda{\x85!\x1b\x82\x9d\x07',
    b'\x1c\xff\x7f\xb26\xdb[\xda3\xde\x11\xe5H\xa04\x02\x12\x9b\x8c\xf4',
    b'?J\x9c]\xc1\x1c\x13|\xb9\xd3.\x0eO\xf0\x9e2\xaf\x15M\x15',
    b'-q\x82E\x03\xd1\xf6\xa0G\x14S\xf8\xa0\t\xfdu?\x9e\xb02',
    ...
    • Nov 28 2019, 4:04 PM
    • 20 Lines
  • import json
    from pprint import pprint
    import re
    import elasticsearch
    ...
    • Nov 25 2019, 2:07 PM
    • 46 Lines
    • Python
  • default: &default_settings
    memory: 200G
    java_tool_options: -XXlol
    compress:
    <<: *default_settings
    ...
    • Nov 8 2019, 2:49 PM
    • 6 Lines
  • default:
    memory: 200G
    java_tool_options: -XXlol
    compress:
    memory: 1000G
    • Nov 8 2019, 2:47 PM
    • 5 Lines
    • YAML
  • 11:18 <+ardumont> douardda: D2237 draft to check for missing task types
    11:18 -- Notice(swhbot): D2237 (author: ardumont, Needs Review) on swh-lister: lister: Add checks on expected scheduler's output tasks <https://forge.softwareheritage.org/D2237>
    11:18 <+ardumont> i'm not sure when/where to plug that check though
    11:21 <+olasd> if we're doing that, we might just as well create the task type with the proper settings
    11:25 <+ardumont> mmm, unsure
    ...
    • Nov 8 2019, 12:00 PM
    • 33 Lines
  • swh/graph/tests/test_cli.py::TestCompress::test_pipeline
    -------------------------------------------------------------------------------------------------------------------------------- live log call --------------------------------------------------------------------------------------------------------------------------------
    webgraph.py 233 INFO starting compression
    webgraph.py 242 INFO starting compression step MPH (1/11)
    webgraph.py 153 INFO running: java it.unimi.dsi.sux4j.mph.GOVMinimalPerfectHashFunction --zipped /tmp/tmp465gdf9e.swh-graph-test/example.mph --temp-dir /tmp/tmp465gdf9e.swh-graph-test/tmp /home/antoine/swh/swh-environment/swh-graph/swh/graph/tests/dataset/example.nodes.csv.gz
    ...
    • Nov 4 2019, 12:01 PM
    • 49 Lines
  • url
    ------------------------------------------------------------------------------------------------
    https://github.com/rootpy/root_numpy
    https://github.com/stevengj/mpb
    https://github.com/barbagroup/pygbe
    ...
    • Oct 15 2019, 2:49 PM
    • 263 Lines
  • package org.softwareheritage.graph;
    import org.softwareheritage.graph.algo.Traversal;
    import java.io.OutputStream;
    ...
    • Oct 14 2019, 1:52 PM
    • 67 Lines
    • Java
  • ERR-CONDUIT-CORE: Graph cycle detected (type=5, cycle=PHID-DREV-vvlfxmyjkkrcdnbfzb5l, PHID-DREV-sz2tjc63iowtoyay6mey, PHID-DREV-ni5tcyma6542fypzaalx, PHID-DREV-q6y4zi7eciqddatexzxl, PHID-DREV-tsal5vzfklovorhkqvzn, PHID-DREV-rtmmo2kgo7wacl4warhv, PHID-DREV-5muna6xupz7kvfkz2yuo, PHID-DREV-ee7627jtcof4i73qqrfm, PHID-DREV-kf4oob54dw4sttlkv2az, PHID-DREV-vvlfxmyjkkrcdnbfzb5l).
    • Oct 14 2019, 12:28 PM
    • 1 Line
  • git pull --rebase
    remote: Enumerating objects: 21, done.
    remote: Counting objects: 100% (21/21), done.
    remote: Compressing objects: 100% (11/11), done.
    remote: Total 12 (delta 6), reused 0 (delta 0)
    ...
    • Oct 9 2019, 2:58 PM
    • 15 Lines
  • P545 Data
    {'_id': ObjectId('5ad99f9fbd95630dfc4b9a4e'),
    'graphql': {'user': {'biography': 'International educator, traveler, music '
    'lover, photographer, Californian.',
    'blocked_by_viewer': False,
    'connected_fb_page': None,
    ...
    • Oct 8 2019, 6:30 PM
    • 640 Lines
  • 17:09 <+douardda> and the pb should not happen if we put the plugin out of the swh package (i.e. ar the root of swh-core directory)
    17:10 <+douardda> (I've reproduced the issue in a minimal src repo)
    17:12 <+douardda> the problem seems to be that when the plugin lives under our package's hat, it's loaded very soon, thus swh.core is loaded but from the installed location (in the tox venv here)
    17:13 <+douardda> so when the subsequent import statement for a conftest or a testfile occurs, it's looked under this package's root directory first
    17:14 <+douardda> so the simple solution is to put this plugin in a dedicated python module not under the swh's (and especially the swh.core one I think) package
    ...
    • Oct 8 2019, 11:49 AM
    • 6 Lines
  • Bad:
    Returns
    List of tarball urls and their associated metadata (time, length).
    For example:
    ...
    • Oct 8 2019, 11:33 AM
    • 21 Lines
  • ```
    $ sudo apt install r-base
    ```
    ```
    ...
    • Oct 6 2019, 11:49 AM
    • 34 Lines
  • Hello,
    It's the time of the year where I ask you (again!) for your help to better archive GNU source code in the Software Heritage archive.
    Would it be possible to change the format of the GNU file listing [1] to also include SHA256 checksums?
    ...
    • Oct 1 2019, 12:10 PM
    • 16 Lines
  • indexes:
    - swh_workers-2018.03.*
    size: 100
    from: 0
    ...
    • Sep 30 2019, 7:47 PM
    • 27 Lines
  • diff --git a/Makefile b/Makefile
    index 524175c..be63f09 100644
    --- a/Makefile
    +++ b/Makefile
    @@ -3,3 +3,12 @@
    ...
    • Sep 27 2019, 7:10 PM
    • 29 Lines
    • Diff
  • ✘ ⚙ dev@desktop5  ~/swh-environment/swh-journal   bencode-key ●  pytest
    ========================================================================================================= test session starts =========================================================================================================
    platform linux -- Python 3.5.3, pytest-5.0.1, py-1.7.0, pluggy-0.12.0
    hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/home/dev/swh-environment/swh-journal/.hypothesis/examples')
    rootdir: /home/dev/swh-environment/swh-journal, inifile: pytest.ini
    ...
    • Sep 23 2019, 3:00 PM
    • 51 Lines
  • -- gitlab (renamed key "api_baseurl" to "url")
    update task set arguments='{"args": [], "kwargs": {"instance": "inria", "url": "https://gitlab.inria.fr/api/v4"}}' where arguments#>>'{kwargs,instance}' = 'inria' and type in ('list-gitlab-full', 'list-gitlab-incremental');
    update task set arguments='{"args": [], "kwargs": {"instance": "framagit", "url": "https://framagit.org/api/v4"}}' where arguments#>>'{kwargs,instance}' = 'framagit' and type in ('list-gitlab-full', 'list-gitlab-incremental');
    update task set arguments='{"args": [], "kwargs": {"instance": "riseup", "url": "https://0xacab.org/api/v4"}}' where arguments#>>'{kwargs,instance}' = 'riseup' and type in ('list-gitlab-full', 'list-gitlab-incremental');
    update task set arguments='{"args": [], "kwargs": {"instance": "gitlab", "url": "https://gitlab.com/api/v4"}}' where arguments#>>'{kwargs,instance}' = 'gitlab' and type in ('list-gitlab-full', 'list-gitlab-incremental');
    ...
    • Sep 11 2019, 3:54 PM
    • 35 Lines
    • SQL
  • messages: 168300
    messages: 168400
    messages: 168500
    messages: 168600
    messages: 168700
    ...
    • Sep 11 2019, 3:18 PM
    • 23 Lines
  • swh_graph-replayer.1.khy6rzsfohdz@mirror-node-3 | INFO:swh.journal.cli:Processed 35000 messages.
    swh_graph-replayer.1.khy6rzsfohdz@mirror-node-3 | Traceback (most recent call last):
    swh_graph-replayer.1.khy6rzsfohdz@mirror-node-3 | File "/usr/bin/swh", line 11, in <module>
    swh_graph-replayer.1.khy6rzsfohdz@mirror-node-3 | load_entry_point('swh.core==0.0.67', 'console_scripts', 'swh')()
    swh_graph-replayer.1.khy6rzsfohdz@mirror-node-3 | File "/usr/lib/python3/dist-packages/swh/core/cli/__init__.py", line 56, in main
    ...
    • Sep 10 2019, 5:14 PM
    • 36 Lines
  • swh_graph-replayer.1.p18dcgg4lwq5@mirror-node-3 | INFO:swh.journal.cli:Processed 124000 messages.
    swh_graph-replayer.1.p18dcgg4lwq5@mirror-node-3 | INFO:kafka.client:Closing idle connection 1, last active 540038 ms ago
    swh_graph-replayer.1.p18dcgg4lwq5@mirror-node-3 | INFO:kafka.conn:<BrokerConnection node_id=1 host=kafka01.euwest.azure.internal.softwareheritage.org:9092 <connected> [IPv4 ('192.168.200.24', 9092)]>: Closing connection.
    swh_graph-replayer.1.p18dcgg4lwq5@mirror-node-3 | INFO:kafka.client:Closing idle connection 3, last active 540139 ms ago
    swh_graph-replayer.1.p18dcgg4lwq5@mirror-node-3 | INFO:kafka.conn:<BrokerConnection node_id=3 host=kafka03.euwest.azure.internal.softwareheritage.org:9092 <connected> [IPv4 ('192.168.200.31', 9092)]>: Closing connection.
    ...
    • Sep 10 2019, 1:50 PM
    • 9 Lines
  • Sep 10 11:59:01 desktop5 replayer-12301[2746]: INFO:kafka.conn:<BrokerConnection node_id=11 host=esnode1.internal.softwareheritage.org:9092 <connecting> [IPv4 ('192.168.100.61', 9092)]>: connecting to esnode1.internal.softwareheritage.org:9092 [('192.168.100.61', 9092) IPv4]
    Sep 10 11:59:06 desktop5 replayer-12301[2746]: INFO:kafka.conn:<BrokerConnection node_id=12 host=esnode2.internal.softwareheritage.org:9092 <connecting> [IPv4 ('192.168.100.62', 9092)]>: connecting to esnode2.internal.softwareheritage.org:9092 [('192.168.100.62', 9092) IPv4]
    Sep 10 11:59:06 desktop5 replayer-12301[2746]: INFO:kafka.conn:<BrokerConnection node_id=11 host=esnode1.internal.softwareheritage.org:9092 <connecting> [IPv4 ('192.168.100.61', 9092)]>: Connection complete.
    Sep 10 11:59:06 desktop5 replayer-12301[2746]: INFO:kafka.conn:<BrokerConnection node_id=12 host=esnode2.internal.softwareheritage.org:9092 <connecting> [IPv4 ('192.168.100.62', 9092)]>: Connection complete.
    Sep 10 11:59:19 desktop5 replayer-12301[2746]: WARNING:kafka.coordinator:Heartbeat session expired, marking coordinator dead
    ...
    • Sep 10 2019, 12:00 PM
    • 43 Lines
  • ERROR:root:An error occurred while calling o0.visit.
    : java.lang.ArrayIndexOutOfBoundsException: Index 272313807 out of bounds for length 121824
    at it.unimi.dsi.bits.LongArrayBitVector.getBoolean(LongArrayBitVector.java:374)
    at org.softwareheritage.graph.algo.Traversal.visitNodesVisitor(Traversal.java:160)
    at org.softwareheritage.graph.Entry.visit(Entry.java:28)
    ...
    • Sep 9 2019, 7:33 PM
    • 108 Lines
  • /browse/origin/26984/latest_snapshot/
    /browse/origin/34423/latest_snapshot/
    /browse/origin/21387/latest_snapshot/
    /browse/origin/48567/latest_snapshot/
    /browse/origin/29526/latest_snapshot/
    ...
    • Sep 9 2019, 2:59 PM
    • 851 Lines
  • import json
    import requests
    from pprint import pprint
    ...
    • Sep 9 2019, 2:53 PM
    • 74 Lines
    • Python
  • swh/indexer/tests/test_metadata.py .................................F
    ================================================================================================================== FAILURES ===================================================================================================================
    ___________________________________________________________________________________________________ Metadata.test_revision_metadata_indexer ___________________________________________________________________________________________________
    ...
    • Sep 5 2019, 2:55 PM
    • 92 Lines
  • # Graph compression output
    Compression script and environment used: https://forge.softwareheritage.org/source/swh-graph/browse/master/dockerfiles/
    - Direct compressed graph: `all.{graph,obl,offsets,properties}`
    ...
    • Aug 27 2019, 9:20 PM
    • 8 Lines
  • https://forge.softwareheritage.org/rDGRPHb31d2e86a80cf8b85d4bf51f30be8e463fe994e4
    https://forge.softwareheritage.org/rDGRPH0b46253799f43a25a8528926052340f93a1a911b
    https://forge.softwareheritage.org/rDGRPHc7363b064ae1ed52c271b9831b934cd196589c8e
    https://forge.softwareheritage.org/rDGRPHb6c6e1eec131a002a44e01cef17abb81ec958421
    https://forge.softwareheritage.org/rDGRPHd5dcbfcdf245777a8753ccc6ac5414e762605abe
    ...
    • Aug 27 2019, 2:02 PM
    • 186 Lines
  • Command : $ django-admin shell --settings=swh.web.settings.tests
    /home/kalpitk/.virtualenvs/swh/lib/python3.6/site-packages/swh/scheduler/__init__.py:69: DeprecationWarning: Call to deprecated class SWHRemoteAPI. (Use the RPCClient instead) -- Deprecated since version 0.0.64.
    return SchedulerBackend(**args)
    Traceback (most recent call last):
    ...
    • Aug 22 2019, 11:45 AM
    • 38 Lines
    • Bash Scripting
  • #!/usr/bin/env python3
    import sys
    import dateutil.parser
    ...
    • Aug 20 2019, 2:40 PM
    • 36 Lines
    • Python
  • On rioc:
    (base) [zacchiro@rioc graph]$ cat all+ori.*.count
    164513699014
    11683687950
    ...
    • Aug 19 2019, 10:44 AM
    • 10 Lines
  • (swh) archit@work-pc:~/swh-environment/swh-lister/swh/lister$ python jjj.py
    DEBUG:swh.lister.core.lister_base:Loading config from lister_gnu
    INFO:swh.core.config:Loading config file /home/archit/.config/swh/lister_gnu.yml
    DEBUG:swh.lister.core.lister_base:<swh.lister.gnu.lister.GNULister object at 0x7ffacb9f7a20> CONFIG={'content_size_limit': 104857600, 'log_db': 'dbname=softwareheritage-log', 'storage': {'cls': 'remote', 'args': {'url': 'http://localhost:5002/'}}, 'scheduler': {'cls': 'remote', 'args': {'url': 'http://localhost:5008/'}}, 'lister': {'cls': 'local', 'args': {'db': 'postgresql:///lister-gnu'}}, 'credentials': [], 'cache_responses': True, 'cache_dir': '/home/archit/.cache/swh/lister/gnu/'}
    DEBUG:urllib3.util.retry:Converted retries value: 3 -> Retry(total=3, connect=None, read=None, redirect=None, status=None)
    ...
    • Aug 18 2019, 2:35 PM
    • 91 Lines
  • Done on rioc:
    Used 1000 random nodes (results are in seconds):
    'git bundle' use-case
    ...
    • Aug 15 2019, 9:46 PM
    • 15 Lines
  • swh-storage=# select * from origin_visit where origin=424 ;
    origin | visit | date | type | status | metadata | snapshot
    --------+-------+-------------------------------+------+---------+----------+--------------------------------------------
    424 | 1 | 2019-08-14 10:02:03.190532+00 | gnu | partial | |
    424 | 2 | 2019-08-14 10:02:46.964868+00 | gnu | full | | \x1f3305edbd687c27ca005f7cafea3d6f809c38d1
    ...
    • Aug 14 2019, 1:32 PM
    • 16 Lines
  • Done on rioc:
    Used 100000 random nodes (results are in seconds):
    'ls' use-case
    ...
    • Aug 14 2019, 11:22 AM
    • 27 Lines
  • Print of `revisions` at this line
    https://forge.softwareheritage.org/source/swh-storage/browse/master/swh/storage/storage.py$692
    [{'author': {'email': 'robot@softwareheritage.org',
    'fullname': 'Software Heritage',
    ...
    • Aug 14 2019, 10:32 AM
    • 182 Lines
  • swh-storage_1 | [2019-08-13 20:10:39 +0000] [41] [DEBUG] POST /content/missing
    swh-storage_1 | [2019-08-13 20:10:39 +0000] [41] [DEBUG] POST /directory/missing
    swh-storage_1 | [2019-08-13 20:10:39 +0000] [41] [DEBUG] POST /revision/missing
    swh-storage_1 | [2019-08-13 20:10:39 +0000] [41] [DEBUG] POST /revision/add
    swh-storage_1 | ERROR:root:Object of type bytes is not JSON serializable
    ...
    • Aug 13 2019, 10:12 PM
    • 39 Lines
  • utkarsh@G3:~$ workon swh
    Usage:: command not found
    Command: command not found
    Options:: command not found
    INFO: command not found
    ...
    • Aug 12 2019, 3:26 PM
    • 12 Lines
  • (swh) utkarsh@G3:~/swh-environment$ swh scheduler task-type add list-packagist-full2 "swh.lister.packagist.tasks.PackagistListerTask" "Full PACKAGIST lister" --default-interval '1 day' --backoff-factor 1
    /home/utkarsh/.virtualenvs/swh/lib/python3.7/site-packages/swh/scheduler/__init__.py:69: DeprecationWarning: Call to deprecated class SWHRemoteAPI. (Use the RPCClient instead) -- Deprecated since version 0.0.64.
    return SchedulerBackend(**args)
    OK
    • Aug 12 2019, 2:48 PM
    • 4 Lines
  • There were two revisions for this package
    {'author': {'email': b'robot@softwareheritage.org',
    'fullname': b'Software Heritage',
    'name': b'Software Heritage'},
    ...
    • Aug 11 2019, 9:10 PM
    • 104 Lines
  • swh-loader_1 | [2019-08-11 19:02:59,149: INFO/MainProcess] Received task: swh.loader.package.tasks.LoadGNU[11fa45c7-ffc8-47ad-9e18-dcff159f1bd2]
    swh-loader_1 | [2019-08-11 19:02:59,152: INFO/ForkPoolWorker-1] Loading config file /loader.yml
    swh-loader_1 | [2019-08-11 19:02:59,169: DEBUG/ForkPoolWorker-1] Creating gnu origin for https://ftp.gnu.org/gnu/hello/
    swh-loader_1 | [2019-08-11 19:02:59,304: DEBUG/ForkPoolWorker-1] Done creating gnu origin for https://ftp.gnu.org/gnu/hello/
    swh-loader_1 | [2019-08-11 19:02:59,304: DEBUG/ForkPoolWorker-1] Creating origin_visit for origin https://ftp.gnu.org/gnu/hello/ at time 2019-08-11 19:02:59.304333+00:00
    ...
    • Aug 11 2019, 9:06 PM
    • 82 Lines
  • iled'}
    swh-loader_1 | [2019-08-11 18:50:23,779: INFO/MainProcess] Received task: swh.loader.package.tasks.LoadGNU[7a7e43e1-dd5b-4ebe-be5b-2ea10fd0bdbc]
    swh-loader_1 | [2019-08-11 18:50:23,871: INFO/ForkPoolWorker-1] Loading config file /loader.yml
    swh-loader_1 | [2019-08-11 18:50:23,915: DEBUG/ForkPoolWorker-1] Creating gnu origin for https://ftp.gnu.org/gnu/dap/
    swh-loader_1 | [2019-08-11 18:50:23,924: DEBUG/ForkPoolWorker-1] Done creating gnu origin for https://ftp.gnu.org/gnu/dap/
    ...
    • Aug 11 2019, 8:58 PM
    • 91 Lines
  • P497 logs
    wh-loader_1 | Using pip from /srv/softwareheritage/venv/bin/pip
    swh-loader_1 | Processing /src/swh-loader
    swh-loader_1 | Requirement already satisfied: vcversioner in /srv/softwareheritage/venv/lib/python3.7/site-packages (from swh.loader.core==0.0.44.post3) (2.16.0.0)
    swh-loader_1 | Requirement already satisfied: retrying in /srv/softwareheritage/venv/lib/python3.7/site-packages (from swh.loader.core==0.0.44.post3) (1.3.3)
    swh-loader_1 | Requirement already satisfied: psutil in /srv/softwareheritage/venv/lib/python3.7/site-packages (from swh.loader.core==0.0.44.post3) (5.6.3)
    ...
    • Aug 9 2019, 4:35 PM
    • 295 Lines
  • NPM Revision
    {'synthetic': True,
    'metadata': {'package_source': {'name': 'ja','version': '0.0.1', 'filename': 'ja-0.0.1.tgz', 'sha1': '31399c51d3024f6eb91c626a31a175dc30f343e5', 'date': '2014-04-07T16:02:07.453Z', 'url': 'https://registry.npmjs.org/ja/-/ja-0.0.1.tgz', 'sha256': '8101e284d5846e77f9698628da660e92f8627535c266a3aef28549b25e63a597', 'blake2s256': 'fcf5438822e12c8a6973b9ecb40df4978947edd675c4269d728d5279dc82b203'},
    'package': {'author': 'Goldeneye Solutions', 'name': 'ja', 'description': 'Compose a cross-platform application from various things.',
    ...
    • Aug 9 2019, 3:56 PM
    • 95 Lines
  • id | type | arguments | next_run | current_interval | status | policy | retries_left | priority
    --------+---------------+----------------------------+-------------------------------+------------------+--------------------+---------+--------------+----------
    225784 | list-gnu-full | {"args": [], "kwargs": {}} | 2019-08-08 19:22:27.338807+00 | 90 days | next_run_scheduled | oneshot | 0 |
    225786 | list-gnu-full | {"args": [], "kwargs": {}} | 2019-08-08 19:25:28.450559+00 | 90 days | next_run_scheduled | oneshot | 0 |
    225787 | list-gnu-full | {"args": [], "kwargs": {}} | 2019-08-08 19:29:55.666438+00 | 90 days | disabled | oneshot | 0 |
    ...
    • Aug 9 2019, 7:47 AM
    • 8 Lines
  • swh-lister_1 | [2019-08-09 05:32:01,482: ERROR/ForkPoolWorker-1] Task swh.lister.gnu.tasks.GNUListerTask[e997c3cf-0dfc-423e-9460-3e8d685321b7] raised unexpected: NotNullViolation('null value in column "retries_left" violates not-null constraint\nDETAIL: Failing row contains (225791, load-tar, {"args": ["gcal", "https://ftp.gnu.org/gnu/gcal/"], "kwargs": {"..., 2019-08-09 05:32:01.13707+00, null, next_run_not_scheduled, recurring, null, null).\nCONTEXT: SQL statement "insert into task (type, arguments, next_run, status, current_interval, policy,\n retries_left, priority)\n select type, arguments, next_run, status, current_interval, policy,\n retries_left, priority\n from tmp_task t\n where not exists(select 1\n from task\n where type = t.type and\n arguments->\'args\' = t.arguments->\'args\' and\n arguments->\'kwargs\' = t.arguments->\'kwargs\' and\n policy = t.policy and\n priority is not distinct from t.priority and\n status = t.status)"\nPL/pgSQL function swh_scheduler_create_tasks_from_temp() line 12 at SQL statement\n')
    swh-lister_1 | Traceback (most recent call last):
    swh-lister_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/celery/app/trace.py", line 385, in trace_task
    swh-lister_1 | R = retval = fun(*args, **kwargs)
    swh-lister_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/scheduler/task.py", line 45, in __call__
    ...
    • Aug 9 2019, 7:35 AM
    • 41 Lines
  • Notice: /Stage[main]/Profile::Ssh::Server/Sshkey[ssh-worker1.internal.staging.swh.network-ecdsa-sha2-nistp256]/ensure: current_value absent, should be present (noop)
    Notice: /Stage[main]/Profile::Ssh::Server/Sshkey[ssh-worker0.internal.staging.swh.network-rsa]/ensure: current_value absent, should be present (noop)
    Notice: /Stage[main]/Profile::Ssh::Server/Sshkey[ssh-worker0.internal.staging.swh.network-ecdsa-sha2-nistp256]/ensure: current_value absent, should be present (noop)
    Notice: /Stage[main]/Profile::Ssh::Server/Sshkey[ssh-worker0.internal.staging.swh.network-ed25519]/ensure: current_value absent, should be present (noop)
    Notice: /Stage[main]/Profile::Ssh::Server/Sshkey[ssh-worker0.internal.staging.swh.network-dsa]/ensure: current_value absent, should be present (noop)
    ...
    • Aug 8 2019, 7:28 PM
    • 172 Lines
  • version: '2'
    services:
    swh-objstorage:
    volumes:
    ...
    • Aug 8 2019, 3:41 PM
    • 15 Lines
  • Benchmark results for content_find:
    hash_algo = sha1 (sample size=263):
    cassandra: avg = 9 ms, stdev = 2.5 ms
    postgres: avg = 14 ms, stdev = 14.9 ms
    ...
    • Aug 8 2019, 1:47 PM
    • 72 Lines
  • from collections import defaultdict
    import csv
    import itertools
    import os
    from pprint import pprint
    ...
    • Aug 8 2019, 12:21 PM
    • 207 Lines
    • Python
  • Benchmark results for content_find:
    hash_algo = sha1 (sample size=248):
    avg cassandra = 5 ms
    avg postgres = 14 ms
    hash_algo = sha1_git (sample size=242):
    ...
    • Aug 8 2019, 12:16 PM
    • 25 Lines
  • DEBUG:swh.loader.package.GNULoader:Sending 2 revisions
    DEBUG:urllib3.connectionpool:Resetting dropped connection: localhost
    DEBUG:urllib3.connectionpool:http://localhost:5002 "POST /revision/add HTTP/1.1" 400 85
    ERROR:swh.loader.package.GNULoader:Loading failure, updating to `partial` status
    Traceback (most recent call last):
    ...
    • Aug 7 2019, 1:12 PM
    • 38 Lines
  • pip install $( ./bin/pip-swh-packages --with-testing )
    Obtaining file:///home/tony/work/inria/repo/swh/swh-environment/swh-core
    Obtaining file:///home/tony/work/inria/repo/swh/swh-environment/swh-model
    Obtaining file:///home/tony/work/inria/repo/swh/swh-environment/swh-core
    Obtaining file:///home/tony/work/inria/repo/swh/swh-environment/swh-objstorage
    ...
    • Aug 2 2019, 12:03 PM
    • 52 Lines
  • variable "region" {
    type = "string"
    default = "northeurope"
    }
    ...
    • Aug 1 2019, 5:40 PM
    • 68 Lines
  • # Get an available port number
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sock.bind(('127.0.0.1', 0))
    self.port = sock.getsockname()[1]
    sock.close()
    • Aug 1 2019, 10:05 AM
    • 5 Lines
    • Python
  • Info: Using configured environment 'new_staging'
    Info: Retrieving pluginfacts
    Info: Retrieving plugin
    Info: Loading facts
    Info: Applying configuration version '1564591379'
    ...
    • Jul 31 2019, 6:45 PM
    • 59 Lines
  • Package: DZEXPM
    Type: Package
    Title: Estimation and Prediction of Skewed Spatial Processes
    Version: 1.0
    Date: 2017-06-24
    ...
    • Jul 25 2019, 6:56 PM
    • 14 Lines
  • root@pergamon:~# puppet agent --test --noop
    Info: Using configured environment 'production'
    Info: Retrieving pluginfacts
    Info: Retrieving plugin
    Info: Loading facts
    ...
    • Jul 24 2019, 4:27 PM
    • 68 Lines
  • $ git commit --date '1999-01-01' -m "foo" --allow-empty | grep Date
    Date: Fri Jan 1 12:59:34 1999 +0100
    $ git commit --date '1973-03-03' -m "foo" --allow-empty | grep Date
    Date: Sat Mar 3 13:00:21 1973 +0100
    $ git commit --date '1973-03-02' -m "foo" --allow-empty | grep Date
    ...
    • Jul 23 2019, 2:02 PM
    • 10 Lines
  • ardumont@pergamon:~% cat /etc/fstab
    # /etc/fstab: static file system information.
    #
    # Use 'blkid' to print the universally unique identifier for a
    # device; this may be used with UUID= as a more robust way to name devices
    ...
    • Jul 22 2019, 11:41 AM
    • 32 Lines
  • /
    -----master----. b
    \ /
    ...
    • Jul 19 2019, 2:10 PM
    • 18 Lines
  • (swh) archit@work-pc:~/swh-environment/swh-loader-core/swh/loader/gnu$ tox .
    GLOB sdist-make: /home/archit/swh-environment/swh-loader-core/setup.py
    flake8 installed: entrypoints==0.3,flake8==3.7.8,mccabe==0.6.1,pycodestyle==2.5.0,pyflakes==2.1.1,swh.loader.core==0.0.44.post2
    flake8 run-test-pre: PYTHONHASHSEED='1804426209'
    flake8 run-test: commands[0] | /home/archit/swh-environment/swh-loader-core/.tox/flake8/bin/python -m flake8
    ...
    • Jul 19 2019, 9:48 AM
    • 65 Lines
  • swh_content-replayer.1.bgt2xzycn7wo@desktop6 | Starting the SWH mirror content replayer
    swh_content-replayer.1.bgt2xzycn7wo@desktop6 | /usr/lib/python3/dist-packages/swh/storage/api/server.py:20: DeprecationWarning: Call to deprecated class SWHServerAPIApp. (Use the RPCServerApp instead) -- Deprecated since version 0.0.64.
    swh_content-replayer.1.bgt2xzycn7wo@desktop6 | app = SWHServerAPIApp(__name__)
    swh_content-replayer.1.bgt2xzycn7wo@desktop6 | Traceback (most recent call last):
    swh_content-replayer.1.bgt2xzycn7wo@desktop6 | File "/usr/bin/swh", line 11, in <module>
    ...
    • Jul 18 2019, 3:42 PM
    • 15 Lines
  • kafka:
    image: wurstmeister/kafka
    environment:
    KAFKA_ADVERTISED_HOST_NAME: 127.0.0.1
    ports:
    ...
    • Jul 14 2019, 1:30 PM
    • 9 Lines
  • (swh) archit@work-pc:~/swh-environment$ doco logs kafka
    kafka_1 | log.cleaner.min.cleanable.ratio = 0.5
    kafka_1 | log.cleaner.min.compaction.lag.ms = 0
    ...
    • Jul 14 2019, 1:23 PM
    • 177 Lines
  • (swh) archit@work-pc:~/swh-environment$ doco logs swh-storage
    ...
    • Jul 14 2019, 1:21 PM
    • 47 Lines
  • INFO:swh.core.config:Loading config file /home/archit/.config/swh/loader/cran.yml
    /home/archit/swh-environment/swh-storage/swh/storage/api/client.py:13: DeprecationWarning: Call to deprecated class MetaRPCClient. (Use the MetaRPCClient instead) -- Deprecated since version 0.0.64.
    class RemoteStorage(SWHRemoteAPI):
    /home/archit/swh-environment/swh-storage/swh/storage/__init__.py:43: DeprecationWarning: Call to deprecated class RPCClient. (Use the RPCClient instead) -- Deprecated since version 0.0.64.
    return Storage(**args)
    ...
    • Jul 14 2019, 12:55 PM
    • 48 Lines