Page MenuHomeSoftware Heritage
Paste Active Pastes
  • #!/bin/bash
    cat log_mph log_bv log_bfs log_transform > log
    rm log_*
    cat timings_mph timings_bv timings_obl timings_bfs timings_transform timings_obl2 timings_stats > timings
    ...
    • May 19 2019, 6:45 AM
    • 6 Lines
    • Bash Scripting
  • Graph compression output
    ========================
    These are the output directories from experiments running the WebGraph framework
    to compress the Software Heritage graph datasets. Each directory is the output
    ...
    • May 18 2019, 7:49 AM
    • 16 Lines
  • (swh) morane@hplaptopft0:~/Documents/code/swh-environment/swh-docs$ tox -e sphinx-dev
    GLOB sdist-make: /home/morane/Documents/code/swh-environment/swh-docs/setup.py
    sphinx-dev create: /home/morane/Documents/code/swh-environment/swh-docs/.tox/sphinx-dev
    sphinx-dev installdeps: django < 2, -rrequirements-swh-dev.txt, pifpaf
    ...
    • May 15 2019, 2:30 PM
    • 23 Lines
  • Stats
    -----
    Returns statistics on the compressed graph.
    ...
    • May 13 2019, 8:00 AM
    • 74 Lines
  • # Edge dataset
    The dataset in this folder only contains informations about the **edges** of the
    Software Heritage Graph (and none of the associated metadata). This is useful
    for studying the **topology** of the graph.
    ...
    • May 12 2019, 1:56 PM
    • 39 Lines
  • #!/bin/bash
    for dataset in dir_to_dir dir_to_file dir_to_rev origin_to_snapshot \
    release_to_obj rev_to_dir rev_to_rev snapshot_to_obj; do
    mv $dataset.csv.gz $dataset.edges.csv.gz
    ...
    • May 12 2019, 4:12 AM
    • 9 Lines
    • Bash Scripting
  • /srv/ftp/pub/R/src/contrib/00Archive/A3/A3_0.9.1.tar.gz 45252 FALSE 664 2013-02-07 14:30:29 2019-05-05 09:40:28 2019-05-05 02:56:44 1001 1001 hornik cranadmin
    ...
    • May 10 2019, 7:55 PM
    • 30 Lines
  • update deposit set check_task_id='167830447', load_task_id='167830455' where id = 264;
    update deposit set check_task_id='164895648', load_task_id='167830454' where id = 263;
    update deposit set check_task_id='160036918', load_task_id='160037202' where id = 262;
    update deposit set check_task_id='159935272', load_task_id='159936134' where id = 261;
    ...
    • May 9 2019, 11:23 AM
    • 15 Lines
    • SQL
  • ~/annex/dataset/swh-graph-2019-01-28/edges $ cat *.csv.count | paste -d+ -s | bc
    164_513_703_039 # 160 B
    ~/annex/dataset/swh-graph-2019-01-28/edges $ cat *.nodes.count | paste -d+ -s | bc
    17_537_088_222 # 17 B
    • May 7 2019, 1:28 PM
    • 4 Lines
  • Hello,
    sorry for the late reply.
    So, as zack mentioned, we did the changes.
    ...
    • May 6 2019, 3:30 PM
    • 20 Lines
  • swh_app = <Celery celery.tests at 0x7ff6a13d0048>
    celery_session_worker = <Worker: gen658@b84cee4ca65b (running)>
    def test_ping(swh_app, celery_session_worker):
    res = swh_app.send_task(
    ...
    • Apr 28 2019, 12:02 AM
    • 69 Lines
  • swh-docker-dev_zookeeper_1 is up-to-date
    Starting swh-docker-dev_swh-objstorage_1 ...
    swh-docker-dev_swh-storage-db_1 is up-to-date
    swh-docker-dev_swh-idx-storage-db_1 is up-to-date
    Starting swh-docker-dev_swh-objstorage_1 ... done
    ...
    • Apr 23 2019, 3:08 PM
    • 11 Lines
  • Name Command State Ports
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------
    swh-docker-dev_amqp_1 docker-entrypoint.sh rabbi ... Up 15671/tcp, 15672/tcp, 25672/tcp, 4369/tcp, 5671/tcp, 0.0.0.0:5072->5672/tcp
    swh-docker-dev_grafana_1 /run.sh Up 3000/tcp
    swh-docker-dev_kafka-manager_1 /kafka-manager/bin/kafka-m ... Up 0.0.0.0:5093->9000/tcp
    ...
    • Apr 23 2019, 2:47 PM
    • 30 Lines
  • azure@desktop5~/ansible   master ✚  cat plugins/inventory/terraform.py
    import json
    import os
    from subprocess import check_output
    ...
    • Apr 17 2019, 2:33 PM
    • 61 Lines
    • Python
  • P382 test.tf
    provider "azurerm" {
    }
    data "azurerm_network_security_group" "worker-nsg" {
    name = "worker-nsg"
    ...
    • Apr 17 2019, 1:37 PM
    • 74 Lines
  • $ ipython
    Python 3.7.2+ (default, Feb 2 2019, 14:31:48)
    Type 'copyright', 'credits' or 'license' for more information
    IPython 7.2.0 -- An enhanced Interactive Python. Type '?' for help.
    ...
    • Apr 16 2019, 5:06 PM
    • 23 Lines
    • Python
  • $ parallel
    Academic tradition requires you to cite works you base your article on.
    When using programs that use GNU Parallel to process data for publication
    please cite:
    ...
    • Apr 16 2019, 11:24 AM
    • 18 Lines
  • def test_content_add_collision_sha256(self):
    cont1 = self.cont
    # create (corrupted) content with same sha256 but != sha1{,_git}
    cont1b = cont1.copy()
    ...
    • Apr 2 2019, 4:14 PM
    • 13 Lines
    • Python
  • diff --git a/swh-team/swh-weekly-report b/swh-team/swh-weekly-report
    index 10b53e1..ec8f64d 100755
    --- a/swh-team/swh-weekly-report
    +++ b/swh-team/swh-weekly-report
    @@ -9,7 +9,7 @@ from dateutil.relativedelta import relativedelta
    ...
    • Apr 1 2019, 6:02 PM
    • 93 Lines
    • Diff
  • Delivered-To: antoine.romain.dumont@gmail.com
    Received: by 2002:a02:c722:0:0:0:0:0 with SMTP id h2csp1100444jao;
    Wed, 27 Mar 2019 11:20:52 -0700 (PDT)
    X-Google-Smtp-Source: APXvYqy7L3Rh5mg/8x1/ZMWHV2gDO+U7xH/Ps0EYSqvWFS2HHupe3xHr8/Y0jPrUmtAJR/vaAWu9
    X-Received: by 2002:adf:8367:: with SMTP id 94mr25834115wrd.46.1553710852555;
    ...
    • Mar 28 2019, 10:30 AM
    • 141 Lines
  • as of now, swh-environment within nix:
    ```
    $ nix-shell swh.nix
    ```
    ...
    • Mar 23 2019, 11:28 PM
    • 257 Lines
  • $ ./swh-weekly-report.py
    Tasks (subscribed):
    - T1534 | PostgreSQL replication issues between prado and somerset
    - T1276 | swh-journal: Add tests
    ...
    • Mar 20 2019, 11:07 PM
    • 18 Lines
  • <Multi_key> <d> <r> : "https://forge.softwareheritage.org/diffusion" # repositories
    <Multi_key> <d> <t> : "https://forge.softwareheritage.org/tasks"
    <Multi_key> <d> <p> : "https://forge.softwareheritage.org/paste"
    <Multi_key> <d> <g> : "https://wiki.softwareheritage.org/wiki/Git_style_guide"
    <Multi_key> <d> <d> : "https://docs.softwareheritage.org/devel/getting-started.html#getting-started"
    ...
    • Mar 20 2019, 9:55 AM
    • 7 Lines
  • (swh) archit@work-pc:~/swh-environment/swh-lister$ python
    Python 3.6.7 (default, Oct 22 2018, 11:32:17)
    [GCC 8.2.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import logging
    ...
    • Mar 19 2019, 10:22 PM
    • 100 Lines
  • [<FrameSummary file /usr/lib/python3.5/runpy.py, line 193 in _run_module_as_main>, <FrameSummary file /usr/lib/python3.5/runpy.py, line 85 in _run_code>, <FrameSummary file /usr/lib/python3/dist-packages/celery/__main__.py, line 20 in <module>>, <FrameSummary file /usr/lib/python3/dist-packages/celery/__main__.py, line 16 in main>, <FrameSummary file /usr/lib/python3/dist-packages/celery/bin/celery.py, line 322 in main>, <FrameSummary file /usr/lib/python3/dist-packages/celery/bin/celery.py, line 496 in execute_from_commandline>, <FrameSummary file /usr/lib/python3/dist-packages/celery/bin/base.py, line 275 in execute_from_commandline>, <FrameSummary file /usr/lib/python3/dist-packages/celery/bin/celery.py, line 488 in handle_argv>, <FrameSummary file /usr/lib/python3/dist-packages/celery/bin/celery.py, line 420 in execute>, <FrameSummary file /usr/lib/python3/dist-packages/celery/bin/worker.py, line 223 in run_from_argv>, <FrameSummary file /usr/lib/python3/dist-packages/celery/bin/base.py, line 238 in...
    • Mar 18 2019, 5:17 PM
    • 1 Line
  • Package Version Location
    --------------------------- ------------- -------------------------------------------------
    aiohttp 4.0.0a0
    alabaster 0.7.12
    amqp 2.4.2
    ...
    • Mar 18 2019, 11:19 AM
    • 150 Lines
  • From: ardumont@softwareheritage.org
    To: sysadmin@fsf.org
    Cc: swh-devel@inria.fr
    Subject: [swh] GNU listing adaptation please?
    Fcc: sent
    ...
    • Mar 12 2019, 5:52 PM
    • 38 Lines
  • Using the Software Heritage Graph Dataset
    =========================================
    This README contains instructions on how to use the different formats the
    *Software Heritage graph dataset* is distributed as.
    ...
    • Mar 11 2019, 4:06 PM
    • 129 Lines
  • *** swh-deploy: starting test run on moma.internal.softwareheritage.org...
    Info: Using configured environment 'production'
    Info: Retrieving pluginfacts
    Info: Retrieving plugin
    Info: Loading facts
    ...
    • Mar 4 2019, 3:28 PM
    • 59 Lines
  • swh-web_1 | starting the swh-web server
    swh-web_1 | [2019-03-04 12:25:26 +0000] [1] [INFO] Starting gunicorn 19.9.0
    swh-web_1 | [2019-03-04 12:25:26 +0000] [1] [INFO] Listening at: http://0.0.0.0:5004 (1)
    swh-web_1 | [2019-03-04 12:25:26 +0000] [1] [INFO] Using worker: sync
    swh-web_1 | [2019-03-04 12:25:26 +0000] [13] [INFO] Booting worker with pid: 13
    ...
    • Mar 4 2019, 1:29 PM
    • 53 Lines
  • swh-web_1 | TypeError: 'NoneType' object is not iterable
    swh-web_1 | [2019-03-03 14:26:03 +0000] [18] [ERROR] Error handling request /static/img/icons/swh-logo-deposit-192x192.png
    swh-web_1 | Traceback (most recent call last):
    swh-web_1 | File "/usr/local/lib/python3.6/site-packages/gunicorn/workers/sync.py", line 135, in handle
    swh-web_1 | self.handle_request(listener, req, client, addr)
    ...
    • Mar 3 2019, 4:31 PM
    • 223 Lines
  • update revision_metadata
    set translated_metadata = origin_intrinsic_metadata.metadata
    from origin_intrinsic_metadata
    where revision_metadata.id=origin_intrinsic_metadata.from_revision and revision_metadata.translated_metadata='{"@context": "https://doi.org/10.5063/schema/codemeta-2.0"}' and origin_intrinsic_metadata.metadata != '{"@context": "https://doi.org/10.5063/schema/codemeta-2.0"}';
    • Mar 2 2019, 9:50 AM
    • 4 Lines
  • DELETE FROM revision_metadata
    WHERE translated_metadata = '{"@context": "https://doi.org/10.5063/schema/codemeta-2.0"}'::jsonb ;
    DELETE FROM origin_intrinsic_metadata
    WHERE metadata = '{"@context": "https://doi.org/10.5063/schema/codemeta-2.0"}'::jsonb ;
    • Mar 1 2019, 2:32 PM
    • 5 Lines
  • *** swh-deploy: starting test run on moma.internal.softwareheritage.org...
    Info: Using configured environment 'production'
    Info: Retrieving pluginfacts
    Info: Retrieving plugin
    Info: Loading facts
    ...
    • Feb 28 2019, 6:51 PM
    • 59 Lines
  • Old :
    CREATE TABLE public.directory (
    id public.sha1_git NOT NULL,
    dir_entries bigint[],
    ...
    • Feb 19 2019, 11:48 AM
    • 33 Lines
  • Feb 14 17:46:13 worker01 python3[2344]: [2019-02-14 17:46:13,609: ERROR/ForkPoolWorker-1] Task swh.deposit.loader.tasks.ChecksDepositTsk[e201d7a8-6f19-4248-a08c-d10874c2e6a3] raised unexpected: AttributeError("'DepositChecker' object has no attribute 'log'",)
    Feb 14 17:46:13 worker01 python3[2344]: Traceback (most recent call last):
    Feb 14 17:46:13 worker01 python3[2344]: File "/usr/lib/python3/dist-packages/swh/deposit/loader/checker.py", line 21, in check
    Feb 14 17:46:13 worker01 python3[2344]: self.client.check(deposit_check_url)
    Feb 14 17:46:13 worker01 python3[2344]: File "/usr/lib/python3/dist-packages/swh/deposit/client/__init__.py", line 208, in check
    ...
    • Feb 14 2019, 7:01 PM
    • 42 Lines
  • Feb 14 14:24:05 worker01 python3[4534]: [2019-02-14 14:24:05,077: ERROR/ForkPoolWorker-1] Task swh.lister.gitlab.tasks.IncrementalGitLabLister[f7f2d92d-bfa9-4994-a637-e3859a73f432] raised unexpected: TypeError('incremental_gitlab_lister() takes 0 positional arguments but 1 was given',)
    Feb 14 14:24:05 worker01 python3[4534]: Traceback (most recent call last):
    Feb 14 14:24:05 worker01 python3[4534]: File "/usr/lib/python3/dist-packages/celery/app/trace.py", line 382, in trace_task
    Feb 14 14:24:05 worker01 python3[4534]: R = retval = fun(*args, **kwargs)
    Feb 14 14:24:05 worker01 python3[4534]: File "/usr/lib/python3/dist-packages/swh/scheduler/task.py", line 45, in __call__
    ...
    • Feb 14 2019, 3:39 PM
    • 18 Lines
  • rng | frq | bar
    -------------+----------+--------------------------------
    [4,73) | 22473062 | ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
    [72,142) | 4225673 | ■■■■■■
    [141,211) | 3351946 | ■■■■
    ...
    • Feb 12 2019, 4:41 PM
    • 22 Lines
  • goal: Make the stretch-swh's build for the scheduler ok
    Relevant commit:
    ```
    cd swh-scheduler
    ...
    • Feb 11 2019, 4:24 PM
    • 38 Lines
  • ============================= test session starts ==============================
    platform linux -- Python 3.5.3, pytest-3.0.6, py-1.4.32, pluggy-0.4.0
    rootdir: /<<PKGBUILDDIR>>, inifile:
    plugins: postgresql-1.3.4, hypothesis-3.6.1, celery-4.2.1
    collected 42 items
    ...
    • Feb 11 2019, 11:15 AM
    • 68 Lines
  • Started by timer
    Running in Durability level: MAX_SURVIVABILITY
    Loading library swh@master
    Attempting to resolve master from remote references...
    > git --version # timeout=10
    ...
    • Feb 7 2019, 3:38 PM
    • 138 Lines
  • gemspec | codemeta | pkginfo | npm | maven | total | total_nonempty
    ---------+----------+---------+---------+--------+----------+----------------
    143688 | 139 | 3525 | 1313459 | 447697 | 17730303 | 2033490
    • Feb 4 2019, 10:55 AM
    • 3 Lines
  • #!/bin/bash
    # wrapper to run GitHub Licensee license detection tool form a git clone of its
    # repo, setting up the appropriate Ruby load path
    ...
    • Feb 2 2019, 1:35 PM
    • 8 Lines
  • Jan 31 13:04:36 storage0 python3 [2434334]: 2019-01-31 13:04:36 [2434334] [ERROR] canceling statement due to statement timeout
    Traceback (most recent call last):
    File "/usr/lib/python3/dist-packages/flask/app.py", line 1612, in full_dispatch_request
    rv = self.dispatch_request ()
    File "/usr/lib/python3/dist-packages/flask/app.py", line 1598, in dispatch_request
    ...
    • Jan 31 2019, 2:11 PM
    • 31 Lines
  • *** swh-deploy: deploying on moma.internal.softwareheritage.org...
    Info: Using configured environment 'production'
    Info: Retrieving pluginfacts
    Info: Retrieving plugin
    Info: Loading facts
    ...
    • Jan 25 2019, 2:49 PM
    • 51 Lines
  • swh-scheduler-api_1 | [INFO] werkzeug -- 172.20.0.15 - - [24/Jan/2019 10:48:57] "POST /create_tasks HTTP/1.1" 200 -
    swh-scheduler-api_1 | [INFO] werkzeug -- 172.20.0.15 - - [24/Jan/2019 10:48:57] "POST /create_tasks HTTP/1.1" 200 -
    swh-scheduler-api_1 | [INFO] werkzeug -- 172.20.0.15 - - [24/Jan/2019 10:48:57] "POST /create_tasks HTTP/1.1" 200 -
    swh-scheduler-api_1 | [ERROR] root -- relation "tmp_task" already exists
    ...
    • Jan 24 2019, 11:56 AM
    • 47 Lines
  • $ pip3 show pyld | grep Version
    Version: 1.0.3
    $ python3
    Python 3.5.3 (default, Sep 27 2018, 17:25:39)
    [GCC 6.3.0 20170516] on linux
    ...
    • Jan 16 2019, 5:57 PM
    • 29 Lines
  • >>> import pprint
    >>> import swh.indexer.storage.api.client
    >>> s = swh.indexer.storage.api.client.RemoteStorage(url='http://uffizi.internal.softwareheritage.org:5007/')
    >>> pprint.pprint(s.origin_intrinsic_metadata_search_fulltext(['James']))
    [{'from_revision': b'\xd4bM\xa6\x9eH\x06\x15\x0c\x1ap\xbc\x84~\x11\x17'
    ...
    • Jan 14 2019, 1:40 PM
    • 85 Lines
    • Python
  • import sys
    import time
    #import kafka
    ...
    • Jan 10 2019, 1:38 PM
    • 32 Lines
  • # Copyright (C) 2018 The Software Heritage developers
    # See the AUTHORS file at the top-level directory of this distribution
    # License: GNU General Public License version 3, or any later version
    # See top-level LICENSE file for more information
    ...
    • Jan 10 2019, 11:53 AM
    • 50 Lines
    • Python
  • -- DONE
    CREATE TABLE ctas_dataset_dir_to_rev
    WITH (format = 'TEXTFILE', external_location =
    's3://softwareheritage/edges_dataset/dir_to_rev/', field_delimiter = ' ')
    AS SELECT to_hex(directory.id) as source, to_hex(target) as dest
    ...
    • Jan 7 2019, 7:15 PM
    • 72 Lines
    • SQL
  • swh-environment $ pip install $( ./bin/pip-swh-packages --with-testing )
    swh-storage[schemata,listener][testing] should either be a path to a local project or a VCS url beginning with svn+, git+, hg+, or bzr+
    • Dec 20 2018, 3:31 PM
    • 2 Lines
  • ```
    $ pifpaf run postgresql -- pytest
    WARNING [pifpaf.drivers] `psutil.Popen(pid=23360, status='terminated')` is already gone, sending SIGKILL to its process group
    ERROR [pifpaf] sequence item 0: expected str instance, bytes found
    ```
    ...
    • Dec 20 2018, 3:24 PM
    • 45 Lines
  • swh-scheduler-api_1 | ERROR:root:fe_sendauth: no password supplied
    swh-scheduler-api_1 | Traceback (most recent call last):
    swh-scheduler-api_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1813, in full_dispatch_request
    swh-scheduler-api_1 | rv = self.dispatch_request()
    swh-scheduler-api_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1799, in dispatch_request
    ...
    • Dec 17 2018, 3:22 PM
    • 19 Lines
  • ✘ dev@desktop5  ~/swh-environment/swh-docs   master  git pull
    Already up-to-date.
    dev@desktop5  ~/swh-environment/swh-docs   master  tox -r -e sphinx-dev
    GLOB sdist-make: /home/dev/swh-environment/swh-docs/setup.py
    sphinx-dev recreate: /home/dev/swh-environment/swh-docs/.tox/sphinx-dev
    ...
    • Nov 29 2018, 6:29 PM
    • 46 Lines
  • When i have something like:
    ```
    if a:
    r = do_something(a)
    ...
    • Nov 22 2018, 11:46 AM
    • 21 Lines
  • Hypothesis issue with old version: https://github.com/HypothesisWorks/hypothesis/issues/290
    build output:
    ```
    ============================= test session starts ==============================
    ...
    • Nov 19 2018, 11:38 AM
    • 98 Lines
  • pytest swh/indexer/tests/storage/test_storage.py -x
    ========================================================================================================= test session starts =========================================================================================================
    platform linux -- Python 3.5.3, pytest-3.9.3, py-1.7.0, pluggy-0.8.0
    hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/home/dev/swh-environment/swh-indexer/.hypothesis/examples')
    rootdir: /home/dev/swh-environment/swh-indexer, inifile: pytest.ini
    ...
    • Nov 15 2018, 10:52 AM
    • 99 Lines
  • diff <(grep 'def ' swh/storage/in_memory.py | grep -v 'def _' | sed -e 's/(.*//' | sort) <(grep 'def ' swh/storage/storage.py | grep -v 'def _' | sed -e 's/(.*//' | sort)
    0a1
    > def add_to_objstorage
    3a5
    > def content_get
    ...
    • Nov 15 2018, 10:07 AM
    • 21 Lines
    • Diff
  • | context | file_name | counted | percentage | percentage | percentage on 3,424,000,000 files |
    |-------------------------------|-------------------|----------------------|------------|------------|-----------------------------------|
    | CodeMeta | CODE | 320 | 0.00% | | 8.85E-06 |
    | haskell | .cabal | 676053 | 1.27% | 0.01% | 0.01870298068 |
    | java- Maven | pom.xml | 15509125 | 29.03% | 0.43% | 0.4290593566 |
    ...
    • Nov 12 2018, 5:11 PM
    • 33 Lines
  • def _naive_sig(param_names):
    return inspect.Signature([
    inspect.Parameter(name, inspect.Parameter.POSITIONAL_OR_KEYWORD)
    for name in param_names])
    ...
    • Nov 9 2018, 1:07 PM
    • 82 Lines
    • Python
  • def content_mimetype_missing(self, mimetypes, db=None, cur=None):
    """Generates mimetypes missing from storage.
    Args:
    mimetypes (iterable): iterable of dict with keys:
    ...
    • Nov 8 2018, 12:03 PM
    • 13 Lines
    • Python
  • commit edebe6a4a42bae99a1819898b14bb0951cfe6b8b
    Author: Valentin Lorentz <vlorentz@softwareheritage.org>
    Date: Mon Nov 5 14:54:06 2018 +0100
    Remove testrepo.zip.
    ...
    • Nov 5 2018, 2:52 PM
    • 64 Lines
    • Diff
  • GLOB sdist-make: /home/morane/Documents/code/swh-environment/swh-indexer/setup.py
    flake8 recreate: /home/morane/Documents/code/swh-environment/swh-indexer/.tox/flake8
    flake8 installdeps: flake8
    flake8 installed: flake8==3.6.0,mccabe==0.6.1,pkg-resources==0.0.0,pycodestyle==2.4.0,pyflakes==2.0.0
    flake8 runtests: PYTHONHASHSEED='3997505493'
    ...
    • Oct 30 2018, 4:31 PM
    • 3,664 Lines
  • dev@desktop5  ~/swh-environment/swh-indexer   master  git pull
    remote: Counting objects: 14, done.
    remote: Compressing objects: 100% (14/14), done.
    remote: Total 14 (delta 10), reused 0 (delta 0)
    Unpacking objects: 100% (14/14), done.
    ...
    • Oct 29 2018, 10:22 AM
    • 14 Lines
  • If it's urgent to redeploy indexers, here is what i foresee in the
    current state of affairs,
    - After review and acceptance, merge the diffs:
    ...
    • Oct 27 2018, 11:16 AM
    • 58 Lines
  • celery.worker.strategy: INFO: Received task: swh.indexer.tests.test_origin_metadata.test_revision_metadata_task[785b1155-89f2-4aec-ac50-c2d3eb34b4d3]
    celery.app.trace: ERROR: Task swh.indexer.tests.test_origin_metadata.test_revision_metadata_task[785b1155-89f2-4aec-ac50-c2d3eb34b4d3] raised unexpected: EncodeError(TypeError("b'8dbb6aeb036e7fd80664eb8bfd1507881af1ba9f' is not JSON serializable",),)
    Traceback (most recent call last):
    File "/usr/lib/python3/dist-packages/kombu/serialization.py", line 50, in _reraise_errors
    yield
    ...
    • Oct 25 2018, 3:08 PM
    • 81 Lines
  • ======================================================================
    FAIL: test_pipeline (swh.indexer.tests.test_origin_metadata.TestOriginMetadata)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
    ...
    • Oct 25 2018, 2:47 PM
    • 26 Lines
  • def test_compute_metadata_codemeta(self):
    """
    test that a codemeta file is not altered with translation
    """
    ...
    • Oct 25 2018, 12:06 PM
    • 50 Lines
  • pre-requisite:
    ```
    arc patch D582
    ```
    ...
    • Oct 24 2018, 11:12 AM
    • 1,748 Lines
  • KeyboardInterrupt
    Exception ignored in: <bound method AsyncResult.__del__ of <AsyncResult: 8731ccc1-5068-41ff-9b83-e0e2ef930843>>
    Traceback (most recent call last):
    File "/home/morane/.local/lib/python3.5/site-packages/celery/result.py", line 385, in __del__
    self.backend.remove_pending_result(self)
    ...
    • Oct 23 2018, 5:01 PM
    • 60 Lines
  • ---
    errors:
    hg:
    - "Failed to uncompress archive"
    - "OSError: [Errno 12] Cannot allocate memory"
    ...
    • Oct 18 2018, 11:13 AM
    • 134 Lines
  • origin-update-pypi;oneshot;["manhattan_seo", "https://pypi.org/project/manhattan_seo/"];{"project_metadata_url": "https://pypi.org/pypi/manhattan_seo/json"}
    origin-update-pypi;oneshot;["hypermax", "https://pypi.org/project/hypermax/"];{"project_metadata_url": "https://pypi.org/pypi/hypermax/json"}
    origin-update-pypi;oneshot;["Flask-Security-Bundle", "https://pypi.org/project/Flask-Security-Bundle/"];{"project_metadata_url": "https://pypi.org/pypi/Flask-Security-Bundle/json"}
    origin-update-pypi;oneshot;["collective.dms.mailcontent", "https://pypi.org/project/collective.dms.mailcontent/"];{"project_metadata_url": "https://pypi.org/pypi/collective.dms.mailcontent/json"}
    origin-update-pypi;oneshot;["certbot-dns-cloudxns", "https://pypi.org/project/certbot-dns-cloudxns/"];{"project_metadata_url": "https://pypi.org/pypi/certbot-dns-cloudxns/json"}
    ...
    • Oct 17 2018, 4:21 PM
    • 1,411 Lines
  • return self.RevisionMetadataTask().apply_async(
    kwargs={
    'ids': [res['revision_id'] for res in results],
    'policy_update': 'update-dups',
    },
    ...
    • Oct 16 2018, 2:25 PM
    • 9 Lines
    • Python
  • jq . pypi.group-output.txt | grep -v 'Reason: 404'
    {
    "googlecode": {
    "total": 3933,
    "errors": {
    ...
    • Oct 16 2018, 1:50 PM
    • 178 Lines
  • ```
    export PYTHONPATH=$SWH_ENVIRONMENT_HOME/snippets/ardumont:$PYTHONPATH
    python3 -m kibana_fetch_logs > output.txt
    cat output.txt | python3 -m group_by_exception --loader-type hg > output-group-by.txt
    ```
    ...
    • Oct 15 2018, 4:22 PM
    • 7 Lines
  • curl -XPOST 'http://esnode3.internal.softwareheritage.org:9200/swh_workers-2018.10.11,swh_workers-2018.10.12,swh_workers-2018.10.13,swh_workers-2018.10.14/_search' -d '{
    "from": 10,
    "_source": [
    "message",
    ...
    • Oct 15 2018, 2:33 PM
    • 36 Lines
  • {
    "swh_workers-2018.09.30": {
    "mappings": {
    "doc": {
    "properties": {
    ...
    • Oct 15 2018, 1:36 PM
    • 617 Lines
  • {
    "order": 0,
    "index_patterns": [
    "swh_workers-*"
    ],
    ...
    • Oct 15 2018, 1:33 PM
    • 16 Lines
  •  tony  (e) .venv   arcpatch-D505_1  …  swh  swh-environment  swh-indexer  2  make test
    python3 -m nose -sv --with-doctest .
    Failure: TypeError (metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases) ... ERROR
    Failure: TypeError (metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases) ... ERROR
    Failure: TypeError (metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases) ... ERROR
    ...
    • Oct 11 2018, 10:46 AM
    • 465 Lines
  • http://kibana0.internal.softwareheritage.org:5601/app/kibana#/dashboard/32632370-c0bd-11e8-8222-07f3ec376cd5?_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:%272018-08-31T22:00:00.000Z%27,mode:absolute,to:%272018-10-05T21:59:59.999Z%27))&_a=(description:%27This%20is%20a%20general%20dashboard%20listing%20the%20full%20logs%20of%20the%20swh-workers%27,filters:!((%27$state%27:(store:appState),meta:(alias:!n,disabled:!f,index:%2720de3150-c0bb-11e8-8222-07f3ec376cd5%27,key:systemd_unit,negate:!f,params:(query:%27swh-worker@swh_loader_pypi.service%27,type:phrase),type:phrase,value:%27swh-worker@swh_loader_pypi.service%27),query:(match:(systemd_unit:(query:%27swh-worker@swh_loader_pypi.service%27,type:phrase)))),(%27$state%27:(store:appState),meta:(alias:!n,disabled:!f,index:%2720de3150-c0bb-11e8-8222-07f3ec376cd5%27,key:priority,negate:!f,params:(query:%273%27,type:phrase),type:phrase,value:%273%27),query:(match:(priority:(query:%273%27,type:phrase)))),(%27$state%27:(store:appState),meta:(alias:!n...
    • Oct 5 2018, 6:29 PM
    • 1 Line
  • dev@desktop5  ~/swh-environment/swh-indexer   origin-head-indexer  grep content_metadata_add **/*.{py,sql}
    swh/indexer/metadata.py: self.idx_storage.content_metadata_add(
    swh/indexer/storage/__init__.py: def content_metadata_add(self, metadata, conflict_update=False, db=None,
    swh/indexer/storage/__init__.py: db.content_metadata_add_from_temp(conflict_update, cur)
    swh/indexer/storage/api/client.py: def content_metadata_add(self, metadata, conflict_update=False):
    ...
    • Oct 5 2018, 4:22 PM
    • 20 Lines
  • Traceback (most recent call last):
    File "origin_head.py", line 121, in <module>
    main()
    File "/usr/lib/python3/dist-packages/click/core.py", line 716, in __call__
    return self.main(*args, **kwargs)
    ...
    • Oct 5 2018, 2:51 PM
    • 26 Lines
  • ```
    test_revision_metadata_indexer (swh.indexer.tests.test_metadata.Metadata) ... ERROR
    ...
    ======================================================================
    ...
    • Oct 4 2018, 4:19 PM
    • 28 Lines
  • # swhpass and completion mechanism
    SWH_PASSWORD_STORE_DIR=${HOME}/work/inria/repo/swh/credentials/
    function swhpass() {
    PASSWORD_STORE_DIR=$SWH_PASSWORD_STORE_DIR pass $@
    }
    ...
    • Oct 4 2018, 3:00 PM
    • 12 Lines
  • diff --git a/setup.py b/setup.py
    index 42e154f..93d3192 100644
    --- a/setup.py
    +++ b/setup.py
    @@ -34,6 +34,7 @@ setup(
    ...
    • Oct 1 2018, 2:55 PM
    • 12 Lines
    • Diff
  • http://acsccg.googlecode.com/svn/ /srv/storage/space/mirrors/code.google.com/sources/v2/code.google.com/a/acsccg/acsccg-repo.svndump.gz
    http://anarchintosh-projects.googlecode.com/svn/ /srv/storage/space/mirrors/code.google.com/sources/v2/code.google.com/a/anarchintosh-projects/anarchintosh-projects-repo.svndump.gz
    http://bastian.googlecode.com/svn/ /srv/storage/space/mirrors/code.google.com/sources/v2/code.google.com/b/bastian/bastian-repo.svndump.gz
    http://calculapdf.googlecode.com/svn/ /srv/storage/space/mirrors/code.google.com/sources/v2/code.google.com/c/calculapdf/calculapdf-repo.svndump.gz
    http://dagamers.googlecode.com/svn/ /srv/storage/space/mirrors/code.google.com/sources/v2/code.google.com/d/dagamers/dagamers-repo.svndump.gz
    ...
    • Sep 28 2018, 4:04 PM
    • 23 Lines
  • http://9i00.googlecode.com/svn/ f Traceback (most recent call last):\n File "/usr/lib/python3/dist-packages/swh/loader/core/loader.py", line 862, in load\n self.store_data()\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 313, in store_data\n start_from_scratch=self.start_from_scratch)\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 503, in process_repository\n svnrepo, revision_start, revision_end, revision_parents)\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 240, in process_swh_revisions\n raise e\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 219, in process_swh_revisions\n self.config['revision_packet_size']):\n File "/usr/lib/python3/dist-packages/swh/core/utils.py", line 40, in grouper\n for _data in itertools.zip_longest(*args, fillvalue=None):\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 163, in process_svn_revisions\n for rev, nextrev, commit, new_objects, root_directory in gen_revs:\n File "/usr/lib/python3/dist-packages/swh/loader/svn/svn.py", line 267, in swh_hash_data_per_revision\n objects = self.swhreplay.compute_hashes(rev)\n File "/usr/lib/python3/dist-packages/swh/loader/svn/ra.py", line 374, in compute_hashes\n self.replay(rev)\n File "/usr/lib/python3/dist-packages/swh/loader/svn/ra.py", line 359, in replay\n self.conn.replay(rev, rev+1, self.editor)\nUnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 0: invalid start byte\n
    http://9i00.googlecode.com/svn/ f Traceback (most recent call last):\n File "/usr/lib/python3/dist-packages/swh/loader/core/loader.py", line 742, in load\n self.store_data()\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 308, in store_data\n self.last_known_swh_revision)\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 496, in process_repository\n svnrepo, revision_start, revision_end, revision_parents)\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 238, in process_swh_revisions\n raise e\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 217, in process_swh_revisions\n self.config['revision_packet_size']):\n File "/usr/lib/python3/dist-packages/swh/core/utils.py", line 40, in grouper\n for _data in itertools.zip_longest(*args, fillvalue=None):\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 161, in process_svn_revisions\n for rev, nextrev, commit, new_objects, root_directory in gen_revs:\n File "/usr/lib/python3/dist-packages/swh/loader/svn/svn.py", line 266, in swh_hash_data_per_revision\n objects = self.swhreplay.compute_hashes(rev)\n File "/usr/lib/python3/dist-packages/swh/loader/svn/ra.py", line 333, in compute_hashes\n self.replay(rev)\n File "/usr/lib/python3/dist-packages/swh/loader/svn/ra.py", line 318, in replay\n self.conn.replay(rev, rev+1, self.editor)\nUnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 0: invalid start byte\n
    http://9i00.googlecode.com/svn/ f Traceback (most recent call last):\n File "/usr/lib/python3/dist-packages/swh/loader/core/loader.py", line 742, in load\n self.store_data()\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 308, in store_data\n self.last_known_swh_revision)\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 496, in process_repository\n svnrepo, revision_start, revision_end, revision_parents)\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 238, in process_swh_revisions\n raise e\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 217, in process_swh_revisions\n self.config['revision_packet_size']):\n File "/usr/lib/python3/dist-packages/swh/core/utils.py", line 40, in grouper\n for _data in itertools.zip_longest(*args, fillvalue=None):\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 161, in process_svn_revisions\n for rev, nextrev, commit, new_objects, root_directory in gen_revs:\n File "/usr/lib/python3/dist-packages/swh/loader/svn/svn.py", line 266, in swh_hash_data_per_revision\n objects = self.swhreplay.compute_hashes(rev)\n File "/usr/lib/python3/dist-packages/swh/loader/svn/ra.py", line 333, in compute_hashes\n self.replay(rev)\n File "/usr/lib/python3/dist-packages/swh/loader/svn/ra.py", line 318, in replay\n self.conn.replay(rev, rev+1, self.editor)\nUnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 0: invalid start byte\n
    http://9i00.googlecode.com/svn/ f Traceback (most recent call last):\n File "/usr/lib/python3/dist-packages/swh/loader/core/loader.py", line 732, in load\n self.store_data()\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 308, in store_data\n self.last_known_swh_revision)\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 496, in process_repository\n svnrepo, revision_start, revision_end, revision_parents)\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 238, in process_swh_revisions\n raise e\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 217, in process_swh_revisions\n self.config['revision_packet_size']):\n File "/usr/lib/python3/dist-packages/swh/core/utils.py", line 40, in grouper\n for _data in itertools.zip_longest(*args, fillvalue=None):\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 161, in process_svn_revisions\n for rev, nextrev, commit, new_objects, root_directory in gen_revs:\n File "/usr/lib/python3/dist-packages/swh/loader/svn/svn.py", line 266, in swh_hash_data_per_revision\n objects = self.swhreplay.compute_hashes(rev)\n File "/usr/lib/python3/dist-packages/swh/loader/svn/ra.py", line 336, in compute_hashes\n self.replay(rev)\n File "/usr/lib/python3/dist-packages/swh/loader/svn/ra.py", line 321, in replay\n self.conn.replay(rev, rev+1, self.editor)\nUnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 0: invalid start byte\n
    http://9i00.googlecode.com/svn/ f Traceback (most recent call last):\n File "/usr/lib/python3/dist-packages/swh/loader/core/loader.py", line 628, in load\n """Detailed visit status.\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 300, in store_data\n Note:\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 489, in process_repository\n self.log.info(msg)\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 229, in process_swh_revisions\n _id = known_swh_rev.get('id')\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 213, in process_swh_revisions\n revs = []\n File "/usr/lib/python3/dist-packages/swh/core/utils.py", line 40, in grouper\n for _data in itertools.zip_longest(*args, fillvalue=None):\n File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 155, in process_svn_revisions\n gen_revs = svnrepo.swh_hash_data_per_revision(\n File "/usr/lib/python3/dist-packages/swh/loader/svn/svn.py", line 267, in swh_hash_data_per_revision\n File "/usr/lib/python3/dist-packages/swh/loader/svn/ra.py", line 453, in compute_hashes\n File "/usr/lib/python3/dist-packages/swh/loader/svn/ra.py", line 411, in replay\nUnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 0: invalid start byte\n
    ...
    • Sep 27 2018, 3:47 PM
    • 81 Lines
  • $ python3 -m venv .venv
    $ source .venv/bin/activate
    $ pip install $( bin/pip-swh-packages )
    Obtaining file:///home/tony/work/inria/repo/swh/swh-environment/swh-core
    Obtaining file:///home/tony/work/inria/repo/swh/swh-environment/swh-model
    ...
    • Sep 25 2018, 5:35 PM
    • 68 Lines
  • root@scratch01-euwest ~ # du -sh /srv/hdd/swh-parquet/origin_visit/*
    128M /srv/hdd/swh-parquet/origin_visit/019ed7acd0d64f309dcfb3a977f11480.parquet
    113M /srv/hdd/swh-parquet/origin_visit/05160c9af2bf41379f605259a3e1cb24.parquet
    122M /srv/hdd/swh-parquet/origin_visit/07e9ce57dd544eebb539c376f7c81aaa.parquet
    136M /srv/hdd/swh-parquet/origin_visit/2ccd24604c194508909a3ae7f431d731.parquet
    ...
    • Sep 25 2018, 1:33 PM
    • 36 Lines
  • Table origin_visit...
    0%|▏ | 500000/230786000 [00:10<1:22:12, 46691.30it/s]
    Table origin...
    1%|▍ | 500000/85109300 [00:04<13:35, 103756.07it/s]
    Table occurrence_history...
    ...
    • Sep 20 2018, 2:32 PM
    • 34 Lines
  • Sep 12 14:20:04 worker12 python3[25846]: [2018-09-12 14:20:04,287: ERROR/MainProcess] Task swh.vault.cooking_tasks.SWHCookingTask[a129a228-79b1-47e1-8c37-70e913c1f544] raised unexpected: ConnectionResetError(104, 'Connection reset by peer')
    Traceback (most recent call last):
    File "/usr/lib/python3/dist-packages/celery/app/trace.py", line 240, in trace_task
    R = retval = fun(*args, **kwargs)
    File "/usr/lib/python3/dist-packages/celery/app/trace.py", line 438, in __protected_call__
    ...
    • Sep 19 2018, 1:59 PM
    • 38 Lines
  • begin;
    create or replace function swh_count_total_indexes_size()
    returns bigint
    language plpgsql
    ...
    • Sep 6 2018, 2:43 PM
    • 27 Lines
  • logstash0: Error: Could not update: Execution of '/usr/bin/apt-get -q -y -o DPkg::Options::=--force-confold --force-yes install logstash=6.3.2' returned 100: Reading package lists...
    logstash0: Building dependency tree...
    logstash0: Reading state information...
    logstash0: E: Version '6.3.2' for 'logstash' was not found
    logstash0: Error: /Stage[main]/Profile::Logstash/Package[logstash]/ensure: change from 1:6.3.2-1 to 6.3.2 failed: Could not update: Execution of '/usr/bin/apt-get -q -y -o DPkg::Options::=--$
    ...
    • Aug 31 2018, 10:10 AM
    • 31 Lines
  • #+BEGIN_SRC emacs-lisp
    (defun swh-percent (current total)
    (* 100 (/ current (* total 1.0))))
    ...
    • Aug 24 2018, 11:49 AM
    • 25 Lines
  • alpha
    Asuka
    Automiko
    appReviewToSlack
    aio2gis
    ...
    • Aug 22 2018, 4:03 PM
    • 402 Lines
  • allspark
    archiveorg
    audit
    astar
    apicheckr
    ...
    • Aug 22 2018, 2:53 PM
    • 378 Lines
  • {'info': {'author': 'Nathan Harrington',
    'author_email': 'nharrington@wasatchphotonics.com',
    'bugtrack_url': None,
    'classifiers': [],
    'description': 'UNKNOWN',
    ...
    • Aug 2 2018, 3:07 PM
    • 96 Lines
  • {
    "info": {
    "author": "bernardfrk",
    "author_email": "bernard.frk@gmail.com",
    "bugtrack_url": null,
    ...
    • Aug 2 2018, 3:03 PM
    • 95 Lines
  • python3
    Python 3.6.6 (default, Jun 27 2018, 14:44:17)
    [GCC 8.1.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import logging; logging.basicConfig(level=logging.DEBUG); from swh.loader.pypi.tasks import LoadPyPiTsk; LoadPyPiTsk().run('arrow', 'https://pypi.org/pypi/arrow/', project_metadata_url='https://pypi.org/pypi/arrow/json')
    ...
    • Aug 1 2018, 2:29 PM
    • 309 Lines