Page MenuHomeSoftware Heritage
Paste Active Pastes
  • Case with some edge cases:
    ```
    $ opam show --color never --normalise --root $PWD -f all-versions opam-state
    2.0~alpha5 2.0.0~beta 2.0.0~beta3 2.0.0~beta3.1 2.0.0~beta5 2.0.0~rc 2.0.0~rc2 2.0.0~rc3 2.0.0 2.0.1 2.0.2 2.0.3 2.0.4 2.0.5 2.0.6 2.0.7 2.0.8 2.0.9 2.1.0~beta2 2.1.0~beta4 2.1.0~rc 2.1.0~rc2 2.1.0
    ```
    ...
    • Fri, Sep 24, 4:14 PM
    • 78 Lines
  • visit_type=hg; sleep=300; while true; do for policy in never_visited_oldest_update_first already_visited_order_by_lag ; do echo "$(date) scheduling $visit_type origins with policy ${policy}"; SWH_C
    ONFIG_FILENAME=/etc/softwareheritage/scheduler/listener-runner.yml swh scheduler -C /etc/softwareheritage/scheduler/listener-runner.yml origin send-to-celery --policy $policy $visit_type; echo "$(date) sleep
    $sleep" ; sleep $sleep; done; done
    • Thu, Sep 23, 11:11 AM
    • 3 Lines
  •  ddouard  (e) swh   master … 2  ~/s/s/swh-web  cat swh/web/django_db_backend/base.py
    from django.db.backends.postgresql.base import * # NOQA
    from django.db.backends.postgresql.base import DatabaseWrapper as _DatabaseWrapper
    # dirty hack to allow using a postgresql:// libpq connection URI as db name...
    ...
    • Wed, Sep 22, 6:15 PM
    • 12 Lines
  • - First run of the foss heptapod origins scheduled
    - They failed due to a dangling configuration key from the old mercurial loader (fixed)
    -> all origins have failed
    -> they have now their entry in the origin-visit-stats table
    -> scheduler metrics are updated
    ...
    • Wed, Sep 22, 11:37 AM
    • 413 Lines
  • $ pwd
    /home/tony/scratch/opam/test-opam-root
    $ opam init --reinit --bare --no-setup --root $PWD opam.ocaml.org https://opam.ocaml.org
    $ opam repository add --root $PWD satysfi-external https://github.com/gfngfn/satysfi-external-repo.git
    $ opam repository add --root $PWD opam-windows-repository https://github.com/vouillon/opam-windows-repository.git
    ...
    • Tue, Sep 21, 1:54 PM
    • 56 Lines
  • tony  yavin4  ~  %  opam init --reinit --bare --no-setup --root ~/opamtest instance-foo https://opam.ocaml.org
    [NOTE] Will configure from built-in defaults.
    Checking for available remotes: rsync and local, git, mercurial, darcs. Perfect!
    <><> Fetching repository information ><><><><><><><><><><><><><><><><><><><><><>
    ...
    • Mon, Sep 20, 12:30 PM
    • 8 Lines
  • ```
    swhworker@worker17:~$ time SWH_CONFIG_FILENAME=/etc/softwareheritage/loader_oneshot.yml swh loader run mercurial_from_disk https://foss.heptapod.net/heptapod/heptapod
    INFO:swh.loader.mercurial.LoaderFromDisk:Load origin 'https://foss.heptapod.net/heptapod/heptapod' with type 'hg'
    applying clone bundle from https://cellar-c2.services.clever-cloud.com/heptapod-foss-clonebundles/heptapod/heptapod-2020-07-22-zstd-v2.hg
    adding changesets
    ...
    • Fri, Sep 17, 11:13 AM
    • 60 Lines
  • diff --git a/swh/lister/gitlab/lister.py b/swh/lister/gitlab/lister.py
    index 5937256..074906d 100644
    --- a/swh/lister/gitlab/lister.py
    +++ b/swh/lister/gitlab/lister.py
    @@ -203,10 +203,11 @@ class GitLabLister(Lister[GitLabListerState, PageResult]):
    ...
    • Thu, Sep 16, 5:47 PM
    • 29 Lines
  • - worker0.staging is running within a venv with D6240 (more extid filtering) and D6268 (build snapshot)
    - worker17 is running with the current latest mercurial packaged (no optim)
    But the filtering is still happening in the mercurial loader (without D6275)
    # tl; dr
    ...
    • Thu, Sep 16, 4:04 PM
    • 136 Lines
  • vagrant up test
    Bringing machine 'test' up with 'libvirt' provider...
    ==> test: Creating image (snapshot of base box volume).
    ==> test: Creating domain with the following settings...
    ==> test: -- Name: puppet-environment_test
    ...
    • Wed, Sep 15, 5:26 PM
    • 62 Lines
  • cqlsh:swh> DESCRIBE extid
    CREATE TABLE swh.extid (
    extid_type ascii,
    extid blob,
    ...
    • Wed, Sep 15, 2:23 PM
    • 28 Lines
  • seq 2 8 | parallel -t ssh root@parasilo-{} nodetool flush
    seq 2 8 | parallel -t ssh root@parasilo-{} systemctl stop cassandra
    seq 2 8 | parallel -t ssh root@parasilo-{} sync
    seq 2 8 | xargs -t -n1 -i{} ssh root@parasilo-{} 'echo 3 > /proc/sys/vm/drop_caches'
    seq 2 8 | parallel -t ssh root@parasilo-{} systemctl start cassandra
    • Wed, Sep 15, 12:04 PM
    • 5 Lines
    • Bash Scripting
  • from swh.storage import get_storage
    from swh.model.hashutil import hash_to_bytes
    import sys
    import time
    ...
    • Wed, Sep 15, 11:18 AM
    • 31 Lines
    • Python
  • 10:36 guest@softwareheritage => select type, count(*) from origin_visit_status where status='full' and snapshot is null group by type;
    type │ count
    ──────┼───────
    git │ 1
    hg │ 62959
    ...
    • Wed, Sep 15, 11:05 AM
    • 9 Lines
  • (swh) ✘-2 ~/swh/swh-environment/swh-loader-mercurial [master|✔]
    15:05 $ tox -r
    GLOB sdist-make: /home/anlambert/swh/swh-environment/swh-loader-mercurial/setup.py
    black recreate: /home/anlambert/swh/swh-environment/swh-loader-mercurial/.tox/black
    black installdeps: black==19.10b0
    ...
    • Tue, Sep 14, 3:08 PM
    • 306 Lines
  • ```
    10:51:05 softwareheritage@belvedere:5432=> select * from origin o inner join origin_visit_status ovs on o.id=ovs.origin where url = 'https://github.com/CocoaPods/Specs' order by date desc limit 10;
    +---------+------------------------------------+---------+-------+-------------------------------+---------+----------+--------------------------------------------+------+
    | id | url | origin | visit | date | status | metadata | snapshot | type |
    ...
    • Tue, Sep 14, 10:53 AM
    • 41 Lines
  • [2] is the log used to extract the following list
    ```
    $ gzip -dc [2] | awk '{print $13" "$15 }' | sed -e 's/\\nTraceback//' | sort | uniq
    ```
    ...
    • Mon, Sep 13, 6:10 PM
    • 258 Lines
  • See [1] for the full logs filtered on the command:
    ```
    $ gzip -dc [1] | awk '{print $13" "$15 }' | sed -e 's/\\nTraceback//' | sort | uniq
    ```
    ...
    • Mon, Sep 13, 5:59 PM
    • 357 Lines
  • 00959a167bd98452c98ce73382f4b42179d53d32
    00a867beb2ad8e203f242e9843d2e88de0856cda
    028e9890a9287b35851c48ca351641743542d030
    030a51a49b3239769928872be9ac6d435ab14a61
    036594a6bbec926c21fa073e2404a5f760d35a43
    ...
    • Fri, Sep 10, 6:08 PM
    • 166 Lines
  • diff --git a/swh/storage/proxies/filter.py b/swh/storage/proxies/filter.py
    index c28b3df3..0693aede 100644
    --- a/swh/storage/proxies/filter.py
    +++ b/swh/storage/proxies/filter.py
    @@ -9,7 +9,10 @@ from typing import Dict, Iterable, List, Set
    ...
    • Tue, Sep 7, 10:10 AM
    • 79 Lines
  • [2021-09-03 13:38:39,438: ERROR/ForkPoolWorker-21] Bundle cooking failed.
    Traceback (most recent call last):
    File "/usr/lib/python3/dist-packages/swh/vault/cookers/base.py", line 130, in cook
    self.prepare_bundle()
    File "/usr/lib/python3/dist-packages/swh/vault/cookers/git_bare.py", line 145, in prepare_bundle
    ...
    • Fri, Sep 3, 5:41 PM
    • 61 Lines
  • In [8]: from swh.storage import get_storage
    In [9]: s = get_storage('remote', url='http://moma.internal.softwareheritage.org:5002/')
    In [10]: r, = s.revision_get([bytes.fromhex('40b469ef825bec99054bec0fcaec224abe2e9cef')])
    ...
    • Fri, Sep 3, 5:38 PM
    • 11 Lines
    • Python
  • [2021-09-03 14:41:41,875: ERROR/ForkPoolWorker-17] Bundle cooking failed.
    Traceback (most recent call last):
    File "/usr/lib/python3/dist-packages/swh/vault/cookers/base.py", line 130, in cook
    self.prepare_bundle()
    File "/usr/lib/python3/dist-packages/swh/vault/cookers/git_bare.py", line 145, in prepare_bundle
    ...
    • Fri, Sep 3, 5:29 PM
    • 35 Lines
  • Sep 03 12:33:12 worker12 python3[2525920]: [2021-09-03 12:33:12,243: INFO/MainProcess] Received task: swh.vault.cooking_tasks.SWHCookingTask[49a1a587-113d-4a47-b2f0-8a3345f7a344]
    Sep 03 12:33:12 worker12 python3[2614585]: Initialized empty Git repository in /tmp/swh-vault-gitbare-hfnujjye/clone.git/
    Sep 03 12:33:12 worker12 python3[2525940]: [2021-09-03 12:33:12,975: WARNING/ForkPoolWorker-16] Retrying RPC call
    Sep 03 12:33:13 worker12 python3[2525940]: [2021-09-03 12:33:13,191: WARNING/ForkPoolWorker-16] Retrying RPC call
    ...
    • Fri, Sep 3, 2:41 PM
    • 48 Lines
  • 2021-04-30 10:40:44,401 __main__ ERROR Could not parse revision metadata 69b3c7915fe7301980851946f8f8d32912359443
    Traceback (most recent call last):
    File "/usr/lib/python3/dist-packages/swh/storage/migrate_extrinsic_metadata.py", line 1161, in main
    handle_row(row, storage, deposit_cur, dry_run)
    File "/usr/lib/python3/dist-packages/swh/storage/migrate_extrinsic_metadata.py", line 1076, in handle_row
    ...
    • Fri, Sep 3, 2:10 PM
    • 4,067 Lines
  • 12:31:24 softwareheritage-scheduler@belvedere:5432=> select l.name, l.instance_name, sm.visit_type,
    softwareheritage-scheduler-> extract(epoch from sm.last_update) as last_update,
    softwareheritage-scheduler-> sm.origins_known, sm.origins_enabled, sm.origins_never_visited,
    softwareheritage-scheduler-> sm.origins_with_pending_changes
    softwareheritage-scheduler-> from scheduler_metrics sm
    ...
    • Fri, Sep 3, 9:35 AM
    • 62 Lines
  • swh-debian-build-unstable
    ++ git branch
    ++ grep '*'
    ++ cut '-d ' -f2
    + CURRENT_BRANCH=debian/unstable-swh
    ...
    • Wed, Sep 1, 4:38 PM
    • 2,025 Lines
  • unstable:
    ```
    $ swh-debian-build-unstable
    ++ git branch
    ++ grep '*'
    ...
    • Wed, Sep 1, 3:18 PM
    • 1,339 Lines
  • diff --git a/setup.py b/setup.py
    index d72b153..a7100a1 100755
    --- a/setup.py
    +++ b/setup.py
    @@ -74,8 +74,10 @@ def finalize_options(self):
    ...
    • Tue, Aug 31, 2:38 PM
    • 53 Lines
    • Diff
  • .
    ├── AUTHORS
    ├── CODE_OF_CONDUCT.md
    ├── CONTRIBUTORS
    ├── debian
    ...
    • Tue, Aug 31, 2:31 PM
    • 98 Lines
  • sbuild (Debian sbuild) 0.81.2 (31 January 2021) on mirzakhani.olasd.eu
    +==============================================================================+
    | swh-search 0.11.3-2~swh1 (amd64) Tue, 31 Aug 2021 12:28:45 +0000 |
    +==============================================================================+
    ...
    • Tue, Aug 31, 2:30 PM
    • 1,593 Lines
  • origin(url) {
    url
    snapshots
    }
    ...
    • Tue, Aug 31, 10:55 AM
    • 39 Lines
    • JSON
  • ... $ cd swh-loader-core
    $ ...  master $$ arc patch D6158
    $ ...  arcpatch-D6158 $ # edit swh/loader/package/loader.py and add some breakpoing
    $ ...  arcpatch-D6158 $ grep -B25 -A2 pdb swh/loader/package/loader.py
    ...  arcpatch-D6158 $ pytest --log-level=debug --ff -x -vv -s swh/loader/package/jar/tests/test_jar.py -k test_jar_visit_with_release_artifact_no_prior_visit
    ...
    • Tue, Aug 31, 9:52 AM
    • 30 Lines
  • {
    viewer {
    login
    }
    organization(login: "rails") {
    ...
    • Mon, Aug 30, 5:18 PM
    • 12 Lines
  • {
    search(query: "org:rails", type: REPOSITORY, first: 100) {
    repositoryCount
    edges {
    node {
    ...
    • Mon, Aug 30, 5:17 PM
    • 16 Lines
    • JSON
  • (swh) boris@debian:~/swh-environment/swh-loader-core/swh/loader/package/jar$ pytest -v .
    ================================================ test session starts =================================================
    platform linux -- Python 3.7.3, pytest-6.2.4, py-1.10.0, pluggy-0.13.1 -- /home/boris/.virtualenvs/swh/bin/python3
    cachedir: .pytest_cache
    hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/home/boris/swh-environment/swh-loader-core/swh/loader/package/jar/.hypothesis/examples')
    ...
    • Mon, Aug 30, 10:24 AM
    • 369 Lines
    • Python
  • commit 4f000e884c3010c2bb04bbb516aaf0ba234ccf4f
    Author: Valentin Lorentz <vlorentz@softwareheritage.org>
    Date: Fri Jun 28 14:57:57 2019 +0200
    [WIP] graphql API
    ...
    • Mon, Aug 30, 10:05 AM
    • 529 Lines
    • Diff
  • Manual installation so far.
    es_open_unfreeze_from_journalctl.sh:
    ```
    root@logstash0:~# cat /usr/local/bin/es_open_unfreeze_from_journalctl.sh
    ...
    • Aug 27 2021, 2:37 PM
    • 77 Lines
  • 11:54:44 softwareheritage-scheduler@belvedere:5432=> select status, count(*) from task where type='load-git' and priority is null group by status;
    +------------------------+-----------+
    | status | count |
    +------------------------+-----------+
    | next_run_not_scheduled | 17 |
    ...
    • Aug 26 2021, 12:31 PM
    • 19 Lines
  • diff --git a/setup.py b/setup.py
    index 3929a46..bf3108d 100755
    --- a/setup.py
    +++ b/setup.py
    @@ -132,8 +132,8 @@ def generate_parser(dest_path):
    ...
    • Aug 25 2021, 1:57 PM
    • 14 Lines
    • Diff
  • function updateVaultItemList(vaultUrl, vaultItems) {
    window.localStorage.setItem('swh-vault-cooking-tasks', JSON.stringify(vaultItems));
    return cy.visit(vaultUrl);
    }
    ...
    • Aug 24 2021, 3:42 PM
    • 11 Lines
    • Javascript
  • commit 9a4422102ff79124d637b80aab0fdecfff0169e0 (HEAD -> prepare-vault)
    Author: Valentin Lorentz <vlorentz@softwareheritage.org>
    Date: Tue Aug 24 10:19:22 2021 +0200
    vault.spec.js: Remove vaultItems from global variables
    ...
    • Aug 24 2021, 10:52 AM
    • 64 Lines
    • Diff
  • root@belvedere:~# puppet agent --enable; puppet agent --test --noop; puppet agent --disable
    Info: Using configured environment 'production'
    Info: Retrieving pluginfacts
    Info: Retrieving plugin
    Info: Retrieving locales
    ...
    • Aug 23 2021, 5:38 PM
    • 44 Lines
  • $ swh-debian-build-stable
    ++ git branch
    ++ grep '*'
    ++ cut '-d ' -f2
    + CURRENT_BRANCH=debian/buster-swh
    ...
    • Aug 23 2021, 3:09 PM
    • 955 Lines
  • P1127 error
    Storing in /Users/vanishaagrawal/Library/Application Support/ActivityTracker-nodejs/activity
    (node:15864) UnhandledPromiseRejectionWarning: Error: spawn /snapshot/Cross-Platform-Activity-Tracker/node_modules/.pnpm/active-win@7.6.1/node_modules/active-win/main ENOENT
    at Process.ChildProcess._handle.onexit (internal/child_process.js:269:19)
    at onErrorNT (internal/child_process.js:467:16)
    at processTicksAndRejections (internal/process/task_queues.js:82:21)
    ...
    • Aug 21 2021, 2:51 PM
    • 33 Lines
  • search:
    cls: elasticsearch
    hosts:
    - http://127.0.0.1:9200/
    • Aug 20 2021, 10:59 AM
    • 4 Lines
  • Attaching to docker_swh-indexer-journal-client_1
    swh-indexer-journal-client_1 | Using pip from /srv/softwareheritage/venv/bin/pip
    swh-indexer-journal-client_1 | Installed Python packages:
    swh-indexer-journal-client_1 | Package Version
    swh-indexer-journal-client_1 | --------------------- -----------
    ...
    • Aug 20 2021, 10:49 AM
    • 339 Lines
  • #!/bin/zsh
    #set -x
    setopt shwordsplit
    cat >/tmp/script.py <<EOF
    ...
    • Aug 18 2021, 2:37 PM
    • 57 Lines
    • Bash Scripting
  • insert into task_type(
    type,
    description,
    backend_name,
    default_interval, min_interval, max_interval, backoff_factor,
    ...
    • Aug 18 2021, 11:22 AM
    • 12 Lines
    • SQL
  • import sys
    begin=0
    end=0x1000000
    #partitions=10
    ...
    • Aug 17 2021, 12:34 PM
    • 35 Lines
    • Python
  • #!/bin/zsh
    # From
    setopt shwordsplit
    ...
    • Aug 17 2021, 12:33 PM
    • 28 Lines
    • Bash Scripting
  • diff --git a/swh/web/settings/common.py b/swh/web/settings/common.py
    index ac6e4878..d4bb93e3 100644
    --- a/swh/web/settings/common.py
    +++ b/swh/web/settings/common.py
    @@ -12,6 +12,7 @@ import os
    ...
    • Aug 16 2021, 12:37 PM
    • 27 Lines
  • Aug 09 15:43:53 deposit python3[886]: 2021-08-09 15:43:53 [886] swh.deposit.auth:WARNING Error during cache token retrieval: Signature has expired.
    Aug 09 15:43:53 deposit python3[886]: 2021-08-09 15:43:53 [886] gunicorn.access:INFO 127.0.0.1 - hal-preprod [09/Aug/2021:15:43:53 +0000] "POST /1/hal-preprod/ HTTP/1.1" 201 1358 "-" "-"
    Aug 09 15:43:54 deposit python3[896]: 2021-08-09 15:43:54 [896] gunicorn.access:INFO 127.0.0.1 - hal-preprod [09/Aug/2021:15:43:54 +0000] "POST /1/hal-preprod/483/media/ HTTP/1.1" 201 1384 "-" "-"
    Aug 09 15:48:54 deposit python3[896]: 2021-08-09 15:48:54 [896] swh.deposit.auth:WARNING Error during cache token retrieval: Signature has expired.
    Aug 09 15:53:55 deposit python3[881]: 2021-08-09 15:53:55 [881] swh.deposit.auth:WARNING Error during cache token retrieval: Signature has expired.
    ...
    • Aug 10 2021, 5:05 PM
    • 29 Lines
  • diff --git a/swh/storage/cassandra/cql.py b/swh/storage/cassandra/cql.py
    index 89662459..d4941c70 100644
    --- a/swh/storage/cassandra/cql.py
    +++ b/swh/storage/cassandra/cql.py
    @@ -25,6 +25,7 @@
    ...
    • Aug 6 2021, 6:52 PM
    • 85 Lines
    • Diff
  • ```
    16:09 <labor[m]> ardumont: could you give me the full list of opam packages + versions ending up with a 404 ?
    16:32 <+ardumont> my best approximation for this would be origins whose status ended up in an empty snapshot
    ```
    ...
    • Aug 6 2021, 4:36 PM
    • 1,023 Lines
  • diff --git a/swh/provenance/postgresql/provenancedb.py b/swh/provenance/postgresql/provenancedb.py
    index 7276daf..8c626c5 100644
    --- a/swh/provenance/postgresql/provenancedb.py
    +++ b/swh/provenance/postgresql/provenancedb.py
    @@ -304,10 +304,8 @@ class ProvenanceDB:
    ...
    • Aug 6 2021, 3:21 PM
    • 38 Lines
  • --- /home/dev/.local/lib/python3.7/site-packages/dulwich/pack.py 2021-08-04 17:01:37.362133450 +0200
    +++ pack.py 2021-08-04 17:07:30.875342007 +0200
    @@ -1439,7 +1439,8 @@
    # Unlike PackData.get_object_at, there is no need to cache offsets as
    # this approach by design inflates each object exactly once.
    ...
    • Aug 4 2021, 5:02 PM
    • 12 Lines
    • Diff
  • swhwebapp@moma:~$ export DJANGO_SETTINGS_MODULE=swh.web.settings.production
    swhwebapp@moma:~$ time /usr/bin/django-admin refresh_savecodenow_statuses
    Successfully updated 4 save request(s).
    real 0m4.973s
    ...
    • Aug 3 2021, 5:52 PM
    • 49 Lines
  • (ve) swhscheduler@saatchi:~$ while true; do echo "$(date) update metrics"; time SWH_CONFIG_FILENAME=/etc/softwareheritage/scheduler/backend.yml swh scheduler --config-file /etc/softwareheritage/scheduler/backend.yml origin update-metrics | grep "real "; sleep 3600; done
    Thu Jul 29 12:36:48 UTC 2021 update metrics
    real 12m1.651s
    user 0m0.609s
    ...
    • Aug 3 2021, 2:18 PM
    • 501 Lines
  • To have real prometheus data on the vagrant vm
    - ssh -g -L 9090:192.168.100.29:9090 pergamon
    - On the vm, edit the `/etc/prometheus/prometheus.yml`
    ```
    ...
    • Aug 2 2021, 4:29 PM
    • 33 Lines
  • * Replication
    ** main db: publication
    connect to main db softwareheritage:
    ...
    • Aug 2 2021, 9:53 AM
    • 29 Lines
  • def stream_results_optional(
    f: Callable[..., Optional[PagedResult[TResult, TToken]]], *args, **kwargs
    ) -> Optional[Iterable[TResult]]:
    """Like stream_results(), but for functions ``f`` that return an Optional.
    ...
    • Jul 30 2021, 3:31 PM
    • 13 Lines
    • Python
  • def stream_results_optional(
    f: Callable[..., Optional[PagedResult[TResult, TToken]]], *args, **kwargs
    ) -> Optional[Iterable[TResult]]:
    """Like stream_results(), but for functions ``f`` that return an Optional.
    ...
    • Jul 30 2021, 3:21 PM
    • 23 Lines
    • Python
  • swhworker@worker16:~$ base_dir=/srv/storage/space/mirrors/boatbucket
    swhworker@worker16:~$ nb_origins=10
    swhworker@worker16:~$ head -n$nb_origins $base_dir/mapping-to-repos.txt | while read dir url; do
    > repo_dir="$base_dir/$dir"
    > visit_date=`stat -c %z $repo_dir/.hg/blackbox.log | sed -E 's/ \+0000/+0000/'`
    ...
    • Jul 30 2021, 12:32 PM
    • 50 Lines
  • swhworker@worker01:~$ SWH_CONFIG_FILENAME=/etc/softwareheritage/loader_high_priority.yml swh --log-level DEBUG loader run mercurial_from_disk https://foss.heptapod.net/pypy/cffi
    DEBUG:swh.loader.cli:ctx: <click.core.Context object at 0x7fefbe8a3278>
    DEBUG:swh.core.config:Loading config file /etc/softwareheritage/loader_high_priority.yml
    DEBUG:swh.loader.cli:config_file: /etc/softwareheritage/loader_high_priority.yml
    DEBUG:swh.loader.cli:config:
    ...
    • Jul 30 2021, 12:15 PM
    • 111 Lines
  • def test_load_new_extid_should_be_eventful(
    swh_storage, datadir, tmp_path
    ):
    """Changing the extid version should make loaders ignore existing extids,
    and load the repo again."""
    ...
    • Jul 29 2021, 11:11 AM
    • 26 Lines
    • Python
  • def test_load_new_extid_should_be_eventful(
    swh_storage, datadir, tmp_path
    ):
    """Changing the extid version should make loaders ignore existing extids,
    and load the repo again."""
    ...
    • Jul 29 2021, 10:48 AM
    • 23 Lines
    • Python
  • root@pergamon:~# clush -b -w @swh-workers "systemctl status swh-worker@loader_git" | grep -i linux
    Jul 28 09:22:30 worker01 python3[116142]: [2021-07-28 09:22:30,504: INFO/ForkPoolWorker-400] Load origin 'https://git.launchpad.net/~kerneltoast/+git/bionic-linux-hwe' with type 'git'
    Jul 28 10:05:11 worker02 python3[54928]: [2021-07-28 10:05:11,162: INFO/ForkPoolWorker-229] Load origin 'https://git.launchpad.net/~arighi/+git/xenial-linux-kvm' with type 'git'
    dulwich.errors.GitProtocolError: unexpected http resp 401 for https://git.launchpad.net/~arighi/+git/xenial-linux-kvm/info/refs?service=git-upload-pack
    swh.loader.exception.NotFound: unexpected http resp 401 for https://git.launchpad.net/~arighi/+git/xenial-linux-kvm/info/refs?service=git-upload-pack
    ...
    • Jul 28 2021, 12:19 PM
    • 31 Lines
  • $ sha512sum $SWH_CI_ENVIRONMENT_HOME/swh-jenkins-jobs/keyrings/cassandra.asc
    e1f7a62808fe1828ffe815aa6fa1001661b0d06ba3c7d7649d8dbdf77549cdefaf83f846dd002bc2fdd68f82256d7758bf4e5a1626a29a2afecdf3575e2c6e2b /home/tony/work/inria/repo/swh/ci-environment/swh-jenkins-jobs/keyrings/cassandra.asc
    $ sha512sum /home/tony/work/inria/resources/cassandra.asc
    e1f7a62808fe1828ffe815aa6fa1001661b0d06ba3c7d7649d8dbdf77549cdefaf83f846dd002bc2fdd68f82256d7758bf4e5a1626a29a2afecdf3575e2c6e2b /home/tony/work/inria/resources/cassandra.asc
    ...
    • Jul 28 2021, 11:11 AM
    • 98 Lines
  • #BVGraph properties
    #Tue Jul 27 14:28:26 UTC 2021
    bitsforreferences=0
    avgbitsforintervals=NaN
    graphclass=it.unimi.dsi.big.webgraph.BVGraph
    ...
    • Jul 27 2021, 5:20 PM
    • 35 Lines
  • 1. https://github.com/pushrax/round660
    People bruteforcing Git commit hashes for a CTF.
    Lowest hash: 000000000001b58ef4d6727f61f4d7f8625feb72
    ...
    • Jul 26 2021, 6:40 PM
    • 80 Lines
  • ❯ pytest .
    =============================================================== test session starts ================================================================
    platform linux -- Python 3.8.10, pytest-6.2.4, py-1.10.0, pluggy-0.13.1
    rootdir: /home/shivendu/projects/swh/swh-environment/swh-search, configfile: pytest.ini
    ...
    • Jul 26 2021, 5:42 PM
    • 19 Lines
  • UTC+2 time
    -- Mon, 26 Jul 2021 --
    03:27 <+swhbot> icinga PROBLEM: service linux-ssh on worker03.softwareheritage.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds
    03:30 <+swhbot> icinga PROBLEM: service load on worker03.softwareheritage.org is WARNING: WARNING - load average: 36.25, 23.93, 12.73
    ...
    • Jul 26 2021, 2:34 PM
    • 60 Lines
  • name | instance_name | current_state
    --------+------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    gitlab | gitlab | {"last_seen_next_link": "https://gitlab.com/api/v4/projects?id_after=2114599&membership=false&order_by=id&owned=false&page=1&pagination=keyset&per_page=20&repository_checksum_failed=false&simple=false&sort=asc&starred=false&statistics=false&wiki_checksum_failed=false&with_custom_attributes=false&with_issues_enabled=false&with_merge_requests_enabled=false"}
    gitlab | gitlab.inria.fr | {"last_seen_next_link": "https://gitlab.inria.fr/api/v4/projects?id_after=31598&membership=false&order_by=id&owned=false&page=1&pagination=keyset&per_page=20&simple=false&sort=asc&starred=false&statistics=false&with_custom_attributes=false&with_issues_enabled=false&with_merge_requests_enabled=false"}
    gitlab | gitlab.lip6.fr | {"last_seen_next_link": "https://gitlab.lip6.fr/api/v4/projects?id_after=967&membership=false&order_by=id&owned=false&page=1&pagination=keyset&per_page=20&simple=false&sort=asc&starred=false&statistics=false&with_custom_attributes=false&with_issues_enabled=false&with_merge_requests_enabled=false"}
    ...
    • Jul 23 2021, 4:59 PM
    • 13 Lines
  • swh-lister_1 | [2021-07-22 11:16:22,656: DEBUG/ForkPoolWorker-1] Fetching URL https://gitlab.com/api/v4/projects?id_after=575090&membership=false&order_by=id&owned=false&page=1&pagination=keyset&per_page=20&repository_checksum_failed=false&simple=false&sort=asc&starred=false&statistics=false&wiki_checksum_failed=false&with_custom_attributes=false&with_issues_enabled=false&with_merge_requests_enabled=false
    swh-lister_1 | [2021-07-22 11:16:23,364: WARNING/ForkPoolWorker-1] Unexpected HTTP status code 502 on https://gitlab.com/api/v4/projects?id_after=575090&membership=false&order_by=id&owned=false&page=1&pagination=keyset&per_page=20&repository_checksum_failed=false&simple=false&sort=asc&starred=false&statistics=false&wiki_checksum_failed=false&with_custom_attributes=false&with_issues_enabled=false&with_merge_requests_enabled=false: b'<!DOCTYPE html>\n<html>\n<head>\n <meta content="width=device-width, initial-scale=1, maximum-scale=1" name="viewport">\n <title>500 Error - GitLab</title>\n <style>body{color:#666;text-align:center;font-family:Helvetica Neue,Helvetica,Arial,sans-serif;margin:auto;font-size:14px;display:flex;flex-direction:column;align-items:center;justify-content:center}hr{max-width:800px;margin:18px auto;border:0;border-top:1px solid #eee;border-bottom:1px solid #fff}img{max-width:40vw}.container{margin:auto 20px}.cferror_details{list-style-type:none}.cf-error-details h1{color:#456;font-size:20px;font-weight:400;line-height:28px}</style>\n\n\n</head>\n\n<body>\n <h1>\n <img src="data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iMjEwIiBoZWlnaHQ9IjIxMCIgdmlld0JveD0iMCAwIDIxMCAyMTAiIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyI+CiAgPHBhdGggZD0iTTEwNS4wNjE0IDIwMy42NTVsMzguNjQtMTE4LjkyMWgtNzcuMjhsMzguNjQgMTE4LjkyMXoiIGZpbGw9IiNlMjQzMjkiLz4KICA8cGF0aCBkPSJNMTA1LjA2MTQgMjAzLjY1NDhsLTM4LjY0LTExOC45MjFoLTU0LjE1M2w5Mi43OTMgMTE4LjkyMXoiIGZpbGw9IiNmYzZkMjYiLz4KICA8cGF0aCBkPSJNMTIuMjY4NSA4NC43MzQxbC0xMS43NDIgMzYuMTM5Yy0xLjA3MSAzLjI5Ni4xMDIgNi45MDcgMi45MDYgOC45NDRsMTAxLjYyOSA3My44MzgtOTIuNzkzLTExOC45MjF6IiBmaWxsPSIjZmNhMzI2Ii8+CiAgPHBhdGggZD0iTTEyLjI2ODUgODQuNzM0Mmg1NC4xNTNsLTIzLjI3My03MS42MjVjLTEuMTk3LTMuNjg2LTYuNDExLTMuNjg1LTcuNjA4IDBsLTIzLjI3MiA3MS42MjV6IiBmaWxsPSIjZTI0MzI5Ii8+CiAgPHBhdGggZD0iTTEwNS4wNjE0IDIwMy42NTQ4bDM4LjY0LTExOC45MjFoNTQuMTUzbC05Mi43OTMgMTE4LjkyMXoiIGZpbGw9IiNmYzZkMjYiLz4KICA8cGF0aCBkPSJNMTk3Ljg1NDQgODQuNzM0MWwxMS43NDIgMzYuMTM5YzEuMDcxIDMuMjk2LS4xMDIgNi45MDctMi45MDYgOC45NDRsLTEwMS42MjkgNzMuODM4IDkyLjc5My0xMTguOTIxeiIgZmlsbD0iI2ZjYTMyNiIvPgogIDxwYXRoIGQ9Ik0xOTcuODU0NCA4NC43MzQyaC01NC4xNTNsMjMuMjczLTcxLjYyNWMxLjE5Ny0zLjY4NiA2LjQxMS0zLjY4NSA3LjYwOCAwbDIzLjI3MiA3MS42MjV6IiBmaWxsPSIjZTI0MzI5Ii8+Cjwvc3ZnPgo=" alt="GitLab Logo" /><br />\n </h1>\n <div class="container">\n <div class="cf-error-details cf-error-502">\n <h1>Bad gateway</h1>\n <p>The web server reported a bad gateway error.</p>\n <ul>\n <li>Ray ID: 672c41a9985d3316</li>\n <li>Your IP address: 78.192.232.58</li>\n <li>Error reference number: 502</li>\n <li>Cloudflare Location: Paris</li>\n </ul>\n</div>\n\n <hr />\n <p>Please see our <a href="https://status.gitlab.com">status page</a> for more information.</p>\n </div>\n</body>\n</html>\n'
    swh-lister_1 | [2021-07-22 11:16:23,368: ERROR/ForkPoolWorker-1] Task swh.lister.gitlab.tasks.FullGitLabRelister[92b02cda-2ffe-4744-af4b-aac17c219899] raised unexpected: HTTPError('502 Server Error: Bad Gateway for url: https://gitlab.com/api/v4/projects?id_after=575090&membership=false&order_by=id&owned=false&page=1&pagination=keyset&per_page=20&repository_checksum_failed=false&simple=false&sort=asc&starred=false&statistics=false&wiki_checksum_failed=false&with_custom_attributes=false&with_issues_enabled=false&with_merge_requests_enabled=false')
    swh-lister_1 | Traceback (most recent call last):
    swh-lister_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/celery/app/trace.py", line 450, in trace_task
    ...
    • Jul 22 2021, 2:05 PM
    • 33 Lines
  • anlambert@carnavalet:~/tmp$ cat es_query.sh
    #!/bin/bash
    curl -X POST http://localhost:9200/origin-production/_search?pretty -H 'Content-Type: application/json' -d '
    {
    ...
    • Jul 22 2021, 12:59 PM
    • 87 Lines
  • 2021-07-06 07:07:05,685: svn 493.8169417530298s: 'eventful'
    2021-07-06 07:07:10,109: svn 2.36457402119413s: 'eventful'
    2021-07-06 07:07:15,745: svn 5.628665724769235s: 'eventful'
    2021-07-06 07:10:36,629: svn 200.8752660569735s: 'eventful'
    2021-07-06 07:10:41,017: svn 4.378507147077471s: 'eventful'
    ...
    • Jul 19 2021, 5:29 PM
    • 7,602 Lines
  • diff --git a/package.json b/package.json
    index c39fb88..9349bcf 100644
    --- a/package.json
    +++ b/package.json
    @@ -7,7 +7,7 @@
    ...
    • Jul 16 2021, 11:20 AM
    • 38 Lines
    • Diff
  • softwareheritage-scheduler=> select * from scheduler_metrics;
    lister_id | visit_type | last_update | origins_known | origins_enabled | origins_never_visited | origins_with_pending_changes
    --------------------------------------+------------+-------------------------------+---------------+-----------------+-----------------------+------------------------------
    b678cfc3-2780-4186-9186-d78a14bd4958 | cvs | 2021-07-09 15:47:09.900077+00 | 28622 | 0 | 0 | 0
    7a775770-2b2f-4139-aacb-ad715c022b9d | git | 2021-07-09 15:47:09.900077+00 | 1375 | 1375 | 1307 | 2
    ...
    • Jul 9 2021, 6:23 PM
    • 50 Lines
  • from tree_sitter import Language, Parser
    Language.build_library(
    'build/swh-query-language.so',
    ['.']
    ...
    • Jul 9 2021, 11:19 AM
    • 45 Lines
    • Python
  • module.exports = grammar({
    name: 'swh_search_query_language',
    rules: {
    program: $ => repeat(
    ...
    • Jul 9 2021, 9:31 AM
    • 124 Lines
    • Javascript
  • P1090 swl_ql
    query ::= ( patternFilter | booleanFilter | numericFilter | boundedListFilter | unboundedListFilter | dateFilter )*
    patternFilter ::= ( patternField patternOp patternVal )
    patternField ::= ( 'url' | 'metadata' )
    ...
    • Jul 9 2021, 8:33 AM
    • 58 Lines
  • [
    400,
    "search_phase_execution_exception",
    {
    "error": {
    ...
    • Jul 5 2021, 6:26 PM
    • 56 Lines
    • JSON
  • Result of this query:
    ```
    curl -H 'Content-Type: application/json' http://localhost:9200/origin-read/_search\?pretty -d '{
    "query": {
    "bool": {
    ...
    • Jul 5 2021, 3:53 PM
    • 34 Lines
  • from the server in question:
    ```
    root@tate:~# ssh-keygen -lf /etc/ssh/ssh_host_ecdsa_key.pub | awk '{print $2}'
    SHA256:0gi9tkEMHCDIveSm/VbZhdp8T83PlmWlufyli5OKrvg
    ```
    ...
    • Jul 5 2021, 12:02 PM
    • 12 Lines
  • curl -H 'Content-Type: application/json' http://localhost:9200/origin-read/_search\?pretty -d '{
    "query": {
    "bool": {
    "must": [
    {
    ...
    • Jul 2 2021, 5:40 PM
    • 30 Lines
  • GET origin/_search
    {
    "query": {
    "bool": {
    "must": [
    ...
    • Jul 2 2021, 5:29 PM
    • 30 Lines
  • 22:07 guest@softwareheritage => select extid_type, count(distinct extid) from extid e1 where exists (select 1 from extid e2 where e1.extid = e2.extid and e1.extid_type = e2.extid_type and e1.target < e2.target) group by extid_type;
    extid_type │ count
    ───────────────────────┼─────────
    cran-sha256 │ 7
    hg-nodeid │ 2691404
    ...
    • Jun 29 2021, 11:24 PM
    • 11 Lines
    • Jun 29 2021, 2:22 PM
    • 1 Line
  • ================================================================================================================== FAILURES ===================================================================================================================
    _________________________________________________________________________________ TestDirectoryCooker.test_directory_simple[cook_extract_directory_git_bare] __________________________________________________________________________________
    self = <swh.vault.tests.test_cookers.TestDirectoryCooker object at 0x7f26f605d748>, git_loader = <function git_loader.<locals>._create_loader at 0x7f26f5f8f9d8>
    cook_extract_directory = <function cook_extract_directory_git_bare at 0x7f26f6299510>
    ...
    • Jun 29 2021, 12:19 PM
    • 28 Lines
  • swh-web_1 | Downloading swh.storage-0.31.0-py3-none-any.whl (272 kB)
    swh-web_1 | INFO: pip is looking at multiple versions of swh-search to determine which version is compatible with other requirements. This could take a while.
    swh-web_1 | Collecting swh.search>=0.2.0
    swh-web_1 | Downloading swh.search-0.9.0-py3-none-any.whl (43 kB)
    swh-web_1 | Downloading swh.search-0.8.0-py3-none-any.whl (41 kB)
    ...
    • Jun 29 2021, 8:59 AM
    • 27 Lines
  • swh:1:dir:00016aedfa0a381d838a2a374a4618335c4ed33e
    swh:1:dir:00017cd05870c9dff67bf3a59deaf599bb52d13b
    swh:1:dir:000403d85d5a5e7fb9f172fa19b3755f6e2ba39f
    swh:1:dir:00040a338b2ad9a5bcad0de5cf9f0ecf8b997c01
    swh:1:dir:00046c7332123653985644a561f4d9406a789844
    ...
    • Jun 28 2021, 5:15 PM
    • 100 Lines
  • # Copyright (C) 2021 The Software Heritage developers
    # See the AUTHORS file at the top-level directory of this distribution
    # License: GNU General Public License version 3, or any later version
    # See top-level LICENSE file for more information
    ...
    • Jun 25 2021, 3:33 PM
    • 49 Lines
  • Jun 22 20:23:34 saatchi swh[2769803]: INFO:swh.scheduler.celery_backend.runner:Grabbed 1 tasks load-pypi
    Jun 22 20:26:23 saatchi swh[2769803]: INFO:swh.scheduler.cli.admin.runner:Scheduled 1 tasks
    # 725005 softw...cheduler swhscheduler 192.168.100.210/ 99.7 17.9 0.00B 0.00B 424 h N N active select * from swh_scheduler_grab_ready_tasks( 'list-debian-distribution', '2021-06-06T16:31:13 # <- killed a while ago but still listed somehow
    ...
    • Jun 24 2021, 10:48 AM
    • 6 Lines
  • import os
    from bs4 import BeautifulSoup
    import requests
    ...
    • Jun 21 2021, 2:02 PM
    • 55 Lines
    • Python
  • (swh2) shivendu@swh-self-hosted:~/swh-environment/swh-search$ pytest -sv swh/search/tests/test_journal_client.py
    ================================================== test session starts ==================================================
    platform linux -- Python 3.7.5, pytest-6.2.4, py-1.10.0, pluggy-0.13.1 -- /home/shivendu/.virtualenvs/swh2/bin/python3
    cachedir: .pytest_cache
    hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/home/shivendu/swh-environment/swh-search/.hypothesis/examples')
    ...
    • Jun 18 2021, 4:19 PM
    • 149 Lines
  • diff --git a/swh/search/tests/test_elasticsearch.py b/swh/search/tests/test_elasticsearch.py
    index c59e173..8c5698b 100644
    --- a/swh/search/tests/test_elasticsearch.py
    +++ b/swh/search/tests/test_elasticsearch.py
    @@ -15,7 +15,7 @@
    ...
    • Jun 17 2021, 11:04 AM
    • 39 Lines
    • Diff
  • diff --git a/swh/search/tests/test_elasticsearch.py b/swh/search/tests/test_elasticsearch.py
    index c59e173..670a32a 100644
    --- a/swh/search/tests/test_elasticsearch.py
    +++ b/swh/search/tests/test_elasticsearch.py
    @@ -15,7 +15,7 @@
    ...
    • Jun 17 2021, 10:34 AM
    • 35 Lines
    • Diff
  • https://issues.sonatype.org/browse/MVNCENTRAL-6804 (registration required)
    I'm currently developing a Maven Central connector for the Software Heritage Foundation [1]. In a nutshell, the SWH aims to archive all existing source code in the world, and besides archiving to provide useful tools (unique IDs, search, graph-related tools..). It's all open-source, and many large forges and software systems have already been archived (GitHub, GitLab, npm, pypi, debian packages, CRAN..).
    [1] https://www.softwareheritage.org/
    ...
    • Jun 15 2021, 2:16 PM
    • 9 Lines
  • P1069 cypress
    it('should redirect to browse when origin url is searched', function() {
    71 cy.get('#swh-origins-url-patterns')
    72 .type(origin.url);
    73 cy.get('.swh-search-icon')
    74 .click();
    ...
    • Jun 14 2021, 2:35 PM
    • 12 Lines