Page MenuHomeSoftware Heritage
Feed Advanced Search

Jun 22 2021

anlambert added a revision to T3127: Compute and display distribution of origins by forge: D5910: journal_client: Add origins processing.
Jun 22 2021, 4:50 PM · Metrics/monitoring, Web app, Roadmap 2021, meta-task
anlambert added a revision to T3127: Compute and display distribution of origins by forge: D5907: interface: Add get_listers method.
Jun 22 2021, 2:36 PM · Metrics/monitoring, Web app, Roadmap 2021, meta-task
anlambert added a comment to T3127: Compute and display distribution of origins by forge.

Nice to see this moving forward!

These entries in the counter log look suspicious, though, they are not origins:

b'atlassian@bitbucket.org' 2
b'taylorhakes@github.com' 2
b'bunnyhero@bitbucket.org' 1
b'dtrebbien@bitbucket.org' 1
b'eldargab@github.com' 1
b'git@github.com' 1
b'schierlm@git.code.sf.net' 1
b'tomakehurst@github.com' 1
b'wenshao@github.com' 1
b'zimbra-mirror@bitbucket.org' 1
Jun 22 2021, 2:05 PM · Metrics/monitoring, Web app, Roadmap 2021, meta-task
rdicosmo added a comment to T3127: Compute and display distribution of origins by forge.

Nice to see this moving forward!

Jun 22 2021, 1:59 PM · Metrics/monitoring, Web app, Roadmap 2021, meta-task
anlambert added a comment to T3127: Compute and display distribution of origins by forge.

Regarding this, to ease the mapping between a lister and an instance name, we may want to rework the instance names in the scheduler
model (listers table) so that the value is actually the netloc of the origin.

Jun 22 2021, 12:18 PM · Metrics/monitoring, Web app, Roadmap 2021, meta-task
ardumont added a comment to T3127: Compute and display distribution of origins by forge.

Great work! Awesome.

Jun 22 2021, 12:16 PM · Metrics/monitoring, Web app, Roadmap 2021, meta-task
anlambert added a comment to T3127: Compute and display distribution of origins by forge.

After some analysis, the data we need to properly implement this are:

  • the set of lister names and their instance names in order to organize origins by forge types (gitlab, cgit, sourceforge, ...)
  • a precise or estimated count for the origins listed by a given lister instance
Jun 22 2021, 12:07 PM · Metrics/monitoring, Web app, Roadmap 2021, meta-task
vsellier added a comment to T3357: Perform some tests of the cassandra storage on Grid5000.

An array with the possible node count relative to the replication factor was added on the hedgedoc document : https://hedgedoc.softwareheritage.org/m2MBUViUQl2r9dwcq3-_Nw?both

Jun 22 2021, 9:47 AM · System administration, Storage manager

Jun 18 2021

vsellier added a comment to T3357: Perform some tests of the cassandra storage on Grid5000.

Several tests were executed with cassandra node on the parasilo cluster [1]
The configuration was always the same to calibrate the runs:

  • ZFS is used to manage to datasets
  • the commitlogs in the 200Go SSD drive
  • the data in the 4 600Gb HDD configured in RAID0
  • Default memory configuration (8Go / default GC (not g1))
  • Cassandra configuration : [2]
Jun 18 2021, 4:44 PM · System administration, Storage manager
ardumont added a revision to T3388: Create FAQ in docs for users: D5898: docs: Update build to deploy the "users" sphinx instance.
Jun 18 2021, 3:49 PM · Documentation, Roadmap 2021
ardumont added a revision to T3388: Create FAQ in docs for users: D5897: docs: Update tox.ini to build the "users" sphinx instance.
Jun 18 2021, 3:49 PM · Documentation, Roadmap 2021
ardumont added a comment to T3388: Create FAQ in docs for users.

Landed but some more change in the build pipeline need to happen.
Currently looking into it...

Jun 18 2021, 3:45 PM · Documentation, Roadmap 2021
ardumont closed T3389: Create FAQ in docs for developers as Resolved.

Landed and deployed on the docs site [1]

Jun 18 2021, 3:44 PM · Documentation, Roadmap 2021
ardumont closed T3389: Create FAQ in docs for developers, a subtask of T3387: Create FAQ in docs, as Resolved.
Jun 18 2021, 3:44 PM · Documentation
ardumont added a subtask for T3082: Improve Save Code Now handling: T3378: Improve save code now status report so the status and browse experience stay consistent.
Jun 18 2021, 2:51 PM · Save Code Now, meta-task, Roadmap 2021, Web app

Jun 17 2021

ardumont added a revision to T3388: Create FAQ in docs for users: D5888: users faq: Define faq with categories.
Jun 17 2021, 5:59 PM · Documentation, Roadmap 2021
ardumont added a revision to T3119: FAQ: D5887: developers faq: Define faq with categories.
Jun 17 2021, 5:57 PM · Community Building
moranegg triaged T3390: Create FAQ in docs for sys-admins as Normal priority.
Jun 17 2021, 2:44 PM · Documentation
moranegg triaged T3389: Create FAQ in docs for developers as Normal priority.
Jun 17 2021, 2:43 PM · Documentation, Roadmap 2021
moranegg renamed T3388: Create FAQ in docs for users from Create FAQ for users to Create FAQ in docs for users.
Jun 17 2021, 2:42 PM · Documentation, Roadmap 2021
moranegg triaged T3388: Create FAQ in docs for users as Normal priority.
Jun 17 2021, 2:42 PM · Documentation, Roadmap 2021
moranegg renamed T3387: Create FAQ in docs from create FAQ in docs to Create FAQ in docs.
Jun 17 2021, 2:41 PM · Documentation
moranegg triaged T3387: Create FAQ in docs as Normal priority.
Jun 17 2021, 2:38 PM · Documentation

Jun 16 2021

vsellier added a comment to T3357: Perform some tests of the cassandra storage on Grid5000.

Some notes on how to perform common actions with cassandra: https://hedgedoc.softwareheritage.org/m2MBUViUQl2r9dwcq3-_Nw

Jun 16 2021, 11:09 AM · System administration, Storage manager

Jun 15 2021

vsellier added a comment to T3357: Perform some tests of the cassandra storage on Grid5000.

The environment can be stopped and rebuild as long as the disk remained reserved on the servers.

Jun 15 2021, 10:50 AM · System administration, Storage manager
vsellier updated the task description for T3357: Perform some tests of the cassandra storage on Grid5000.
Jun 15 2021, 10:29 AM · System administration, Storage manager
vsellier updated the task description for T3357: Perform some tests of the cassandra storage on Grid5000.
Jun 15 2021, 10:29 AM · System administration, Storage manager

Jun 11 2021

moranegg triaged T3377: Add icon/button in moderation view to go to deposit in new tab as Normal priority.
Jun 11 2021, 5:18 PM · Monitoring, SWORD deposit, Web app
moranegg triaged T3376: Visualize metadata of a deposit in the admin (moderation) view as Normal priority.
Jun 11 2021, 5:13 PM · Monitoring, SWORD deposit, Web app

Jun 10 2021

vsellier added a comment to T3357: Perform some tests of the cassandra storage on Grid5000.

Some status about the automation:

  • Cassandra nodes are ok (os installation, zfs configuration according to the defined environment except a problem during the first initialization with new disks, startup, cluster configuration)
  • swh-storage node is ok (os installation, gunicorn/swh-storage installation and startup)
  • cassandra database initialization :
root@parasilo-3:~#  nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address      Load        Tokens  Owns (effective)  Host ID                               Rack 
UN  172.16.97.3  78.85 KiB   256     31.6%             49d46dd8-4640-45eb-9d4c-b6b16fc954ab  rack1
UN  172.16.97.5  105.45 KiB  256     26.0%             47e99bb4-4846-4e03-a06c-53ea2862172d  rack1
UN  172.16.97.4  98.35 KiB   256     18.1%             e2aeff29-c89a-4c7a-9352-77aaf78e91b3  rack1
UN  172.16.97.2  78.85 KiB   256     24.3%             edd1b72b-4c35-44bd-b7e5-316f41a156c4  rack1
root@parasilo-3:~# cqlsh 172.16.97.3
Connected to swh-storage at 172.16.97.3:9042
[cqlsh 6.0.0 | Cassandra 4.0 | CQL spec 3.4.5 | Native protocol v5]
cqlsh> desc KEYSPACES
Jun 10 2021, 7:02 PM · System administration, Storage manager

Jun 8 2021

ardumont edited projects for T3082: Improve Save Code Now handling, added: System administration; removed System administrators.
Jun 8 2021, 4:53 PM · Save Code Now, meta-task, Roadmap 2021, Web app

Jun 3 2021

vsellier updated subscribers of T3357: Perform some tests of the cassandra storage on Grid5000.

I played with grid5000 to experiment how the jobs work and how to initialize the reserved nodes.

Jun 3 2021, 7:30 PM · System administration, Storage manager
ardumont moved T3357: Perform some tests of the cassandra storage on Grid5000 from Backlog to in-progress on the System administration board.
Jun 3 2021, 6:19 PM · System administration, Storage manager

Jun 2 2021

vsellier changed the status of T3357: Perform some tests of the cassandra storage on Grid5000 from Open to Work in Progress.
Jun 2 2021, 6:25 PM · System administration, Storage manager

May 27 2021

ardumont added a comment to T3129: Reliable monitoring of services: for users and for admins .

great ;)

May 27 2021, 11:38 AM · Roadmap 2022, Roadmap 2021, Monitoring, meta-task
vsellier added a comment to T3129: Reliable monitoring of services: for users and for admins .

The save code now queue statistics are now displayed on the status.io page[1] as an example. The data are refreshed each 5 minutes.

May 27 2021, 10:59 AM · Roadmap 2022, Roadmap 2021, Monitoring, meta-task

May 26 2021

vsellier added a revision to T3129: Reliable monitoring of services: for users and for admins : D5787: status.io: push save code now statistics.
May 26 2021, 5:07 PM · Roadmap 2022, Roadmap 2021, Monitoring, meta-task
vlorentz closed T2602: Investigate how to upgrade the schema of the Cassandra storage, a subtask of T2214: Scale-out graph and database storage in production, as Resolved.
May 26 2021, 11:25 AM · meta-task, Roadmap 2022, Roadmap 2021, Storage manager

May 25 2021

vsellier added a comment to T3129: Reliable monitoring of services: for users and for admins .

Metrics can easily be pushed to the status page.
The simple poc for the save code now request is available here : https://forge.softwareheritage.org/source/snippets/browse/master/sysadmin/status.io/update_metrics.py

May 25 2021, 9:17 AM · Roadmap 2022, Roadmap 2021, Monitoring, meta-task

May 20 2021

vsellier added a comment to T3129: Reliable monitoring of services: for users and for admins .

for the status.swh.org point of view, status.io is providing some api endpoint to push metrics. It should be possible to add some metrics (up to 10 with our plan) to expose the behavior of the platform (daily/weekly and monthly statistics).
As a first step, we could expose the number of pending save code now requests and the number of origin visits to have some live data. An example of a status page with metrics : https://status.docker.com/
I'm working on a code snippet to test the integration feasibility/complexity.

May 20 2021, 6:07 PM · Roadmap 2022, Roadmap 2021, Monitoring, meta-task
vsellier changed the status of T3129: Reliable monitoring of services: for users and for admins from Open to Work in Progress.
May 20 2021, 12:01 PM · Roadmap 2022, Roadmap 2021, Monitoring, meta-task

May 17 2021

dachary updated the task description for T3054: Scale out object storage design.
May 17 2021, 4:17 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
May 17 2021, 4:16 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

May 15 2021

dachary removed a subtask for T3054: Scale out object storage design: T3327: Hardware architecture for the object storage.
May 15 2021, 1:09 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a subtask for T3054: Scale out object storage design: T3327: Hardware architecture for the object storage.
May 15 2021, 1:08 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

May 10 2021

anlambert closed T3272: Authenticated users should be able to browse their save code now requests, a subtask of T3082: Improve Save Code Now handling, as Resolved.
May 10 2021, 11:19 AM · Save Code Now, meta-task, Roadmap 2021, Web app
vlorentz changed the status of T843: Vault: Add a "git bare" tarball cooker, a subtask of T3096: Efficient and reliable download via the Vault, from Open to Work in Progress.
May 10 2021, 9:48 AM · meta-task, Roadmap 2021, Vault
vlorentz added a subtask for T3096: Efficient and reliable download via the Vault: T843: Vault: Add a "git bare" tarball cooker.
May 10 2021, 9:48 AM · meta-task, Roadmap 2021, Vault
vlorentz changed the status of T3096: Efficient and reliable download via the Vault from Open to Work in Progress.
May 10 2021, 9:47 AM · meta-task, Roadmap 2021, Vault
vlorentz moved T3096: Efficient and reliable download via the Vault from Backlog to Work in progress on the Roadmap 2021 board.
May 10 2021, 9:46 AM · meta-task, Roadmap 2021, Vault

May 8 2021

zack updated the task description for T3316: SWHID v2: determine binary-to-text encoding for checksum part.
May 8 2021, 1:18 PM · Data Model
zack triaged T3316: SWHID v2: determine binary-to-text encoding for checksum part as Normal priority.
May 8 2021, 11:43 AM · Data Model
rdicosmo moved T2194: Archive Integration (Web API) from Backlog to Work in progress on the Roadmap 2021 board.
May 8 2021, 11:14 AM · Roadmap 2021, meta-task
rdicosmo moved T3118: Documentation for users and ambassadors from Backlog to Work in progress on the Roadmap 2021 board.
May 8 2021, 11:14 AM · Documentation, Scientific Community Building, Community Building, Roadmap 2021, meta-task
rdicosmo moved T2912: Next generation archive counters from Pending validation to Done on the Roadmap 2021 board.
May 8 2021, 11:13 AM · Roadmap 2021, System administration, Monitoring, Web app
rdicosmo moved T3082: Improve Save Code Now handling from Backlog to Work in progress on the Roadmap 2021 board.
May 8 2021, 11:12 AM · Save Code Now, meta-task, Roadmap 2021, Web app

May 3 2021

dachary closed T3065: Using git to store objects, a subtask of T3054: Scale out object storage design, as Wontfix.
May 3 2021, 5:49 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary closed T3050: Using libcephsqlite to store objects, a subtask of T3054: Scale out object storage design, as Wontfix.
May 3 2021, 5:47 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
anlambert changed the status of T3272: Authenticated users should be able to browse their save code now requests, a subtask of T3082: Improve Save Code Now handling, from Open to Work in Progress.
May 3 2021, 2:09 PM · Save Code Now, meta-task, Roadmap 2021, Web app

Apr 28 2021

rdicosmo added a comment to T2912: Next generation archive counters.

> I also recall now that vincent added a graph [1] recently enough.

This to try and compare a bit the counter approaches together.

So that's still using the old plumbing at least for that part.

[1] https://grafana.softwareheritage.org/goto/BlkwHorMz

Apr 28 2021, 5:23 PM · Roadmap 2021, System administration, Monitoring, Web app
ardumont added a comment to T2912: Next generation archive counters.

What about the old counter pipeline? Has it been decommissioned already?

I don't think so as I do not recall seeing diffs about clean up.

In any case, it's not part of what's currently deployed (so no risk for
data mangling if that's part the concern).

Apr 28 2021, 5:12 PM · Roadmap 2021, System administration, Monitoring, Web app

Apr 27 2021

moranegg updated the task description for T2624: Create strategy for documentation with a map or a full table of content.
Apr 27 2021, 3:41 PM · Roadmap 2021, meta-task, Documentation
moranegg updated the task description for T2624: Create strategy for documentation with a map or a full table of content.
Apr 27 2021, 3:11 PM · Roadmap 2021, meta-task, Documentation
moranegg updated the task description for T2624: Create strategy for documentation with a map or a full table of content.
Apr 27 2021, 3:07 PM · Roadmap 2021, meta-task, Documentation
moranegg changed the status of T3128: Improve deposit integration, management and display from Open to Work in Progress.
Apr 27 2021, 2:54 PM · meta-task, Roadmap 2021, Monitoring, SWORD deposit, Web app
moranegg claimed T3118: Documentation for users and ambassadors.
Apr 27 2021, 2:13 PM · Documentation, Scientific Community Building, Community Building, Roadmap 2021, meta-task
vlorentz triaged T3119: FAQ as Normal priority.
Apr 27 2021, 2:12 PM · Community Building

Apr 26 2021

ardumont added a comment to T2912: Next generation archive counters.

What about the old counter pipeline? Has it been decommissioned already?

Apr 26 2021, 2:29 PM · Roadmap 2021, System administration, Monitoring, Web app
rdicosmo added a comment to T2912: Next generation archive counters.

Last bits deployed on archive.s.o (including the author counters).

Apr 26 2021, 1:33 PM · Roadmap 2021, System administration, Monitoring, Web app
ardumont added a comment to T2912: Next generation archive counters.

Last bits deployed on archive.s.o (including the author counters).

Apr 26 2021, 12:00 PM · Roadmap 2021, System administration, Monitoring, Web app
ardumont added a comment to T3213: Enable save code now of software source code archives for specific users.

Remains one or two concerns about this prior to actually act on it.

Apr 26 2021, 11:42 AM · Save Code Now, Web app
rdicosmo moved T2912: Next generation archive counters from Work in progress to Pending validation on the Roadmap 2021 board.
Apr 26 2021, 10:50 AM · Roadmap 2021, System administration, Monitoring, Web app
zack added a comment to T3087: Implement support for takedown notices (infra, admin tools, workflow).

So what about exports of the archive available on git-annex?

Apr 26 2021, 8:34 AM · Roadmap 2022, meta-task, Roadmap 2021, Web app

Apr 24 2021

ardumont added a comment to T3213: Enable save code now of software source code archives for specific users.

If I understand well, url+time+length+filename+version are used in an heuristic to
avoid (down)loading over and over again something that is already ingested

Apr 24 2021, 7:29 PM · Save Code Now, Web app
rdicosmo added a comment to T3213: Enable save code now of software source code archives for specific users.

I recall it's part of creating a primary key (of sort) composed of all the properties mentioned
above (when the artifact does not provide some hashes already).
This to bypass fetching all other again things already fetched.

Apr 24 2021, 3:20 PM · Save Code Now, Web app
ardumont added a comment to T3213: Enable save code now of software source code archives for specific users.

(submitted too early)

Apr 24 2021, 10:41 AM · Save Code Now, Web app
rdicosmo added a comment to T3213: Enable save code now of software source code archives for specific users.

Currently users only provide an url in the save code now, the loader expects a bit more
[1] (recall it's the lister which actually provide those).

The loader expects to be provided with a list of artifacts (could be only 1 in our
case). Still, such artifacts are described through the following:

  • artifact url
  • time
  • length (could be derived from the url when discussing with the server but not all server provides it...)
  • version (could be derived with heuristic from the url as well but that's regexp-hell-ish and prone to error)
  • filename (could be derived from the url without too much risk i think...)

I gather the save code now ui could be enriched (and displayed according to chosen visit
type) but that becomes more involved for people in general.

Another road would be to make some of those properties optional...

Thoughts?

[1]

 "url": "https://ftp.gnu.org/old-gnu/emacs/",
 "artifacts": [{"url": "https://ftp.gnu.org/old-gnu/emacs/elib-1.0.tar.gz",
                "time": "1995-12-12T08:00:00+00:00",
                "length": 58335,
                "version": "1.0",
                "filename": "elib-1.0.tar.gz",
                },
                ...
               ]
...
Apr 24 2021, 9:53 AM · Save Code Now, Web app

Apr 23 2021

vlorentz assigned T3117: Publish status of existing listers and loaders to anlambert.
Apr 23 2021, 4:53 PM · Documentation, Roadmap 2022, meta-task, Community Building, Roadmap 2021
vlorentz assigned T2234: Write use case-specific documentation to moranegg.
Apr 23 2021, 4:52 PM · Roadmap 2021, meta-task, Documentation
vlorentz assigned T1363: Have metrics in prometheus for each tracked forge to olasd.
Apr 23 2021, 4:52 PM · Roadmap 2021, Metrics/monitoring, System administration
vlorentz assigned T3127: Compute and display distribution of origins by forge to anlambert.
Apr 23 2021, 4:52 PM · Metrics/monitoring, Web app, Roadmap 2021, meta-task
vlorentz assigned T3136: Prior art detection service to zack.
Apr 23 2021, 4:51 PM · Roadmap 2022, Code scanner, Scientific Community Building, Roadmap 2021, meta-task
vlorentz assigned T3112: Provenance index for the full archive to zack.
Apr 23 2021, 4:51 PM · Roadmap 2022, Provenance database, Roadmap 2021, meta-task
vlorentz assigned T2204: Full-text search on source code (prototype) to anlambert.
Apr 23 2021, 4:51 PM · Roadmap 2021
vlorentz assigned T2194: Archive Integration (Web API) to anlambert.
Apr 23 2021, 4:50 PM · Roadmap 2021, meta-task
vlorentz assigned T2220: swh-graph in production to zack.
Apr 23 2021, 4:50 PM · Roadmap 2022, meta-task, Roadmap 2021, Compressed graph service
vlorentz assigned T3135: Improve integrity of ingested content to olasd.
Apr 23 2021, 4:50 PM · Storage manager, Roadmap 2021, meta-task
vlorentz assigned T3134: SWHID v2 to zack.
Apr 23 2021, 4:50 PM · Roadmap 2022, Roadmap 2020, Data Model, Web app, meta-task, Roadmap 2021
vlorentz assigned T3116: Roll out at least one operational mirror to douardda.
Apr 23 2021, 4:49 PM · Roadmap 2022, Unknown Object (Project), Mirror, Roadmap 2021, meta-task
vlorentz assigned T3113: Cold storage archive to douardda.
Apr 23 2021, 4:49 PM · Roadmap 2021, Archive content, meta-task
vlorentz assigned T3087: Implement support for takedown notices (infra, admin tools, workflow) to anlambert.
Apr 23 2021, 4:48 PM · Roadmap 2022, meta-task, Roadmap 2021, Web app
vlorentz assigned T1538: Add "forge" now to ardumont.
Apr 23 2021, 4:48 PM · Add Forge Now , Roadmap 2022, meta-task, Roadmap 2021
vlorentz assigned T3082: Improve Save Code Now handling to ardumont.
Apr 23 2021, 4:48 PM · Save Code Now, meta-task, Roadmap 2021, Web app
vlorentz moved T2214: Scale-out graph and database storage in production from Backlog to Work in progress on the Roadmap 2021 board.
Apr 23 2021, 4:47 PM · meta-task, Roadmap 2022, Roadmap 2021, Storage manager
vsellier added a revision to T2912: Next generation archive counters: D5588: Activate swh-counters on all the webapps.
Apr 23 2021, 4:26 PM · Roadmap 2021, System administration, Monitoring, Web app
vsellier claimed T3129: Reliable monitoring of services: for users and for admins .
Apr 23 2021, 3:13 PM · Roadmap 2022, Roadmap 2021, Monitoring, meta-task
vsellier closed T3251: Count authors from revisions and releases, a subtask of T2912: Next generation archive counters, as Resolved.
Apr 23 2021, 1:03 PM · Roadmap 2021, System administration, Monitoring, Web app

Apr 22 2021

anlambert added a revision to T3213: Enable save code now of software source code archives for specific users: D5578: django: Add keycloak realm roles in user permissions set.
Apr 22 2021, 6:20 PM · Save Code Now, Web app
ardumont added a comment to T3213: Enable save code now of software source code archives for specific users.

I stand by what i said regarding the scheduling logic, it's as simple as I described
earlier... But...

Apr 22 2021, 4:59 PM · Save Code Now, Web app
ardumont moved T3266: Improve save code now failed/uneventful status reporting from Pending validation to Done on the Roadmap 2021 board.
Apr 22 2021, 4:21 PM · Save Code Now, Roadmap 2021, System administrators, Web app