Page MenuHomeSoftware Heritage

Web appFolder
ActivePublic

Members

  • This project does not have any members.
  • View All

Details

Recent Activity

Fri, Jan 15

moranegg added a comment to T2254: textual search language for the Web UI.

Just saw this search page on Zenodo: https://help.zenodo.org/guides/search/

Fri, Jan 15, 11:27 AM · Archive search, Web app

Mon, Jan 11

moranegg triaged T2957: Add posssibility to access full save request list after doing a "save again" on a repo as Wishlist priority.
Mon, Jan 11, 8:51 PM · Web app

Fri, Jan 8

anlambert added a revision to T2945: Migrate swh-web production database from SQLite to PostgreSQL: D4830: django: Use valid identifiers for application labels.
Fri, Jan 8, 6:24 PM · System administration, Web app
anlambert added a revision to T2934: Hide unvisited origins search option is not honored with elasticsearch backend: D4827: assets/origin-search: Process origins with no visit when required.
Fri, Jan 8, 4:29 PM · Archive search, Web app
anlambert added a comment to T2900: Public graph/ API does not handle streaming results from endpoints.

The fix is now deployed and proxied graph responses are now properly streamed \o/

Fri, Jan 8, 3:54 PM · System administration, Graph service, Web app
anlambert closed T2900: Public graph/ API does not handle streaming results from endpoints as Resolved by committing rDWAPPSe605c3fa701a: api/graph: Stream responses as in the proxied graph service.
Fri, Jan 8, 3:07 PM · System administration, Graph service, Web app
anlambert added a comment to T2891: "Save code now" may fail with "OperationalError: database is locked".

This really feels like the wrong solution to the problem... postgres is the right one

Fri, Jan 8, 2:52 PM · System administration, Web app
anlambert triaged T2945: Migrate swh-web production database from SQLite to PostgreSQL as Normal priority.
Fri, Jan 8, 2:52 PM · System administration, Web app

Thu, Jan 7

anlambert added a revision to T2900: Public graph/ API does not handle streaming results from endpoints: D4824: api/graph: Stream responses as in the proxied graph service.
Thu, Jan 7, 6:58 PM · System administration, Graph service, Web app
vlorentz triaged T2938: Create API endpoint to access raw_extrinsic_metadata as Normal priority.
Thu, Jan 7, 11:47 AM · Web app, Metadata workflow
anlambert added a comment to T2926: Failed ingestion of a GitHub repository.

Thanks Antoine, any way to have this kind of errors also reported in the admin dashboard for save code now.

Thu, Jan 7, 11:44 AM · Web app, Git loader
rdicosmo added a comment to T2926: Failed ingestion of a GitHub repository.

Thanks Antoine, any way to have this kind of errors also reported in the admin dashboard for save code now.

Thu, Jan 7, 11:41 AM · Web app, Git loader
anlambert added a comment to T2926: Failed ingestion of a GitHub repository.

For the record, the load failure on 2021-01-04T17:05:11Z was due to a network error (found via Kibana):

Thu, Jan 7, 11:34 AM · Web app, Git loader

Wed, Jan 6

anlambert added a project to T2934: Hide unvisited origins search option is not honored with elasticsearch backend: Archive search.
Wed, Jan 6, 6:50 PM · Archive search, Web app
anlambert added a comment to T2891: "Save code now" may fail with "OperationalError: database is locked".

Also the Save code now requests table is constantly growing and now contains more than 50000 lines, making the select requests quite slow.

Turns out the slow requests was not due to the database but rather an implementation flaw while retrieving a save requests page. The fix is short but I realized that code is not covered so more work to do before submitting a diff.

Wed, Jan 6, 6:49 PM · System administration, Web app
anlambert triaged T2934: Hide unvisited origins search option is not honored with elasticsearch backend as Normal priority.
Wed, Jan 6, 6:48 PM · Archive search, Web app
anlambert added a comment to T2891: "Save code now" may fail with "OperationalError: database is locked".

Also the Save code now requests table is constantly growing and now contains more than 50000 lines, making the select requests quite slow.

Wed, Jan 6, 5:08 PM · System administration, Web app
vlorentz added a comment to T2891: "Save code now" may fail with "OperationalError: database is locked".

This really feels like the wrong solution to the problem... postgres is the right one

Wed, Jan 6, 4:10 PM · System administration, Web app
anlambert added a comment to T2891: "Save code now" may fail with "OperationalError: database is locked".

Other solution, increase timeout value for waiting on the database lock.

Wed, Jan 6, 4:10 PM · System administration, Web app
anlambert added a comment to T2891: "Save code now" may fail with "OperationalError: database is locked".

Thanks for the indexes ! Regarding the OperationalError, maybe we could mitigate it with a retry process as in the storage ?

Wed, Jan 6, 2:52 PM · System administration, Web app
vlorentz added a comment to T2891: "Save code now" may fail with "OperationalError: database is locked".

-> D4815

Wed, Jan 6, 2:29 PM · System administration, Web app
vlorentz added a comment to T2891: "Save code now" may fail with "OperationalError: database is locked".

Also the Save code now requests table is constantly growing and now contains more than 50000 lines, making the select requests quite slow.

Wed, Jan 6, 2:21 PM · System administration, Web app
anlambert added a comment to T2891: "Save code now" may fail with "OperationalError: database is locked".

I guess it is time to migrate the webapp database from SQLite to PostgreSQL.

Wed, Jan 6, 2:00 PM · System administration, Web app
anlambert added a comment to T2926: Failed ingestion of a GitHub repository.

The repository has been correctly ingested on 05 January 2021, 11:56 UTC .

Wed, Jan 6, 1:53 PM · Web app, Git loader
rdicosmo updated subscribers of T2926: Failed ingestion of a GitHub repository.
Wed, Jan 6, 12:34 PM · Web app, Git loader
zack triaged T2930: vault: use SWHIDs as identifiers shown to user in the download window as Low priority.
Wed, Jan 6, 11:14 AM · Web app

Tue, Jan 5

rdicosmo added a comment to T2912: Next generation archive counters.

It looks like you already agree, but FWIW I'd also would like to have a dedicated (micro)service that keeps an up-to-date bloom filter for the entire archive, with a REST API.
It might be useful for other use cases (swh-scanner comes to mind, but I'm sure we'll find others as time passes).

Tue, Jan 5, 6:05 PM · System administration, Monitoring, Web app
zack added a comment to T2912: Next generation archive counters.
In T2912#55849, @olasd wrote:

I think we should be able to decouple these counters completely from the loaders, and have them directly updated/handled by a client of the swh-journal. This would be a "centralized" component, but which we can parallelize quite heavily thanks to basic kafka design. We can also leverage the way kafka clients do parallelism to sidestep the locking issues arising in a potentially distributed filter.

Maybe my writing was not all that clear: I also had in mind a single centralised component (the ArchiveCounter) per Bloom filter, receiving the lists newcontents of ids from the loaders.
Getting the feed of ids from swh-journal instead of from the loaders is really neat: we avoid touching the loader code, and we gain a better capability of monitoring the load on the ArchiveCounter, so I'm all for it :-)

Tue, Jan 5, 6:01 PM · System administration, Monitoring, Web app
rdicosmo added a comment to T2912: Next generation archive counters.
In T2912#55849, @olasd wrote:

Thanks for sketching out this proposal! It looks quite promising (and neat!).

Tue, Jan 5, 5:00 PM · System administration, Monitoring, Web app
douardda added a comment to T2912: Next generation archive counters.

I'm also having the "full journal" approach in mind after a quick reading of this neat proposal :-)

Tue, Jan 5, 4:24 PM · System administration, Monitoring, Web app
olasd added a comment to T2912: Next generation archive counters.

Thanks for sketching out this proposal! It looks quite promising (and neat!).

Tue, Jan 5, 4:00 PM · System administration, Monitoring, Web app

Mon, Jan 4

rdicosmo triaged T2926: Failed ingestion of a GitHub repository as High priority.
Mon, Jan 4, 7:29 PM · Web app, Git loader
rdicosmo updated the task description for T2912: Next generation archive counters.
Mon, Jan 4, 6:35 PM · System administration, Monitoring, Web app
rdicosmo updated the task description for T2912: Next generation archive counters.
Mon, Jan 4, 12:04 PM · System administration, Monitoring, Web app
zack triaged T2918: missing "Authentication" link in navigation header as Low priority.
Mon, Jan 4, 9:26 AM · Easy hack, Web app

Tue, Dec 22

rdicosmo added a comment to T2912: Next generation archive counters.

Updated the proposal with your suggestions, thanks!

Tue, Dec 22, 2:59 PM · System administration, Monitoring, Web app
rdicosmo updated the task description for T2912: Next generation archive counters.
Tue, Dec 22, 2:59 PM · System administration, Monitoring, Web app
rdicosmo added a comment to T2912: Next generation archive counters.

A Python library may be an issue, as it requires a central process with a global lock. Sharding by hash may fix the issue, though.

Tue, Dec 22, 2:55 PM · System administration, Monitoring, Web app
vlorentz added a comment to T2912: Next generation archive counters.

A Python library may be an issue, as it requires a central process with a global lock. Sharding by hash may fix the issue, though.

Tue, Dec 22, 2:46 PM · System administration, Monitoring, Web app
rdicosmo updated the task description for T2912: Next generation archive counters.
Tue, Dec 22, 1:29 PM · System administration, Monitoring, Web app
rdicosmo updated the task description for T2912: Next generation archive counters.
Tue, Dec 22, 1:28 PM · System administration, Monitoring, Web app
rdicosmo triaged T2912: Next generation archive counters as Normal priority.
Tue, Dec 22, 12:57 PM · System administration, Monitoring, Web app

Dec 17 2020

zack added a project to T2900: Public graph/ API does not handle streaming results from endpoints: System administration.
Dec 17 2020, 4:15 PM · System administration, Graph service, Web app
zack added a project to T2900: Public graph/ API does not handle streaming results from endpoints: Graph service.
Dec 17 2020, 4:15 PM · System administration, Graph service, Web app
haltode triaged T2900: Public graph/ API does not handle streaming results from endpoints as Normal priority.
Dec 17 2020, 4:07 PM · System administration, Graph service, Web app

Dec 15 2020

vlorentz triaged T2891: "Save code now" may fail with "OperationalError: database is locked" as Normal priority.
Dec 15 2020, 2:16 PM · System administration, Web app
vlorentz created T2891: "Save code now" may fail with "OperationalError: database is locked".
Dec 15 2020, 2:16 PM · System administration, Web app

Dec 14 2020

moranegg updated subscribers of T2781: Make it obvious that services are the staging version.
Dec 14 2020, 3:19 PM · SWORD deposit, Web app
anlambert closed T2878: SWHID mishandling in Word and Outlook as Resolved by committing rDWAPPS126e3ea6c9bf: common/archive: Handle single slash after protocol in lookup_origin.
Dec 14 2020, 1:57 PM · Web app

Dec 12 2020

rdicosmo added a comment to T2878: SWHID mishandling in Word and Outlook.

Nice, let me know when it's deployed and I will test if the problem is indeed fixed

Dec 12 2020, 8:50 AM · Web app