Page MenuHomeSoftware Heritage
Feed Advanced Search

Nov 26 2021

seirl requested review of D6699: Stop writing swhid2node.bin maps.
Nov 26 2021, 5:36 PM
seirl closed T3740: swh-graph: Translate node IDs on the Java side, not Python side, a subtask of T3739: swh-graph: Remove SWHID -> Node ID mapping, use MPH instead, as Resolved.
Nov 26 2021, 5:33 PM · Compressed graph service
seirl closed T3740: swh-graph: Translate node IDs on the Java side, not Python side as Resolved.
Nov 26 2021, 5:33 PM · Compressed graph service
seirl closed D6676: Move SWHID<->node ID conversion in the Java backend.
Nov 26 2021, 5:05 PM
seirl committed rDGRPH0b33cff0d228: Move SWHID<->node ID conversion in the Java backend (authored by seirl).
Move SWHID<->node ID conversion in the Java backend
Nov 26 2021, 5:05 PM
seirl updated the diff for D6676: Move SWHID<->node ID conversion in the Java backend.

Fix src/dst inversion, add regression test

Nov 26 2021, 4:33 PM
seirl added inline comments to D6676: Move SWHID<->node ID conversion in the Java backend.
Nov 26 2021, 1:47 PM

Nov 25 2021

seirl committed rDGRPH32bab89d4448: BidirectionalImmutableGraph: implement outdegrees and predecessorBigArray… (authored by seirl).
BidirectionalImmutableGraph: implement outdegrees and predecessorBigArray…
Nov 25 2021, 5:00 PM
seirl committed rDGRPH5f5ae5dcc104: Add mvn/jvm.config to fix spotless not working with OpenJDK 16+ (authored by seirl).
Add mvn/jvm.config to fix spotless not working with OpenJDK 16+
Nov 25 2021, 4:04 PM
seirl committed rDGRPHbe6c986a5238: Move bidirectional graph logic into a separate ImmutableBidirectionalGraph class (authored by seirl).
Move bidirectional graph logic into a separate ImmutableBidirectionalGraph class
Nov 25 2021, 4:04 PM
seirl committed rDGRPH3cbcf625aa24: SubdatasetSizeFunction: collect more statistics (authored by seirl).
SubdatasetSizeFunction: collect more statistics
Nov 25 2021, 4:03 PM
seirl updated the task description for T3579: Meta-task: upgrade infrastructure to Debian Bullseye.
Nov 25 2021, 12:11 PM · System administration (Component upgrades)
seirl updated the task description for T3579: Meta-task: upgrade infrastructure to Debian Bullseye.
Nov 25 2021, 12:10 PM · System administration (Component upgrades)

Nov 23 2021

seirl requested review of D6676: Move SWHID<->node ID conversion in the Java backend.
Nov 23 2021, 5:35 PM
seirl added a revision to T3740: swh-graph: Translate node IDs on the Java side, not Python side: D6676: Move SWHID<->node ID conversion in the Java backend.
Nov 23 2021, 5:29 PM · Compressed graph service

Nov 19 2021

seirl committed rDSEA150cbbca19bb: setup.py: use yarnpkg instead of yarn if present in PATH (authored by seirl).
setup.py: use yarnpkg instead of yarn if present in PATH
Nov 19 2021, 5:53 PM
seirl closed T3742: yarn called in swh-search setup.py but not present in developer setup docs as Resolved.

Fixed in rDDOC55cdfd9ee957f57cf91b0f6932cc941d2887d933

Nov 19 2021, 5:29 PM · Archive search
seirl committed rDDOC55cdfd9ee957: Add yarnpkg dependency to developer-setup (authored by seirl).
Add yarnpkg dependency to developer-setup
Nov 19 2021, 5:28 PM
seirl triaged T3742: yarn called in swh-search setup.py but not present in developer setup docs as Normal priority.
Nov 19 2021, 5:17 PM · Archive search
seirl added a subtask for T3739: swh-graph: Remove SWHID -> Node ID mapping, use MPH instead: T3740: swh-graph: Translate node IDs on the Java side, not Python side.
Nov 19 2021, 4:44 PM · Compressed graph service
seirl added a parent task for T3740: swh-graph: Translate node IDs on the Java side, not Python side: T3739: swh-graph: Remove SWHID -> Node ID mapping, use MPH instead.
Nov 19 2021, 4:44 PM · Compressed graph service
seirl triaged T3740: swh-graph: Translate node IDs on the Java side, not Python side as High priority.
Nov 19 2021, 4:42 PM · Compressed graph service
seirl triaged T3739: swh-graph: Remove SWHID -> Node ID mapping, use MPH instead as High priority.
Nov 19 2021, 4:40 PM · Compressed graph service

Nov 2 2021

seirl added a comment to T2983: graph service: allow loading in memory only one direction of the graph.

Copying my comment from a linked diff:

Nov 2 2021, 3:19 PM · Compressed graph service
seirl added a comment to D6594: Add parameter to load a single graph direction in memory..

Hey! Thanks for this initial diff.

Nov 2 2021, 3:18 PM

Oct 18 2021

seirl closed T3624: Update swh-graph from 0.3.0 to 0.5.0 on granet, a subtask of T3623: Run swh-graph with gunicorn to support multiple/parallel requests, as Resolved.
Oct 18 2021, 3:01 PM · Compressed graph service, System administration
seirl closed T3624: Update swh-graph from 0.3.0 to 0.5.0 on granet as Resolved.

Done

Oct 18 2021, 3:01 PM · Compressed graph service, System administration

Oct 14 2021

seirl updated the task description for T3639: prepare quote for "granet2", next gen swh-graph compression server.
Oct 14 2021, 1:58 PM · System administration

Aug 12 2021

seirl accepted D6072: StreamingGraphView: Buffer lines before writing.
Aug 12 2021, 1:44 AM

Jul 28 2021

seirl accepted D6038: journalprocessor: Fix deserialize_message raising EOFError on the last message of each assignment.

LGTM too

Jul 28 2021, 1:26 PM

Jul 27 2021

seirl accepted D6028: journalprocessor: Fix freeze on empty offset ranges..
Jul 27 2021, 4:26 PM

Jul 26 2021

seirl created P1101 Weird repos.
Jul 26 2021, 6:40 PM

Jul 9 2021

seirl committed rDGRPHbb1ac27436bd: LabelMapBuilder: mmap order file, use less RAM (authored by seirl).
LabelMapBuilder: mmap order file, use less RAM
Jul 9 2021, 5:19 PM
seirl committed rDGRPHde67dafebb6b: ConnectedComponents: add --by-origins (authored by seirl).
ConnectedComponents: add --by-origins
Jul 9 2021, 5:19 PM
seirl committed rDGRPHc0c0c0469d0c: Bump fastutil version (authored by seirl).
Bump fastutil version
Jul 9 2021, 5:19 PM

Jun 15 2021

seirl committed rDGRPH313878766f8a: Add lazy subgraph implementation (authored by seirl).
Add lazy subgraph implementation
Jun 15 2021, 1:42 PM
seirl committed rDGRPH00384d9a6118: topology: AveragePaths: print the temporary result regularly (authored by seirl).
topology: AveragePaths: print the temporary result regularly
Jun 15 2021, 1:42 PM
seirl committed rDGRPH02414c10d7f2: topology: add AveragePaths.java (authored by seirl).
topology: add AveragePaths.java
Jun 15 2021, 1:42 PM
seirl committed rDGRPHac7bc2c0ba90: topology: ConnectedComponents: use new lazy Subgraph (authored by seirl).
topology: ConnectedComponents: use new lazy Subgraph
Jun 15 2021, 1:42 PM
seirl committed rDGRPHd6879309ffc0: java: reformat topology files (authored by seirl).
java: reformat topology files
Jun 15 2021, 1:42 PM
seirl committed rDGRPHb40f861e6f8c: topology: ClusteringCoefficient: remove allowedNodes parameter (authored by seirl).
topology: ClusteringCoefficient: remove allowedNodes parameter
Jun 15 2021, 1:41 PM
seirl committed rDGRPH34ba6645574d: topology: ClusteringCoefficient: version with subgraph (authored by seirl).
topology: ClusteringCoefficient: version with subgraph
Jun 15 2021, 1:41 PM
seirl committed rDGRPH84a1327a0cfa: topology: ClusteringCoefficient: version without subgraph (authored by seirl).
topology: ClusteringCoefficient: version without subgraph
Jun 15 2021, 1:41 PM
seirl committed rDGRPHe960d2011fc6: topology: ClusteringCoefficient: work on undirected graph (authored by seirl).
topology: ClusteringCoefficient: work on undirected graph
Jun 15 2021, 1:41 PM
seirl committed rDGRPH6ef89157db57: InOutDegree: fix switch fallthrough bug (authored by seirl).
InOutDegree: fix switch fallthrough bug
Jun 15 2021, 1:41 PM
seirl committed rDGRPH65c76530eec7: topology: InOutDegree: compute per-layer stats (authored by seirl).
topology: InOutDegree: compute per-layer stats
Jun 15 2021, 1:41 PM
seirl committed rDGRPH6ae95b6896e9: Add ImmutableGraph load methods (authored by seirl).
Add ImmutableGraph load methods
Jun 15 2021, 1:41 PM
seirl committed rDGRPH4e6c940cfe52: Replace graph constructor by load/loadMapped calls (authored by seirl).
Replace graph constructor by load/loadMapped calls
Jun 15 2021, 1:41 PM
seirl committed rDGRPH7057bacf9245: ClusteringCoefficient: revrel.txt -> relrev.txt (authored by seirl).
ClusteringCoefficient: revrel.txt -> relrev.txt
Jun 15 2021, 1:41 PM
seirl committed rDGRPH2071723471e4: Merge branch 'topology' (authored by seirl).
Merge branch 'topology'
Jun 15 2021, 1:41 PM

May 26 2021

seirl added a comment to T3341: Move real-time discussion away from Freenode.

Super clunky, a lot of message types aren't handled, some messages get filtered out. I don't recall all the specifics but I tried it around a year ago and it was really bad.

May 26 2021, 1:53 PM · Community Building

May 8 2021

seirl committed rDGRPHf7e9ddf7f917: NodeIdMap: more String/MPH compatibility (authored by seirl).
NodeIdMap: more String/MPH compatibility
May 8 2021, 9:04 PM

May 7 2021

seirl committed rDGRPH991a96dbd08b: LabelMapBuilder: use sort by default (authored by seirl).
LabelMapBuilder: use sort by default
May 7 2021, 3:55 PM

May 5 2021

seirl accepted D5689: Make test_directory_bogus_perms/test_revision_bogus_perms actually test the cookers.

LGTM

May 5 2021, 8:10 PM
seirl created P1037 (An Untitled Masterwork).
May 5 2021, 4:15 PM

May 4 2021

seirl accepted D5665: Add a simple alternative "client" in pure-python.
May 4 2021, 3:10 PM

May 3 2021

seirl accepted D5664: Make test_visit_edges_limited less strict.
May 3 2021, 6:32 PM

Apr 27 2021

seirl raised the priority of T843: Vault: Add a "git bare" tarball cooker from Wishlist to Normal.
Apr 27 2021, 10:56 PM · Vault
seirl updated the task description for T843: Vault: Add a "git bare" tarball cooker.
Apr 27 2021, 7:28 PM · Vault

Apr 23 2021

seirl accepted D5501: add an anti-Dos limit for edges traversed as a query parameter.
Apr 23 2021, 5:49 PM

Apr 19 2021

seirl committed rDDATASETd59c00c32ca8: docs: fix SVG filename of relational schema (authored by seirl).
docs: fix SVG filename of relational schema
Apr 19 2021, 5:28 PM
seirl committed rDDATASETd08a96e35e93: update db-schema.{dot,svg} (authored by seirl).
update db-schema.{dot,svg}
Apr 19 2021, 4:56 PM
seirl committed rDDATASET54a2bbfd78e3: docs: rename db-schema -> dataset-schema (authored by seirl).
docs: rename db-schema -> dataset-schema
Apr 19 2021, 4:56 PM

Apr 16 2021

seirl committed rDDATASETdffb127c5a3b: athena: pass database name as an attribute (authored by seirl).
athena: pass database name as an attribute
Apr 16 2021, 7:43 PM
seirl closed D5540: docs: Update for new schema.
Apr 16 2021, 7:39 PM
seirl committed rDDATASET01ba5d14ef21: docs: Update for new schema (authored by seirl).
docs: Update for new schema
Apr 16 2021, 7:39 PM
seirl added a comment to D5501: add an anti-Dos limit for edges traversed as a query parameter.

It looks to me like this would be simpler if max_edges was given as a parameter to Traversal, since it's common to most methods. Would that work?

Apr 16 2021, 1:09 PM

Apr 15 2021

seirl added a reviewer for D5540: docs: Update for new schema: zack.
Apr 15 2021, 5:55 PM
seirl requested review of D5540: docs: Update for new schema.
Apr 15 2021, 5:55 PM
seirl closed D5527: swh_model_data: add parents to test revision.
Apr 15 2021, 4:37 PM
seirl committed rDMOD1f6b3b9d5b41: swh_model_data: add parents to test revision (authored by seirl).
swh_model_data: add parents to test revision
Apr 15 2021, 4:37 PM
seirl updated the diff for D5527: swh_model_data: add parents to test revision.

rebase

Apr 15 2021, 4:37 PM

Apr 14 2021

seirl committed rDDATASET4636d1c146aa: Add two ORC tools (orc-merge, orc-print-contents) (authored by seirl).
Add two ORC tools (orc-merge, orc-print-contents)
Apr 14 2021, 7:07 PM
seirl committed rDDATASET33a50eac62f3: journalprocessor: disable in-partition sharding for LevelDB tests (authored by seirl).
journalprocessor: disable in-partition sharding for LevelDB tests
Apr 14 2021, 6:56 PM
seirl committed rDDATASETab6191bc712f: journalprocessor: only reassign partitions when needed (authored by seirl).
journalprocessor: only reassign partitions when needed
Apr 14 2021, 6:56 PM
seirl committed rDDATASETf5526f05d314: athena: add documentation and licensing info (authored by seirl).
athena: add documentation and licensing info
Apr 14 2021, 6:48 PM
seirl committed rDDATASETcfb3bc5510d4: ORC: export missing revision_history table (authored by seirl).
ORC: export missing revision_history table
Apr 14 2021, 6:48 PM
seirl closed D5522: Add athena subcommand to create/query AWS Athena database.
Apr 14 2021, 6:48 PM
seirl committed rDDATASETb1d76ed7a763: Add athena subcommand to create/query AWS Athena database (authored by seirl).
Add athena subcommand to create/query AWS Athena database
Apr 14 2021, 6:48 PM
seirl committed rDDATASET5459673218d1: Move ORC table schema in relational.py (authored by seirl).
Move ORC table schema in relational.py
Apr 14 2021, 6:48 PM
seirl requested review of D5527: swh_model_data: add parents to test revision.
Apr 14 2021, 6:40 PM
seirl updated the diff for D5522: Add athena subcommand to create/query AWS Athena database.

Add documentation and licensing info

Apr 14 2021, 4:48 PM
seirl added a comment to D5522: Add athena subcommand to create/query AWS Athena database.

Thanks for the review!

Apr 14 2021, 4:38 PM
seirl added a comment to T2981: Graph API: add a (node type) result filters.

I just want to write something here that maybe isn't clear from the initial task description. This filtering must happen *after* the visit, not during. We can already change *how* the graph is visited using the edges parameter, the goal of this task is to filter the result post-visit.

Apr 14 2021, 4:28 PM · Compressed graph service
seirl added a comment to T1968: existing graph endpoints should not return 404 upon missing arguments.

Right, I suppose we can close the task then?

Apr 14 2021, 4:25 PM · Easy hack, Compressed graph service
seirl updated the diff for D5522: Add athena subcommand to create/query AWS Athena database.

Remove debug print

Apr 14 2021, 2:01 PM
seirl added a reviewer for D5522: Add athena subcommand to create/query AWS Athena database: Reviewers.
Apr 14 2021, 1:58 PM
seirl updated the diff for D5522: Add athena subcommand to create/query AWS Athena database.

Rebase

Apr 14 2021, 1:57 PM
seirl retitled D5522: Add athena subcommand to create/query AWS Athena database from Move ORC table schema in relational.py to Add athena subcommand to create/query AWS Athena database.
Apr 14 2021, 1:57 PM
seirl requested review of D5522: Add athena subcommand to create/query AWS Athena database.
Apr 14 2021, 1:55 PM
seirl committed rDDATASET11b2436563e8: test_edges: fix mypy error while mocking a method (authored by seirl).
test_edges: fix mypy error while mocking a method
Apr 14 2021, 1:54 PM

Apr 13 2021

seirl updated subscribers of T1968: existing graph endpoints should not return 404 upon missing arguments.

@zack We talked about this on IRC with @vlorentz, I think this issue is invalid. We chose to have the source and destination nodes as part of the URI in the API. Semantically, it makes sense that accessing the path without these path fragments would return a 404: it's not a missing argument but an invalid path. If we had a ?src= and a &dst= arguments instead, then having a 400 error would make sense, but in our case the semantics are really weird.

Apr 13 2021, 7:05 PM · Easy hack, Compressed graph service

Apr 9 2021

seirl committed rDGRPH4f751998c69c: NodeIdMap: add backward compatibility for loading MPH on strings (authored by seirl).
NodeIdMap: add backward compatibility for loading MPH on strings
Apr 9 2021, 4:19 PM
seirl closed D5427: NodeIdMap: use the MPH + mmapped .order to translate SWHID -> node ID.
Apr 9 2021, 4:19 PM
seirl committed rDGRPH53bbd5c65cbe: NodeIdMap: use the MPH + mmapped .order to translate SWHID -> node ID (authored by seirl).
NodeIdMap: use the MPH + mmapped .order to translate SWHID -> node ID
Apr 9 2021, 4:19 PM
seirl updated the diff for D5427: NodeIdMap: use the MPH + mmapped .order to translate SWHID -> node ID.
  • Fix reviews
  • Add backward compatibility for loading MPH on strings
Apr 9 2021, 3:49 PM

Apr 7 2021

seirl closed T3178: document how to export the graph dataset automatically, a subtask of T1847: fully automate export of the graph dataset, as Invalid.
Apr 7 2021, 3:03 PM · Compressed graph service, Datasets
seirl closed T3178: document how to export the graph dataset automatically as Invalid.

Duplicate of T2431

Apr 7 2021, 3:03 PM · Documentation, Datasets
seirl added a subtask for T1847: fully automate export of the graph dataset: T2431: Document how to export the graph edge dataset.
Apr 7 2021, 3:03 PM · Compressed graph service, Datasets
seirl added a parent task for T2431: Document how to export the graph edge dataset: T1847: fully automate export of the graph dataset.
Apr 7 2021, 3:03 PM · Documentation, Compressed graph service, Datasets