Up to the 2021 version of the dataset we used to have the Java source code of custom code used to, e.g., find the earliest occurrence of a license blob, as part of the dataset in a java/ subdir.
This seems to be gone from the 2022 version.
We should add it back (ideally; or else we can point to the code used for that as part of swh-graph, but that would make the replication package a bit less useful in its own).
Description
Description
Event Timeline
Comment Actions
the replication/05-earliest-revision.sh script in the replication package mentions the swh-graph version it uses, and the fully qualified class name, so it can be found in the swh-graph code.
but that would make the replication package a bit less useful in its own
the old EarliestRevision.java was also not useful on its own, because it doesn't compile with the current version of swh-graph
Comment Actions
Future versions will be generated using only code in swh-graph (bash glue code replaced by Python code, some of which shells out to bash for simplicity), so the replication package will simply be replaced by a swh-graph tag.