Page MenuHomeSoftware Heritage

kill swh-storage-testdata [was: make swh-storage-testdata a python package]
Closed, ResolvedPublic

Description

so we don't rely on these awful '../[...]/../swh-storage-testdata' everywhere and make CI simpler

Related Objects

Event Timeline

douardda created this task.Oct 11 2018, 5:18 PM
douardda triaged this task as High priority.

in fact, let's kill this blob repo; in there we have:

git-repos

from which only example-submodule.fast-export.xz (1.1k) seems used nowadays in shw-loader-git.

So let's put this file back there.

dumps

in which we have:

  • swh-archiver.{dump,sql} used by swh-archiver
  • swh-indexer: idem
  • swh-scheduler-updater and swh-scheduler for swh-scheduler
  • swh for swh-storage

also note this interesting fact:

 ls -l dumps
total 468
-rw-r--r-- 1 ddouard ddouard  12361 Oct 12 10:28 swh-archiver.dump
-rw-r--r-- 1 ddouard ddouard   8085 Oct 12 10:28 swh-archiver.sql
-rw-r--r-- 1 ddouard ddouard  56528 Oct 12 10:28 swh-indexer.dump
-rw-r--r-- 1 ddouard ddouard  43397 Oct 12 10:28 swh-indexer.sql
-rw-r--r-- 1 ddouard ddouard   8356 Oct 12 10:28 swh-scheduler-updater.dump
-rw-r--r-- 1 ddouard ddouard   5410 Oct 12 10:28 swh-scheduler-updater.sql
-rw-r--r-- 1 ddouard ddouard  42404 Oct 12 10:28 swh-scheduler.dump
-rw-r--r-- 1 ddouard ddouard  33205 Oct 12 10:28 swh-scheduler.sql
-rw-r--r-- 1 ddouard ddouard 141124 Oct 12 10:28 swh.dump
-rw-r--r-- 1 ddouard ddouard 103131 Oct 12 10:28 swh.sql

yes, .dump files are havier than .sql ones. So let's keep .sql ones only and put them where they belong (ie. in their respective python packages)!

dir-folders

in which the single file, dir-folders/sample-folder.tgz (4k) is only used by swh-model.

objects

which I haven't any one using in the current code base. Did I miss something?

douardda renamed this task from make swh-storage-testdata a python package to kill swh-storage-testdata [was: make swh-storage-testdata a python package].Oct 12 2018, 11:09 AM

For your information, we recently moved out of this repository the loader-svn and loader-tar's testdata.
So yes, this sounds reasonable.

Cheers,

Its main use is the sql generation for our multiple modules using db in their internals (storage, indexer, archiver, scheduler, scheduler-updater)...
See the root Makefile of that repository.

Its main use is the sql generation for our multiple modules using db in their internals (storage, indexer, archiver, scheduler, scheduler-updater)...
See the root Makefile of that repository.

yes I'm aware of this logic in the makefile. Not sure yet what's the best to do with this. Move it up in swh-development?

zack added a subscriber: zack.Oct 12 2018, 2:07 PM

yes I'm aware of this logic in the makefile. Not sure yet what's the best to do with this. Move it up in swh-development?

Yeah, it was backward anyway to have the list of reverse-dependent modules listed in swh-storage-testdata/Makefile.
It should be in swh-environment, either as a centralized logic, or as a target in the common Makefile that the modules who need it can call.

douardda added a comment.EditedOct 12 2018, 2:28 PM

well, it will even be simpler than that: let's get rid of those "dumps" and fix the db test fixture to build the db on the fly from the sql files found in the swh-<name>/sql directory.

well, it will even be simpler than that: let's get rid of those "dumps" and fix the db test fixture to build the db on the fly from the sql files found in the swh-<name>/sql directory.

sounds awesome ;)

there is a good chance that this series of diffs break stuffs in the doc generation or some (test) tools at swh-environment level... These will be checked and fixed ASA this series is considered ready to be merged.

douardda closed this task as Resolved.Oct 23 2018, 11:24 AM