The tables archives and content_archive should be initialized with respectively the archives servers, and the content. content_archive should be filled properly to have as 'present' on Banco the already copied content.
Description
Status | Assigned | Task | ||
---|---|---|---|---|
Migrated | gitlab-migration | T239 preserve at least 2 copies of each content object | ||
Migrated | gitlab-migration | T240 content archiver | ||
Migrated | gitlab-migration | T482 First swh-storage-archiver run to catch up uffizi | ||
Migrated | gitlab-migration | T412 Bootstrap archiver's database | ||
Migrated | gitlab-migration | T484 List banco's current sha1s for injection in archiver db |
Event Timeline
Running on banco:
INSERT INTO archives(id, url) VALUES('Banco', 'http://banco.softwareheritage.org:5003/'); begin; -- prepare data CREATE TABLE content_archive_tmp ( content_id sha1 REFERENCES content(sha1), PRIMARY KEY (content_id) ); \copy content_archive_tmp from 'content-id-by-ctime.after-T7.txt'; -- insert into the real production table insert into content_archive (sha1, archive_id, status, mtime) select sha1, 'Banco', 'present', '2016-02-04 14:19:59.000000000 +0000'::timestamptz from content_archive_tmp; -- drop temporary table drop table content_archive_tmp;
Note:
- the previous snippet may be changed after the \copy instruction (since it has not yet been tested).
- the timestamp used for the insert is the modify time of the .txt file holding the list of sha1s we inject
- 'archives' table should be renamed 'archive' to respect our naming convention
- archive_id is TEXT, i'm uneasy with that since we repeat it with every content we have (that's quite a huge repetition), a simple integer should be enough -(> i don't measure yet the impact on the archiver code though)
- status is also TEXT and could be replaced with an enum or something
This is a failure for now.
After multiple attempts in the week-end, there is not enough space on disk for the process to finish.
I see 2 ways to improve this:
- grow /srv/softwareheritage/postgres on prado.
- create the archiver's db on /srv/softwareheritage/postgres-hdd (as some other db we have on the side).
If i understand right about the hardware postgres partition is ssd, and postgres-hdd partition is standard disk (so slower).
I'm more inclined to 2 as:
- I don't think this could be much of a problem to have an archiver db slightly slower.
- if it's not good we can always migrate back to 1
- i have multiple blocking points for the solution 1 (How much is it reasonable to grow the partition? Do we even have the resources to do so? Also, @olasd showed me how to grow the partition disk but i don't remember since i did not try...)
'status' is of type archive_status which is already an enum. I guess that Postgres do the right thing with integers.
CREATE TYPE archive_status AS ENUM ( 'missing', 'ongoing', 'present' );
archive_id is a foreign key. I assumed that postgres would nicely do the job for us. If it's not the case, we clearly need to change that.
When content are missing, the date is not relevant, so any would do the job.
- I totally agree with the archives > archive change.
- As we said on irc, the foreign key from content_archive.id to content.sha1 makes the creation of an archiver single db quite uneasy.
'status' is of type archive_status which is already an enum. I guess that Postgres do the right thing with integers.
Yep, sorry i did not change my remark.
I saw this this morning when i took a closer look.
A good news then, one less change to do ^^
archive_id is a foreign key. I assumed that postgres would nicely do the job for us. If it's not the case, we clearly need to change that.
I do hope so as well ^^
When content are missing, the date is not relevant, so any would do the job.
ok
I totally agree with the archives > archive change.
ok
As we said on irc, the foreign key from content_archive.id to content.sha1 makes the creation of an archiver single db quite uneasy.
Indeed, we'll wait to have some more space then ^^.
As we said on irc, the foreign key from content_archive.id to content.sha1 makes the creation of an archiver single db quite uneasy.
As discussed on irc, we could, as a first approximation, drop this constraint since our identifier are quite stable (we never delete anything).
Indeed, we'll wait to have some more space then ^^.
Not necessarily, we need to discuss this but we could leverage our queue system (or in the possible future system kafka) to notify that some new contents have been added. And then update the archiver db from those notifications.
Otherwise, for the space T486 is in progress, and T413 in progress as well (or so i think from @rdicosmo's email)
Moving softwareheritage-log from the ssd to hdd (T487), we reclaimed 1.1T of data on the ssd (which were the initial blocking point).
So now, we can try to inject back the archiver's bootstrap data to finally... run it ^^
For info, i ran another failed attempt yesterday (Saturday the 16th of July 2016).
This stopped before finishing.
Trying the following:
CREATE TABLE content_archive ( content_id sha1 REFERENCES content(sha1), archive_id archive_id default 'Banco' REFERENCES archives(id), status archive_status default 'present', mtime timestamptz default '2016-02-04 14:19:59.000000000 +0000'::timestamptz, PRIMARY KEY (content_id, archive_id) ); \copy content_archive from '/srv/storage/space/lists/todb';
Note: Default values being temporary the time to bootstrap.
and to effectively load the data:
ardumont@prado:/srv/storage/space/lists$ mkfifo todb ardumont@prado:/srv/storage/space/lists$ pv content-id-by-ctime.after-T7.txt.gz -s 70g -e -a -t | pigz -dc > todb
It failed around 500M lines and was stopped for missing space yet again.
Looks like, for now, the only way is to store this is in the other cluster (hdd rotating spin, so other mount point).
Thus effectively dropping the constraint about the foreign key on content_id.
And then improve the swh.storage.storage.content_add api function with some way of notifying we added new contents.
Also in regards to db, softwareheritage-archiver has been created with the following schema.
First run (TL; DR - too slow so stopped)
As of Monday the 18th on prado, was running in a tmux session (under ardumont) a process to inject data.
This was too slow so i stopped it just now.
The idea was to use direct injection in an altered content_archive table with default values.
Done 447951000 in ~24h.
As postgres user, using psql on softwareheritage-archiver:
begin; DROP TABLE content_archive; CREATE TABLE content_archive ( content_id sha1, archive_id archive_id default 'Banco' REFERENCES archive(id), status archive_status default 'present', mtime timestamptz default '2016-02-04 14:19:59.000000000 +0000'::timestamptz, PRIMARY KEY (content_id, archive_id) ); COPY content_archive(content_id) from '/var/lib/postgres/todb'; -- Alter content_archive to remove default values -- ... (yet to be determined) commit;
This must have been too slow for the index creation done at the same time and maybe the default values policy...
second run (so far so good)
As postgres user, using psql on softwareheritage-archiver:
CREATE table content_archive_tmp(content_id sha1); COPY content_archive_tmp(content_id) from '/var/lib/postgres/todb'; INSERT INTO content_archive(content_id, archive_id, status, mtime) SELECT content_id, 'Banco', 'present', '2016-02-04 14:19:59.000000000 +0000'::timestamptz FROM content_archive_tmp; DROP content_archive_tmp;
And in another tmux pane:
postgres@prado:~$ pv /srv/storage/space/lists/content-id-by-ctime.after-T7.txt.gz -s 70g -e -a -t | pigz -dc > todb
This goes way faster for now (for the pure copy at least):
postgres@prado:~$ pv /srv/storage/space/lists/content-id-by-ctime.after-T7.txt.gz -s 70g -e -a -t | pigz -dc > todb 0:16:12 [15.3MiB/s] ETA 1:02:04 softwareheritage-archiver=# explain select count(*) from content_archive_tmp; QUERY PLAN --------------------------------------------------------------------------------------- Aggregate (cost=11340915.30..11340915.31 rows=1 width=0) -> Seq Scan on content_archive_tmp (cost=0.00..9912800.04 rows=571246104 width=0) (2 rows)
w00t
According to documentation, to defer a constraint, first said constraint must be deferrable (which it is not the default).
So changing first in the original table those constraints:
DROP TABLE content_archive; -- make each constraint deferrable by default (they are not by default) CREATE TABLE content_archive ( content_id sha1, archive_id archive_id REFERENCES archive(id) DEFERRABLE, status archive_status, mtime timestamptz, PRIMARY KEY (content_id, archive_id) DEFERRABLE );
Then, only inside the transaction can we change those constraints property adequately:
BEGIN; SET CONSTRAINTS content_archive_pkey DEFERRED; SET CONSTRAINTS content_archive_archive_id_fkey DEFERRED; INSERT INTO content_archive(content_id, archive_id, status, mtime) SELECT content_id, 'Banco', 'present', '2016-02-04 14:19:59.000000000 +0000'::timestamptz FROM content_archive_tmp; -- revert back to default SET CONSTRAINTS content_archive_pkey IMMEDIATE; SET CONSTRAINTS content_archive_archive_id_fkey IMMEDIATE; COMMIT;
Note:
The constraint names were found using '\d content_archive'.
softwareheritage-archiver=# \d content_archive Table "public.content_archive" Column | Type | Modifiers ------------+--------------------------+----------- content_id | sha1 | not null archive_id | archive_id | not null status | archive_status | mtime | timestamp with time zone | Indexes: "content_archive_pkey" PRIMARY KEY, btree (content_id, archive_id) DEFERRABLE Foreign-key constraints: "content_archive_archive_id_fkey" FOREIGN KEY (archive_id) REFERENCES archive(id) DEFERRABLE
Ok so, status, faster but still too slow.
One third done in ~24h or so.
The first part of the copy was a happy moment.
But the insert part after that is not the way to go.
Of course, i did not find the right documentation, hat tip to @zack to notice my misguided ways.
So effectively, we must:
- use copy all the way
- drop index and constraints altogether
- either keep the default values in table to populate (we choose that), either rework the inputs (what's read from the todb fifo) to add the missing values
- create index and constraints after that (it's faster that way according to the doc)
So here it goes:
DROP TABLE content_archive_tmp; DROP TABLE content_archive; CREATE TABLE content_archive ( content_id sha1, archive_id archive_id default 'banco', -- REFERENCES archive(id), status archive_status default 'present', mtime timestamptz default '2016-02-04 14:19:59.000000000 +0000'::timestamptz -- PRIMARY KEY (content_id, archive_id) ); COPY content_archive(content_id) from '/var/lib/postgres/todb';
And on the side still:
postgres@prado:~$ mkfifo todb postgres@prado:~$ pigz -dc /srv/storage/space/lists/content-id-by-ctime.after-T7.txt.gz | pv --progress --timer --eta --rate --average-rate --size 70g > todb
And it's done.
It took around 45min, awesome.
So next step, we recreate the right information on table (index, constraint, default values):
\timing ALTER TABLE content_archive ALTER COLUMN archive_id DROP DEFAULT; ALTER TABLE content_archive ALTER COLUMN status DROP DEFAULT; ALTER TABLE content_archive ALTER COLUMN mtime DROP DEFAULT; ALTER TABLE content_archive ADD PRIMARY KEY(content_id, archive_id); ALTER TABLE content_archive ADD FOREIGN KEY(archive_id) REFERENCES archive(id);
still running...
Primary key done
softwareheritage-archiver=# ALTER TABLE content_archive ALTER COLUMN archive_id DROP DEFAULT; ALTER TABLE Time: 78.389 ms softwareheritage-archiver=# ALTER TABLE content_archive ALTER COLUMN status DROP DEFAULT; ALTER TABLE Time: 2.648 ms softwareheritage-archiver=# ALTER TABLE content_archive ALTER COLUMN mtime DROP DEFAULT; ALTER TABLE Time: 4.029 ms softwareheritage-archiver=# alter table content_archive add primary key(content_id, archive_id); ALTER TABLE Time: 25526460.506 ms
En route for the foreign key... ^^
And we are done
softwareheritage-archiver=# ALTER TABLE content_archive ADD FOREIGN KEY(archive_id) REFERENCES archive(id); ALTER TABLE Time: 593917.431 ms
^^
Sum up:
softwareheritage-archiver=# \d+ List of relations Schema | Name | Type | Owner | Size | Description --------+-----------------+-------+------------+--------+------------------------ public | archive | table | postgres | 16 kB | public | content_archive | table | postgres | 107 GB | public | dbversion | table | swhstorage | 16 kB | Schema update tracking (3 rows) softwareheritage-archiver=# \di+ List of relations Schema | Name | Type | Owner | Table | Size | Description --------+----------------------+-------+------------+-----------------+-------+------------- public | archive_pkey | index | postgres | archive | 16 kB | public | content_archive_pkey | index | postgres | content_archive | 78 GB | public | dbversion_pkey | index | swhstorage | dbversion | 16 kB | (3 rows) softwareheritage-archiver=# \d+ content_archive; Table "public.content_archive" Column | Type | Modifiers | Storage | Stats target | Description ------------+--------------------------+-----------+----------+--------------+------------- content_id | sha1 | not null | extended | | archive_id | archive_id | not null | plain | | status | archive_status | | plain | | mtime | timestamp with time zone | | plain | | Indexes: "content_archive_pkey" PRIMARY KEY, btree (content_id, archive_id) Foreign-key constraints: "content_archive_archive_id_fkey" FOREIGN KEY (archive_id) REFERENCES archive(id)
status:
- contents present in uffizi and missing in banco added in softwareheritage-archiver.content_archive
- index currently being built on content_archive (still).
create unique index concurrently content_archive_pk on content_archive(content_id);
For information, I did not cleanup softwareheritage.content_archive_tmp (just in case)
Lists contents not present on banco but on uffizi is stored on /srv/storage/space/lists/content-id-uffizi-not-on-banco.txt.gz