Page MenuHomeSoftware Heritage
Paste P656

t2310 irc discussion
ActivePublic

Authored by ardumont on Apr 30 2020, 12:02 PM.
12:25 <+olasd> ardumont: you should even do this migration the other way around. 1/ create the new table, empty; 2/ deploy new storage that fills both tables in parallel; 3/ run the migration process for origin visits that don't have any state information
12:26 <+olasd> 4/ switch over the read queries to the new tables; 5/ drop fields from the old table
12:26 <+olasd> (#showerthoughts)
12:27 <+olasd> that way, the only downtime needed is during #2 which should be minimal; it should even be doable on the fly as the RPC api is not changing at all
12:28 <+ardumont> for 2. i need to check the code, i think we stop the writing in origin visit to only write to origin visit status
12:28 <+ardumont> check the code first*
12:29 <+olasd> well that's not good; I don't trust that the read queries will behave properly on the new table with all the data, so I think we need to live with both in parallel until we figure out whether the new schema is good enough
12:29 <+olasd> (performance wise, notably for origins with lots of visits like the debian / pypi / npm origins)
12:29 <+ardumont> olasd: ok, will check and update back those then (if it's indeed the case, i do think it is)
12:29 <+ardumont> vlorentz: ^
12:30 <+ardumont> anlambert[m]: tbc, yes, pin the storage if you can to the latest tag (v0.0.188)
12:31 <vlorentz> olasd: "I don't trust that the read queries will behave properly on the new table with all the data," why not?
12:31 <+anlambert[m]> ardumont: already done, back on track
12:31 <+ardumont> thx
12:33 <+olasd> vlorentz: because it's triplicating an already large set of data
12:35 <vlorentz> so ardumont has to revert two thirds of D2938 ?
12:35 -- Notice(swhbot): D2938 (author: ardumont, Closed) on swh-storage: pg-storage: Adapt internal implementations to use origin visit status model representation <https://forge.softwareheritage.org/D2938>
12:35 <+ardumont> i was realizing and reflect on that
12:35 <+olasd> well, I'm uneasy halting the world for a week while the data is being moved and we figure out whether the new stuff works at all
12:36 <vlorentz> what about staging?
12:36 <vlorentz> I mean, use it to test
12:36 <+ardumont> unfortunately the dataset on staging is way smaller
12:36 <+olasd> staging doesn't have 0.0001% of the production data
12:37 <vlorentz> hmm
12:37 <+olasd> it's a smoke test
12:37 <+olasd> not much more
12:37 <vlorentz> can we duplicate the replica on azure, and use it like a staging with all data?
12:37 <+olasd> no
12:37 <vlorentz> too expansive?
12:38 <+olasd> too expensive, and postgresql on azure is useless
12:38 <+olasd> if you're confident doing the full migration in one go, I won't block you too hard
12:38 <+olasd> but I'm dubious
12:39 <vlorentz> I'm just trying to find a better way
12:40 <+ardumont> sure, thx
12:40 <+olasd> to minimize the amount of data to move around, it should be possible to turn the current origin_visit table into origin_visit_status, and recreate a new origin_visit table from scratch
12:40 <+olasd> (rather than going the other way around)
12:41 <+olasd> no idea how that will play out with the replication
12:42 <vlorentz> sounds risky
12:43 <+olasd> how so?
12:43 <+olasd> it's just adding one new field duplicating the current date field
12:46 <+ardumont> for the dump idea, wondering if we can't make a dump out of origin/origin_visit from prod, restore it to staging
12:47 <+ardumont> trigger further some load after that and check what happens in staging webapp
12:47 <+ardumont> (it'll be missing some snapshots though)
12:47 <+olasd> do you have 200GB free in the staging postgres?
12:48 <+olasd> I mean, that's probably a decent middle ground
12:48 <+ardumont> not immediately but that can be done
12:49 <+olasd> vlorentz: the current proposed migration is risky too, in the sense that it's pretty much a one-way deal once workers start writing /only/ to the new tables
12:49 <+olasd> s/tables/table/
12:50 <+ardumont> indeed
12:50 <+ardumont> tbc it's been quite a while since i grew weary of that migration
12:50 <vlorentz> <olasd> how so?
12:50 <vlorentz> I meant renaming the table
12:50 <vlorentz> ardumont's proposal sounds good
12:51 <+ardumont> (weary and wary both :)
12:55 <+ardumont> i'll check that then
12:55 <+ardumont> increasing db0 (staging) fs
12:56 <+ardumont> dumping out of somerset some tables (origin, origin-visit)
12:56 <+ardumont> (somerset looks less loaded)
12:56 <+olasd> please use belvedere
12:56 <+ardumont> ah lol, checked the wrong machine, thx
12:56 <+olasd> it's faster
12:57 <+ardumont> even better
12:57 <+olasd> (and it's not being used by the public frontend)
12:58 <+ardumont> righty right