Normalize data sent from clients to the storage
Closed, MigratedEdits Locked
Actions

Assigned To

Authored By

	olasd
	Apr 4 2019, 12:31 PM

Description

These days, the data normalization step in the graph storage happens at the point where we write the data out to PostgreSQL.

This is becoming an issue now that we're considering pushing data to the journal directly when it gets inserted, as journal clients end up needing to do the normalization before consuming the data.

This also makes the "normalized data schema" dependent on the PostgreSQL implementation, instead of having a proper specification, which is problematic when considering "post-Postgres" graph storage backends.

This is not entirely a problem *now* as the journal consumers are fully controlled by us and either process a few bits of the data, or just write back to postgresql; It's going to be a problem in the close future.

We should:

make sure the normalized data schema is (more cleanly) specified
make loaders normalize their data before sending it to the storage backend
make the journal backfiller normalize its fetched data before sending it to the journal
(plausibly) make the storage backend check data normalization before accepting to store it

Event Timeline

olasd triaged this task as Normal priority.Apr 4 2019, 12:31 PM

olasd created this task.

olasd updated the task description. (Show Details)

This task has been migrated to GitLab.

Normalize data sent from clients to the storageClosed, MigratedEdits LockedActions

Description

Event Timeline

Normalize data sent from clients to the storage
Closed, MigratedEdits Locked
Actions