The idea is that every write to your major entities would take the form of a "message" row, inserted into a table (you could have multiple message tables for different kinds of messages). You'd eagerly record events, append-only, and that would be all your persisted data. Then concrete things like "the current status of user accounts" would be in a materialized view over the raw message sequence (or if it turned out to be more performant, you could compute/cache this data in any number of other downstream places; services, redis, etc)
Obviously performance could be a challenge, especially as the message tables get large, which is the only problem I can really think of. But the advantages would be:
- Full history of all DB states (for auditing, recovery, etc)
- Eventual-consistency gets really simple, you just merge message histories
- Schema changes, etc become less of a headache. Your source of truth is always just "what has happened", and you can always add to or reinterpret that however you want. Data integrity should basically become a non-issue if you're never overwriting or rewriting history
I'm also wondering if you could periodically "squash" the messages as needed to preserve performance. I.e. if an account changes to one state and then back to the previous state, remove the previous messages and only keep what's needed to create the latest state. You could decide that eg. you only want the full history for the past 30 days and squash everything before that on an interval
I've definitely seen this used as a programming model, just not at the DB level. It's also roughly how git works
I think Datomic does sorta work this way, but it's unusual in a bunch of other ways too; seems like you could do this with a regular DB
Anybody tried this or heard of it being done?
He basically explores similar idea, you have a log to persist and materialised views to query.
Basically, for every table "things"
(id INT, name VARCHAR, ...)
you'd also have a "things_history"
(operation VARCHAR, id INT, name VARCHAR, ...)
This is basically a slight inverse of your model, but pretty much the same implementation: your "materialized views" are actually the target tables, and auto-maintained tables give you the source of truth for audit etc.
Pretty cool setup. Once again, not the same, but interesting