What solutions does the HN community have in dealing with this issue?
One master is picked as the final authority, then changes flow to other databases, replicas, etc. either using database tools or applications.
Awareness of deadlines or how much latency (ie. when the updates are needed for each downstream pipeline) is helpful.
A practical example that solves a simple case is the recent release of a tool by Netflix.
Pro tip: when you interview people for a project like this, they should already know the above. :)
Source: DBA.
Step 2) Send a message to all the `downstream` systems (i.e. not the Source of Truth systems) with the change description and GUID
Step 3a) Update source of truth with the new info, write the GUID to a log table in source of truth in the same transaction
Step 3b,c,d) Downstream systems poll the source of truth for existence of the GUID in the log table indicating the change has been committed to source of truth. Load latest value from source of truth and commit the change to the downstream system.
Step 4) If GUID never shows up, transaction was rolled back in source of truth (or an error occurred trying to send to one of the other downstreams, aborting before transaction)- do something relevant for your use case. Alert somebody, carry on, whatever.
Here's the high-level approach I would take:
1: Understand which databases serve what, and what the needs for accuracy are. (For example, a database serving their public-facing website can probably be a few seconds behind compared to a database handling real money.)
2: Figure out which database is the source of truth. Update that database first.
3: Update the remaining three databases in a loop: Update the database, then check to make sure the update happened
You will probably need to add some state management where you put yourself into an "updating" state, so that if the update breaks in the middle, you can go back and recover.
----
Another approach is to refactor the applications that are critical (Those that handle money, not the public facing web site) to always read from the database you decide is the source of truth.
If it happens infrequently enough, you can have simply have a proceed "change mortgage rate" that touches the 4 legacy databases. Done.
If it does need a technical solution, you have to establish the leading system / source of truth / whatever you call it. This can be either an existing system or a new one. And then all the others need to follow the leading system.
A more distributed approach (every of the 4 legacy systems can generate events that the other 3 work with) likely won't work with 4 legacy systems. It's often hard enough to get high-fidelity events/deltas out of a single system.
There’s a trade-off between having a manual vs. automated process in the best of circumstances. You might think intuitively that an automated process would be easier and more reliable, but depending on local factors for each of the databases, there might be issues like availability or locking that could make an automated solution more “fragile” than simply having an employee with domain expertise doing an update on a schedule. Reliably automating a process like this would require a lot of careful QA.
So throwing legacy systems into the mix would just aggravate this further. It may be that the cost/benefit of automating this just isn’t there. If it’s cheaper in terms of salary cost to do it by hand for N years than the one-time expense to automate, it may just not make sense to automate. (Obviously over an indefinite time scale it would almost always make sense to automate)
The standard thing to do is to have a source of truth, as described in various comments here like https://news.ycombinator.com/reply?id=22690991&goto=item%3Fi...
Sometimes you'll have multiple sources of truth for (hopefully non-overlapping) subsets of the data in question.
These things are messy. You have to do the hard work of writing sync tools. You want sync tools that work off of incremental / delta feeds, but you also need (slower) tools that can make-it-right based on the totality of your data -- at least the first time you sync two systems, and often also for auditing purposes to make sure that the incremental (fast) sync is working correctly.
We had a prime app, a fifth DB driving it. It had scripts to pull rates and a UI to adjust. Logs all changes.
These rates are pulled by the other systems (eg the AS400 pulled over FTP). Additional internal apps could query over SMB and HTTP.
The prime updated slowly and each of the services would poll it very frequently (4m) + on-demand when needed
2) Then once this is standardized, write a for loop.
A little planned downtime goes a long way.
Name the bank so we know not patronize patronize it.