Four Ways to Prepare For Active/Active Replication
What is Active/Active?
Active/active replication is a setup in which multiple databases with the same application tables operate independently with changes synchronized between them. Geographical replication is a popular use case for active/active replication, but it may also be of interest for migrations e.g. on-premises to Cloud migrations. An active/active replication setup may only involve two active databases but is certainly not restricted two.
In a rare case active/active replication can be implemented by simply running change data capture and integration on all databases in the environment, and sending the changes to all other databases. In most cases however this strategy leads to lots of data collisions (e.g. duplicate primary keys) that can be prevented through advance planning.
1. Primary Keys
Many applications generate a primary key from a database sequence. In an active/active scenario be prepared that without any adoption this strategy generally results in data issues. To avoid data collisions, consider:
- Start the sequence at incremental values on different sites, and always increment the sequence by the number of active sites.
- Add a column e.g. site_created to every table to introduce a composite key that is always unique.
- Allocate a batch of numbers to every site and start the sequence accordingly. Of course this strategy requires a periodic review to make sure that the allocated batch is not filling up.
In all cases you may need to change the application (or introduce database views and/or triggers to avoid application changes) and beware of the possible impact on application performance because indexes change and/or column populated by a sequence may no longer just increment.
Many traditional OLTP applications use database triggers to implement business logic. Triggers should fire when the application makes changes to the database, but in most cases you probably don’t want the triggers to fire when the replication introduces changes. This may involve a change in the triggers (e.g. “if [user] <> [replication user] then [logic] end if ;”), unless triggers can be disabled at a session level.
3. Cascade Constraints
Some applications use cascade constraints. A cascade constraint will delete or update/nullify records in a child table based on the foreign key relationship to the parent table when rows on the parent table are deleted or the primary key is updated. Database replication may need to be prepared to handle potential issues with the replication, given the cascade operation automatically happens. (E.g. an on-delete cascade constraint will delete a child row if a parent row delete is deleted.)
Data replication will want to perform both deletes but when the child row was already deleted through the cascade constraint the subsequent delete of the child row deletes nothing. In many cases you can instruct the replication to handle these situations gracefully (and optionally log a warning), but most data replication tools will consider this to be a failure.
Despite the best preparation in the world fundamentally collisions cannot be avoided because databases operate independently. A collision occurs when a user updates a row in one database and at the same time another user updates (or deletes) the same row in another database. The nature of the application may naturally prevent issues like this (e.g. in a CRM application account managers only modify their own accounts) but not in all cases. For the cases when the application does not prevent collisions you may need to find a way to resolve collisions during real-time data replication. One very important consideration is that systems should remain in sync in all cases, because out-of-sync systems generally only end up going further out of sync.