Log-Based CDC: Impact on Database Processing
Worry about the impact of Log-Based CDC?
Many DBAs are – rightfully – worried about the impact log-based Change Data Capture (CDC) will have on the mission-critical transaction processing database. There are many factors that influence CDC performance including the transaction log generation rate, the transaction mix (lots of small transactions versus fewer larger ones), the percentage of the log that is of interest for replication, number tables, columns and data types, etc. Of course, the best thing to do is run a test on representative hardware with a representative workload.
In this blog post, I will focus on just the impact of reading the log
Log-based CDC reads transaction logs – and if it is behind archived/backed up transaction logs – until it reaches the tail of the current log. When it gets there the log reader goes to sleep only to wake up a split second later to find out whether there is more transaction log to be consumed and then the cycle repeats. On most modern systems this process has very little impact and should not be a major concern for DBAs because file system cache (or storage cache) will have the tail end of the log in the cache.
There are cases however when the database has been configured to bypass any file system cache. This may have been a consideration when server memory was expensive and DBAs would rather allocate a larger database buffer. In these cases the log will have to be read directly from disk, and given the database is writing to disk (right around but not at the same location), the odds for contention are a lot higher, and chances are overall read performance is a lot slower compared to systems that do take advantage of a file system (or storage) cache.
In the Oracle Database the way to control this behavior – if you are still using a regular file system to store database files and not Oracle’s Automatic Storage Manager (ASM) – is through filesystemio_options. By default Oracle does not set this value which generally means writes will be buffered in the file system cache. If however this value is set to SETALL or DIRECTIO then caches are bypassed. And that may result in (quite a bit) more impact of running CDC to capture changes from the database.
So to run log-based CDC most optimally on an Oracle Database unset filesystemio_options – and restart your database.