- August 20, 2018 at 3:30 pm #15196
Is there an HVR best practice when it comes to Channel Count for a particular source?
For example, we have sources which have 600+ tables. Today, we tend to split our channels based on table size so that larger tables don’t hang up the integrates for some of our smaller tables. Is this a best practice, or is it recommend that each source has a single channel?
Any advice would be helpful as I can see the argument for both sides.August 20, 2018 at 6:04 pm #15197ggoodrichKeymaster
Determining the proper number of channels/jobs to manage depends on several factors, such as:
- Amount of transaction log generation (per hour) at each source location
- Number of logical change operations (per second) to be integrated
- Division of duties and ownership of data
- Replication use case
The short answer is … the fewest number of channels/jobs is best, but this may vary depending on your specific use case.
The capture job is typically extremely fast and can handle many gigabytes of redo per hour, and therefore having a single capture job is most efficient whenever possible. Since capture does not store any of the changes locally, the disk i/o is practically nil. It works in large blocks, so again, it is fast and efficient.
Integrate might need to be split into multiple jobs (locations) to help spread the load across the database/file server for faster throughput. For example, you might have a high volume of logical operations which require a few integrate jobs to maintain the latency requirements. When splitting these operations into multiple integrate jobs, you should consider keeping similar data together for transactional consistency. Another factor to consider is the number of processing cores.
If different groups of business users are using HVR for various replication projects, they may elect to maintain their own channels separately as they design and test their replication. They too might have different technical or business requirements that would suggest they need to develop and deploy separate channels.
The scenario or use case might also affect the number of channels defined. For example, some replication requirements might involve different schedules, as to when jobs should or should not be running. Maybe a geographically disbursed active-active use case requires integration jobs to be continuously running with near real-time latency requirements. While another use case might be designed for consolidated data warehousing in which running in small batches might be more optimal. And lastly, maybe one team is constantly making changes to their channels but does not want to impact other teams with the hourly maintenance work.
So, to summarize, let’s look at an example where a company is using HVR to replicate data for a couple of data replication projects. They have large volume of data and need to keep replication with near zero latency for most jobs, but not all. They also have a data mart project that requires jobs to be stopped during certain hours of the day. This company has defined two channels. One channel has a single capture job, but it has defined multiple integrate jobs to keep the latency low. Their integration server has eight cores, so they elected to split the data into eight target jobs. A second channel has a single capture and a single integrate job, but since it has unique scheduling attributes that require it to be stopped for a few hours every day, they split those tables into a separate channel.
You can see there are several factors that may attribute to the number of channels/jobs defined that vary from volumes of throughput and replication requirements or user needs.August 21, 2018 at 7:39 am #15198
Thanks Glenn for your input.
In your example you mention “<span style=”color: #000000; font-family: Roboto, sans-serif;”>One channel has a single capture job, but it has defined multiple integrate jobs to keep the latency low”. It is possible to have multiple integrates in a channel that point to the same location? So if I have 100 tables in my channel, I can have a single capture and 4 integrates (25 tables each) that all point to the same target location?</span>
If so, how do you create that? Is that done by initializing each table set separately?
JaredAugust 21, 2018 at 8:46 am #15199ggoodrichKeymaster
HVR provides multiple ways to define a one capture to multiple integrate jobs. Details on one of those methods can be found here: https://www.hvr-software.com/support-services/customer-resources/forum/topic/define-multiple-integrate-jobs-achieve-parallelism/
GlennAugust 21, 2018 at 2:13 pm #15208
- You must be logged in to reply to this topic.