Optimizing Your Network Topology
What does “topology” mean?
The word topology is derived from the Greek word topos which means place, region, or space. The term is used across different sciences and technologies. In the context of replication, the use of the term is closest to the definition of Network Topology: “Network topology is the interconnected pattern of network elements. The network topology may be physical, mapping hardware configuration, or logical, mapping the path that the data must take in order to travel around the network.”
When talking about customers about their data replication project, we often talk about their topology first. In fact, HVR’s Start Page shows common topologies to help them get started with their replication deployment.
Common Replication Topologies
Defining Common Topologies
- One to one. This type of topology is common when customers look to: to offload reporting, to facilitate a system/platform migration with minimal downtime, or to set up a disaster recovery environment.
- One to many. A one-to-many topology is often used to distribute load across multiple identical systems, or, as is becoming more common these days, capture once and deliver to multiple destinations. For example, we see several customers adopting cloud technologies targeting both a file-based data lake using a distributed storage system like S3 or ADLS, as well as a relational database for analytics like Snowflake, Redshift or Azure Data Warehouse.
- Bi-directional. A bi-directional topology is also referred to as active/active, to keep two (or more) systems in sync. This is typical in a geographically distributed setup where the data should always be local to the application, or in a high availability setup.
- Many to one. A many to one is the topology for a data warehouse or data lake consolidation project, with multiple data sources feeding a single destination, or for the use case of multiple distributed systems, typically containing a subset of the data (e.g. local branches), all feeding into a central database.
In practice we see many HVR customers take advantage of the real-time replication benefits the technology offers for multiple projects, with some standardizing on HVR for all organization-wide real-time data replication projects. For scenarios like these, what is the ultimate topology?
HVR now enables customers to answer that question. With Version 5.6 on, HVR customers can see their topology chart enabling them to visualize all data flows (in HVR terminology, Channels) in a single overview. This overview provides as much information on a single screen as the topology chart provides, including:
- The direction of the data flows indicated with arrows, but also with optional animation.
- A relative indication of data flow volume by means of the thickness of the lines, with a wider line indicating higher volumes.
- A relative indication for the number of tables included in the replication at an endpoint or location. A larger circle indicates more tables.
- Replication down is indicated by a grey line. Because replication is divided in capture and integration replication may only be down for one side of the data flow.
- A relative indication of latency, with a darker shade indicating higher observed latency.
- Whether a latency threshold was exceeded on data flow, color-coded in red, currently using a configurable threshold indicator.
The context for the topology is a hub, the installation of HVR that controls the replication. Distributed agents perform the bulk of the hard work, a single hub can easily control dozens of data flows.
The topology chart is interactive. Select the data flow you would like to inspect to see arrows pop up on the flow to highlight the current state of the replication. The topology charts are also a good starting point into other parts of Insights, namely the statistics or the events (using the buttons on the vertical bar on the left).
Of course, with the topology charts running in a browser, the display is entirely driven by REST calls. At present, the topology chart can only be retrieved on a machine that runs the HVR GUI. This will change in an upcoming HVR release when a standalone web server will become part of the product, and more of the user interaction will be browser-based.
Topology is a feature unique to HVR. I am starting to recognize a customer’s deployment by simply seeing the chart. It is the best way to get a quick overview of data replication in a hub. Wonder what your topology looks like? Contact us for a demo.
Mark Van de Wiel is the CTO for HVR. He has a strong background in data replication as well as real-time Business Intelligence and analytics.