What is Application Down Time?
And what does it cost?
The simplified view of application availability is binary: the application is available, or it is not. There is however a large grey area related to availability to indicate whether the application performs as well as it should, and if you consider the cost of application down time, or the investment in application up-time, then performance challenges must be taken into account. Sub-optimal application performance may result in direct costs such as lost productivity and lower revenues, but also indirect costs such as frustrated employees, dissatisfied customers, and possibly credibility/brand damage. Of course the cost of application down time depends a lot on how mission critical the application is, whether it is customer-facing, the size of the organization, etc. On top of that an organization should decide for itself how much to invest to ensure the application is up and runs well.
In this blog I will focus on database applications (2 or 3 tier) because these are the ones that could benefit from real-time data replication technologies. I will also assume that access to the data is the most important aspect of application availability and not cover the fact that for a 3-tier application to run the middle tier must also be available and perform well which can have its own set of challenges.
- There are many reasons why a database application may be unavailable, both planned and unplanned, including but not limited to power outages, network issues, hardware failures, software bugs, upgrades and migrations, and last but not least human errors. Hardware and software vendors have implemented many features to make systems more highly available such as dual power supplies, RAID storage configurations to survive disk failures, and clusters to survive server failures. Arguably some of these technologies introduced so much more complexity that in the end more downtime was introduced due to a higher frequency of outages for software patches or because of bugs. A lot of high availability technologies focus on availability within the boundaries of a server or data center. Disaster recovery solutions typically consider a larger scope of outages that may include regional power outages or large network failures.
- The impact of application performance challenges on cost or lost revenue is much more difficult to define. Maybe a sales representative can process ten purchases per day on a system that performs well but only seven on a system that performs poorly. Poor performance can have many reasons including high network latency, overloaded hardware, insufficient performance tuning, bad application design, etc. Always try to understand the root cause of poor performance before attempting to solve the problem with any particular technology.
- When considering the cost of application unavailability or poor performance CIOs and IT Directors make trade-offs between the cost of a solution and the risk of downtime and any costs associated with it. This is where real-time replication may play a role. Of course real-time replication technology is not free and introduces implementation and maintenance cost, but data replication can play a role in both high availability and disaster recovery scenarios. There are many customer use cases that show:
- Using multiple mid-size servers instead of a single much larger server can be very cost effective.
- Reporting done on a different, lower-cost hardware/software implementation, may pay for the replication implementation.
- Migrations have a much lower level of risk if the old setup can still be kept up to date with changes from the new system.
- Chatty database applications benefit from a database next to the application.
Think about the database applications in your environment and consider the value of up-time. Understand the cost and risk of downtime, and evaluate whether the application performs well.