Tuning Capacity Tips for SecureSphere Database Activity Monitoring
You have Imperva SecureSphere Database Activity Monitoring (DAM) up and running. You’ve deployed the system and configured your business audit policies. So, what’s next?
In a previous post I discussed the capacity management challenges of database monitoring solutions, in this post I’ll elaborate on the solutions SecureSphere has for managing and resolving those challenges. I’ll explain the importance of managing your DAM capacity over time, review how you can discover capacity issues and share ways to mitigate any problems.
Why the Need for Tuning?
As discussed in the do’s and don’ts post on capacity estimation, it is very difficult to get an accurate estimate on the expected capacity per database. It’s better to estimate for the entire deployment.
The problem with estimates is, well, they’re estimates. Things can change or perform differently than you expect. You could find out for instance that your 20 core MySQL database has less activity than your 8 core MSSQL. Why? It could be due to several reasons, such as:
- The application that uses the MySQL is much more efficient and caches data
- The DBA responsible for the MySQL doesn’t back up everything, while in MSSQL he does
- The MySQL server contains other databases as well, or hosted applications
Even if you did a great job estimating your capacity needs when you first deployed, it’s just a matter of time before it loses accuracy. From upgrading your database, changing your applications, adding more users, or adding more databases to your deployment – your capacity requirements WILL change over time. These changes will affect not only your overall capacity requirements, but capacity per database.
SecureSphere Database Activity Monitoring Terminology
Before I drill down into SecureSphere solutions for capacity management, there are a few terms that would be helpful to understand. The basic operation of DAM (usually) involves agents, which are installed on the database servers, and gateways, which are used to process the database activity sent from the agents. This means that every agent needs at least one connection to at least one gateway.
There are two primary methods to manage multiple gateways:
- Gateway Group – simple grouping of gateways with no scale out or capacity related logic
- Cluster – more advanced grouping of gateways with additional logic for scale out and redundancy
Every gateway has a maximum capacity measured in IPU (Imperva Performance Units), and every agent has a relative load on the gateway, which is also measured in IPU. The purpose of IPU measurement is so you can compare and estimate the capacity impact of the agent assignment on gateways.
With the basic terminology down, let’s find out if you have capacity issues.
Discovering If You Have Capacity Issues
The best way to identify most capacity problems in SecureSphere DAM is through the health monitoring feature, which displays alarms on various issues. There are a few that indicate an overload problem, either at the gateway level or at the cluster level (see Figure 1):
- Gateway capacity warning – major warning, gateway at a high load state
- Gateway capacity alert – critical warning, gateway at a critical load state
- Cluster capacity warning – major warning, cluster of gateways at a high load state
- Cluster capacity alert – critical warning, cluster of gateways at a high load state
Figure 1: SecureSphere displaying current status of SecureSphere components
There are also alarms for special scenarios. For example, if an agent load is more than your gateway can handle (depending on the gateway model) you will receive an alarm with a recommendation to scale out.
These alarms are based on real time measurements of the overall load of each gateway and the relative load of each of its corresponding agents. Each alarm contains a detailed explanation with recommendations for mitigation.
You can be more proactive by analyzing the current agent load and total gateway(s) load via the cluster management feature (Figure 2), which displays detailed information for all types of clusters and gateway groups. You can see which agents are assigned to which gateways, the capacity information, versions, status, etc.
Figure 2: SecureSphere DAM cluster management feature enabling cluster maintenance
Four Ways to Solve SecureSphere Database Activity Monitoring Capacity Issues
There are four ways to solve for SecureSphere DAM capacity issues. Let’s take a look at each one.
Manual Load Balancing
You can choose to analyze your deployment and manually change the assignment of agents to gateways. It is also possible to set a threshold to prevent assigning an agent to a gateway if that gateway is overloaded according to its current assignment. It is important to note that this threshold is based on the estimates given to each agent upon initial assignment and not according to real time measurements.
Automatic Load Balancing
Another way to solve a load balancing issue is to let SecureSphere do it automatically. The automatic load balancing feature (Figure 3) ensures the cluster will be optimized for the long term. Its aim is NOT to solve a momentary peak load, but to cluster balance over time. It doesn’t change the agent assignment unless it improves the overall load scenario.
Figure 3: Configuring automatic load balancing in SecureSphere
In other scenarios, you might discover that you need more gateways (scale out), or more powerful gateways (scale up) (Figure 4). This means that load balancing is not a relevant solution – you simply don’t have enough capacity. The alarms will guide you with recommendations, but in some cases it will still be beneficial to contact support to make the best decision.
Figure 4: Scale out (add more gateways) versus scale up (add more powerful gateways)
Large Server Cluster
There are a few special scenarios which can lead to capacity-related alarms. One of them is discovering that a certain agent’s required capacity is larger than a “full gateway”, and that multiple gateways are required to handle this single database. In this scenario, you will see the appropriate alarm with a recommendation to create a large server cluster. The large server cluster is used to solve the problem of monitoring very large databases (minimum of 128 cores might be considered large – depending on various factors).
As you can see, SecureSphere DAM supplies multiple tools to handle capacity management before there’s a problem and mitigate any existing ones. It is highly recommended not to wait for the next alarm to pop up, but to follow these guidelines:
- Use clusters when you can, and let them do the load balancing for you
- Be proactive – check the actual IPU measurements per gateway and per agent to fine tune your current deployment and improve your future capacity estimations
- Be attentive to all alarms and follow recommendations
By utilizing these solutions and best practices you will improve the foundation for future growth plans and keep the capacity management overhead of your SecureSphere DAM deployment to a minimum.