Optimizing A Monitoring System: Three Methods for Effective Incident Management
Picture this: You’ve just returned from a well-deserved vacation and, upon opening up your security monitoring system you’re faced with the prospect of analyzing thousands of events.
This isn’t an imaginary scenario, the security monitoring world (actually monitoring in general) is full of anomalies that trigger events. These may represent a real problem or just a slight difference in someone’s day-to-day behavior that might trigger such alerts.
Regardless of the cause, it forces you to sift through large numbers of incidents to figure out which are high priority and which aren’t.
In this post, we’ll highlight three effective methods that can be used to alleviate this problem, based on real-world examples.
Real Value Incidents
The biggest questions in the monitoring world are which anomalies should trigger an incident. One of the challenges the security operations team is facing is to find relevant and meaningful incidents, there are too many false positives. To answer this question, we need to also ask ourselves how we define an incident. Well, that depends on the system domain. The actual decision requires high-level knowledge of the domain and may require the use of complex algorithms that, based on the definition, will highlight what is really interesting.
For example, in the insider threat domain, a system identifies that a user has performed an action on a database for the first time. This is an anomaly since it never occurred before, but is it a real security incident? In order to answer this question, we have to classify the user as well as the database and correlate these two. This allows insights you wouldn’t get at face value.
Grouping of Incidents
Once the real-value incidents are identified, one way of reducing the number of incidents that need to be managed is by grouping them into narratives to describe a specific phenomenon that security engineers can handle as one. Although each individual incident is valid, when grouped together, an even larger, more manageable narrative appears that can be dealt with as one – the sum is greater than its parts.
The two types of groupings:
- By incident type. For example, ‘a service account was abused by multiple users’.
This implies that this service account is accessible to a community, which is bad practice. Handling of this phenomenon can be to change the permission of this account.
- Grouping of different types of incidents that represents a certain narrative. For example, a user has abused a specific database account, accessed several application tables and accessed a large number of files. This implies that the user may be compromising the data of the enterprise. Handling this could mean assessing the user and their behavior.
An Imperva CounterBreach customer data example shows how grouping reduces the number of events to deal with. The number of incidents continuously grows whereas the number of groups slows down until it stops.
Incident Priority Scoring
Traditional prioritization of security incidents is usually done by classification into severities (critical, high, medium, low). This type of classification doesn’t provide a clear decision on what should be done first. Let’s say there are 10 incidents classified as critical. All of them must be treated immediately, but which should be first?
The suggested solution is to set a priority score for each incident on a range of 0 to 100. Different criteria within the incident add scores — different calculation methods can be used — and the priority score is the end result.
Example: The traditional severity for incident ‘Excessive Database Record Access’ is high as this implies data theft.
Two incidents of this type are raised which, at first glance, might be treated with the same urgency, but are they really the same?
Let’s now look more closely at the details:
- A human user has accessed 105000 records in a database in a production environment.
- A human user has accessed 100000 records in a regular database in a staging environment.
The details clearly indicate that the first incident should be treated prior to the second as it one as it poses a greater threat.
Using the new method:
- Incident type: excessive database record access = 70.
- Number of records accessed > 100000 – Add +5.
- Database is in a production environment – Add +10.
Based on the above, the first incident’s final priority score is 85 whereas the second incident’s final priority score is 70.
Scoring can be done on groups as well
Deciding on the score criteria and values is a fundamental factor of whether the ordering of the incidents guides to the correct prioritization. It requires in-depth knowledge of the subjects being monitored.
Applying the described methods
Each of these methods reduces the number of incidents you need to deal with, however, is best to implement all.
As seen in our examples, the number of incidents with real value may still be high, especially if you have a big amount of coverage. Grouping incidents can dramatically reduce the number of issues to deal with, but you will still want to know which incidents or groups of incidents to handle first. Setting scores takes care of that.
Security monitoring systems provide a very important layer of protection, however, when the number of incidents raised increase it becomes harder to manage and more time-consuming. It may even lead to abandoning a system altogether.
Focusing on the important stuff (real value incidents), providing the big picture (grouping incidents) and defining a clear priority (incident priority scoring) allows a faster, more effective investigation. As such, the real value of monitoring is achieved. Imperva CounterBreach addresses all of these requirements, get in touch and let’s see where we can help.