A Practical Example for Geolocation

Geolocation

A post this week on Jordan Wright’s blog has generated quite a bit of discussion here at Imperva on the subject of industrial hacking. Jordan talks about a large number of attempts (at least one of which, early on, was successful) to compromise a web server by exploiting a vulnerability in ElasticSearch. What really set this article apart was the inclusion of detailed logs from an ElasticSearch honeypot designed to capture as much information about the attackers as possible. It’s fairly rare for a researcher to include this level of information in an easily consumed format (JSON) so I decided to download the logs and take a closer look.

Based on the discussions we were having and some of Jordan’s own conclusions, I started with the geolocation data. Clearly some regions are more over-represented than others, which is not an uncommon experience for web application defenders. As reported in the blog, the attack geolocation data is heavily skewed by one particular country:

Number of Attacks Percentage of Overall Attacks Country
7400 95.16% China
337 4.33% United States
15 0.19% Hong Kong
8 0.10% Netherlands
5 0.06% United Kingdom
3 0.04% Germany
2 0.03% Spain
2 0.03% Denmark
2 0.03% Canada
1 0.01% Russian Federation
1 0.01% Romania
7776 TOTAL

The data raises a very obvious question, one which many of our customers consider during their WAF implementations once they see similar patterns emerging in their own alerts – should they block requests from countries which appear to be repeat offenders? One of the ThreatRadar threat intelligence services which we include with Imperva SecureSphere Web Application Firewall is a mapping of IP address space to country names, i.e. a geolocation feed. Once ThreatRadar is enabled, every alert is tagged with the geolocation of the source IP:

geolocation02

While this is interesting information at the individual alert level, the real value of geolocation emerges when these geolocation tags are aggregated across alert data sets, such as in this report I generated at a customer site a few years ago:

chart

The “blocks” referred to in the report are attacks which the WAF detected and blocked.

Running the source IP addresses in those ElasticSearch honeypot logs through our ThreatRadar feeds also came up with numerous hits in the Malicious IP feed, and one for Comment Spam. Given that our feeds are updated many times each day, but these attacks occurred two months ago, there is clearly some long-term offending from some of these IP addresses. So why not block them? If over 95% of occurrences of a certain type of attack emanate from one geography, and some of these IP addresses are long term offenders, why not block that whole geography at your perimeter?

Many of my customers already have the ability to put a geolocation wall, like the Jerusalem city wall in my photo below, in front of their web applications. None of them do it, because that ability is provided by devices like IPS and Next Generation Firewalls which do not provide the granularity needed to avoid false positives. It’s an all-or-nothing proposition. An enterprise WAF which understands the layout of the web application and can easily differentiate between a request to the main page and a request to the admin portal login gives defenders the flexibility they need with the power of geolocation. To continue the theme of medieval defenses, it looks more like the medieval entry way into Sighișoara citadel at the top of this blog. 

building

It’s a good example of a controlled entry point where requests which pass the outer line of defense (geolocation) can still be inspected before being allowed in (those apertures above the lane weren’t for waving at the people passing below). That’s exactly how our WAF works, too.

Finally, a word on attribution. Just because 95% of the ElasticSearch attacks emanated from IP addresses in Chinese address space does not necessarily mean that those attacks were generated by Chinese nationals. Industrial hacking rules: the entire Internet can be scanned in six minutes, and if you’re not able to scan it yourself there are companies who will do it for you and give you access to the results for a modest charge (sometimes for free). Our job as defenders does not always require us to identify exactly who is attacking us, but if we can significantly reduce our alert load by blocking certain types of connections from certain geographies without creating false positives, we’ve made it easier for ourselves to focus on the attacks that really matter.