WP Headless Chrome: DevOps Love It, So Do Hackers, Here’s Why | Imperva

Headless Chrome: DevOps Love It, So Do Hackers, Here’s Why

Headless Chrome: DevOps Love It, So Do Hackers, Here’s Why

Google Chrome is the most popular web browser and has been so for almost a decade. Each new version of Chrome brings new usability, security and performance features.

This article focuses on the “headless mode” feature that Google released more than a year ago; and, since day one has become very popular not only among software engineers and testers but also with attackers.

Off with their heads!

Headless mode is a functionality that allows the execution of a full version of the latest Chrome browser while controlling it programmatically. It can be used on servers without dedicated graphics or display, meaning that it runs without its “head”, the Graphical User Interface (GUI).

In headless mode, it’s possible to run large scale web application tests, navigate from page to page without human intervention, confirm JavaScript functionality and generate reports.


As with benign cases, the same functionality takes place in malicious scenarios, when an attacker needs to evaluate JavaScript or emulate browser functionality.

The practice of web browser automation isn’t new. It’s used in dedicated headless browsers like PhantomJS and NightmareJS, test frameworks like Capybara and Jasmin, and tools like Selenium that can automate different browsers including Chrome.

How popular is Headless Chrome?

The chart below shows the amount of traffic generated by Headless Chrome and other major headless browsers since its release date in June 2017. In comparison to other headless browsers and automation frameworks, Headless Chrome overtook the previous leader, PhantomJS, within a year of its release.

Headless Chrome1

Automated browser trends over the last year

The data collected from our cloud WAF statistics, reinforced by data from Google Trends, highlight how the popularity of PhantomJS fades, while Headless Chrome’s trajectory keeps climbing.

Headless Chrome2

PhantomJS and Headless Chrome: Google search trends

Automated browsers driving increased traffic

Apart from Headless Chrome’s popularity, and the degradation in the popularity of outdated tools, we observed an increase in total traffic generated by automated browsers compared to non-automated web surfing.

The chart below represents the percentage of automated browsers out of total traffic generated by web browsers:

Headless Chrome3

Traffic ratio between automated and non-automated browsers

So, why is Headless Chrome so popular?

There are several reasons for Headless Chrome’s popularity; one being the support for Chrome’s new “out of the box” features, which constantly introduce new trends in web development. Another reason is the support for major desktop, server, and mobile operating systems. Headless Chrome also has convenient development tools and many additional useful features for Devs.

 

The release of Puppeteer a couple of months after the release of the headless functionality was a decisive push in Headless Chrome’s popularity. Puppeteer is a NodeJs library developed by the Chrome team, which provides a high-level API to control headless and full versions of the latest Chrome.

Enter Puppeteer

Puppeteer is a common and natural way to control Chrome. It provides full access to browser features and, most importantly, can run Chrome in fully headless mode on a remote server, which is very useful for both automation teams and attackers.

 

Without much difficulty, attackers can put in place an infrastructure with a host of nodes running Headless Chrome and orchestrated by one component (Puppeteer).

 

Apart from Puppeteer, Chrome can be automated using webdriver and automation frameworks like Selenium or by direct access through Command Line Interface (CLI). In this case, some Chrome functionality will be limited, but it offers the flexibility to write automation in any programing language besides NodeJS and JavaScript.

Headless Chrome4

Just how popular is it among attackers?

By analyzing malicious activity generated by automated browsers, I found that PhantomJS was a leader not only in the amount of traffic it produced but also in malicious activity.


However, nowadays, Chrome occupies the top of the “attackers’ podium,” with half of the malicious traffic divided evenly between execution in headless and non-headless mode.

Headless Chrome5

Taking a closer look at malicious traffic, however, I found that there are no specific trends indicating a preference among attackers for Headless Chrome to exploit vulnerabilities, inject SQL or carry out cross-site scripting attacks (XSS). That said, occasional spikes show attempts at targeting specific sites by using vulnerability scanners, or attempts to exploit newly released vulnerabilities using the “spray and pray” technique.

 

Using a web browser for vulnerability scanning is crafty, but not a new approach, as it can help to bypass some validation mechanisms based on validation of the legitimacy of the client.

Headless Chrome6

WAF events generated by Headless Chrome

Analyzing traffic from the last year, I didn’t find any DDoS attacks performed from a botnet based on Headless Chrome. Nothing similar to the Headless Android Botnet that was discovered two years ago and since then all but vanished.

 

Usage of automated browsers in general, and Headless Chrome in particular, for DDoS, is not common practice. The reason for this is the low request rate to the server that browsers can generate. As Chrome receives the response from the server, evaluates it and only then performs the next request, its rate is very low in comparison to a simple script that floods with many requests and doesn’t “care” about the responses.

 

Having said that, we observe more than 10K unique IP addresses daily performing scraping, sniping, carding, blackhat SEO and other types of malicious activity where JavaScript evaluation is necessary to perform the attack. Distribution among the countries performing these malicious activities is presented in the chart below. While 7% of the traffic is coming from proxies or VPNs to hide the origin of the attack.

Headless Chrome7

Geographical distribution of malicious Headless Chrome traffic

But what about legitimate services?

Headless Chrome isn’t only used by attackers but also by legitimate services. We observe dozens of legitimate well-known web tools that use it to access websites.

 

  • Search engines use it to render the page, generate dynamic content and index data from single page web applications.
  • SEO tools use it to analyze your website and help promote it better.
  • Monitoring tools use Headless Chrome to measure performance and JavaScript execution time of web applications.
  • Online testing tools render pages and compare it to previous versions to track regression or distortion in the user interface.

Ok, so how do we make sure we’re protected?

At this point, you’re probably asking yourself whether or not to block Headless Chrome or any other automated browsers.

 

The answer to this question is “yes… and no.”

 

Using Headless Chrome by itself is not malicious, and as stated earlier, there are legitimate scenarios and services that use this functionality to access websites. Whitelisting all legitimate services is tough work, as it requires constant mapping and maintaining the lists of such services and their IPs.

 

The decision to block Headless Chrome requests or not should be based on the intent and behavior of each IP and session individually.

 

Unless the payload is malicious (which is high evidence of malicious activity), it is better to pass some requests to the website, analyze the behavior and only then decide whether to block or not.

 

The reputation of IPs and their correlation, sophisticated heuristics, and machine learning algorithms can be implemented to make a deliberate decision, which will give better long-term results than aggressive blocking, at least in most cases.

 

For Imperva Incapsula (now Imperva Cloud Application Security) users, a set of IncapRules can be implemented to block Headless Chrome from accessing your website. Starting from a simple rule based on client classification up to sophisticated rules including rates, tags, and reputation.