WP Python and Go are 2019's Top Programming Languages used by Attackers | Imperva

Python and Go are 2019’s Top Programming Languages used by Attackers

Python and Go are 2019’s Top Programming Languages used by Attackers

Imperva Cloud WAF protects over a hundred thousand websites globally and observes around a billion of attacks daily. We detect thousands of hacking tools on a daily basis and employ various measures to stop malicious requests. Here are the most dangerous tools and attacks we discover while observing over billions of daily attacks in 2019.

We use an advanced intelligent Client Classification mechanism which classifies various web clients. To identify the top tools used by hackers we looked at all the attacks observed during 2019, and clustered them into security incidents. By clustering the data we were able to reduce the bias caused by large attacks and instead focused on diverse attacks on multiple sites over time.

Noticeably the popular coding language, Python, continued to be the weapon of choice for most hackers while Google’s Go language was on the rise.
Next we can see the WinHttp library, which is mainly used by .net and CPP running on Windows, followed by Shell tools such as cURL, wget and others. The rest of the top tools were more

programming languages and browsers.

We’ll provide more interesting statistics on the top tools like attack types and source countries distributions. We’ll also give a short drill-down into some of the top tools and finish with some advice on how you can protect your website from these tools.

Some stats from the community

We decided to take a look at some GitHub stats to understand which languages were used the most.

According to GitHut 2.0, Python and Go were among the top five languages for 2019:

We decided to focus specifically on cybersecurity projects in GitHub, assuming most attack tools are tagged as such. GitHub doesn’t classify every repository, but the Security topic in GitHub holds over 8,500 security-related repositories, which is quite a big sample.

Looking at the top five languages used in these repositories we can see that Python comes in first, by a long way , followed by Java, JavaScript, PHP and finally, Go. It’s not surprising to see that major web languages such as PHP and JavaScript are high up on the list, or robust and well-used languages such as Python and Java. But Go joined the top of the list during 2019, and even more interestingly, took the place of the Shell-based code.

Stack overflow

When comparing GitHub’s statistics to Stack Overflow Trends we get a similar picture. It’s hard to say why there aren’t as many questions about Go as there are pull requests from Go repositories. One amazing statistic is Python’s quick and sharp rise to power – with an average annual rise of 13%, it almost quadrupled in a decade.

Cloud WAF Statistics

To see the spread of attacks by these tools on the sites we protect, we created a graph showing the percentage of sites being attacked by each tool during 2019. The tools we observed were in the top both in terms of the number of incidents, and the percentage of attacked sites. Most sites were hit by Python every month, while 30-50% of the sites would get hit by each of the other tools.

We decided to observe the two most popular attacks with the largest number of variations – XSS and SQLi – and the exploitation attempts using these attacks via Go and Python

Noticeably, by the end of 2019, Go had caught up with Python in both types of attacks. It’s still early to tell if this trend will continue, but it’s easy to see that Go had become significantly more popular by the end of 2019.

We wanted to check the tools distribution for each of the main attack types we observed:

Attack-type-distribution

As you can see – Python was the strongest tool in RCE/RFI, File Upload and Data Leakage while Go was stronger in general automated attacks.

Let’s look at the tool usage source country, according to source IP:

tool usage source country

China used Python way more than any other country, while India chose Go as their go-to tool. It’s hard to say why, but given how well-known these countries are for their cyber activity, it’s possible that new hackers joining the market chose modern tools for their nefarious activities.

IPs vs Incidents
Surprisingly, there wasn’t a strong correlation between the number of IPs using a tool to attack and the number security incidents caused by the tool. This can be explained, partially, by the type of attacks the tools were involved in. Sophisticated, automated attacks tend to be coordinated,\ – massive wide scale attacks, not so much.

IPs vs Incidents

This hypothesis is further validated by looking at the ratio between incidents to IPs. Tools with a low request rate, that can be easily used for browser impersonation, had a very low incident to IP ratio. In comparison, tools that can easily generate wide-scale attacks like Go and Python had a significantly higher incident to IP ratio.

Let’s go into the bits and bytes

popular Python libraries

The most popular Python libraries are Requests, Urllib and Async IO.
The use of Async IO library has grown since last year but it is still way behind. However, we expect to see additional growth in the future due to the great capabilities it has when writing asynchronous programs in Python.

We decided to examine some of the CVEs that were commonly exploited by hacking tools:

In first place, with around 7 million HTTP requests, was (CVE-2017-9841), a remote command execution vulnerability in PHPUnit, a widely-used testing framework for PHP. When drilling down into its history, it appears that the CVE was published in June 2017 and the fix already committed on November 11, 2016.

On September 4, 2019, Drupal published a public service announcement (PSA-2019-09-04) on a vulnerability in Mailchimp which was using PHPUnit as a third-party library. In addition, on January 7, 2020, PrestaShop (an e-commerce solution written in PHP) published a security announcement regarding the PHPUnit vulnerability being exploited by a malware named XsamXadoo Bot. These announcements revived the CVE, triggering hype on Twitter where many posts were published relating to the vulnerability. In addition, vulnerability Databases, such as VulnDB, also updated the record related to this CVE and added new links – for example, a POC written back in 2017 in a blog that’s only available via the wayback machine. For additional info you can read “The resurrection of PHPUnit RCE Vulnerability” blog.

When we examine our data, trying to find a popular CVE which was used by the Go-lang attacking tool, we found that one of the most popular CVEs, with around 200K HTTP requests, was actually a group of CVEs (CVE-2016-5385, CVE-2016-5386, CVE-2016-5387, CVE-2016-5388, CVE-2016-1000109, and CVE-2016-1000110). All these CVEs are related to a problem in the HTTP_PROXY environment variable, known as ‘httpoxy’ issue.

Web servers running in a CGI or CGI-like context may assign client request Proxy header values to internal HTTP_PROXY environment variables. This vulnerability can be leveraged to conduct man-in-the-middle (MITM) attacks on internal subrequests or to direct the server to initiate connections to arbitrary hosts.

Let’s move to cURL, the most common attacking shell tool. With an interesting CVE related to Remote Code Execution in Apache Struts (CVE-2016-3087) with around 100K HTTP requests. Apache Struts is a free, open-source, Model-View-Controller (MVC) framework for creating elegant, modern Java web applications. The Remote Code Execution can be performed when using REST plugin with ! operator when Dynamic method Invocation is enabled.

How can you protect yourself?

If you have a Client Classification mechanism, which can be as simple as looking at the User-Agent, these insights will allow you to easily defend against many common attacks. Treat these tools suspiciously – if you don’t expect your application, or part of it, to be accessed by these tools, block requests coming from them. In other cases, you might know that only specific IPs are supposed to use these tools, such as an IP that performs health monitoring, for example. You might want to restrict access to these IPs alone.

In any case, Client Classification or not, the standard recommendations remain the same – keep your system patched, develop with security in mind, and don’t do anything risky, even temporarily.

More information about client classification, specific attacks, and on the global cyber threat can be found in the Cyber Threat Index.