Behind the Scenes: How Attackers Extend the Lifespan of Phishing Campaigns


In a previous Imperva Hacker Intelligence Initiative (HII) report we delved into some of the financial aspects of phishing and credential theft. Obviously, one of the important factors in the cost of a campaign is the lifespan of a phishing site. With so many prying eyes of security vendors and researchers, phishing campaign operators are trying to find ways to extend the life expectancy of their pages and servers.

We recently analyzed more than 60 deployment kits retrieved from phishing sites around the world. These deployment kits, usually in the form of a compressed archive file, contain the files required for setting up and configuring a phishing site. One of the common features we found in the kits (in 33% of them to be accurate) is a mechanism for blocking unwanted visitors, thus creating the façade that the site is already down, and therefore extending its life expectancy and increasing the owner’s ROI.

Read on for a deeper look at these phishing campaign techniques.

Blocking Mechanisms Used by Attackers

The kits contained the expected HTML, javascript and image files for hosting the phishing pages, and the PHP file for sending the stolen data to the attacker. However, a third of the kits contained a mechanism for denying access to the site.

Digging into the kits’ code revealed two common methods for avoiding detection by security companies: .htaccess file (on Apache sites) and dedicated PHP modules (embedded in the phishing pages through “include” directive or explicitly into the code). We take a look at each one.

Block via .htaccess

Attackers have been using the .htaccess file for a while now. They use this file to hide malware, to redirect search engines to their own sites, and for many other purposes. In this case the .htaccess file is being used by attackers in order to deny access from a particular IP address, or IP address ranges that belong to bots, security companies, anti-phishing engines or even to a specific ISP.

The following figures show parts of .htaccess files we found in the deployment kits:

Blocking by IP addresses_figure 1

Figure 1: .htaccess file – blocking by IP addresses and by domain names

We also observed .htaccess files constructed to perform a conditional redirect based on the HTTP referer, remote IP address, and HTTP user agent. The following .htaccess file is an example for using user-agent and referer to cause the server to return a 403 Forbidden status code to the client instead of the phishing page:

Return 403 Forbidden status_figure 2

Figure 2: .htaccess file – return 403 Forbidden status code to blacklisted entities

Essentially the owner of the phishing site is looking to identify tools that may expose the campaign or taint the data. Tool signatures are often found in the user-agent and referer headers. We observed several .htaccess files where the owner of the phishing site set conditions based on user-agent and referer headers to deny access to spam bots, crawlers, research spiders and phishing (or malware) scanners.

The following file is another example to blocking based on the HTTP referer:

Blocks access of anti-phishing_figure 3

Figure 3: .htaccess file – blocks access of several anti-phishing services

Block via PHP

This blocking method is relatively straightforward and is a copy-and-paste code example found in the first result page of Google search (see below). The attacker places this code snippet into the top of any PHP page they wish to block access:


$deny = array(“”, “”, “”);

if (in_array ($_SERVER[‘REMOTE_ADDR’], $deny)) {



} ?>

The code creates a list of IP addresses designated for blocking, and then checks the incoming address against the array. If the incoming address matches any value in the list, the function will deny access to the page. We have seen two types of pages: redirection to a specified URL (usually Google’s home page) and a HTTP 404 error page.

Block.php file_figure 4

Figure 4: block.php file – redirect to

Anti.php file_figure 5

Figure 5: anti.php file – return HTTP 404 Not Found error

How Many Entities Blocked?

We analyzed a data set of blocked IP addresses we found in 20 blocking files collected from the deployment kits. The total number of blocked records was 3,850, where each record represents an individual IP address or range of IP addresses. Normalization and analysis of the records reveal 1,215 subnets and 155 unique IP addresses, which represent 69,831,362 IP addresses in total, forming 6.5 percent of all non-reserved Internet addresses.

For our initial analysis, we split the individual IP addresses and subnets between those that have a single appearance in our data set and those that have multiple appearances. The following graph summarizes the appearance of individual IP addresses in our data set:

Individual IP addresses_figure 6

Figure 6: Appearance of individual IP addresses in the data set

The following figure summarizes the appearance of subnets in the data set:

Subnets in the data set_figure 7

Figure 7: Appearance of subnets in the data set

It immediately pops out that almost 25 percent of subnets  and 76 percent of individual IP addresses appeared in two or more blocking files. Furthermore, 6 of the 20 blocking files we analyzed were completely identical to each other. The identical blocking files and the code similarity (see A Brief Look into Deployment Kits below) proving once again that do-it-yourself (DIY) phishing and managed phishing are the main factors for the growth of phishing (see our previous HII report).

We found out that Google and Amazon are listed in most of the blocking files. This is probably to avoid detection and indexing of the phishing sites and block any client which is implausible to be a real victim. It is interesting to notice that Israel’s Internet Service Providers (012 Smile, 013 NetVision and Bezeq International) are also very common in the blocking files. We assume that this is due to the abundance of cyber security companies in Israel.

Who is on the Blacklist?

Once we looked at the data, it became apparent that there are multiple organizations blacklisted by the hackers. These include Internet hosting services, security companies, anti-phishing services, and government organizations. By enriching the IP address information with “Bulk IP Address Lookup”, we were able to recognize more than 400 different organizations, more than 100 of them are cyber security companies and anti-phishing services.

The attackers’ main goal is to block any client which is unlikely to be a real victim. Therefore, some of the blocking files contain IP address ranges that cover the entire address space of a particular organization.

Furthermore, we found more than 200 tools and bots that are blacklisted by the phishing sites’ owners. These include legitimate tools such as download manager programs, good bots like Google’s web crawling bot, but also several bad bots like spam bots, spiders, crawlers, and scrapers.

As such, we noticed that 40 percent of blocking files contained TOR IP addresses intended to block researchers which frequently use the TOR network to hide their true identity. An important point to notice that once the phishing site is deployed, the owner does not expect to access it anymore – otherwise TOR access would be allowed to maintain anonymity.

TOR network_figure 8_1

TOR network_figure 8_2

Figure 8: TOR network IP addresses blacklisted

Tor-exit nodes_figure 9

Figure 9: Tor-exit nodes blacklisted

We started by looking at the  geographic distribution of the blocked IP addresses and subnets.

Geo distribution of blocked IP_figure 10

Figure 10: Geographic distribution of blocked IP addresses and subnets (%)

It is interesting to see that Israel is in sixth place in the list of blocked countries.

As a part of our research, we decided to check whether we  are blacklisted by phishing attackers. We found Imperva’s IP address range in four different .htaccess files. Based on our sample, we assume that a third of the phishing sites have a blocking functionality. Our IP addresses appear in a fifth (20 percent) of the blocking lists, which is about 6.6 percent of all blacklists in the wild.

In the following example, we see resulting page of phishing site when accessed through blacklisted IP vs. accessed through not blacklisted IP address:

Blocking in action_figure 11

Figure 11: Blocking in action – 404 Not Found

To check our hypothesis, we fetched a list of more than one thousand known phishing sites using anonymous proxies and compared the resulting page when fetched from our company IP.

Using an anonymous proxy has improved our accessibility to the phishing sites by 7 percent, which is very close to our assumption of 6.6 percent.

Obviously, the amount of improvement will differ between organizations and depends on the incidence of the used IP address in the blacklists.

A Brief Look into Deployment Kits

Since we already had a nice collection of deployment kits we decided to run some additional analysis. One type of analysis we ran was code similarity. When we looked at the source code we immediately noticed that a third of them contains precisely the same files which are related to a Google Docs phishing campaign:

Files within phishing archive_figure 12

Figure 12: Files within phishing archive

Index.php – PHP code that processes credentials posted to a fake “Google Docs” page and sends them to an attacker’s controlled email account.

Verification.php – PHP code that is relevant for Gmail accounts only. It presents a fake verification page and obtains it’s sensitive information such as recovery email address or phone number.

geoplugin.class.php – a PHP class used by the attacker to geolocate the IP address of the victim.

The following is an example of the “index.php” file we found in 20 different phishing servers:

Index.php_figure 13

Figure 13: index.php

The “Index.php” file presents a Google Docs phishing page:

Google Docs phishing page_figure 14

Figure 14: Google Docs phishing page designed to obtain email credentials

An interesting fact is that the signature ‘By NoBODY’ (in the first line of the message) appears in 17 of 20 index.php files. The other three files contain exactly the same code, but signed by a different attacker (‘BY Miracle’) or not signed at all. Other phishing sites’ source code can be divided into smaller clusters of 2-4 sites, which contain the same or almost identical content. It demonstrates the two models we talked about in our previous HII report, Managed Phishing and Do-it-yourself (DIY) Phishing. As mentioned in that last  report, “The attacker can utilize existing services to execute the phishing scam to minimize the operational expenses, including the scam pages which are sold online for a variety of services and sites.”

Phishing Attackers Continue to Up Their Game

In this post we described some initial results of our recent analysis and in particular the attackers’ use of visitor blocking techniques to keep researchers and bots, thus extending the life expectancy of phishing sites. This is in line with our previously reported observations regarding the importance of phishing campaign financials. Our research into phishing shows that industrial grade platforms and infrastructure are constantly being set and improved by specialized attackers – proving that massive phishing campaigns are still a major tool of the trade for the cyber crime industry. Credentials obtained through these, presumably basic, techniques are then used for further launching of sophisticated attacks against organizations.

Watch for further results of our cross analysis of the deployment kits in future blog posts and reports.