Ireland-based Datalex provides a unified Digital Commerce Platform, which combines pricing, shopping, order management, and customer insights to deliver competitive and differentiated retail experiences at every touch point in the travel journey.

Voted the “World’s Leading Travel Merchandising Solution Provider” at the 2014 World Travel Awards, The Datalex commerce platform enables a travel marketplace of over one billion shoppers covering every corner of the globe, driven by some of the world’s most innovative airline retail brands. Its customers include Aer Lingus, Abacus, Air China, Air Transat, Brussels Airlines, Copa Airlines, Delta Air Lines, Edelweiss, HP Enterprise Services, JetBlue Airways, Philippine Airlines, SITA, Swiss International Air Lines, Virgin Atlantic, Virgin Australia, West Air, and WestJet.


Bad actors were scraping Datalex customers’ sites, diminishing SEO, and luring away upsell and cross-sell opportunities

The travel industry is particularly vulnerable to bad bots and web scraping. “Our customer sites have valuable data for scrapers, who can be aggressive in how they obtain it.” said Eric Chapin, Security Administrator at Datalex, who added that the industry is fairly tolerant of scrapers in general, because it’s how they share fares with other travel sites such as Travelocity and Kayak. “Some scrapers are ‘allowed’ or ‘whitelisted,’ so they can scrape the site for fares,” he explains. “But when they step out of line or go too deep into the site then we need to take action.”

Persistent scrapers can steal content from travel sites and post it on their own site, diminishing SEO impact and usurping advertising dollars. Web scraping bots are also used to monitor fare prices, so competitors can undercut with lower fare offerings.

“In the airline industry, bots can cause problems with customer contact information, flight changes and check-in procedures,” said Chapin. “They can lure away business by stealing upsell and cross-sell opportunities such as upgrades and travel insurance, as well.”

Deep-digging scrapers were increasing Global Distribution System (GDS) API pull costs and causing performance issues

One of our airline customers was being bombarded with deep-digging attacks. “A malicious scraper was hitting the site extremely hard,” said Chapin. “The web scraper was going deeper than normal and driving up backend payment costs.” Chapin explained that when airlines do a hard API pull against a global distribution system (GDS), it’s a monetary costmfor the business. “This scraper was increasing our customer’s GDS pull costs, and as their managed solution provider, we had to find a way to stop the attacks,” he said.

According to Chapin, even smaller customer sites are hit by bots multiple times a day, which can slow them down or even take them offline. In serious cases, excessive scraping eats up bandwidth and server utilization, causing instability and even downtime. Imperva has seen that roughly 23% of traffic on the average travel website is from bad bots—traffic that can cause websites to experience performance issues or “brownouts.”

“Web scrapers have always been around in the airline community, but it’s only in the last two or three years they’ve gone to levels of truly taking down sites. Modern scrapers are getting to levels that they are effectively a DDoS attempt,” said Chapin.

Identifying and policing bad bot and human website traffic with F5 Networks was burdensome and ineffective

Not all scrapers are bad actors, and distinguishing the good from the bad can be a lot of work. “We have to be sure there’s an ability to whitelist good bots, so they can promote our customers’ fares,” said Chapin.

Datalex had been using an anti-bot solution from F5 Networks, but according to Chapin, it was attempting to block scrapers by identifying packets versus pages or sessions, making the solution more difficult to tune. “F5 could do some basic rate limiting by identifying packets, but less than 10% of our bot blocking comes through rate limiting today,” he said. “Imperva Bot Management is able to detect more because it’s a graduated path, whereas previous solutions were one and done,” he explained.

F5 Networks was useful for blocking some ordinary scrapers, but insufficient for the bots that went deep into customer sites. “To block the persistent scrapers, we knew we needed a more sophisticated solution than F5,” said Chapin. “We found Imperva Bot Management, did a POC, and as soon as Imperva Bot Management was implemented on the site, we were blocking the bad bots. In fact, once tuning was completed, Imperva Bot Management has eliminated all but allowed scrapers so far.”

Results with Imperva

Imperva complements F5 Networks to block advanced persistent bots

Chapin said although he uses F5 Networks for other network solutions, their anti-bot solution didn’t provide the granular controls around rate limiting or the signature detection Imperva Bot Management offers. Additionally, F5 Networks was more IP centric— it would block the offending IP, but that doesn’t work if the bad guys spread their bot attacks over hundreds of IPs or come in through peer-to-peer networks or anonymous proxies. Rather than relying on basic rate limiting, Imperva Bot Management uses device fingerprinting, behavioural modelling of each domain’s web traffic, as well as automated JavaScript detection and other advanced browser validation tools.

“F5 Networks would detect an initial attack, but some scrapers could quickly modify its traffic to spread across IPs and more sessions, and look like a normal user,” said Chapin. “The scraper on one of our customer’s sites tried the same distributed attack against Imperva Bot Management. Because Imperva Bot Management was more granular in its tuning and knew the attacker’s signature, he didn’t stand a chance.”

Automated bot detection relieves heavy administrative load

Chapin said Imperva Bot Management’s level of automation is unique among bot detection solutions. With Imperva Bot Management, Chapin’s team can easily set tolerance levels, limits and specify how to block bad traffic. There are only a handful of NOC interactions necessary to roll out the solution, whereas with competing solutions, setting levels and tuning metrics is tedious, and can still be ineffective in stopping the more persistent bots.

“Imperva Bot Management was one of the easiest implementations we’ve ever done in our environment and once implemented, the system optimizes itself. We’ve been able to move employee time to more strategic work as opposed to chasing bots.”

Eliminated advanced scrapers, making sites more stable and secure than ever

“Eliminating unwanted hits against the site helps to maintain site stability and reduce backend infrastructure costs. Getting rid of scrapers significantly increases stability, especially with our smaller customers whose sites could be brought down by a big hit,” said Chapin.

“When we implemented Imperva Bot Management with our first customer we saw over a 50% decrease in web traffic with no impact on real human users,” said Chapin. On average, eliminating bad bots decreased traffic to Datalex customer sites by 20-30%. “If we put Imperva Bot Management on a customer’s site, we can be confident there will be no bad bots” said Chapin.

Forced advanced persistent scrapers to “throw in the towel”

“Imperva Bot Management has completely eliminated all unruly web scraping on our customer sites so far, which is impressive considering the airline industry has some of the most persistent scrapers out there,” said Chapin. “One web scraper was doing anything he could to get airfares, and despite our initial attempts with F5 to stop him, he could still get back into the site within 24 hours. But after we implemented Imperva Bot Management, he simply gave up” he

“Imperva Bot Management has helped us reduce backend infrastructure costs, increased availability, and decreased customer support time” explained Chapin. “On the security side, Imperva Bot Management is a helpful tool to protect against brute force login attempts, vulnerability scanning, and other attacks against our sites.”