WP How Incapsula Protects Against Data Leaks

Archive

How Incapsula Protects Against Data Leaks

How Incapsula Protects Against Data Leaks

Some of our customers have been asking us if we are vulnerable to the same issues that Cloudflare faced. The recent incident at Cloudflare involved some circumstances where edge servers were running past the end of a buffer and returning memory that contained private information such as HTTP cookies, authentication tokens, HTTP POST bodies, and other sensitive data.

To read more about the incident, see the articles from Ars Technica, The New York Times and others.

In this post, we summarize what a buffer overrun is and how  Incapsula mitigates the risk of buffer overruns and data leakage.

What’s a Buffer overrun?

Say you have a sack of 20 apples, and you want to pull away 5 apples, and your “pseudo-program” looks like this:

while (there_are_apples_in_the_sack) {
   take_apples();
   if (apples_in_hand == 5)
      stop_taking_apples();
}

This would work well and give you 5 apples from the sack.

However, if the take_apples() function, in some cases, advances the counter from 4 directly to 6, the condition apples_in_hand == 5 is not met and the program will keep pulling apples from the sack.

In the case of a buffer overrun, such as the one found in Cloudflare, replace “sack of apples” with “memory”, and instead of taking apples, the program is parsing HTML content. When a buffer check is not met, the application continues to send whatever is next in memory irrespective of whether the data is relevant.

How common was the leak?

According to Cloudflare, the leak occurred with 1 in every ~3,300,000 HTTP requests between February 13th and February 18th potentially affecting 0.00003% of requests. While seemingly statistically small, any data leak is a serious bug, and Cloudflare took the necessary steps to fix the leak immediately.

Are Incapsula customers vulnerable to the same leak?

No. Incapsula uses an entirely different technology stack than Cloudflare. According to Cloudflare, the bug that led to buffer overrun was not a flaw in their HTML parser, but rather how they used the parser. Cloudflare reports that it uses a Ragel-based parser and NGINX as its proxy server. Incapsula does not utilize either of these two products for processing customer website traffic.

Does Incapsula parse HTML responses?

No. Incapsula can inject content into HTML responses to enrich pages with additional capabilities, but we do not parse the entire HTML document. The result is that we are not vulnerable to malformed HTML pages which triggered the data leak.

For example, take the following HTML page:

<html>
   <head>
      <title>Hello World!</title>
   </head>
   <body>
      <p>Hello World!</p>
   </body>
</html>

Incapsula can inject HTML content into either the <head> or near the </body> tags. The following <script> tag can be injected using Incapsula into the tag:

<html>
   <head>
      <title>Hello World!</title>
      <script type=”text/javascript” src=”utils.js”></script>
   </head>
   <body>
      <p>Hello World!</p>
   </body>
</html>

Can a similar bug occur with Incapsula?

When the Cloudflare news broke, we set up a team to check whether we were vulnerable. One of the things we did was to review our HTML manipulation code for any sign of buffer overruns or related bugs. The Cloudflare blog reported the following code as the root cause of the data leak:

if ( ++p == pe )
   goto _test_eof;

p is a pointer to a memory buffer containing the HTML document and pe is a pointer to the last byte in that buffer. The bug occurs where p is advanced such that it passes pe, and the condition p == pe is not met. The correct code would be:

if ( ++p >= pe )
   goto _test_eof;

Looking at the equivalent (stripped down) code in Incapsula shows the correct pattern. Note that the Incapsula code checks the opposite condition of whether to continue consuming bytes from the buffer so p<pe is the correct condition in that case:

for ( ; p < pe ; ++p) {
   // do something
}

How do we protect the Incapsula service from these issues?

First, our approach to software engineering is to minimize the probability our developers are not fully aware of the code running our services. That’s not to say we don’t use core software libraries like libc or OpenSSL, but we try to make our mission-critical systems like our HTTP/S reverse proxy, DNS and Behemoth scrubbing servers as transparent as possible for our developers.

Interestingly, the first version of our HTTP parser was based on Ragel but we only used it for a few months. Our developers felt that the additional level of abstraction of writing the parser definition introduced opaqueness which they were not comfortable with. So Ragel was dropped early on in favor of much simpler code that was a better fit with our needs.

Second, and because of major production issues we suffered, we started using static code analysis which can spot bugs like buffer overruns quite easily. However, including generated code in static code analysis is an uncommon practice in the industry as that code is not under the developer’s control, so they can’t fix the potential defects. We suspect that approach will be re-evaluated after this incident.

Finally, about one-third of the Incapsula engineering staff are test automation engineers. Their job is to make sure Incapsula services evolve safely and that any new functionality we add does not break anything our customers rely on. Our HTML manipulation features are tested daily and we continually evaluate the need for additional tests.

How would Incapsula react to a similar issue?

Similar to Cloudflare, we have teams working 24×7 in some sites and we also pursue a follow the sun pattern. We can globally and locally disable every piece of functionality in our code base to quickly turn off features if needed.

Leave us a comment if you have a question and we’ll do our best to answer.