Search Learning Center for

CAPTCHA

AppSec, Essentials 11.6k views

What is CAPTCHA

CAPTCHA stands for the Completely Automated Public Turing test to tell Computers and Humans Apart. CAPTCHAs are tools you can use to differentiate between real users and automated users, such as bots. CAPTCHAs provide challenges that are difficult for computers to perform but relatively easy for humans. For example, identifying stretched letters or numbers, or clicking in a specific area.

What are CAPTCHAs Used for

CAPTCHAs are used by any website that wishes to restrict usage by bots. Specific uses include:

  • Maintaining poll accuracy—CAPTCHAs can prevent poll skewing by ensuring that each vote is entered by a human. Although this does not limit the overall number of votes that can be made, it makes the time required for each vote longer, discouraging multiple votes.
  • Limiting registration for services—services can use CAPTCHAs to prevent bots from spamming registration systems to create fake accounts. Restricting account creation prevents waste of a service’s resources and reduces opportunities for fraud.
  • Preventing ticket inflation—ticketing systems can use CAPTCHA to limit scalpers from purchasing large numbers of tickets for resale. It can also be used to prevent false registrations to free events.
  • Preventing false comments—CAPTCHAs can prevent bots from spamming message boards, contact forms, or review sites. The extra step required by a CAPTCHA can also play a role in reducing online harassment through inconvenience.

How Does CAPTCHA Work

CAPTCHAs work by providing information to a user for interpretation. Traditional CAPTCHAs provided distorted or overlapping letters and numbers that a user then has to submit via a form field. The distortion of the letters made it difficult for bots to interpret the text and prevented access until the characters were verified.

This CAPTCHA type relies on a human’s ability to generalize and recognize novel patterns based on variable past experience. In contrast, bots can often only follow set patterns or input randomized characters. This limitation makes it unlikely that bots will correctly guess the right combination.

Since CAPTCHA was introduced, bots that use machine learning have been developed. These bots are better able to identify traditional CAPTCHAs with algorithms trained in pattern recognition. Due to this development, newer CAPTCHA methods are based on more complex tests. For example, reCAPTCHA requires clicking in a specific area and waiting until a timer runs out.

Drawbacks of Using CAPTCHA

The overwhelming benefit of CAPTCHA is that it is highly effective against all but the most sophisticated bad bots. However, CAPTCHA mechanisms can negatively affect the user experience on your website:

  • Disruptive and frustrating for users
  • May be difficult to understand or use for some audiences
  • Some CAPTCHA types do not support all browsers
  • Some CAPTCHA types are not accessible to users who view a website using screen readers or assistive devices

CAPTCHA Types: Examples

Modern CAPTCHAs fall into three main categories—text-based, image-based, and audio.

Text-based CAPTCHAs

Text-based CAPTCHAs are the original way in which humans were verified. These CAPTCHAs can use known words or phrases, or random combinations of digits and letters. Some text-based CAPTCHAs also include variations in capitalization.

The CAPTCHA presents these characters in a way that is alienated and requires interpretation. Alienation can involve scaling, rotation, distorting characters. It can also involve overlapping characters with graphic elements such as color, background noise, lines, arcs, or dots. This alienation provides protection against bots with insufficient text recognition algorithms but can also be difficult for humans to interpret.

Text-based CAPTCHA patterns

Text-based CAPTCHA patterns

Techniques for creating text-based CAPTCHAs include:

  • Gimpy—chooses an arbitrary number of words from an 850-word dictionary and provides those words in a distorted fashion.
  • EZ-Gimpy—is a variation of Gimpy that uses only one word.
  • Gimpy-r—selects random letters, then distorts and adds background noise to characters.
  • Simard’s HIP—selects random letters and numbers, then distorts characters with arcs and colors.

CAPTCHA Image

Image-based CAPTCHAs were developed to replace text-based ones. These CAPTCHAs use recognizable graphical elements, such as photos of animals, shapes, or scenes. Typically, image-based CAPTCHAs require users to select images matching a theme or to identify images that don’t fit.

You can see an example of this type of CAPTCHA below. Note that it defines the theme using an image instead of text.

Example of image-based CAPTCHA

Example of image-based CAPTCHA

Image-based CAPTCHAs are typically easier for humans to interpret than text-based. However, these tools present distinct accessibility issues for visually impaired users. For bots, image-based CAPTCHAs are more difficult than text to interpret because these tools require both image recognition and semantic classification.

Audio CAPTCHA

Audio CAPTCHAs were developed as an alternative that grants accessibility to visually impaired users. These CAPTCHAs are often used in combination with text or image-based CAPTCHAs. Audio CAPTCHAs present an audio recording of a series of letters or numbers which a user then enters.

These CAPTCHAs rely on bots not being able to distinguish relevant characters from background noise. Like text-based CAPTCHAs, these tools can be difficult for humans to interpret as well as for bots.

Math or Word Problems

Some CAPTCHA mechanisms ask users to solve a simple mathematical problem such as “3+4” or “18-3”. The assumption is that a bot will find it difficult to identify the question and devise a response. Another variant is a word problem, asking the user to type the missing word in a sentence, or complete a sequence of several related terms. These types of problems are accessible to vision impaired users, but at the same time they may be easier for bad bots to solve.

Social Media Sign In

A popular alternative to CAPTCHA is requiring users to sign in using a social profile such as Facebook, Google or LinkedIn. The user’s details will be automatically filled in using single sign on (SSO) functionality provided by the social media website.

This is still disruptive, but may actually be easier for the user to complete than other forms of CAPTCHA. An additional benefit is that it is a convenient registration mechanism.

No CAPTCHA ReCAPTCHA

This type of CAPTCHA, known for its use by Google, is much easier for users than most other types. It provides a checkbox saying “I am not a robot” which users need to select – and that’s all. It works by tracking user movements and identifying if the click and other user activity on the page resembles human activity or a bot. If the test fails, reCAPTCHA provides a traditional image selection CAPTCHA, but in most cases the checkbox test suffices to validate the user.

See how Advanced Bot Protection can help you with stopping bad bots.

Imperva Bot Detection: CAPTCHA as a Last Line of Defense

Imperva provides a bot detection solution that is built for minimal business disruption. It offers several types of challenges which filter out bad bot traffic with minimal impact on human users—including device fingerprinting, cookie challenges and JavaScript challenges.

Imperva provides the option to deploy CAPTCHAs, but uses it as the final line of defense, if all other bot identification mechanisms fail. This means it will be used for a very small percentage of user traffic. Imperva does provide the option to manually enforce CAPTCHA, for websites that need a stricter approach to advanced bot protection.

In addition to providing bad bot mitigation, Imperva provides multi-layered protection to make sure websites and applications are available, easily accessible and safe. The Imperva application security solution includes:

  • DDoS Protection—maintain uptime in all situations. Prevent any type of DDoS attack, of any size, from preventing access to your website and network infrastructure.
  • CDN—enhance website performance and reduce bandwidth costs with a CDN designed for developers. Cache static resources at the edge while accelerating APIs and dynamic websites.
  • Cloud WAF—permit legitimate traffic and prevent bad traffic. Safeguard your applications at the edge with an enterprise‑class cloud WAF.
  • Gateway WAF—keep applications and APIs inside your network safe with Imperva Gateway WAF.
  • RASP—keep your applications safe from within against known and zero‑day attacks. Fast and accurate protection with no signature or learning mode.