Several different groups developed the earliest forms of CAPTCHA technology in parallel during the late 1990s and early 2000s. Each group worked to combat the widespread problem of hackers that use bots for nefarious activities on the internet. For example, computer scientists working for the search engine AltaVista wanted to stop bots from adding malicious web addresses to the company's link database.

Researchers at the IT company Sanctum filed the first CAPTCHA-style system in 1997. However, a group of computer science researchers at Carnegie Mellon University led by Luis von Ahn and Manuel Blum first introduced the term CAPTCHA in 2003. This team was inspired to work on the technology by a Yahoo executive who delivered a talk about the company's issues with spambots signing up for millions of fake email accounts.

To solve Yahoo’s problem, von Ahn and Blum created a computer program that:

generated a random string of text, generated a distorted image of that text (called a ‘CAPTCHA code’), presented the image to the user, asked the user to enter the text into a form field and then submit entry by clicking a check box next to the phrase "I am not a robot."

Because OCR technology of the time struggled to decipher such distorted text, bots could not pass the CAPTCHA challenge. If a user entered the correct string of characters, it could be reliably assumed they were human and they were permitted to complete their account registration or web form submission.

Yahoo implemented Carnegie Mellon's technology, requiring all users to pass a CAPTCHA test before signing up for an email address. This significantly cut down on spambot activity, and other companies proceeded to adopt CAPTCHAs to protect their web forms. Over time, however, hackers used data from completed CAPTCHA challenges to develop algorithms capable of reliably passing CAPTCHA tests. This marked the beginning of an ongoing arms race between CAPTCHA developers and cybercriminals which has fueled the evolution of CAPTCHA functionality.

reCAPTCHA v1

Launched by von Ahn in 2007, reCAPTCHA v1 had a dual aim: to make the text-based CAPTCHA challenge more difficult for bots to crack, and to improve the accuracy of OCR being used at the time to digitize printed texts.

reCAPTCHA achieved the first goal by increasing the distortion of text displayed to the user, and eventually adding lines through the text.

It achieved the second goal by replacing a single image of randomly-generated distorted text with two distorted text images of words scanned from actual texts by two different OCR programs. The first word, or control word, was a word identified correctly by both OCR programs. The second word was a word both OCR programs failed to identify. If the user correctly identified the control word, reCAPTCHA assumed the user was human and allowed them to continue their task, and also assumed the user identified the second word correctly, and used the response to verify future OCR results.

In this way, reCAPTCHA improved anti-bot security and improved the accuracy of texts being digitized at the Internet Archive and the New York Times. Ironically, over time it also helped improve artificial intelligence and machine learning algorithms to the point that, by 2014, they could identify the most distorted text CAPTCHAs 99.8% of the time.

In 2009, Google acquired reCAPTCHA and began using it to digitize texts for Google Books while offering it as a service to other organizations. However, as OCR technology progressed with the help of reCAPTCHA, so did the artificial intelligence programs that could effectively solve text-based reCAPTCHAs. In response, Google introduced image recognition reCAPTCHAs in 2012, which replaced distorted text with images taken from Google Street View. Users proved their humanity by identifying real-world objects like street lights and taxicabs. In addition to sidestepping the advanced OCR now deployed by bots, these image-based reCAPTCHAs were considered more convenient for mobile app users.

Google reCAPTCHA v2: No CAPTCHA reCAPTCHA

In 2014, Google released reCAPTCHA v2, which replaced text- and image-based challenges with a simple checkbox stating "I am not a robot." As users check the box, reCAPTCHA v2 analyzes the user’s interactions with web pages, evaluating factors like typing speed, cookies, device history, and IP address to determine whether a user is likely to be human. The checkbox is also part of how the CAPTCHA works: no CAPTCHA reCAPTCHA tracks the user's mouse movements as they click the box. A human's movements tend to be more chaotic, whereas bots' movements are more precise. If no CAPTCHA reCAPTCHA suspects a user may be a bot, it presents them with an image-based CAPTCHA challenge.

reCAPTCHA v3

reCAPTCHA v3, which debuted in 2018, does away with the checkbox and expands upon the AI-driven risk analysis of no CAPTCHA reCAPTCHA. ReCAPTCHA v3 integrates with a web page through JavaScript API and runs in the background, scoring a user's behavior on a scale of 0.0 (likely a bot) to 1.0 (likely a human). Website owners can set automated actions to trigger at certain moments when a user's score suggests they may be a bot. For example, blog comments from low-scoring users may be sent to a moderation queue when they click "submit," or low-scoring users may be asked to complete a multifactor authentication process when they attempt to log into an account.

AI-based authentication methods like reCAPTCHA v3 seek to sidestep the problem of hackers. By removing interactive challenges from the CAPTCHA verification process, they prevent hackers from using data from previously solved challenges to train bots to crack new CAPTCHAs. Because of this, experts believe AI-based CAPTCHAs may become the norm, completely replacing challenge-based CAPTCHAs in the next five to ten years.