OCR is sometimes referred to as text recognition. An OCR program extracts and repurposes data from scanned documents, camera images and image-only PDFs. OCR software singles out letters on the image, puts them into words, and then puts the words into sentences, thus enabling access to and editing of the original content. It also eliminates the wasted effort of redundant manual data entry.

OCR systems use a combination of hardware and software to convert physical, printed documents into machine-readable text. Hardware, such as an optical scanner or specialized circuit board, copies or reads text, then software typically handles the advanced processing.

OCR software can take advantage of artificial intelligence (AI) to implement more advanced methods of intelligent character recognition (ICR) for identifying languages or handwriting. Organizations often use the process of OCR to turn printed legal or historical documents into PDF documents so that users can edit, format and search the documents as if created with a word processor.