Configuring OCR usage
Use optical character recognition (OCR) to obtain text from images or PDF files. This topic introduces the supported OCR providers, and how to configure them.
OCR Providers
Google Tesseract
Google Tesseract is a free OCR API natively integrated with IBM RPA as an option to recognize scanned PDF files, images, and documents.
You don't need additional configuration to use Google Tesseract. To use this provider, select Google
as the OCR provider on commands that support OCR.
Abbyy
ABBYY® FineReader® is proprietary OCR provider integrated with IBM RPA as an option to recognize PDF files, images, and scanned documents.
You don't need additional configuration to use ABBYY® FineReader®. To use this provider, select ABBYY
as the OCR provider on commands that support OCR.
Google Cloud Vision
Google® Cloud Vision® is a proprietary OCR provider that you can integrate with IBM RPA to recognize text on scanned PDF files, images, and documents.
Requirement: To use Google Cloud Vision as an OCR provider on IBM RPA, you must have a Google account with Cloud Vision set up. Refer to Google's documentation 🡥 to learn how to setup Cloud Vision.
To generate the API credentials JSON file, follow the step-by-step 🡥 on the Cloud Vision's official page.
To use Google Cloud Vision on IBM RPA Studio's tools:
- Log in to IBM RPA Studio.
- On the home screen, access the Tools tab.
- In the Options section, click the Options button.
- Click the Credentials menu.
- Click the Google Cloud Vision option.
- In the Credential field, select path to the API credentials JSON file.
- Click the Save button.
To use Google Cloud Vision during automation on commands that support OCR:
- On OCR provider, select
Google Cloud Vision
. - On API parameters, set the path to the API credentials JSON file.