Perform OCR using Google's Drive API v3
Google OCR (Drive API v3)
Perform OCR using Google's Drive API v3
Features
Perform OCR using Google's Drive API v3
Class GoogleOCRApplication()
for use in projects
Highly configurable CLI
Run OCR on a single image file
Run OCR on multiple image files
Run OCR on all images in directory
Use multiple workers (multiprocessing
)
Work on a PDF document directly
Usage
Using in a Project
Create a GoogleOCRApplication
application instance:
from google_drive_ocr import GoogleOCRApplication
app = GoogleOCRApplication ('client_secret.json' )
Enter fullscreen mode
Exit fullscreen mode
Perform OCR on a single image:
app .perform_ocr ('image.png' )
Enter fullscreen mode
Exit fullscreen mode
Perform OCR on mupltiple images:
app .perform_batch_ocr (['image_1.png' , 'image_2.png' , 'image_3.png' ])
Enter fullscreen mode
Exit fullscreen mode
Perform OCR on multiple images using multiple workers (multiprocessing
):
app .perform_batch_ocr (['image_1.png' , 'image_3.png' , 'image_2.png' ], workers = 2 )
Enter fullscreen mode
Exit fullscreen mode
Using Command Line Interface
Typical usage with…
Google's Drive API can be used to perform OCR on images from any language. google-drive-ocr
is a python package that allows users to do this with utmost ease, right from the terminal.
Features
Perform OCR using Google's Drive API v3
Class GoogleOCRApplication()
for use in projects
Highly configurable CLI
Run OCR on a single image file
Run OCR on multiple image files
Run OCR on all images in directory
Use multiple workers (multiprocessing
)
Work on a PDF document directly
Install
To install Google OCR (Drive API v3), run this command in your terminal:
pip install google-drive-ocr
Enter fullscreen mode
Exit fullscreen mode
Note : One must setup a Google application and download client_secrets.json
file before using google_drive_ocr
.
Setup
Instructions
Usage
Using in a Project
Create a GoogleOCRApplication
application instance:
from google_drive_ocr import GoogleOCRApplication
app = GoogleOCRApplication ( 'client_secret.json' )
Enter fullscreen mode
Exit fullscreen mode
Perform OCR on a single image:
app . perform_ocr ( 'image.png' )
Enter fullscreen mode
Exit fullscreen mode
Perform OCR on mupltiple images:
app . perform_batch_ocr ([ 'image_1.png' , 'image_2.png' , 'image_3.png' ])
Enter fullscreen mode
Exit fullscreen mode
Perform OCR on multiple images using multiple workers (multiprocessing
):
app . perform_batch_ocr ([ 'image_1.png' , 'image_3.png' , 'image_2.png' ], workers = 2 )
Enter fullscreen mode
Exit fullscreen mode
Using Command Line Interface
Typical usage with several options:
google-ocr --client-secret client_secret.json \
--upload-folder-id <google-drive-folder-id> \
--image-dir images/ --extension .jpg \
--workers 4 --no-keep
Enter fullscreen mode
Exit fullscreen mode
Show help message with the full set of options:
google-ocr --help
Enter fullscreen mode
Exit fullscreen mode
Configuration
The default location for configuration is ~/.gdo.cfg
.
If configuration is written to this location with a set of options,
we don't have to specify those options again on the subsequent runs.
Save configuration and exit:
google-ocr --client-secret client_secret.json --write-config ~/.gdo.cfg
Enter fullscreen mode
Exit fullscreen mode
Read configuration from a custom location (if it was written to a custom location):
google-ocr --config ~/.my_config_file ..
Enter fullscreen mode
Exit fullscreen mode
Performing OCR
Note : It is assumed that the client-secret
option is saved in configuration file.
Single image file:
google-ocr -i image.png
Enter fullscreen mode
Exit fullscreen mode
Multiple image files:
google-ocr -b image_1.png image_2.png image_3.png
Enter fullscreen mode
Exit fullscreen mode
All image files from a directory with a specific extension:
google-ocr --image-dir images/ --extension .png
Enter fullscreen mode
Exit fullscreen mode
Multiple workers (multiprocessing
):
google-ocr -b image_1.png image_2.png image_3.png --workers 2
Enter fullscreen mode
Exit fullscreen mode
PDF files:
google-ocr --pdf document.pdf --pages 1-3 5 7-10 13
Enter fullscreen mode
Exit fullscreen mode