Building a Document Scanning and Barcode Recognition Application with Qt Python

A few months ago, I published a cross-platform desktop barcode reading application built with Qt Python and Dynamsoft Barcode Reader. The supported input sources include real-time camera stream, image files, and screenshots. In this article, I will set document scanner as the input source. The SDK used for document scanning is Dynamic Web TWAIN, which is a cross-platform JavaScript library and supports Windows, macOS, and Linux.

Pre-requisites

SDK
- Dynamic Web TWAIN
- Dynamsoft Barcode Reader
```
pip install dbr
```
- Qt
```
pip install PyQt5
```
License

Steps to Build a Cross-platform Document Scanning and Barcode Recognition Application

We are going to create a hybrid application that combines HTML5 and Python code.

1. Create Qt application with Qt widgets

Here are the required Qt Widgets:

QWebEngineView: used to load HTML and JavaScript code. It displays the document images scanned by Dynamic Web TWAIN API.
QPushButton: one for acquiring images, and the other for decoding barcodes.
QTextEdit: show the barcode results.

First, we create an empty Qt window:

from PyQt5.QtWidgets import *
from PyQt5.QtCore import *
from PyQt5.QtGui import *
from PyQt5.QtWebEngineWidgets import QWebEngineView
from PyQt5.QtWebChannel import QWebChannel

app = QApplication([])
win = QWidget()
win.setWindowTitle('Dynamic Web TWAIN and Dynamsoft Barcode Reader')
win.show()
app.exec_()

Then we create a layout and add the widgets to it:

class WebView(QWebEngineView):
    def __init__(self):
        QWebEngineView.__init__(self)

layout = QVBoxLayout()
win.setLayout(layout)

view = WebView()
bt_scan = QPushButton('Scan Barcode Documents')
bt_read = QPushButton('Read Barcode')
text_area = QTextEdit()

layout.addWidget(view)
layout.addWidget(bt_scan)
layout.addWidget(bt_read)
layout.addWidget(text_area)

Shortcut keys are convenient for a desktop application. We use R to reload the web view and use Q to quit the application:

def keyPressEvent(event):
    if event.key() == Qt.Key.Key_Q:
        win.close()
    elif event.key() == Qt.Key.Key_R:
        refresh_page()

win.keyPressEvent = keyPressEvent

So far, the basic UI is done. We can run the application:

python app.py

2. Load HTML and JavaScript code in Qt application

Dynamic Web TWAIN SDK provides a variety of sample code located in the Dynamic Web TWAIN SDK <version>\Samples\ folder. We copy Samples\Scan\SourceList.html and Samples\Scan\Resources to the root of our Python project.

Rename SourceList.html to index.html and then add the following Python code to load index.html and all relevant resource files:

import os

class WebView(QWebEngineView):
    def __init__(self):
        QWebEngineView.__init__(self)

        # Load web page and resource files to QWebEngineView
        file_path = os.path.abspath(os.path.join(
            os.path.dirname(__file__), "index.html"))
        local_url = QUrl.fromLocalFile(file_path)
        self.load(local_url)

After re-running the Python application, the web page will be presented.

3. Establish the communication between Python and JavaScript

The communication between Python and JavaScript is the key part of the application.

The index.html has contained the AcquireImage() function for scanning documents.

function AcquireImage() {
    if (DWObject) {
        var OnAcquireImageSuccess, OnAcquireImageFailure;
        OnAcquireImageSuccess = OnAcquireImageFailure = function () {
            DWObject.CloseSource();
        };

        DWObject.SelectSourceByIndex(document.getElementById("source").selectedIndex); //Use method SelectSourceByIndex to avoid the 'Select Source' dialog
        DWObject.OpenSource();
        DWObject.IfDisableSourceAfterAcquire = true;    // Scanner source will be disabled/closed automatically after the scan.
        DWObject.AcquireImage(OnAcquireImageSuccess, OnAcquireImageFailure);
    }
}

We can execute the JavaScript function by calling runJavaScript on the Python side:

def read_barcode():
    frame = view.page()
    frame.runJavaScript('AcquireImage();')

bt_scan.clicked.connect(acquire_image)

As the scanning process is finished, the current image is kept in memory. We can convert it to base64 string:

function getCurrentImage() {
    if (DWObject) {
        DWObject.ConvertToBase64(
            [DWObject.CurrentImageIndexInBuffer],
            Dynamsoft.DWT.EnumDWT_ImageType.IT_JPG,
            function (result, indices, type) {
                // TODO:
            },
            function (errorCode, errorString) {
                console.log(errorString);
            }
        );
    }
}

How can we pass the base64 string from JavaScript to Python? The answer is QWebChannel. In index.html, we include qwebchannel.js which can be found in Qt\Examples\Qt-5.12.11\webchannel\shared folder if you have QtCreator installed. Then add the following code to send base64 string:

var backend;
new QWebChannel(qt.webChannelTransport, function (channel) {
    backend = channel.objects.backend;
});

function getCurrentImage() {
    if (DWObject) {
        DWObject.ConvertToBase64(
            [DWObject.CurrentImageIndexInBuffer],
            Dynamsoft.DWT.EnumDWT_ImageType.IT_JPG,
            function (result, indices, type) {
                backend.onDataReady(result.getData(0, result.getLength()))
            },
            function (errorCode, errorString) {
                console.log(errorString);
            }
        );
    }
}

The onDataReady function needs to be implemented on the Python side:

class Backend(QObject):
    @pyqtSlot(str)
    def onDataReady(self, base64img):

class WebView(QWebEngineView):
    def __init__(self):
        QWebEngineView.__init__(self)

        # Load web page and resource files to QWebEngineView
        file_path = os.path.abspath(os.path.join(
            os.path.dirname(__file__), "index.html"))
        local_url = QUrl.fromLocalFile(file_path)
        self.load(local_url)
        self.backend = Backend(self)
        self.channel = QWebChannel(self.page())
        self.channel.registerObject('backend', self.backend)
        self.page().setWebChannel(self.channel)

def read_barcode():
    frame = view.page()
    frame.runJavaScript('getCurrentImage();')

bt_read.clicked.connect(read_barcode)

4. Decode barcodes from scanned documents

Finally, we can use Dynamsoft Barcode Reader to decode barcodes from base64 string and display the result on the text area.

from dbr import *
import base64

# Initialize Dynamsoft Barcode Reader
reader = BarcodeReader()
# Apply for a trial license https://www.dynamsoft.com/customer/license/trialLicense?product=dbr
reader.init_license('LICENSE-KEY')

class Backend(QObject):
    @pyqtSlot(str)
    def onDataReady(self, base64img):
        imgdata = base64.b64decode(base64img)

        try:
            text_results = reader.decode_file_stream(bytearray(imgdata), '')
            if text_results != None:
                out = ''
                for text_result in text_results:
                    out += "Barcode Format : "
                    out += text_result.barcode_format_string + '\n'
                    out += "Barcode Text : "
                    out += text_result.barcode_text + '\n'
                    out += "-------------------------------------------------" + '\n'

                text_area.setText(out)
        except BarcodeReaderError as bre:
            print(bre)

Source Code

https://github.com/yushulx/web-twain-document-scan-management/tree/main/examples/qt

Blog