Home Applications iris-pero-ocr

iris-pero-ocr

This application is not supported by InterSystems Corporation. Please be notified that you use it at your own risk.
5
1 reviews
0
Awards
253
Views
0
IPM installs
1
1
Details
Releases
Reviews
Issues
Pull requests
Articles
OCR demo for IRIS

What's new in this version

Initial Release

OCR DEMO

This is a demo of the OCR functionality of the pero-ocr library.

It used in the iris application server in python.

Demo

This is an example of input data :

input

This is the result of the OCR :

In this example you have the following information:

  • The text is in the TextEquiv tag
  • The confidence is in the conf attribute of the TextEquiv tag
  • The coordinates of the text are in the Coords tag

  
    Pero OCR
    2022-12-13T08:47:12.207893+00:00
    2022-12-13T08:47:12.207893+00:00
  
  
    
      
      
        
        
        
          IN
        
      
    
    
      
      
        
        
        
          CONGRESS, JULY 4, 1776.
        
      
    
    
      
      
        
        
        
          Dhe unaniwons Declaratton of te Heten maiss States of TNmerica
        
      
    
    
      
      
        
        
        
          hen n lí loune z human venl, i kemu nematy k mpeopě toíohohhehttcal bandí uhích have connechdí tem vith ancthet, andíl
        
      
      
        
        
        
          o hi ſhwes f he eail, fie rehatal andequal flohon & ufch lhe laav  . kalut and   Aloil ped entilt ttem, a dant rafech to the ofunin o manknd tequies fhat thep
        
      
     
      
        
        
        
          imuiaa
        
      
      
        
        
        
          Qlver
        
      
      
        
        
        
          Vbalřew/
        
      
    
    
      
      
        
        
        
          17.
        
      
    
  

Installation

git clone https://github.com/grongierisc/iris-pero-ocr

/!\ This demo requires the models to be installed /!\

To install the model download the model from the realase page and extract it in the misc/pero-ocr-fix-computation-on-cpu of the project.

https://github.com/grongierisc/iris-pero-ocr/releases/download/v1.0.0/OCR_350000.pt.cpu

https://github.com/grongierisc/iris-pero-ocr/releases/download/v1.0.0/ParseNet_296000.pt.cpu

/!\ Both models are required /!\

This is the expected misc folder structure :

misc
├── config_file.ini
├── in
├── out
└── pero-ocr-fix-computation-on-cpu
    ├── OCR_350000.pt.cpu
    ├── ParseNet_296000.pt.cpu
    └── ocr_engine.json

Then docker-compose up

docker-compose up

Usage

Put any sample image in the samples folder and copy them in misc/in folder and they will be processed by the OCR.

The results will be in the misc/out folder.

You will find the xml files with the results and the images with the detected text.

You can monitor the progress in the logs here http

login with _SYSTEM and SYS

How it works in IRIS

The OCR is an Business Service that parse all the files in the misc/in folder and put the results in a message queue.

The message queue is consumed by a Business Operation that put the results in the misc/out folder.

Code is in the src/python/pero-ocr folder.

Read more
Made with
Version
1.0.006 Dec, 2022
ObjectScript quality test
Category
Frameworks
Works with
InterSystems IRIS
First published
06 Dec, 2022