This article will provide you with the general knowledge of OCR, get acquainted with the capabilities of OCR programs, the most popular OCR software options, and Iris OCR among them.
What is OCR software?
To create electronic libraries, archives by translating books and documents into a digital version, special character recognition systems (Optical Character Recognition, OCR) are used. With the help of the scanner, you can get an image of a page with text in a graphic format.
OCR and the conversion of printed documents into electronic form is a new technology that optimizes workflows and increases business profitability.
How does it work?
Modern software and hardware systems allow you to automate the input of large amounts of information into a computer, using, for example, a network scanner and parallel text recognition on multiple computers simultaneously.
Most OCR programs work with a raster image obtained through a fax modem, scanner, digital camera, or another device.
The process includes the following stages:
- In the first stage, OCR should divide the page into blocks of text, based on the features of the right and left alignment and the presence of several columns;
- then the recognized block is divided into lines;
- The lines are then divided into continuous areas of the image, which usually correspond to individual letters;
- the recognition algorithm makes assumptions about the correspondence of these areas to the symbols;
- and then each character is selected, as a result of which the page is restored in the characters of the text, and, as a rule, in the appropriate format.
OCR systems can achieve the best recognition accuracy – more than 99.9% for clear images composed of conventional fonts.
OCR service alternatives
- FineReader Online
This is a polished algorithm for identifying printed characters. One of its main advantages is the support of a large number of languages (there are 37 languages in total). In order to use the service, you must register. As this project is partly advertising in nature, the possibilities of text recognition in it are significantly limited.
A British service that uses FineReader as a text recognition system. You can change the format when you download each new file. In addition, it is possible to receive text by mail. It is worth noting that the results can be packed in a ZIP archive, which will reduce the time to download the resulting file.
- Iris OCR
It is one of several OCR applications, which you can find on the CD that came with your scanner or buy Readiris. Readiris OCR includes inline document compression features to save time on off-line conversions outside the application.
The NewOCR.com project requires no registration or additional user fees. The service has a minimalist interface, and its settings are limited to the choice of language. You can recognize images in JPEG, PNG, GIF, BMP formats, as well as multi-page TIFF files.
- Boxoft Free OCR
This free program is easy to use and is able to analyze multi-column text with a high degree of accuracy. While there are concerns that this OCR is no different from removing text from handwritten notes, it works extremely well with a hard copy.