This comparison of optical character recognition software includes:
Name | Founded year | Latest stable version | Release year | License | Online | Windows | Mac OS X | Linux | BSD | Android | iOS | Programming language | SDK? | Languages | Fonts | Output Formats | Notes | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1989 | 16 | 2022 | C/C++ | 192[1] | All fonts | DOC, DOCX, XLS, XLSX, PPTX, RTF, PDF, HTML, CSV, TXT, ODT, DjVu, EPUB, FB2[2] | ABBYY also supplies SDKs for embedded and mobile devices. Professional, Corporate and Site License Editions for Windows, Express Edition for Mac.[3] | |||||||||||
1989 | VBScript | Works with structured, semi-structured, and unstructured documents. | ||||||||||||||||
Asprise OCR SDK | 1998 | 15 | 2015 | Java, C#,VB.NET, C/C++/Delphi | 20+[4] | Plain text, searchable PDF, XML[5] | Java, C#, VB.NET, C/C++/Delphi SDKs for OCR and Barcode recognition on Windows, Linux, Mac OS X and Unix.[6] | |||||||||||
1996 | 1.1 | 2011 | C/C++ | 28 | Any printed font | HTML, hOCR, native, RTF, TeX, TXT[7] | Enterprise-class system, can save text formatting and recognizes complicated tables of any structure | |||||||||||
E-aksharayan | 2010 | 14 | RTF, TXT, BRL | |||||||||||||||
2000 | 0.52[8] | 2018 | [9] | C | 20+ | |||||||||||||
2015 | Yes | Browser | Browser | Browser | Unknown | Unknown | Yes | 200+ | All fonts | text | Google blog post[10] [11] | |||||||
Office 2007 | 2007 | Uses OmniPage | ||||||||||||||||
2011 | 2007 | |||||||||||||||||
2009-03 | 0.8.5 | 2022 | Python | Features a full user interface and has a command-line tool for automatic operations. Has its own segmentation algorithm but uses system-wide OCR engines like Tesseract or Ocrad | ||||||||||||||
0.29[12] | 2024 | C++ | Latin alphabet | Command line | ||||||||||||||
2007 | 1.3.3 | 2017 | Python | All languages using Latin script (other languages can be trained) | Normal Latin script and Fraktur (other scripts can be trained) | TXT, hOCR,[13] PDF[14] | ||||||||||||
1970s | 19.2 | 2015 | C/C++, C#[15] | 125[16] | Machine and handprinted fonts | DOC/DOCX XLS/XLSX PPTX RTF PDF PDF/A Searchable PDF HTML Text XML ePUB MP3 | Product of Nuance Communications | |||||||||||
2009 | C# | 28 | Any printed font | .NET OCR SDK based on Cognitive Technologies' CuneiForm recognition engine. Wraps Puma COM server and provides simplified API for .NET applications | ||||||||||||||
14 | Scan, capture and classify business documents such as invoices, forms and purchase orders integrated with business processes. | |||||||||||||||||
For working with localized interfaces, corresponding language support is required. | ||||||||||||||||||
1991 | 10.5.8 | 2015 | For musical scores | |||||||||||||||
1985 | 5.3.3 | 2023 | C++, C | 100+[17] | Any printed font | Text, ALTO, hOCR,[18] PDF, others with different user interfaces[19] or the API | Created by Hewlett-Packard; under further development by Google[20] | |||||||||||
Name | Founded year | Latest stable version | Release year | License | Online | Windows | Mac OS X | Linux | BSD | Android | iOS | Programming language | SDK? | Languages | Fonts | Output Formats | Notes |
A 2016 analysis of the accuracy and reliability of the OCR packages Google Docs OCR, Tesseract, ABBYY FineReader, and Transym, employing a dataset including 1227 images from 15 different categories concluded Google Docs OCR and ABBYY to be performing better than others.[21]