kraken is a turn-key automatic text recognition system optimized for historical and non-Latin script material.
It’s designed as a universal text recognizer for the humanities, with trainable layout analysis, reading order detection, and character recognition. The software supports a wide range of scripts and output formats, making it useful for digitizing complex historical documents.
This is free and open source software.
Key Features
- Automatic text recognition system.
- Optimized for historical and non-Latin script material.
- Fully trainable layout analysis.
- Reading order detection.
- Trainable character recognition.
- Right-to-left, bidirectional, and top-to-bottom script support.
- ALTO, PageXML, abbyyXML, and hOCR output.
- Word bounding boxes and character cuts.
- Multi-script recognition support.
- Public repository of model files.
- Variable recognition network architectures.
Website: github.com/mittagessen/kraken
Support:
Developer: mittagessen
License: Apache License 2.0
kraken is written in Python. Learn Python with our recommended free books and free tutorials.
Related Software
| OCR Systems | |
|---|---|
| Tesseract | High quality neural net (LSTM) based OCR engine focused on line recognition |
| EasyOCR | OCR that reads natural scene text and dense text in documents |
| ocrs | Modern OCR engine |
| Surya | Multilingual document OCR toolkit with text recognition |
| ocropy | Open source document analysis and OCR system |
| Ocrad | OCR engine based on a feature extraction method |
| Cuneiform | OCR Engine to convert OCR documents into editable form |
| GOCR | Reads images in many formats |
Read our verdict in the software roundup.
Explore our comprehensive directory of recommended free and open source software. Our carefully curated collection spans every major software category.This directory is part of our ongoing series of informative articles for Linux enthusiasts. It features hundreds of detailed reviews, along with open source alternatives to proprietary solutions from major corporations such as Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk. You’ll also find interesting projects to try, hardware coverage, free programming books and tutorials, and much more. Discovered a useful open source Linux program that we haven’t covered yet? Let us know by completing this form. |

