OCR-Systems

Excellent Utilities: Paperwork – personal document manager

Search

The key advantage of applying OCR to your scanned documents is the ability to search the text. Searching is made simple with Paperwork.

There’s advanced search functionality too. You can choose to search with keyword(s), by label, and/or by date. There’s the ability to apply multiple searches, as shown in the image below. The two basic Boolean search commands AND and OR are supported.

Paperwork-search

You can define the search from and to date, and also apply a NOT operator to any search.

Labels

Paperwork-Labels

Labels offer a simple way to organize your documents.

Clicking the 4 horizontal bar graphic at the top right of any document brings up a dialog box.

The dialog lets you define the date of the document, set one or more user-definable labels, and specify any additional keywords.

The image to the left shows some example labels applied to the scanned documents. The color coded labels help you quickly identify documents.

The software automatically guesses the labels to apply to new documents. This functionality is courtesy of Simplebayes, a memory-based, optional-persistence naïve Bayesian text classifier.

The additional keywords option can be useful if character recognition doesn’t work.

Labels are effective, quick to apply, and work well. You can use the search functionality to filter documents by a label i.e. matching a particular label, or with the NOT operator, disregarding a specific label.

Next page: Page 4 – Other Features

Pages in this article:
Page 1 – Introduction / Installation
Page 2 – In Operation
Page 3 – Search / Labels
Page 4 – Other Features
Page 5 – Summary


Complete list of articles in this series:

Excellent Utilities
tmuxA terminal multiplexer that offers a massive boost to your workflow
lnavAdvanced log file viewer for the small-scale; great for troubleshooting
PaperworkDesigned to simplify the management of your paperwork
AbricotineMarkdown editor with inline preview functionality
mdlessFormatted and highlighted view of Markdown files
fkillKill processes quick and easy
TuskAn unofficial Evernote client with bags of potential
UlauncherSublime application launcher
McFlyNavigate through your bash shell history
LanguageToolStyle and grammar checker for 30+ languages
pecoSimple interactive filtering tool that's remarkably useful
Liquid PromptAdaptive prompt for Bash & Zsh
AnanicyShell daemon created to manage processes’ IO and CPU priorities
cheat.shCommunity driven unified cheat sheet
ripgrepRecursively search directories for a regex pattern
exaA turbo-charged alternative to the venerable ls command
OCRmyPDFAdd OCR text layer to scanned PDFs
WatsonTrack the time spent on projects
fontpreviewQuickly search and preview fonts
fdWonderful alternative to the venerable find
scrcpyDisplay and control Android devices
dufDisk usage utility with more polished presentation than the classic df
tldrSimplified and community-driven man pages
lsdLike exa, lsd is a turbo-charged alternative to ls
brootNext gen tree explorer and customizable launcher
DeskreenLive streaming your desktop to a web browser
Share this article

5 comments

  1. It seems like such a good idea, but on my Ryzen 2700X with GeForce GTX 1080 Ti, it is impossibly slow on documents of a few hundred pages. I can’t get the cut and paste to work either.

  2. Smooth GUI, but not intuitive and most of the time it is not clear what it is doing. I cannot tell when OCR was successful, little to no progress indication on most actions. It has a lot of potential, but also a lot of potential for improvement.

Share your Thoughts

This site uses Akismet to reduce spam. Learn how your comment data is processed.