Utilities

Excellent Utilities: Paperwork – personal document manager

In Operation

Paperwork is designed to be as simple to install and use as possible.

The interface sports 2 panels, the left panel lists your documents sorted by date, and the right panel shows the pages of the currently selected document.

To scan a new document, simply load your document into your scanner’s feeder or place it on the scanner bed, and click scan. The scanning function is performed by SANE (Scanner Access Now Easy), together with a Python library written by Paperwork’s developer.

You can also import documents with the following formats: PDF, PDF folder, Image folder, BMP, GIF, JPEG, PNG, and TIFF. Multiple files can be imported at the same time.

Next, you’ll want to apply optical character recognition (OCR). Here’s an image of Paperwork after applying OCR.

Paperwork

There are 3 icons near the top right corner of the scanned image. These icons let you copy the selected OCR text to the clipboard, edit the scan (more on that later), or delete the scanned page.

Any comments about the accuracy of the OCR itself should be directed to Tesseract itself. Tesseract is a capable OCR engine with good accuracy although this is dependent on the quality of the image. Here’s our rating of Tesseract and other OCR Engines.

OCR-Systems-Best-Free-Software

Next page: Page 3 – Search / Labels

Pages in this article:
Page 1 – Introduction / Installation
Page 2 – In Operation
Page 3 – Search / Labels
Page 4 – Other Features
Page 5 – Summary


Complete list of articles in this series:

Excellent Utilities
AbricotineMarkdown editor with inline preview functionality
AES CryptEncrypt files using the Advanced Encryption Standard
AnanicyShell daemon created to manage processes’ IO and CPU priorities
brootNext gen tree explorer and customizable launcher
CerebroFast application launcher
cheat.shCommunity driven unified cheat sheet
CopyQAdvanced clipboard manager
crocSecurely transfer files and folders from the command-line
DeskreenLive streaming your desktop to a web browser
dufDisk usage utility with more polished presentation than the classic df
exaA turbo-charged alternative to the venerable ls command
Extension ManagerBrowse, install and manage GNOME Shell Extensions
fdWonderful alternative to the venerable find
fkillKill processes quick and easy
fontpreviewQuickly search and preview fonts
horcruxFile splitter with encryption and redundancy
KoohaSimple screen recorder
KOReaderDocument viewer for a wide variety of file formats
ImagineA simple yet effective image optimization tool
LanguageToolStyle and grammar checker for 30+ languages
Liquid PromptAdaptive prompt for Bash & Zsh
lnavAdvanced log file viewer for the small-scale; great for troubleshooting
lsdLike exa, lsd is a turbo-charged alternative to ls
McFlyNavigate through your bash shell history
mdlessFormatted and highlighted view of Markdown files
NushellFlexible cross-platform shell with a modern feel
OCRmyPDFAdd OCR text layer to scanned PDFs
Oh My ZshFramework to manage your Zsh configuration
PaperworkDesigned to simplify the management of your paperwork
PDF Mix ToolPerform common editing operations on PDF files
pecoSimple interactive filtering tool that's remarkably useful
ripgrepRecursively search directories for a regex pattern
RnoteSketch and take handwritten notes
scrcpyDisplay and control Android devices
StickySimulates the traditional “sticky note” style stationery on your desktop
tldrSimplified and community-driven man pages
tmuxA terminal multiplexer that offers a massive boost to your workflow
TuskAn unofficial Evernote client with bags of potential
UlauncherSublime application launcher
WatsonTrack the time spent on projects
Whoogle SearchSelf-hosted and privacy-focused metasearch engine
ZellijTerminal workspace with batteries included

5 comments

  1. It seems like such a good idea, but on my Ryzen 2700X with GeForce GTX 1080 Ti, it is impossibly slow on documents of a few hundred pages. I can’t get the cut and paste to work either.

  2. Smooth GUI, but not intuitive and most of the time it is not clear what it is doing. I cannot tell when OCR was successful, little to no progress indication on most actions. It has a lot of potential, but also a lot of potential for improvement.

Share your Thoughts

This site uses Akismet to reduce spam. Learn how your comment data is processed.