Desktop search is a software application which searches the contents of computer files, rather than searching the internet. The purpose of this software is to enable the user to locate information on their computer. Typically, this data includes emails, chat logs, documents, contact lists, graphics files, as well as multimedia files including video and audio.

Searching a hard disk can be painfully slow, especially bearing in mind the large storage capacities of modern hard disks. To ensure considerably better performance, desktop search engines build and maintain an index database. Populating this database is a system intensive activity. Consequently, desktop search engines will carry out indexing when the computer is not being used.

One of the key benefits of this type of software is that it allows the user to locate data stored on their hard disk almost instantaneously. They are designed to be fast. They are not integrated with a different application, such as a file manager.

For this week, I’m looking at a marvellous desktop search tool. It’s called Recoll. Recoll uses the Xapian information retrieval library as its storage and retrieval engine.

Recoll

There’s a package available in the Raspberry Pi OS’s repositories. You get version 1.24.3. The current Recoll version is 1.27.2. As we’re missing out on significant program development (more than 2 years worth), I recommend compiling the source code. Fortunately, the process is quite straightforward.

First, let’s install a few necessary packages:

$ sudo apt install libchm-dev xapian-tools libxapian-dev libxslt1-dev

Next, download the file recoll-1.27.2.tar.gz from the project’s website. We can then proceed to uncompress and extract that file with the following tar command:

$ tar zxvf recoll-1.27.2.tar.gz

We then need to run the project’s configure script. This script is responsible for getting ready to build the software on your specific system. It makes sure all of the dependencies for the rest of the build and install process are available, and finds out whatever it needs to know to use those dependencies.

Having run the configure script, we can proceed to compile the source code with the make command. Don’t forget to use the -j4 flag as it speeds up the compilation significantly.

$ cd recoll-1.27.2

$ ./configure

$ make -j4

$ sudo make install

We’re then ready to run the program. Bear in mind the first run can take a long time for the indexing to complete.

In my case, this is primarily because my home directory is jam-packed full of software and files. That’s one downside of running the RPI4 from an external SSD with a large capacity.

Recoll indexing is normally incremental: documents will only be processed if they have been modified since the last run.

Once the indexing is complete, we’re ready to rumble.

Recoll processes plain text, HTML, OpenDocument (Open/LibreOffice), email formats, and a few others internally.

Other file types (such as PDF, PostScript, MS Word, RTF) need external applications for preprocessing.

The image to the right shows the output of a very simple search. There are five different modes to help you locate what you’re looking for. With the Advanced Search mode, you can build complex queries.

Recoll works admirably on the RPI4. Memory usage is very light, around 62MB of RAM, so you can leave it running all the time whatever model of the RPI4 you’re using. Another success.

Just make sure you avoid the package, and compile the program yourself. It’s not hard (in this instance), and you get all the benefits of the latest version. What the RPI4 really needs is a community-driven repository, similar to the Arch User Repository. This would provide package descriptions that allow users to compile a package from source, sorting out issues specific to the RPI4. Given the huge volume of sales of the RPI4, I’m surprised we’re still so reliant on the official repositories stuffed full of mostly outdated software.

