Wednesday, January 19 2005 @ 08:24 PM EST Contributed by: glosser
Linux.com takes a look at the voice recognition capabilities of Linux desktop environments, with special focus on KDE.
Voice control is the next step in human interaction with computers. Voice recognition, and its flip side, speech synthesis, can help you streamline your day-to-day work and organize your Linux desktop in a better way.
To begin conversing with your Linux desktop, download the Sphinx-2 speech recognition engine and the Festival text to speech application. Although the CMU Sphinx Group provides several versions of Sphinx (Sphinx-2, -3, and -4), I use only Sphinx-2, as it is the fastest. Even though it is not as accurate as Sphinx-3 or Sphinx-4, it runs in real time, and therefore works well with live applications.
The installation of Sphinx-2 and Festival should be trivial; most distributions already have binaries, and even compiling from source should not be difficult. Debian users might find Festival a little tricky to install if they own an onboard sound card with AC97 codecs. (The symptom is speech that sounds twice as fast as it should, no matter what speed you set up; unfortunately I couldn't find any solution except for changing the sound card.)
Happily, the normal desktop user will not have to learn Festival's command-line interface, as great applications such as KDE Text-to-Speech System (KTTS) and Perlbox Voice fill this gap. KDE 3.4 will talk to you via Festival, Festival Lite (flite) or FreeTTS (another free speech synthesis written in Java), in a multitude of languages and accents. If you want to use KTTS with your present KDE desktop, sources as well as binaries for Debian, SUSE, and Mandrake are available at KTTS's home page.