Newbies What Next ? News Forums Calendar
Home | YouTube | Register | News | Forums | Portal Pages | MyLinks | New | Hot | Link Us


 Category Gateway
More Options

LinuxLinks News
 · Simon Tatham’s Portable Puzzle Collection – Games for the Brain
 · Captain Holetooth – Explorative 2D Platform Game for Kids
 · Taisei – A Classy, Frenetic Shoot’em Up Game in the Style of The Touhou Project
 · YouTube Channel
 · Minilens – Fun Open Source Puzzle Platform Game
 · Wizznic! – Highly Addictive Open Source Puzzle Game
 · Success! Beelink S1 Running Linux – Courtesy of the Open Source Community
 · Beelink S1 Mini PC and Linux – Comedy Gold
 · Fun Packed Open Source Action Games
 · Improve Your Mental Mettle with These Open Source Puzzle Games


Latest Links
Goober Gamer
Discreete Linux
Banshee 3D


Top : Software : Utilities : Text Utilities

Simon Tatham’s Portable Puzzle Collection – Games for the Brain
There are some classic puzzles included in the Puzzle Collection. Implementations of quintessential puzzle games like Master Mind, Sudoku, and Minesweeper are featured. And there are lots of small puzzles that most people will never have played elsewhere. The diversity of the collection makes it a treasure trove.

(Read more)
Family Farm
(commercial) Work the farm in this game of 19th century farmsteading and build a home for your families. Clicking cows won't earn you any cash. This is a simulation of a farmstead experienced in stories which span a generation. Read more


  • grep
    searches one or more input files for lines containing a match to a specified pattern. By default, grep prints the matching lines hot
  • pdfgrep
    pdfgrep is a commandline utility to search text in PDF files. pdfgrep tries to be compatible with GNU Grep, where it makes sense. Many of your favorite grep options are supported (such as -r, -i, -n or -c). hot
  • sift
    sift is a fast and powerful alternative to grep. It comes as a single executable with no external dependencies. The additional features include gitignore support, conditions (e.g. match A only when preceded by B within X lines), full multi-core support and multiline matching. hot
  • Stanford CoreNLP
    Stanford CoreNLP is an extensible annotation-based NLP pipeline that provides core natural language analysis. This open source toolkit is quite widely used, both in the research NLP community and also among commercial and government users of open source NLP technology. Read more hot
  • acoc
    Arbitrary Command Output Colourer: a regular expression based colour formatter for programs that display output on the command-line. It works as a wrapper around the target program, executing it and capturing the stdout stream
  • Align
    Align is a general-purpose text filter tool that helps vertically align columns in string-separated tables of input text.
  • Ansible-cmdb
    Ansible-cmdb takes the output of Ansible's fact gathering and converts it into a static HTML overview page containing system configuration information.
  • Ansifilter
    Ansifilter handles text files containing ANSI terminal escape codes.
  • antiword
    a Microsoft® Word reader for Linux and RISC OS
  • apachegrep
    apachegrep is a perl program (which does not require any non-standard perl modules) to help webmasters (or anyone, really) go through their apache common/combined logs and try to pullout various bits of information.
  • apropos2
    Apropos2 is a replacement for the GNU apropos command that winnows down its responses when given more than one search term. The apropos in man repeatedly prints everything that matches each search term. (The apparent similarity between apropos and whatis led to them being the same script.) Apropos2 is equivalent to "apropos word1 | grep -i word2 | ... | grep -i wordn", but with better error messages.
  • ASCII art printer
    a Perl script that rewrites its input, 2.5 times larger, as ASCII art
  • ascii2pdf
    ascii2pdf translates simple text documents to PDF format. It has options for changing font, font size, and landscape vs. portrait mode.
  • asmview
    AsmView is a small file viewer. It runs in a terminal and accepts input from the command line, a pipe, or interactive prompts. It scrolls in all directions and meets the SFF guidelines.
  • Aspell
    Aspell is a free and Open Source spell checker designed to eventually replace Ispell. It can either be used as a library or as an independent spell checker. Its main feature is that it does a much better job of coming up with possible suggestions than just about any other spell checker out there for the English language, including Ispell and Microsoft Word.
  • Atox
    a fully customizable Python library and command-line tool for converting plain text into XML
  • AutoConvert
    AutoConvert consists of three parts: A converter from Chinese HZ encoding to GB encoding, an auto-converter from HZ/GB/BIG5 encoding to GB/BIG5 encoding, and a working procmail example to auto-convert incoming mail.
  • Autodocbook
    Autodocbook is a simple perl script that runs though C code looking for specially formatted comment blocks and turns them into docbook sgml files, which can then be used to created man pages, info pages and html documentation.
  • awk
    interprets a special-purpose programming language that makes it possible to handle simple data-reformatting jobs with just a few lines of code
  • Base-64
    Base64 is a command line tool that implements an RFC 3548-compliant base 64 encoder and decoder. When encoding it can wrap encoded lines to a specified column, and when decoding can optionally ignore non-alphabet characters.
  • bbe
    bbe is a sed-like editor for binary files. bbe performs basic byte operations on blocks of input stream. bbe is command line tools developed in GNU/Linux environment.Features include: Non-interactive command-line tool, reads input stream in arbitrary blocks, not as lines as sed, and input blocks can be defined as offset and length, just length, or using start and stop strings.
  • Bfr
    Bfr is a general-purpose command-line pipe buffer. It buffers data from stdin and sends it to stdout, adjusting to best fit the pace stdout can handle. It can solve problems on either end of a pipe.
  • BibTeX2HTML
    BibTeX2HTML is a set of LaTeX and Perl scripts, which permit to generate automaticaly web pages from a BibTeX database.
  • bkmrkconv
    converts a Netscape "bookmarks.html" file to into a series of pages of links which are more easily browsable
  • booksync
    a tool used for synchronizing different bookmark files and types. booksync preserves current bookmark structures and sorts in new ones correctly in existing directorys or create new one if necessary
  • Boustrophedon Text Reader
    displays text files in Boustrophedon--a writing style created by the ancient Greeks that alternates direction every line
  • boxes
    can draw all kinds of boxes around its input text, ranging from a C comment box to complex ASCII art
  • catdoc
    catdoc extracts content of Microsoft Word (.doc) fileas readable ASCII text and prints it to stdout. Optionally catdoc can output TeX escape sequences instead of certain characters.
  • catdvi
    translates TeX Device Independent (DVI) files into readable plain text. The program aims to be a superior replacement for the non-free dvi2tty program
  • ccostring
    ccostring is a text utility to compare lines from different files.
  • Cedilla
    Cedilla is a simple text printer that uses Unicode internally. Cedilla attempts to at least partially solve this problem by making heroic efforts to find or create a suitable glyph.
  • Chaperon
    a lexical scanner, a parser generator, a parser, a tree builder and an XML generator all in one package. Chaperon can parse structured text using a grammar and then generate an XML representation of the parsed text, so it is easy to use Chaperon as a converter for text files
  • chcase
    chcase is a Perl script that will rename files to either all upper or all lower case letters.
  • chcsv
    chcsv is a text convert tool from Oracle to CSV file.
  • ChkTeX
    ChkTeX is a LaTeX semantic checker. It is _not_ a replacement for the built-in checker in LaTeX; however it catches some typographic errors LaTeX oversees. In other words, it is Lint for LaTeX. Filters are also provided for checking the LaTeX parts of CWEB documents.
  • ChmSee
    ChmSee is a Compiled HTML Help (CHM) file viewer written in GTK.
  • cledit
    cledit is a change log editor that uses the default editor. It converts text change logs to colorized HTML and checks spelling using aspell.
  • Clipboard Modifier
    Clipboard Modifier is a flexible system to modify the text in a clipboard in a variety of ways. It can copy a spreadsheet and change the clipboard so that it can be pasted into a wiki, with vertical bars (|) instead of tabs. It can modify multi-line clipboard text so that it can be pasted into Java or Python as strings. An URL in the clipboard pointing to Amazon can be modified so that it has your Associate ID in it. It can pipe the clipboard to a shell command and retrieve the output from it. A clibpboard can be forced to text, removing things like formatting. A complicated URL can be converted into its Python equivalent, using urlencode.
  • code2html
    converts a program source code to syntax highlighted HTML. It may be called as a CGI script. It can also handle include commands in HTML files
  • Colortail
    Colortail is a 'tail' program that can color highlight the output.
  • Cook
    Cook is a tool for constructing files, and maintaining referential integrity between files. It is given a set of files to create, and recipes of how to create and maintain them. In any non-trivial program there will be prerequisites to performing the actions necessary to creating any file, such as include files. Cook provides a mechanism to define these.
  • cpp2latex
    cpp2latex converts C++ into LaTeX either for including into existing LaTeX documents or as standalone documents.
  • CSpotRun
    CSpotRun is a free reader for documents in the popular Pilot DOC format.
  • cz2cz tools
    cz2cz is software for converting text files between various encoding charsets (ISO-8859-2, Win-1250, UTF-8, ...). Main feature is autodetection of charset used in text file. Only in czech language (and useful for cz users only).
  • DadaDodo
    analyses texts for word probabilities, and then generates random sentences based on that. Sometimes these sentences are nonsense; but sometimes they cut right through to the heart of the matter, and reveal hidden meanings
  • dbacl
    The dbacl project consist of a set of lightweight UNIX/POSIX utilities which can be used, either directly or in shell scripts, to classify text documents automatically, according to Bayesian statistical principles. dbacl is also the name of the core utility.
  • desift
    a data-driven, template-based text filter implemented in Perl, ideal for generating scripts, content and reports. It provides powerful formatting capabilities for delimited text/input streams, especially when combined with other shell programs
  • diff2html
    generates a valid HTML page to display the output of the diff(1) well-known utility. Using Cascading Style Sheets, the user can fully personnalize the appearance of the web page (you might find the default styles are too much colorfull). diff2html is written using the Python language and is licensed under the GNU GPL
  • Diogenes
    Diogenes is a free, non-commercial, open-source tool for searching the TLG and PHI databases, written in Perl.
  • doc2xml
    doc2xml converts Microsoft Word files to XML.
  • DocFrac
    a document converter that can convert between Rich Text Format (rtf), HyperText Markup Language (html) and plain text (txt). Supports: converting to/from rtf/text/html; colour; font attributes; and most European languages
  • doclifter
    doclifter translates documents written in troff macros to DocBook. Lifting documents from presentation level to semantic level is hard, and a really good job requires human polishing. This tool aims to do everything that can be mechanized, and to preserve any troff-level information that might have structural implications in SGML/XML comments.
  • Docvert
    Docvert is Web service software that takes multiple word processor files (typically .doc) and converts them to Oasis OpenDocument v1.0 format, and then optionally to any XML/HTML format. The results are returned in a .zip file.
  • dtd2xs
    dtd2xs translates a Document Type Definition (DTD) into a XML Schema (REC-xmlschema-1-20010502). The translator can map meaningful DTD entities onto XML Schema constructs
  • duff
    Duff is a Unix command-line utility for quickly finding duplicates in a given set of files. Duff is written in C and should compile on most modern Unices.
  • dvipdfm
    a DVI to PDF translator. Its features include TeX special's that approximate the functionality of the PostScript pdfmarks used by Adobe Acrobat Distiller, the ability to include PDF files and JPEG files as embedded images, support for both Type1 and PK fonts, support for arbitrary linear graphics transformations, a color stack accessible via special's, partial font embedding and stream compression for reduced output file size, native, portable graphics via TPIC specials, balanced page and destination trees for improved reader access on very large document files
  • dwdiff
    dwdiff is a front-end for the diff program that operates at the word level instead of the line level. It is different from wdiff in that it allows the user to specify what should be considered whitespace, and in that it takes an optional list of characters that should be considered delimiters.
  • easy-ebook-viewer
    easy-ebook-viewer is a modern GTK Python app to easily read ePub files.
  • eev.el
    lets you place hyperlinks and shell/tcl/TeX/etc code inside plain text files
  • Elex
    Elex generates a scanner (lexer) from a specification oriented around regular expressions.
  • eolfix
    eolfix is a command line utility for querying and correcting end-of-line (EOL) characters in ASCII text files. It can convert line endings between DOS, Unix, and Mac formats and handles "mixed" and binary formats. It converts only as needed and features a report-only mode.
  • epsmerge
    epsmerge is a program for merging EPS (Encapsulated Postscript) files.
  • epssplit
    epssplit is a Perl program for splitting an EPS (encapsulated postscript) file into several smaller EPS files.
  • EtText
    a simple plain-text format which allows conversion to and from HTML. Instead of editing HTML directly, it provides an easy-to-edit, easy-to-read and intuitive way to write HTML
  • euc2html
    euc2html is a simple application that reads in EUC encoded double-byte characters and translates them to HTML 4.0 Unicode encoded entities.
  • EVP dirdiff
    EVP dirdiff recursively compares two directory trees using message digest (hash), e.g. MD5.
  • Ezvu
    converts the given set of C files into HTML files with all the user defined function calls converted to hyper links so that the user can click the link to view that function definition
  • FavNuts
    converts IE favorite files to a Netscape bookmark file
  • fccu-docprop
    fccu-docprop is a command line utility that tries to print properties of MS OLE files. MS OLE Files are mainly MS Office DOC and XLS files. This software uses the libgsf library to get those metadata. This software can be used for forensic purpose.
  • fileblasphemy
    fileblasphemy digs tags out of a filename and allows you to use those tags within the execution of a program.
  • fixDos
    crlf converts files from/to DOS and UNIX text file formats, tolower converts filename(s) case to lower/upper case, untab converts TABs in files to spaces, and time_t returns values for time handling
  • fk_html
    fk_html is a simple perl script to convert html mail to plaintext. It converts your mail while you're downloanding it running as a fake pop3 server that redirects your mail client connections.
  • fsplit and fmerge
    fsplit and fmerge are utilities to split a large binary file into smaller pieces and merge them together on another machine.
  • gClipColl
    gClipColl provides a drag-and-drop repository for text snippets. Any text dragged to it is stored in a list for dragging to another application.
  • gelapas
    a tool to extract information from files. The default settings (and the shorthand options) are useful to extract information such as the title or meta tags from HTML files but it could also be used for other kind of documents
  • Generic Colouriser
    Generic Colouriser acts as a filter, i.e. taking standard input, colourising it and writing to standard output.
  • genparse
    a command-line parser generator. Creates a a C or C++ file containing command line parsing routines for your program based on a simple configuration file
  • GnoCHM
    a CHM file viewer for Gnome2. It uses PyCHM, a set of Python wrappers around the C library libchm
  • GNU Talk Filters
    The GNU Talk Filters are filter programs that convert ordinary English text into text that mimics a stereotyped or otherwise humorous dialect. These filters have been in the public domain for many years, but now for the first time they are provided as a single integrated package. The filters include austro, b1ff, brooklyn, chef, cockney, drawl, dubya, fudd, funetak, jethro, jive, kraut, pansy, pirate, postmodern, redneck, valspeak, and warez. Each program reads from standard input and writes to standard output. This version of the package also provides the filters as a C library, so they can be easily embedded in other programs.
  • Gnutran
    Gnutran is a simple, Emacs-based front-end to a number of machine translation engines available on the web.
  • gocr
    an optical character recognition software. It converts PGM files into ASC files
  • gozer
    gozer is a commandline text rendering utility for creating images from abitrary text in antialised truetype fonts using optional fontstyles, wordwrapping and layout control.
  • Grutatxt
    a plain text to HTML conversor. It succesfully converts subtle text markup to lists, bold, italics, tables and headings to their corresponding HTML tags without having to write unreadable source text files
  • GutenMark
    a tool for automatically creating high-quality HTML markup from Project Gutenberg etexts. In combination with freely-available HTML-to-Postscript conversion tools, GutenMark can convert Project Gutenberg etexts into publication-quality Postscript, for print-on-demand applications
  • hd2u
    hd2u is a filter used to convert plain texts from DOS (CR/LF) format to UNIX format (CR) and vice versa.
  • help2info
    help2info is a bash script that generates a simple info page from the output of the --help argument of the specified program.
  • help2man
    a Perl script that converts the --help and --version output from a program into a simple manual page
  • hierarchy
    Tools to manipulate hierarchical text outlines (i.e. text trees), including a generator and a spiffy pager.
  • Highlight
    Highlight is a universal sourcecode converter for Linux and Windows, which transforms code to HTML, XHTML, RTF, LaTeX or TeX - files with syntax highlighting.
  • histring
    highlights strings using ANSI terminal escape codes
  • HistView
    HistView takes an ASCII changelog as input and outputs a formatted HTML page, optionally containing links to download releases.
  • Html Code Convert
    Html Code Convert helps speed up the conversion of HTML code into different format including Java Script, JavaServer Pages, Microsoft ASP, PHP, Perl, and the UNIX Shell. It is particularly useful in CGI scripting.
  • HTML to LaTeX
    HTML to LaTeX converts a web site to a LaTeX document which can be used to generate postscript, pdf, and other formats.
    HTML2DB is a tool to assist with the task of converting well-behaved HTML into DocBook SGML.
  • html2fo
    a converter from html to xsl:fo. The html code could be written with StarOffice or other WYSIWYM editors and must not be 100% valid html code
  • html2latex
    a small Perl script designed to convert a properly formatted HTML file into a properly formatted LaTeX file
  • Html2perl
    a simple console based utility for converting HTML text streams (or any ASCII based text stream for that matter) into a series of perl print statements for inclusion in a Perl script
  • html2text
    html2text converts HTML documents into plain text. html2text reads HTML documents from standard input or a (local or remote) URI, and formats them into a stream of plain text characters that is written to standard output or into an output-file. The program is able to preserve the original positions of table fields, allows you to set the screen width, and accepts also syntactically incorrect input. The rendering is largely customizable through an RC file.
  • html_parse
    html_parse is a tool for stripping HTML tags from a document. It is also capable of adding the resulting plain text to a database driven by MySQL.
    HTMLDOC converts HTML files to PDF or PostScript, generates a table-of-contents for books and generates indexed HTML files.
  • htmlrecode
    htmlrecode recodes the HTML file using a new character set, while losing no characters at all.
  • IDReplace
    replaces key tags read from a template file with the data read from a data file and generate an output file
  • info_to_html
    features are links to other info files, the ability to read compressed files and prettier layout
  • IPDF
    creates indexed pdf documents from text files. Designed to aid creating an electronic distribution method for legacy system reports, since many mainframe type print spools are plain text
  • Isearch
    software for indexing and searching text documents. It supports full text and field based search, relevance ranked results, Boolean queries, and heterogeneous databases. Isearch can parse many kinds of documents "out of the box," including HTML, mail folders, list digests, SGML-style tagged data, and USMARC
  • JadeTeX
    a TeX macro package for processing the output from Jade/OpenJade in TeX (-t) mode
  • jbofihe
    jbofihe is a parser for checking the grammatical correctness of Lojban text. It also provides approximate translations of Lojban into English. (Lojban is a constructed human language with the interesting property that its grammar can be cast into a form that Bison can parse).
  • Jcode
    a perl module to do japanese character conversion
  • jpg2html
    thumbnails for jpg images; XawTV snaps and Mavica cameras
  • jq
    jq is a lightweight and flexible command-line JSON processor. jq is like sed for JSON data - you can use it to slice and filter and map and transform structured data with the same ease that sed, awk, grep and friends let you play with text.
  • ktail
    a KDE tail program that monitors multiple files and/or command output in one window
  • KTextDecode
    KTextDecode is a Cyrillic text conversion utility for KDE 1.1.x.
  • latex2slides
    Latex2slides is a simple graphical program that produces a set of HTML/JPEG slides from a TeX or LaTeX source. Alternatively, the source can be a multipage postscript, DVI or PDF FILE, and the image format for the slides can be set to PNG.
  • LaTeXDB
    brings together LaTeX and an SQL database. By using LaTeXDB you can use SQL queries in your LaTeX document and loop over the result sets, creating tables, serial letters and other stuff
  • lazyread
    Lazyread auto-scrolls files or command output to the screen. Change scroll modes, scroll-speed, colors, pause, search, etc. Render text, HTML, PDF, gzip, tar, zip, ar, bzip2, MS-Word, nroff, binary, directories, .deb, .so, .rpm, piped output and more.
  • lcra
    renames uppercase filenames to lowercase
  • less
    a pager. A pager is a program that displays text files
  • lft
    lft lists files by file type (directory, regular file, symbolic link, etc...)
  • LigaTeX
    LigaTeX removes unnecessary ligatures from TeX files. The program currently only works with texts written in German.
  • logtool
    a command line program that will parse syslog (and syslog-like) logfiles into a more palatable format. It will take anything resembling a standard syslog file (this includes syslog-ng, and probably most of the other variants out there), and crunch it into one of the following formats for your viewing
  • logtools
    logtools is a collection of tools to merge, sort, split, and mangle CLF-format Web logs. Some tools for generic log file manipulation are included.
  • loook
    a simple Python tool that searches for text strings in (and StarOffice 6.0 or later) files. It works under Linux, Windows and Macintosh
  • Lout
    Lout is a document formatting system that reads a high-level description of a document similar in style to LaTeX and produces a PostScript file. The system reads a high-level description of a document similar in style to LaTeX and produces a PostScript file which can be printed on most laser printers and graphic display devices. Plain text output is also available, PDF output is limited but working (e.g. no graphics).
  • lq-text
    a console based text retrieval package that is for indexing text/HTML documents
  • Lucidor
    Lucidor is a program for reading and handling e-books. It supports e-books in the EPUB file format and catalogs in the OPDS format. Read more
  • lyx2html
    a very simple Lyx to HTML converter. As the name suggests, it takes a ".lyx" document as input and generates an HTML-file following a few simple rules. "lyx2html" can be very useful for generation documentation. This is a beta-release
  • makefaq
    creates an HTML Frequently-Asked Questions page from a text file in which each category, question, and answer are on a single line in the file
  • man2web
    converts man (manual) pages to html via CGI or on the command line. man2web also allows for keyword (apropos) searching and generation of section indexes
  • mll2html
    mll2html is a GNU program which reformats a mailinglists text file (like this) to a HTML mailinglists file.
  • modifile
    Modifile is an application to modify the contents of text files, using Perl substitution expressions. Optionally, it can run interactively, with the user confirming substitutions to be made.
  • mozilla2ps
    mozilla2ps is a quick and gross hack to convert html to postscript pages in an unattended manner.
  • NFO Viewer
    NFO Viewer is a simple viewer for NFO files, which are "ASCII" art in the CP437 codepage. The advantages of using NFO Viewer instead of a text editor are preset font and encoding settings, automatic window size, and clickable hyperlinks.
  • ngp
    ngp is a grep tool that lets you look for a pattern in your source code directory and display results in ncurses.
  • odt2txt
    odt2txt extracts the text out of OpenDocument Texts. It is small, fast and supports multiple output encodings.
  • OpenBerg Reader
    OpenBerg Reader is an open-standards-based, multi-platform eBook reader. It is comparable to Adobe Acrobat in its purpose, and is based on Mozilla technologies.
  • otl
    otl is a text processor for generating custom markup from plain text supplemented with lightweight markup. Much of both the input and output formats can be customized.
  • otxt2html
    a tool for converting text files to html content, licensed under GPL
  • out2html
    out2html converts program output such as that produced by "git log --color" to colorized HTML.
  • Panconvert
    Panconvert is a markup document converter. It allows selecting files, or ad-hoc conversion of entered/pasted markup. Pandoc needs to be installed, and handles MarkDown/CommonMark, LaTeX, OPML, ODT and EPUB formats.
  • Par
    A paragraph reformatter, vaguely similar to fmt, but better. Par is a filter which copies its input to its output, changing all white characters (except newlines) to spaces, and reformatting each paragraph.
  • Pardiff
    takes the output of diff and display it in a parallel (side-by-side) format, emulating the /PARALLEL option on the VMS version of diff
  • PDF Split and Merge
    PDF Split and Merge (pdfsam) is an easy-to-use tool that provides functions to split and merge PDF files or subsections of them.
  • pdf2html
    pdf2html takes one pdf and generates series of html and PNG images. Each html page contains an image of one page of pdf document.
  • PDFindex
    a very flexible and powerful PERL5 program. It's the simplest way to create a PDF index from your PDF archive
  • PDFKreator
    an easy to use KDE tool for creating PDF documents out of a bunch of image files. It heavily uses ImageMagick's convert tool, tiff2ps and ps2pdf
  • pdfSplit
    (commercial) pdfSplit in an application that allows to separate and/or re-arrange the order of the pages of an Adobe Acrobat file.
  • pdftk
    pdftk is a simple, command line tool for doing everyday things with PDF documents. Use it to merge PDF documents, split PDF pages into a new document, decrypt input as necessary (password required), encrypt output as desired, fill PDF forms with FDF data and/or flatten forms, apply a background watermark, report on PDF metrics, update PDF metadata, attach files to PDF pages or the PDF document, unpack PDF attachments, burst a PDF document into single pages, decompress and re-compress page streams, and repair corrupted PDF files (where possible). Read more
  • pdftohtml
    an open-source PDF-to-HTML converter
  • Penetrator
    a tool for indexing big trees of text files, as your local HTML documentation or home directory
  • phtx
    phtx is a command line tool that extract data from tables in HTML-encoded files.
  • pmpp
    a Poor Man's Pre-Processor, implemented in Python. It can be used to preprocess text such as complicated HTML and forms
  • printconvert
    printconvert is a command-line tool to convert between DOS-style newlines (CR-LF) to Unix-style newlines (LF).
  • psbind
    psbind examines the margins in a PostScript document and rearranges the pages to fit them onto paper efficiently.
  • Psiconv
    Psiconv is a PSION 5 Word conversion utility released under the GPL.
  • Pspell
    the goal of the Pspell library is to provide a generic interface to Spell checker libraries installed on the system
  • pyRenamer
    With pyRenamer you can change the name of several files at the same time easily. You can rename files using patterns or search and replace or common substitutions. You can manually rename selected files. You can rename image and music files using their metadata.
  • Pyrite Publisher
    Pyrite Publisher is a set of powerful tools for building e-texts in the de facto standard Doc format used on the Palm Computing platform. It currently includes tools for converting HTML and ASCII text to Doc databases.
  • QDMerge
    QDMerge is a modular and extensible engine for merging data files with various templates to create documents. Useful for small to medium sized web sites, but not limited to X/HTML output.
  • randtype
    randtype is a small utility to read either standard input or text files and display the output, character-by-character or line-by-line, at random intervals.
  • rbpar
    Rbpar is a program and an accompanying library suite designed for formatting text paragraphs. In this sense, it greatly resembles the venerable Unix programs fmt and par. The difference is that rbpar sports a more modern design: it is written completely in Ruby and offers an internal API for several paragraph formatting tasks.
  • recode
    has the purpose of converting files between various character sets and usages. When exact transliterations are not possible, as it is often the case, the program may get rid of the offending characters or fall back on approximations
  • Redet
    Regular Expression Development and Execution Tool: allows the user to construct regular expressions and test them against input data by executing any of a variety of search programs, editors, and programming languages that make use of regular expressions
  • regex-markup
    regex-markup performs regular expression-based text markup according to used-defined rules. This can be used to color syslog files as well as the output of programs such as ping, traceroute, gcc etc.
  • regextract
    regextract applies a regexp to a file and prints all matches.
  • Region Oriented Ascii Processor
    scans a text file, extracts regions that matches specified patterns from it, and processes them with specified executables sequentially
  • Rel
    a suite of programs and tools for building wide area full text information retrieval systems over the Internet. The search mechanisms are capable of sorting documents by relevance to keyword search criteria. Boolean operations (and, or, not, and grouping operators,) on multiple keywords are fully supported and the programs are capable of phonetic keyword search. The programs are also find application in enterprise wide area information retrieval systems.
  • replace
    provides a much easier way than sed of replacing one or more strings with others in one or more text or binary files or from standard input
  • RIV2ASCII Conversion
    RIV2ASCII Conversion is a simple tool that find the meaning letter codes spotted on freight trains and writes them to the screen.
  • rlwrap
    rlwrap is a 'readline wrapper', i.e. a small utility that uses the GNU readline library to allow the editing of keyboard input for any other command.
  • rotfl
    rotfl is a simple text-formatting language. It's similar in function to TeX, HTML, nroff/groff, Postscript.
  • rpl
    rpl is a UN*X text replacement utility. It will replace strings with new strings in multiple text files.
    a tool to convert RTF documents (from Microsoft Word, Word Perfect, Frame Maker...) into documents for the WWW
  • Russian Anywhere
    Russian Anywhere is a utility to convert Cyrillic files between different codepages.
  • safecat
    safecat implements Dan Bernstein's maildir algorithm, copying standard input safely to a specified directory. With safecat, the user is offered two assurances. First, if safecat returns successfully, then all data is guaranteed to be saved in the destination directory. Second, if a file exists in the destination directory, placed there by safecat, then the file is guaranteed to be complete.
  • Sar2html
    Sar2html converts sar binary data to a graphical HTML format. It has a command line tool, Web interface, and data collection script.
  • Sarep
    SAREP is a command line search and replace tool written in Perl. It supports regular expressions, multiple file search-and-replace, wildcards, writing out to a new file (rather than overwriting the modified file), and the code is well commented so you can make changes very easily.
  • Seetxt
    Seetxt is a lightweight text file and man page viewer for X windows.
  • Selathco
    an acronym for Simple Extensible LaTeX To HTML Converter. It is a program which reads a LaTeX source file and converts all known (i.e built-in or user created) commands to the appropriate HTML tags
  • SGMLtools-Lite
    the sucessor of the now-obsolete SGMLtools project and consists of the easy-to-use front-end, a large number of processing backends, and some custom stylesheets
  • Sgrep
    a tool for searching text files and filtering text streams using structural criteria
  • signature
    a free, open-source producer of dynamic signatures for livening up your e-mail and news postings. It will allow you to sign your messages with a different sig every time
  • similarity-utils
    programs to give a quantitative measure of how similar two files are. similarity_by_diff measures the number of difference lines reported by diff(1), while similarity_by_zlib tries compressing the two files separately and togethe
  • SimpleFont
    takes words and such as input and it makes large variable-width letters consisting of asterisks
  • splitpea
    splitpea is a command-line tool written in Python that can split a file into multiple fixed-size pieces and join those pieces to form the original file.
  • Stenciltools
    a collection of various small utilities, written for text generation and text manipulation. They include: count - Generate sequenced strings, csvconv - Converter for CSV files, rot13 - Rot13 encoder/decoder, memory - File based hashtables and linkget - Extract links from HTML documents
  • super sed
    super sed is an enhanced version of sed.
  • t2t
    t2t is a Perl script that converts standard ASCII text to HTML 4.0 tables. Any text with the delimiter embedded in it is converted to a table. The user can specify any regular Perl expression as a delimiter; the default delimiter is the tab.
  • Tabfmt
    Tabfmt is a command line utility to format tabular data. The program reads lines from one or more files or from standard input, breaks the lines into fields given a set of input field delimiters, and prints a table with constant-width columns to standard output or a specified file. Minimum and maximum field widths, left and right padding, as well as the characters used for filling, padding and delimiting the fields can be specified.
  • Table
    Table is a small set of programs that treats HTML tables like database tables.
  • Tag-types
    Utilities for manipulating tagged files
  • tbook XML
    a system that transforms XML to LaTeX, HTML, XHTML+MathML and DocBook
  • TEItools
    TEItools is a coupled set of scripts, written in Tcl, which does various SGML transformations. Currently they include the following converters: from TEI Lite to HTML, RTF, TeX, DVI, PS, PDF; from HTML to TEI Lite, Linuxdoc, TeX, DVI, PS, PDF; from Linuxdoc to HTML, TEI Lite, DocBook, TeX, DVI, PS, PDF; and from DocBook to TEI Lite.
  • tesh
    tesh is a simple shell which allows you to create and manage text documents by applying tags to them.
  • TiMBL
    The Tilburg Memory Based Learner, TiMBL, is an open source tool for NLP research, and for many other domains where classification tasks are learned from examples. Read more
  • timestamp
    timestamp is a text filtering pipe that marks each line with a timestamp. The time is set when the first character of the line is received, and the util is capable of coping with CR repeats fairly well (won't over-write or update the timestamp).
  • TkDiff
    a graphical front end to the diff program. It provides a side-by-side view of the differences between two files, along with several innovative features such as diff bookmarks and a graphical map of differences for quick navigation
  • TkSplit
    a simple file splitter/joiner written in Tcl/Tk. It can currently only be used to join previously-split files such as those downloaded from newsgroups with .001 .002 etc... extensions, but unlike many splitters, the user can opt to skip missing files
  • tlgu
    tlgu is a utility for converting an input file in Thesaurus Linguae Graeca (TLG) or Packard Humanities Institute (PHI) representation (beta code text and citation information) into Unicode (UTF-8). A companion Hellenic Polytonic HOWTO is also included in the tlgu site.
  • tlve
    tlve is a command-line tool to parse different tlv (tag-length-value) structures and for printing them in different text-based formats.
  • todo2html
    todo2html generates pretty HTML from a standard text TODO file. The formatting is configurable with style sheets, and there is an easy-to-read built-in style or a style-free option.
  • Treescan
    finds all the files in a directory tree and prints their names, using an optimised disk access strategy. It is similar to `find -print'. The added feature is that Treescan optimises the I/O in various ways. It is sometimes much faster than the naive strategy used by `find'
  • trowser
    Trowser is a browser for large line-oriented text files (such as debug traces). It's meant as an alternative to "less". Compared to less, trowser adds color highlighting, a persistent search history, graphical bookmarking, separate search result windows, and flexible skipping of input from pipes to STDIN. Trowser has a graphical interface, but is designed to allow browsing via the keyboard at least to the same extent as less. Key bindings and the cursor positioning concept are derived from vim.
  • Turma
    search (and replace) for text blocks in multiple files & (sub)directories
  • Txr
    Txr implements a sophisticated query language that matches data across one or more text files or Unix pipes. Queries match entire files or sections of files.
  • txt2graph
    reads a textdocument from stdin, removes all non-alphas and generates a array (list) of words. Then it converts german-umlauts, because graphviz can only handle clean ASCII as node-description and output a dot-file for a directed or an undirected graph
  • txt2html
    a Perl program that converts plain text to HTML. It uses the HTML::TextToHTML perl module to do so
  • txt2man
    converts flat ASCII text to man page format. It is a shell script using gnu awk, that should run on any Unix like system
  • txt2pdf
    a power PERL5 script to convert text files to PDF format
  • txt2pdf PRO
    (commercial) a very flexible and powerful Perl program that converts files from text to PDF format. It extends txt2pdf, with features such as the ability to add form feeds, to skip the first form feed, to not print the file name in the first line, to set the top and left margins, and to set all the text to bold, italic, or bold italic
  • txt2regex$
    a Regular Expression "wizard", all written with bash2 builtins, that converts human sentences to RegExs. with a simple interface, you just answer to questions and build your own RegEx for a large variety of programs, like awk, ed, emacs, grep, perl, php, python, sed, tcl and vim
  • txt2tags
    a generic text converter. From simple text files, it generates HTML, sgml, man, Magic Point (mgp), MoinMoin and Adobe PageMaker documents
  • txtbdf2ps
    txtbdf2ps is a perl script that can generate compact, DSC-compliant Postscript out of a plain text file and a BDF font.
  • unac
    unac is a C library and command that removes accents from a string.
  • Unicode Data Browser
    UnicodeDataBrowser is a browser for the UnicodeData.txt file, which contains much useful information but is not easily read by humans. It creates a scrollable table in which columns represent properties. The table may be sorted on any column. Abbreviations are expanded and characters cross-referenced in decomposition and casing fields are named. Regular expression search restricted to a selected column is available. The set of characters for which information is displayed may be restricted to those characters matching a regular expression on a specified property.
  • UnRTF
    UnRTF is a command-line converter from RTF (Rich Text) to HTML, LaTeX, PostScript, plain text, and text with VT100 codes. When converting to HTML, it supports tables, fonts, embedded images, hyperlinks, paragraph alignment, and more.
  • unsort
    Unsort unsorts a textfile. In other words: it randomizes the order of the lines in a file.
  • utf2any
    utf2any translates a file encoded in UTF-7 or UTF-8 (Unicode) into any 7- or 8-bit text format.
  • vcomment
    vcomment is a very small and simple tool that attempts to show every line with a comment (both C++ style double slash, and original C 'slash-star-star-slash' matched quotes) inside one or more files.
  • Vilistextum
    An html to ascii or utf-8 converter specifically programmed to output text suitable for reading.
  • Word Unmunger
    a small Python program which removes much of the HTML cruft produced by Microsoft Word 2002 (Word version 10), making them much easier to hand-edit
  • WordFlashReader
    WordFlashReader is an Rapid Serial Visual Presentation (RSVP) program useful for anyone who has an electronic text or book they wish to read. It flashes each word of the text sequentially and pauses for punctuation. Opens *.txt, *.html, and *.pdf files.
  • WPP
    a small perl5 script that allows preprocessing of html files
  • wv
    a library which allows access to microsoft word files. It can load and parse the word 2000,97,95 and 6 file formats. These are the file formats known internally as word 9,8,7 and 6. wv compiles and works under most operating systems, particularly Linux, Solaris, AIX and OSF1 (formerly known as mswordview)
  • wvDecryptTest
    a Microsoft Word 97 password validator and almost decrypter
  • wvWare
    the continuation of Caolan McNamara's wv - the MSWord library. Efforts are underway to make this library more correct, robust, and turn it into a Word97 exporter.
  • xml2
    these tools are used to convert XML and HTML to and from a line-oriented format more amenable to processing by classic Unix pipeline processing tools, like grep, sed, awk, cut, shell scripts, and so forth
  • xroottext
    xroottext renders stdin onto the root window with line wrap and scrolling.
  • xtranslate
    xtranslate is a tool to convert the text in the X11 selection buffer from ASCII to arbitrary UTF-8 characters. This is particularly useful when you've accidentally typed some text while the keyboard was in the wrong language mode.
  • y2l
    Yacc to LaTeX: takes any yacc source file, and derives an Extended Backus-Naur Form (EBNF) description from it. This EBNF is written out as LaTeX source
  • Yahp
    builds html pages out of "templates". It is especially useful for building web sites where the look and feel of all pages should be the same

Matching Content

Share this Page
Bookmark and Share
Submit this page to popular social networks such as Digg, Twitter, StumbleUpon and more.

My LinuxLinks
  • Bookmarked links
  • Emailed Newsletter
  • Your own profile

  • Top Applications
    100 Essential Apps
    All Group Tests

    Top Free Software
    5 Office Suites
    3 Lean Desktops
    7 Document Processors
    4 Distraction Free Tools
    9 Project Management
    4 Business Solutions
    9 Groupware Apps
    14 File Managers
    10 Databases
    21 Backup Tools
    21 Productivity Tools
    5 Note Taking Apps
    9 Terminal Emulators
    21 Financial Tools
    5 Bitcoin Clients
    21 Text Editors
    21 Video Emulators
    21 Home Emulators
    42 Graphics Apps
    6 CAD Apps
    42 Scientific Apps
    10 Web Browsers
    42 Email Apps
    12 Instant Messaging
    10 IRC Clients
    7 Twitter Clients
    12 News Aggregators
    11 VoIP Apps
    42 Best Games
    9 Steam Games
    42 Audio Apps
    5 Music Streaming
    42 Video Apps
    5 YouTube Tools
    80 Security Apps
    9 System Monitoring
    8 Geometry Apps
    Free Console Apps
    14 Multimedia
    4 Audio Grabbers
    9 Internet Apps
    3 HTTP Clients
    5 File Managers
    8 Compilers
    9 IDEs
    9 Debuggers
    7 Revision Control Apps
    6 Doc Generators
    Free Web Software
    21 Web CMS
    14 Wiki Engines
    8 Blog Apps
    6 eCommerce Apps
    5 Human Resource Apps
    10 ERP
    10 CRM
    6 Data Warehouse Apps
    8 Business Intelligence
    6 Point-of-Sale

    Other Articles
    Migrating from Windows
    Back up your data
    20 Free Linux Books
    24 Beginner Books
    12 Shell Scripting Books

    Web Calendar
    Linux Licenses

    Advertise at


    Add Link | Modify Link | About | FAQ | Guide | Privacy | Awards | Contact |
    Portal Version 0.7. Intel Blade.
    Comments to the webmaster are welcome.
    Copyright 2009 All rights reserved.