General Architecture for Text Engineering (GATE) is an open source full-lifecycle solution for a broad range of Natural Language Processing tasks. GATE excels at text analysis of all shapes and sizes. Read more hot
(commercial) a file compare/merge (dIFF-like) tool with a GUI implemented in Java hot
DiffJ is a commandline application that compares Java files based on content, not whitespace, comments, or reordering of types, methods, or fields.
a collection of XML diff and patch utilities which operate on the hierarchical structure of XML documents
(shareware) a Java Xml parser. Features: High performance XML parser, SAX Level 1 and 2 compliant, DOM Level 1 and 2 compliant
File2XLIFF4j is a java based library for converting files to the XLIFF standard. Additional file type converters can be added.
Holodeck10 is a project built using JAVA to allow the user to select and layout images on a LETTER sheet of paper to be printed.
Jarmor (Java ASCII Armor) is a tiny collection of stream filters which implement ASCII armors. They can be used to convert streams of binary data into text and vice-versa. For example, it can be used to transfer or store data using text-only means like mail or XML. The supported encodings are Base64, Base32, Base16, UUCP, and ASCII85.
Java ReStructuredText is a parser and converter. It can parse ReStructuredText and generate XHTML, xdoc, and DocBook, or you can use your own XSL file.
JavaStats77 is a tool for generating source code statistics in HTML format and converting .java files into HTML. It is integrated with Java2Html.
a Java[tm] application which validates hyperlinks in web sites. It includes no native code so it should run on any Java 1.1.7 virtual machine
JCols parses text files by applying a user specified expression to each line.
jfor converts XML documents conforming to the XSL-FO specification to RTF format, the goal being to use the same XSL-FO documents (as often generated using XSLT transforms) to generate PDF (using FOP or similar) and RTF (using jfor) documents.
an application for adding reading and translation annotations to words in a Japanese text document
Jitac is an image to ASCII converter written in Java.
JReferences is a tool to store and retrieve bibliographic references from a file or MySQL database. It reads BibTeXML, DocBook XML and RIS type references, and can output these and BibTex.
an interesting way to produce RTF files from Java. These files are produced by using an xml style sheet defining the way they should look like
LargeFileViewer can display very large text files.
RegexSearch is an application that performs find and find-and-replace searches for regular expressions on multiple text files. It can search for literal text or regular expressions. The search can be performed on a single file, on a directory (with optional recursion) or on the files and directories listed in a text file. Files can be included in and excluded from the search by means of filters.
Scraper is an easier way to extract data from HTML documents. It lets you describe the information you want to extract without XPath, DOM traversal, or regular expressions
stroy is a smart diff tool. For now it specializes on directories of files. Its differentiating feature is the ability to match files which have different names, locations and content.
designed to add syntax coloring to web pages that display source code or to add color syntax highlighting ability to any text editor written in Java
TDTxE (Tagged Document Tansformer Engine) transforms the markup of a text, based on customizable tables, such as tags from HTML 3.2 to HTML+CSS (or vice versa).
TransfoDocbook is an application to convert XML Docbook documents into pdf, single and multiple HTML.
a Java/Windows conversion utility. Convert text or HTML documents in Vietnamese legacy encodings—VNI, VPS, VISCII, TCVN, and VIQR/Vietnet—to Unicode UTF-8