TALL (Text Analysis for ALL) is an R Shiny application for exploring, modelling, and visualizing textual data. It’s aimed at researchers who need a graphical environment for natural language processing tasks, covering the workflow from data import and cleaning through to statistical analysis, interpretation, and reporting.
The application is designed for work with collections such as research articles, social media posts, survey responses, customer reviews, legal documents, and literary texts. It brings together text mining, linguistic annotation, topic modelling, sentiment analysis, and interactive visual exploration in a reproducible R-based workflow.
This is free and open source software.
Key Features
- Imports plain text, CSV, Excel, PDF, and Biblioshiny export files.
- Offers tokenization, lemmatization, part-of-speech tagging, dependency parsing, and special entity detection.
- Includes corpus statistics, lexical richness measures, TF-IDF rankings, Zipf’s law plots, and word clouds.
- Provides keyness analysis, KWIC concordance, correspondence analysis, co-occurrence networks, thematic maps, and word embeddings.
- Supports LDA, CTM, and STM topic modelling with model selection metrics and diagnostics.
- Includes lexicon-based polarity detection, emotion analysis, syntactic complexity metrics, and SVO triplet extraction.
- Exports analyses to Excel workbooks and plots to high-resolution PNG images.
Website: github.com/massimoaria/tall
Support:
Developer: Massimo Aria and contributors
License: MIT License
Related Software
| R Natural Language Processing Tools | |
|---|---|
| tidytext | Text mining using dplyr, ggplot2, and other tidy tools |
| quanteda | R package for Quantitative Analysis of Textual Data |
| text2vec | Framework with API for text analysis and natural language processing |
| wordcloud | Create attractive word clouds |
| tm | Text Mining Infrastructure in R |
| srtringi | Fast and portable character string processing in R |
| Stringr | String manipulation in R |
| UDPipe | Tokenization, Tagging, Lemmatization and Dependency Parsing |
| tokenizers | Convert natural language text into tokens |
| spacyr | R wrapper around the Python spaCy package |
| Word Vectors | Build and explore embedding models |
| syuzhet | Extraction of sentiment and sentiment-based plot arcs from text |
| textTinyR | Text processing for small or big data |
| sentimentr | Dictionary based sentiment analysis |
| textclean | Collection of tools to clean and normalize text |
| TALL | Explore, model, and visualize textual data |
| corpustools | Various tools for analyzing text corpora |
| topicmodels | Interface to LDA and CTM models |
| text | Analyzing natural language with transformers-based large language models |
| RTextTools | Automatic text classification via supervised learning |
Read our verdict in the software roundup.
Explore our comprehensive directory of recommended free and open source software. Our carefully curated collection spans every major software category.This directory is part of our ongoing series of informative articles for Linux enthusiasts. It features hundreds of detailed reviews, along with open source alternatives to proprietary solutions from major corporations such as Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk. You’ll also find interesting projects to try, hardware coverage, free programming books and tutorials, and much more. Discovered a useful open source Linux program that we haven’t covered yet? Let us know by completing this form. |

