stringi is an R package for very fast, portable, correct, consistent, and convenient string/text processing in any locale or character encoding.
stringi provides numerous functions related to data cleansing, information extraction, and natural language processing.
This is free and open source software.
Key Features
- String concatenation, padding, wrapping.
- Substring extraction.
- Pattern searching (e.g., with Java-like regular expressions).
- Collation and sorting.
- Random string generation.
- Case mapping and folding.
- String transliteration.
- Unicode normalisation.
- Date-time formatting and parsing.
Website: stringi.gagolewski.com
Support: GitHub Code Repository
Developer: Marek Gagolewski and contributors
License: 3-clause BSD License
stringi is written in C++ and C. Learn C++ with our recommended free books and free tutorials. Learn C with our recommended free books and free tutorials.
Related Software
| R Natural Language Processing Tools | |
|---|---|
| tidytext | Text mining using dplyr, ggplot2, and other tidy tools |
| quanteda | R package for Quantitative Analysis of Textual Data |
| text2vec | Framework with API for text analysis and natural language processing |
| wordcloud | Create attractive word clouds |
| tm | Text Mining Infrastructure in R |
| srtringi | Fast and portable character string processing in R |
| Stringr | String manipulation in R |
| UDPipe | Tokenization, Tagging, Lemmatization and Dependency Parsing |
| tokenizers | Convert natural language text into tokens |
| spacyr | R wrapper around the Python spaCy package |
| Word Vectors | Build and explore embedding models |
| syuzhet | Extraction of sentiment and sentiment-based plot arcs from text |
| textTinyR | Text processing for small or big data |
| sentimentr | Dictionary based sentiment analysis |
| textclean | Collection of tools to clean and normalize text |
| TALL | Explore, model, and visualize textual data |
| corpustools | Various tools for analyzing text corpora |
| topicmodels | Interface to LDA and CTM models |
| text | Analyzing natural language with transformers-based large language models |
| RTextTools | Automatic text classification via supervised learning |
Read our verdict in the software roundup.
Explore our comprehensive directory of recommended free and open source software. Our carefully curated collection spans every major software category.This directory is part of our ongoing series of informative articles for Linux enthusiasts. It features hundreds of detailed reviews, along with open source alternatives to proprietary solutions from major corporations such as Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk. You’ll also find interesting projects to try, hardware coverage, free programming books and tutorials, and much more. Discovered a useful open source Linux program that we haven’t covered yet? Let us know by completing this form. |

