MeTA is a modern C++ data sciences toolkit. It’s a suite of natural language processing, classification, information retrieval, data mining, and other applications of text processing.
MeTA’s emphasis focuses on the tight integration of search capabilities (indeed, text access capabilities in general) with text analysis functions, enabling it to provide full support for building a powerful text analysis application.
Another design philosophy of MeTA is to facilitate education and research experiments with various algorithms. In this direction, it is similar to Indri/Lemur in its emphasis on modularity and extensibility achieved through object-oriented design. It enables flexible configuration of a selected subset of modules so as to make it easy for designing course assignments or experimenting with a few selected algorithms as needed in focused research projects.
Key Features
- Text tokenization, including deep semantic features like parse trees.
- Inverted and forward indexes with compression and various caching strategies.
- A collection of ranking functions for searching the indexes.
- Topic models.
- Classification algorithms.
- Graph algorithms.
- Language models.
- CRF implementation (POS-tagging, shallow parsing).
- Wrappers for liblinear and libsvm (including libsvm dataset parsers).
- UTF8 support for analysis on various languages.
- Multithreaded algorithms.
Website: github.com/meta-toolkit/meta
Support:
Developer: MeTA Team – University of Illinois at Urbana-Champaign
License: Open source
MeTA is written in C++. Learn C++ with our recommended free books and free tutorials.
Related Software
| C++ Natural Language Processing Tools | |
|---|---|
| text2vec | Framework with API for text analysis and natural language processing |
| Moses | Statistical machine translation system |
| TiMBL | Tilburg Memory-Based Learner |
| MITIE | MIT Information Extraction |
| MeTA | Modern C++ data sciences toolkit |
| Colibri Core | Efficient n-gram & skipgram modelling on text corpora |
| CRF++ | Yet Another CRF toolkit |
| BLLIP Parser | Statistical natural language parser |
Read our verdict in the software roundup.
Explore our comprehensive directory of recommended free and open source software. Our carefully curated collection spans every major software category.This directory is part of our ongoing series of informative articles for Linux enthusiasts. It features hundreds of detailed reviews, along with open source alternatives to proprietary solutions from major corporations such as Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk. You’ll also find interesting projects to try, hardware coverage, free programming books and tutorials, and much more. Discovered a useful open source Linux program that we haven’t covered yet? Let us know by completing this form. |

