BLLIP Parser is a statistical natural language parser including a generative constituent parser (first-stage) and discriminative maximum entropy reranker (second-stage). It’s also known as Charniak-Johnson or the Brown reranking parser.
There are parsing and reranker models.
Python and Java interfaces are available.
Depending on the text that you’d like to parse, there are different optimal parsing models. Here are the current recommendations:
- News text: WSJ+Gigaword-v2.
- Web text: SANCL2012-Uniform.
- Biomedical (PubMed) text: GENIA+PubMed.
- WSJ section 23 evaluations to replicate papers: For purely supervised parser or parser/reranker results, use either WSJ-PTB3 (for Penn Treebank WSJ) or OntoNotes-WSJ (for the OntoNotes version of WSJ). Use WSJ+Gigaword to replicate self-training results, though WSJ+Gigaword-v2 performs slightly better.
- Everything else: In general, it’s probably best to use SANCL2012-Uniform or WSJ+Gigaword-v2 depending on how well-formed your text is (SANCL2012-Uniform for more informal web/email text).
NLTK provides an interface to the BLLIP reranking parser (aka Charniak-Johnson parser, Charniak parser, Brown reranking parser).
Website: bllip.cs.brown.edu
Support: GitHub Code Repository
Developer: Mark Johnson, Eugene Charniak
License: Apache License 2.0
Related Software
| C++ Natural Language Processing Tools | |
|---|---|
| text2vec | Framework with API for text analysis and natural language processing |
| Moses | Statistical machine translation system |
| TiMBL | Tilburg Memory-Based Learner |
| MITIE | MIT Information Extraction |
| MeTA | Modern C++ data sciences toolkit |
| Colibri Core | Efficient n-gram & skipgram modelling on text corpora |
| CRF++ | Yet Another CRF toolkit |
| BLLIP Parser | Statistical natural language parser |
Read our verdict in the software roundup.
Explore our comprehensive directory of recommended free and open source software. Our carefully curated collection spans every major software category.This directory is part of our ongoing series of informative articles for Linux enthusiasts. It features hundreds of detailed reviews, along with open source alternatives to proprietary solutions from major corporations such as Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk. You’ll also find interesting projects to try, hardware coverage, free programming books and tutorials, and much more. Discovered a useful open source Linux program that we haven’t covered yet? Let us know by completing this form. |

