doccano is a data labelling and text annotation platform for machine learning practitioners.
It’s designed for building datasets used in natural language processing workflows, including text classification, sequence labelling, sequence-to-sequence tasks, sentiment analysis, named entity recognition, and text summarisation. The software provides a browser-based environment where users can create projects, upload data, annotate records, manage labelling work, and export completed datasets for further analysis or model training.
This is free and open source software.
Key Features
- Collaborative annotation workflow for teams working on labelling projects.
- RESTful API for integrating annotation workflows with scripts and machine learning models.
- Supports multiple languages for data annotation tasks.
- Mobile-friendly interface for working across different devices.
- Emoji support for annotation projects that need expressive labels or content handling.
- Dark theme for a more comfortable interface in low-light environments.
- Can be installed with pip, Docker, or Docker Compose.
Website: github.com/doccano/doccano
Support:
Developer: Hiroki Nakayama and contributors
License: MIT License
doccano is written in Python. Learn Python with our recommended free books and free tutorials.
Related Software
| Python Natural Language Processing Tools | |
|---|---|
| PyTorch-Transformers | Library of state-of-the-art pre-trained models for NLP |
| NLTK | Natural Language Toolkit |
| spaCy | Industrial strength natural language processing |
| scikit-learn | Machine learning library |
| Gensim | Vector space modeling and topic modeling toolkit |
| flair | Simple framework for state-of-the-art NLP |
| TextBlob | Python (2 and 3) library for processing textual data |
| textacy | Python library for performing NLP tasks |
| polyglot | Multilingual text (NLP) processing toolkit |
| AllenNLP | Apache 2.0 NLP research library |
| Snips NLU | Natural Language Understanding Python library |
| PyNLPI | Various modules useful for common, and less common, NLP tasks |
| nlpnet | Natural Language Processing with neural networks |
| Pattern | Web mining module |
| GluonNLP | Deep Learning for NLP |
| PyTorch-NLP | Neural network layers, text processing modules and datasets |
| NLP Architect | Deep Learning NLP/NLU library |
Read our verdict in the software roundup.
Explore our comprehensive directory of recommended free and open source software. Our carefully curated collection spans every major software category.This directory is part of our ongoing series of informative articles for Linux enthusiasts. It features hundreds of detailed reviews, along with open source alternatives to proprietary solutions from major corporations such as Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk. You’ll also find interesting projects to try, hardware coverage, free programming books and tutorials, and much more. Discovered a useful open source Linux program that we haven’t covered yet? Let us know by completing this form. |

