OpenSeq2Seq is a toolkit for distributed and mixed precision training of sequence-to-sequence models:
- Machine translation (GNMT, Transformer, ConvS2S, …). These models have been trained with BPE vocabulary used for text tokenization.
- Speech recognition (DeepSpeech2, Wave2Letter, Jasper, …).
- Speech commands (RN-50, Jasper). Automatic speech recognition (ASR) systems can be built using a number of approaches depending on input data type, intermediate representation, model’s type and output post-processing. OpenSeq2Seq is currently focused on end-to-end CTC-based models (like original DeepSpeech model).
- Speech synthesis (Tacotron2, Tacotron2 GST, WaveNet, Centaur, …).
- Language model (LSTM with WikiText-2, LSTM with WikiText-103).
- Sentiment analysis (SST, IMDB, …).
- Image classification, a mixed precison replica of TensorFlow ResNet-50.
OpenSeq2Seq main goal is to allow researchers to most effectively explore various sequence-to-sequence models. The efficiency is achieved by fully supporting distributed and mixed-precision training.
OpenSeq2Seq is built using TensorFlow and provides all the necessary building blocks for training encoder-decoder models for neural machine translation, automatic speech recognition, speech synthesis, and language modeling.
Speech-to-text workflow uses some parts of Mozilla DeepSpeech project.
This is a research project, not an official NVIDIA product.
Key Features
- Models for:
- Neural Machine Translation.
- Automatic Speech Recognition.
- Speech Synthesis.
- Language Modeling.
- NLP tasks (sentiment analysis).
- Data-parallel distributed training:
- Multi-GPU.
- Multi-node.
- Mixed precision training for NVIDIA Volta/Turing GPUs.
- Supports two modes for parallel training: simple multi-tower approach and Horovod-based approach.
- Supports two new optimizers: Layer-wise Adaptive Rate Control (LARC) and NovoGrad. NovoGrad is a first-order SGD-based algorithm, which computes second moments per layer instead of per weight as in Adam.
- Mixed precision with existing models.
- Interactive infer – a mode that makes it easy to demo trained models.
Website: nvidia.github.io/OpenSeq2Seq
Support: GitHub Code Repository
Developer: NVIDIA
License: Apache License 2.0
OpenSeq2Seq is written in Python. Learn Python with our recommended free books and free tutorials.
Related Software
| Speech Recognition Tools | |
|---|---|
| Whisper | Automatic speech recognition (system trained on 680,000 hours of data |
| Flashlight | Fast, flexible machine learning library written entirely in C++. |
| Coqui STT | Deep-learning toolkit for training and deploying speech-to-text models |
| Kaldi | C++ toolkit designed for speech recognition researchers. |
| SpeechBrain | All-in-one conversational AI toolkit based on PyTorch |
| Handy | Offline speech-to-text application |
| ESPnet | End-to-End speech processing toolkit |
| deepspeech.pytorch | Implementation of DeepSpeech2 using Baidu Warp-CTC. |
| Whispering | Transcription application with global speech-to-text functionality |
| Julius | Two-pass large vocabulary continuous speech recognition engine |
| CMUSphinx | Speech recognition system for mobile and server applications |
| Simon | Flexible speech recognition software |
| hyprwhspr | Native speech-to-text designed for Arch / Omarchy |
| ostt | Open Speech-to-Text |
| DeepSpeech | TensorFlow implementation of Baidu's DeepSpeech architecture. |
| OpenSeq2Seq | TensorFlow-based toolkit for sequence-to-sequence models |
| Eesen | End-to-End Speech Recognition |
Read our verdict in the software roundup.
Explore our comprehensive directory of recommended free and open source software. Our carefully curated collection spans every major software category.This directory is part of our ongoing series of informative articles for Linux enthusiasts. It features hundreds of detailed reviews, along with open source alternatives to proprietary solutions from major corporations such as Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk. You’ll also find interesting projects to try, hardware coverage, free programming books and tutorials, and much more. Discovered a useful open source Linux program that we haven’t covered yet? Let us know by completing this form. |

