Biopython is a set of freely available tools for biological computation written by an international team of developers.
Biopython features include parsers for various Bioinformatics file formats (BLAST, Clustalw, FASTA, Genbank,…), access to online services (NCBI, Expasy,…), interfaces to common and not-so-common programs (Clustalw, DSSP, MSMS…), a standard sequence class, various clustering modules, a KD tree data structure etc. and even documentation.
Many of Biopython’s modules contain command line wrappers for commonly used tools, allowing these tools to be used from within Biopython. These wrappers include BLAST, Clustal, PhyML, EMBOSS and SAMtools.
Key Features
- The ability to parse bioinformatics files into Python utilizable data structures, including support for the following formats:
- Blast output – both from standalone and WWW Blast.
- Clustalw.
- FASTA.
- GenBank.
- PubMed and Medline.
- ExPASy files, like Enzyme and Prosite.
- SCOP, including ‘dom’ and ‘lin’ files.
- UniGene.
- SwissProt.
- Files in the supported formats can be iterated over record by record or indexed and accessed via a Dictionary interface.
- Code to deal with popular on-line bioinformatics destinations such as:
- NCBI – Blast, Entrez and PubMed services.
- ExPASy – Swiss-Prot and Prosite entries, as well as Prosite searches.
- Interfaces to common bioinformatics programs such as:
- Standalone Blast from NCBI.
- Clustalw alignment program.
- EMBOSS command line tools.
- A standard sequence class that deals with sequences, ids on sequences, and sequence features.
- Tools for performing common operations on sequences, such as translation, transcription and weight calculations.
- Code to perform classification of data using k Nearest Neighbors, Naive Bayes or Support Vector Machines.
- Code for dealing with alignments, including a standard way to create and deal with substitution matrices.
- Code making it easy to split up parallelizable tasks into separate processes.
- GUI-based programs to do basic sequence manipulations, translations, BLASTing, etc.
- Extensive documentation and help with using the modules, including this file, on-line wiki documentation, the web site, and the mailing list.
- Integration with BioSQL, a sequence database schema also supported by the BioPerl and BioJava projects.
Website: biopython.org
Support: Documentation, Mailing Lists, GitHub Code Repository
Developer: Biopython Project Team
License: Biopython License. Some files are explicitly dual licensed under your choice of the Biopython License Agreement or the BSD 3-Clause License.
Biopython is written in Python. Learn Python with our recommended free books and free tutorials.
Related Software
| Bioinformatics Tools | |
|---|---|
| Bioconductor | Analysis and comprehension of high-throughput genomic data |
| Biopython | Tools for biological computation written in Python |
| UGENE | Set of integrated bioinformatics software |
| BioPerl | Perl tools for computational molecular biology |
| GROMACS | Versatile package to perform molecular dynamics |
| IGV | High-performance visualization genome browser tool |
| GATK | Genomic analysis toolkit focused on variant discovery |
| BioJava | Provides Java tools for processing biological data |
| InterMine | Integrate biological data sources |
| bedtools | Powerful toolset for genome arithmetic |
| EMBOSS | The European Molecular Biology Open Software Suite |
| BLAST | Algorithm for comparing primary biological sequence information |
| Galaxy | Web-based platform for data-intensive computational research |
| minimap2 | Versatile sequence alignment program |
| Jalview | Multiple sequence alignment editing, visualisation and analysis |
| samtools | Manipulate next-generation sequencing data |
| BCFtools | Variant calling and manipulating files in the Variant Call Format |
| FastQC | Quality control tool for high throughput sequence data |
| SPAdes | Versatile toolkit for assembling and analysing sequencing data |
| GenomeTools | Collection of bioinformatics tools |
| AliView | Alignment viewer and editor |
| mothur | Analyze microbial communities |
| Bandage | Visualising de novo assembly graphs |
| cramino | BAM/CRAM quality evaluation |
| abPOA | Adaptive banded Partial Order Alignment |
| Taverna Workbench | For designing and executing bioinformatics workflows |
| geWorkbench | Software platform for integrated genomic data analysis |
| Bioclipse | Rich-client platform chemistry and biology workbench |
Read our verdict in the software roundup.
Explore our comprehensive directory of recommended free and open source software. Our carefully curated collection spans every major software category.This directory is part of our ongoing series of informative articles for Linux enthusiasts. It features hundreds of detailed reviews, along with open source alternatives to proprietary solutions from major corporations such as Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk. You’ll also find interesting projects to try, hardware coverage, free programming books and tutorials, and much more. Discovered a useful open source Linux program that we haven’t covered yet? Let us know by completing this form. |

