Bioinformatics

Biopython – tools for biological computation written in Python

Biopython is a set of freely available tools for biological computation written by an international team of developers.

Biopython features include parsers for various Bioinformatics file formats (BLAST, Clustalw, FASTA, Genbank,…), access to online services (NCBI, Expasy,…), interfaces to common and not-so-common programs (Clustalw, DSSP, MSMS…), a standard sequence class, various clustering modules, a KD tree data structure etc. and even documentation.

Many of Biopython’s modules contain command line wrappers for commonly used tools, allowing these tools to be used from within Biopython. These wrappers include BLAST, Clustal, PhyML, EMBOSS and SAMtools.

Key Features

  • The ability to parse bioinformatics files into Python utilizable data structures, including support for the following formats:
    • Blast output – both from standalone and WWW Blast.
    • Clustalw.
    • FASTA.
    • GenBank.
    • PubMed and Medline.
    • ExPASy files, like Enzyme and Prosite.
    • SCOP, including ‘dom’ and ‘lin’ files.
    • UniGene.
    • SwissProt.
  • Files in the supported formats can be iterated over record by record or indexed and accessed via a Dictionary interface.
  • Code to deal with popular on-line bioinformatics destinations such as:
    • NCBI – Blast, Entrez and PubMed services.
    • ExPASy – Swiss-Prot and Prosite entries, as well as Prosite searches.
  • Interfaces to common bioinformatics programs such as:
    • Standalone Blast from NCBI.
    • Clustalw alignment program.
    • EMBOSS command line tools.
  • A standard sequence class that deals with sequences, ids on sequences, and sequence features.
  • Tools for performing common operations on sequences, such as translation, transcription and weight calculations.
  • Code to perform classification of data using k Nearest Neighbors, Naive Bayes or Support Vector Machines.
  • Code for dealing with alignments, including a standard way to create and deal with substitution matrices.
  • Code making it easy to split up parallelizable tasks into separate processes.
  • GUI-based programs to do basic sequence manipulations, translations, BLASTing, etc.
  • Extensive documentation and help with using the modules, including this file, on-line wiki documentation, the web site, and the mailing list.
  • Integration with BioSQL, a sequence database schema also supported by the BioPerl and BioJava projects.

Website: biopython.org
Support: Documentation, Mailing Lists, GitHub Code Repository
Developer: Biopython Project Team
License: Biopython License. Some files are explicitly dual licensed under your choice of the Biopython License Agreement or the BSD 3-Clause License.

Biopython is written in Python. Learn Python with our recommended free books and free tutorials.


Related Software

Bioinformatics Tools
BioconductorAnalysis and comprehension of high-throughput genomic data
BiopythonTools for biological computation written in Python
UGENESet of integrated bioinformatics software
BioPerlPerl tools for computational molecular biology
GROMACSVersatile package to perform molecular dynamics
IGVHigh-performance visualization genome browser tool
GATKGenomic analysis toolkit focused on variant discovery
BioJavaProvides Java tools for processing biological data
InterMineIntegrate biological data sources
bedtoolsPowerful toolset for genome arithmetic
EMBOSSThe European Molecular Biology Open Software Suite
BLASTAlgorithm for comparing primary biological sequence information
GalaxyWeb-based platform for data-intensive computational research
minimap2Versatile sequence alignment program
JalviewMultiple sequence alignment editing, visualisation and analysis
samtoolsManipulate next-generation sequencing data
BCFtoolsVariant calling and manipulating files in the Variant Call Format
FastQCQuality control tool for high throughput sequence data
SPAdesVersatile toolkit for assembling and analysing sequencing data
GenomeToolsCollection of bioinformatics tools
AliViewAlignment viewer and editor
mothurAnalyze microbial communities
BandageVisualising de novo assembly graphs
craminoBAM/CRAM quality evaluation
abPOAAdaptive banded Partial Order Alignment
Taverna WorkbenchFor designing and executing bioinformatics workflows
geWorkbenchSoftware platform for integrated genomic data analysis
BioclipseRich-client platform chemistry and biology workbench

Read our verdict in the software roundup.


Best Free and Open Source Software Explore our comprehensive directory of recommended free and open source software. Our carefully curated collection spans every major software category.

This directory is part of our ongoing series of informative articles for Linux enthusiasts. It features hundreds of detailed reviews, along with open source alternatives to proprietary solutions from major corporations such as Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk.

You’ll also find interesting projects to try, hardware coverage, free programming books and tutorials, and much more.

Discovered a useful open source Linux program that we haven’t covered yet? Let us know by completing this form.
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments