Datashader – Data Rasterization Pipeline

Datashader is an open source library for rasterizing large amounts of data into attractive, accurate images. It automates the process of creating meaningful representations of large amounts of data.

Datashader breaks the creation of images into a series of explicit steps that allow computations to be done on intermediate representations. This approach allows accurate and effective visualizations to be produced automatically, and also makes it simple for data scientists to focus on particular data and relationships of interest in a principled way.

Datashader is designed for working with large datasets, for cases where it is essential to faithfully represent the distribution of your data. It generates a fixed-size data structure (regardless of the original number of records) that gets transferred to your local browser for display.

Features include:

  • Fast server-side engine designed for dynamic data aggregation.
  • Handles interactive visualizations of really large data sets including sets with billions of rows.
  • It aggregates data and sends pixels instead of sending data to the client.
  • A compute layers that works with Bokeh.
  • Provides a flexible series of processing stages that map from raw data into viewable images.
  • Projection – each record is projected into zero or more bins of a nominal plotting grid shape, based on a specified glyph.
  • Aggregation – reductions are computed for each bin, compressing the potentially large dataset into a much smaller aggregate array.
  • Transformation – these aggregates are then further processed, eventually creating an image.
  • Supports both Pandas and Dask data frames for Points, Lines, and Graphs, and xarray arrays for Raster data.
  • Provides Point, Line, and Raster glyphs, specified at the canvas/scene level.
  • Directly supported by HoloViews, with interactive exploration supported for its Bokeh extension, and static plots supported for its Matplotlib extension.

Support: FAQ, User Guide, GitHub, Twitter
Developer: Continuum Analytics, Inc. and contributors
License: BSD License


Datashader is written in Python. Learn Python with our recommended free books and free tutorials.

Return to Python Visualization Packages Home Page

Read our complete collection of recommended free and open source software. The collection covers all categories of software.

The software collection forms part of our series of informative articles for Linux enthusiasts. There's tons of in-depth reviews, alternatives to Google, fun things to try, hardware, free programming books and tutorials, and much more.
Share this article

Share your Thoughts

This site uses Akismet to reduce spam. Learn how your comment data is processed.