Datashader – Data Rasterization Pipeline

Datashader is an open source library for rasterizing large amounts of data into attractive, accurate images. It automates the process of creating meaningful representations of large amounts of data.

Datashader breaks the creation of images into a series of explicit steps that allow computations to be done on intermediate representations. This approach allows accurate and effective visualizations to be produced automatically, and also makes it simple for data scientists to focus on particular data and relationships of interest in a principled way.

Datashader is designed for working with large datasets, for cases where it is essential to faithfully represent the distribution of your data. It generates a fixed-size data structure (regardless of the original number of records) that gets transferred to your local browser for display.

Features include:

  • Fast server-side engine designed for dynamic data aggregation.
  • Handles interactive visualizations of really large data sets including sets with billions of rows.
  • It aggregates data and sends pixels instead of sending data to the client.
  • A compute layers that works with Bokeh.
  • Provides a flexible series of processing stages that map from raw data into viewable images.
  • Projection – each record is projected into zero or more bins of a nominal plotting grid shape, based on a specified glyph.
  • Aggregation – reductions are computed for each bin, compressing the potentially large dataset into a much smaller aggregate array.
  • Transformation – these aggregates are then further processed, eventually creating an image.
  • Supports both Pandas and Dask data frames for Points, Lines, and Graphs, and xarray arrays for Raster data.
  • Provides Point, Line, and Raster glyphs, specified at the canvas/scene level.
  • Directly supported by HoloViews, with interactive exploration supported for its Bokeh extension, and static plots supported for its Matplotlib extension.

Support: FAQ, User Guide, GitHub, Twitter
Developer: Continuum Analytics, Inc. and contributors
License: BSD License


