The Spark Notebook is an open source notebook aimed at enterprise environments, providing data scientists and data engineers with an interactive web-based editor combining Scala code, SQL queries, Markup and JavaScript in a collaborative manner to explore, analyse and learn from massive data sets.
The notebook offers an easy way to prototype Scala/Spark code with SQL queries to analyse data.
Spark is a general-purpose framework for cluster computing.
Key Features
- Multiple Spark Context Support – each notebook spawns a new JVM with its own SparkSession instance.
- Perform reproducible analysis with Scala, Apache Spark, and the Big Data ecosystem.
- Visualize the output of SQL queries directly as tables and charts.
- Focused on Scala, SQL and Apache Spark.
- Supports exclusively the Scala programming language, the Unpredicted Lingua Franca for Data Science and extensibly exploits the JVM ecosystem of libraries to drive an smooth evolution of data-driven software from exploration to production.
- Components in the Spark Notebook are dynamic and reactive.
- Run Spark Notebook from a secured YARN cluster, Amazon EMR, and Mesosphere Data Center Operating System (DCOS).
- Various widgets to enrich the interaction with the data.
- HTML widgets – dedicated to display and interact with data elements. They can be useful to show samples of large datasets, to choose specific data points, or to show continuous updates of streaming data.
- Graphical widgets – customizable charts ranging from simple line and bar charts to advanced pivot tables and geo-charts.
- Metadata driven configuration. The metadata in the notebook allows you to configure many runtime aspects of the Spark Notebook, any cluster it connects to, specific library dependencies needed, as well as JVM parameters and definition of variables that describe your environment.
This software requires Java 7 or higher.
Website: spark-notebook.io
Support: Quick Start Guide, GitHub Code Repository, Gitter, Mailing List
Developer: Andy Petrella and contributors
License: Apache License 2.0

Spark Notebook is written in JavaScript. Learn JavaScript with our recommended free books and free tutorials.
Related Software
| Notebook software | |
|---|---|
| JupyterLab | The next generation user interface for Project Jupyter |
| RStudio | Integrated development environment (IDE) for R |
| Jupyter Notebook | Web-based notebook environment for interactive computing |
| Positron | Next-generation data science IDE |
| marimo | Reactive Python notebook |
| Apache Zeppelin | Multi-purpose notebook |
| IPython | Rich architecture for interactive computing |
| Polynote | Experimental polyglot notebook environment |
| nteract | Notebooks on your Desktop |
| Pluto | Simple reactive notebooks for Juli |
| Pretzel | Billed as a modern replacement for Jupyter Notebooks |
| Spark Notebook | Interactive and reactive data science using Scala and Spark |
| BeakerX | Kernels and extensions to the Jupyter interactive computing environment |
Read our verdict in the software roundup.
Explore our comprehensive directory of recommended free and open source software. Our carefully curated collection spans every major software category.This directory is part of our ongoing series of informative articles for Linux enthusiasts. It features hundreds of detailed reviews, along with open source alternatives to proprietary solutions from major corporations such as Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk. You’ll also find interesting projects to try, hardware coverage, free programming books and tutorials, and much more. Discovered a useful open source Linux program that we haven’t covered yet? Let us know by completing this form. |

