Data Science

Spark Notebook – Interactive and Reactive Data Science using Scala and Spark

The Spark Notebook is an open source notebook aimed at enterprise environments, providing data scientists and data engineers with an interactive web-based editor combining Scala code, SQL queries, Markup and JavaScript in a collaborative manner to explore, analyse and learn from massive data sets.

The notebook offers an easy way to prototype Scala/Spark code with SQL queries to analyse data.

Spark is a general-purpose framework for cluster computing.

Key Features

  • Multiple Spark Context Support – each notebook spawns a new JVM with its own SparkSession instance.
  • Perform reproducible analysis with Scala, Apache Spark, and the Big Data ecosystem.
  • Visualize the output of SQL queries directly as tables and charts.
  • Focused on Scala, SQL and Apache Spark.
  • Supports exclusively the Scala programming language, the Unpredicted Lingua Franca for Data Science and extensibly exploits the JVM ecosystem of libraries to drive an smooth evolution of data-driven software from exploration to production.
  • Components in the Spark Notebook are dynamic and reactive.
  • Run Spark Notebook from a secured YARN cluster, Amazon EMR, and Mesosphere Data Center Operating System (DCOS).
  • Various widgets to enrich the interaction with the data.
    • HTML widgets – dedicated to display and interact with data elements. They can be useful to show samples of large datasets, to choose specific data points, or to show continuous updates of streaming data.
    • Graphical widgets – customizable charts ranging from simple line and bar charts to advanced pivot tables and geo-charts.
  • Metadata driven configuration. The metadata in the notebook allows you to configure many runtime aspects of the Spark Notebook, any cluster it connects to, specific library dependencies needed, as well as JVM parameters and definition of variables that describe your environment.

This software requires Java 7 or higher.

Website: spark-notebook.io
Support: Quick Start Guide, GitHub Code Repository, Gitter, Mailing List
Developer: Andy Petrella and contributors
License: Apache License 2.0

Spark Notebook

Spark Notebook is written in JavaScript. Learn JavaScript with our recommended free books and free tutorials.


Related Software

Notebook software
JupyterLabThe next generation user interface for Project Jupyter
RStudioIntegrated development environment (IDE) for R
Jupyter NotebookWeb-based notebook environment for interactive computing
PositronNext-generation data science IDE
marimoReactive Python notebook
Apache ZeppelinMulti-purpose notebook
IPythonRich architecture for interactive computing
PolynoteExperimental polyglot notebook environment
nteractNotebooks on your Desktop
PlutoSimple reactive notebooks for Juli
PretzelBilled as a modern replacement for Jupyter Notebooks
Spark NotebookInteractive and reactive data science using Scala and Spark
BeakerXKernels and extensions to the Jupyter interactive computing environment

Read our verdict in the software roundup.


Best Free and Open Source Software Explore our comprehensive directory of recommended free and open source software. Our carefully curated collection spans every major software category.

This directory is part of our ongoing series of informative articles for Linux enthusiasts. It features hundreds of detailed reviews, along with open source alternatives to proprietary solutions from major corporations such as Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk.

You’ll also find interesting projects to try, hardware coverage, free programming books and tutorials, and much more.

Discovered a useful open source Linux program that we haven’t covered yet? Let us know by completing this form.
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments