Big Data

Apache Flink – framework and distributed processing engine

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams.

Flink has been designed to run in all common cluster environments, perform computations at in-memory speed and at any scale.

This is free and open source software.

Key Features

  • A streaming-first runtime that supports both batch processing and data streaming programs.
  • Elegant and fluent APIs in Java and Scala.
  • A runtime that supports very high throughput and low event latency at the same time.
  • Support for event time and out-of-order processing in the DataStream API, based on the Dataflow Model.
  • Flexible windowing (time, count, sessions, custom triggers) across different time semantics (event time, processing time).
  • Fault-tolerance with exactly-once processing guarantees.
  • Natural back-pressure in streaming programs.
  • Libraries for Graph processing (batch), Machine Learning (batch), and Complex Event Processing (streaming).
  • Built-in support for iterative programs (BSP) in the DataSet (batch) API.
  • Custom memory management for efficient and robust switching between in-memory and out-of-core data processing algorithms.
  • Compatibility layers for Apache Hadoop MapReduce.
  • Integration with YARN, HDFS, HBase, and other components of the Apache Hadoop ecosystem.

Website: flink.apache.org
Support: GitHub Code Repository
Developer: Apache Software Foundation
License: Apache License 2.0

Apache Flink is written in Java. Learn Java with our recommended free books and free tutorials.


Related Software

Data Analysis Tools
HadoopDistributed processing of large data sets across clusters of computers
StormDistributed and fault-tolerant realtime computation
DrillDistributed system for interactive analysis of large-scale datasets
FlinkFramework and distributed processing engine
SparkUnified analytics engine for large-scale data processing
PentahoEnterprise reporting, analysis, dashboard, data mining, workflow and more
HPCC SystemsDesigned for the enterprise to resolve Big Data challenges
Rapid MinerKnowledge discovery in databases, machine learning, and data mining

Read our verdict in the software roundup.


Best Free and Open Source Software Explore our comprehensive directory of recommended free and open source software. Our carefully curated collection spans every major software category.

This directory is part of our ongoing series of informative articles for Linux enthusiasts. It features hundreds of detailed reviews, along with open source alternatives to proprietary solutions from major corporations such as Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk.

You’ll also find interesting projects to try, hardware coverage, free programming books and tutorials, and much more.

Discovered a useful open source Linux program that we haven’t covered yet? Let us know by completing this form.
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments