Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams.
Flink has been designed to run in all common cluster environments, perform computations at in-memory speed and at any scale.
This is free and open source software.
Key Features
- A streaming-first runtime that supports both batch processing and data streaming programs.
- Elegant and fluent APIs in Java and Scala.
- A runtime that supports very high throughput and low event latency at the same time.
- Support for event time and out-of-order processing in the DataStream API, based on the Dataflow Model.
- Flexible windowing (time, count, sessions, custom triggers) across different time semantics (event time, processing time).
- Fault-tolerance with exactly-once processing guarantees.
- Natural back-pressure in streaming programs.
- Libraries for Graph processing (batch), Machine Learning (batch), and Complex Event Processing (streaming).
- Built-in support for iterative programs (BSP) in the DataSet (batch) API.
- Custom memory management for efficient and robust switching between in-memory and out-of-core data processing algorithms.
- Compatibility layers for Apache Hadoop MapReduce.
- Integration with YARN, HDFS, HBase, and other components of the Apache Hadoop ecosystem.
Website: flink.apache.org
Support: GitHub Code Repository
Developer: Apache Software Foundation
License: Apache License 2.0
Apache Flink is written in Java. Learn Java with our recommended free books and free tutorials.
Related Software
| Data Analysis Tools | |
|---|---|
| Hadoop | Distributed processing of large data sets across clusters of computers |
| Storm | Distributed and fault-tolerant realtime computation |
| Drill | Distributed system for interactive analysis of large-scale datasets |
| Flink | Framework and distributed processing engine |
| Spark | Unified analytics engine for large-scale data processing |
| Pentaho | Enterprise reporting, analysis, dashboard, data mining, workflow and more |
| HPCC Systems | Designed for the enterprise to resolve Big Data challenges |
| Rapid Miner | Knowledge discovery in databases, machine learning, and data mining |
Read our verdict in the software roundup.
Explore our comprehensive directory of recommended free and open source software. Our carefully curated collection spans every major software category.This directory is part of our ongoing series of informative articles for Linux enthusiasts. It features hundreds of detailed reviews, along with open source alternatives to proprietary solutions from major corporations such as Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk. You’ll also find interesting projects to try, hardware coverage, free programming books and tutorials, and much more. Discovered a useful open source Linux program that we haven’t covered yet? Let us know by completing this form. |

