Kudu is a distributed data storage engine

Apache Kudu is a distributed data storage engine that makes fast analytics on fast and changing data easy.

Kudu runs on commodity hardware, is horizontally scalable, and supports highly available operation.

This is free and open source software.

Key Features

Fast processing of OLAP workloads.
Strong but flexible consistency model, allowing you to choose consistency requirements on a per-request basis, including the option for strict-serializable consistency.
Structured data model.
Strong performance for running sequential and random workloads simultaneously.
Tight integration with Apache Impala, making it a good, mutable alternative to using HDFS with Apache Parquet.
Integration with Apache NiFi and Apache Spark.
Integration with Hive Metastore (HMS) and Apache Ranger to provide fine-grain authorization and access control.
Authenticated and encrypted RPC communication.
High availability: Tablet Servers and Masters use the Raft Consensus Algorithm, which ensures that as long as more than half the total number of tablet replicas is available, the tablet is available for reads and writes. For instance, if 2 out of 3 replicas (or 3 out of 5 replicas, etc.) are available, the tablet is available. Reads can be serviced by read-only follower tablet replicas, even in the event of a leader replica’s failure.
Automatic fault detection and self-healing: to keep data highly available, the system detects failed tablet replicas and re-replicates data from available ones, so failed replicas are automatically replaced when enough Tablet Servers are available in the cluster.
Location awareness (a.k.a. rack awareness) to keep the system available in case of correlated failures and allowing Kudu clusters to span over multiple availability zones.
Logical backup (full and incremental) and restore.
Multi-row transactions (only for INSERT/INSERT_IGNORE operations as of Kudu 1.15 release).
Easy to administer and manage.
Cross-platform support – runs under Linux and macOS.

Website: kudu.apache.org
Support:
Developer: The Apache Software Foundation
License: Apache License 2.0

Kudu is written in C++. Learn C++ with our recommended free books and free tutorials.

Related Software

Column-Oriented Databases
MariaDB ColumnStore	Uses a massively parallel distributed data architecture
DuckDB	In-process SQL OLAP database management system
Druid	High performance, real-time analytics database
Databend	Cloud data warehouse
ClickHouse	Real-time analytics database management system
InfluxDB Core	Scalable datastore for metrics, events, and real-time analytics
Doris	Modern data warehouse for real-time analytics
VictoriaMetrics	Scalable solution for monitoring and managing time series data
StarRocks	High-performance analytical database
MonetDB	High performance relational database system for analytics
Kudu	Distributed data storage engine
QuestDB	High-performance time-series database
Pinot	Real-time analytics platform
IoTDB	High-performance time-series database
GreptimeDB	Cloud-native database
CrateDB	Distributed SQL database management

Read our verdict in the software roundup.

Explore our comprehensive directory of recommended free and open source software. Our carefully curated collection spans every major software category.

This directory is part of our ongoing series of informative articles for Linux enthusiasts. It features hundreds of detailed reviews, along with open source alternatives to proprietary solutions from major corporations such as Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk.

You’ll also find interesting projects to try, hardware coverage, free programming books and tutorials, and much more.

Discovered a useful open source Linux program that we haven’t covered yet? Let us know by completing this form.

Documents	Internet	Education
Audio	Video	Graphics
Admin	Desktop	Productivity
Science	Games	Security
Utilities	Coding	Finance
Web Apps	Other	Books

Google	Microsoft	Apple
Adobe	IBM	Autodesk
Oracle	Atlassian	Corel
Cisco	Intuit	SAS
Progress	Salesforce	Citrix