Log Analyzers

Apache Flume – log data aggregation and more

Apache Flume is an open source, scalable, distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. It uses a simple extensible data model that allows for online analytic application.

The system is centrally managed and allows for intelligent dynamic management. It uses a simple extensible data model that allows for online analytic applications.

The main goal of Apache Flume is to deliver data from applications to Apache Hadoop’s HDFS. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. It uses a simple extensible data model that allows for online analytic applications.

Key Features

  • Complex flows:
    • build multi-hop flows where events travel through multiple agents before reaching the final destination.
    • fan-in and fan-out flows.
    • contextual routing.
    • backup routes (fail-over) for failed hops.
  • Channel-based transactions to guarantee reliable message delivery.
  • Supports a durable file channel which is backed by the local file system. Events are staged in the channel, which manages recovery from failure.
  • High performance persistent channel – the File Channel.
  • ElasticSearch Sink.
  • Create a SpoolDirectory Source and Client.
  • Regex Extractor Interceptor.
  • Load Balancing RPC client.
  • Hive Sink based on the new Hive Streaming support.
  • End to End authentication in Flume.
  • Simple regex search-and-replace interceptor.

Website: flume.apache.org
Support: User Guide
Developer: The Apache Software Foundation
License: Apache License 2.0

Apache Flume is written in Java. Learn Java with our recommended free books and free tutorials.


Related Software

Log Analyzers
KibanaBrowser based interface for logstash and ElasticSearch
logstashLog processing, search, and analytics
OpenObserveCloud-native observability platform
GoAccessReal-time web log analyzer and interactive viewer
FluentdData collector for unified logging layer
LokiHorizontally-scalable, highly-available, multi-tenant log aggregation system
Graylog2Log management solution implementation storing logs in ElasticSearch
GraphiteEnterprise scalable realtime graphing
SigNozMonitor your applications and troubleshoot problems
Apache FlumeDelivers data from applications to Apache Hadoop's HDFS
OpenTSDBScalable, distributed Time Series Database
VictoriaLogsHigh-performance log database designed to ingest, store, and query log data
ScribeServer for aggregating log data that is streamed in real time from clients
LogoRRRCross-platform log analysis tool
ChukwaHadoop sub-project devoted to large-scale log collection and analysis

Read our verdict in the software roundup.


Best Free and Open Source Software Explore our comprehensive directory of recommended free and open source software. Our carefully curated collection spans every major software category.

This directory is part of our ongoing series of informative articles for Linux enthusiasts. It features hundreds of detailed reviews, along with open source alternatives to proprietary solutions from major corporations such as Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk.

You’ll also find interesting projects to try, hardware coverage, free programming books and tutorials, and much more.

Discovered a useful open source Linux program that we haven’t covered yet? Let us know by completing this form.
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments