Apache Flume
Apache Flume is an open source, scalable, distributed,
reliable, and available
service for
efficiently collecting, aggregating, and moving large amounts of log
data. It has a simple and flexible architecture based on streaming
data flows. It is robust and fault tolerant with tunable reliability
mechanisms and many failover and recovery mechanisms. It
uses a simple extensible data model that allows for online analytic
application.
The system is centrally managed and allows for intelligent
dynamic
management. It uses a simple extensible data model that allows for
online analytic applications.
The main goal of Apache Flume is to deliver data from
applications to Apache
Hadoop's HDFS. It has a simple and flexible architecture based on
streaming data flows. It is robust and fault tolerant with tunable
reliability mechanisms and many failover and recovery mechanisms. It
uses a simple extensible data model that allows for online analytic
applications.
Features include:
- Complex flows:
- build multi-hop flows where events travel through
multiple agents before reaching the final destination
- fan-in
and fan-out flows
- contextual routing
- backup routes (fail-over) for failed
hops
- Channel-based transactions to guarantee reliable message
delivery
- Supports
a durable file channel which is backed by the local file system. Events
are staged in the channel, which manages recovery from failure
- High performance persistent channel – the File Channel
- ElasticSearch Sink
- Create a SpoolDirectory Source and Client
- Regex Extractor Interceptor
- Load Balancing RPC client
Return
to Log Analyzers Home Page
Last Updated Sunday, December 23 2012 @ 02:29 AM EST |