9 Key Value Stores for Big Data
Big Data is an all-inclusive term that refers to data sets so
large and complex that they need to be processed by specially designed
hardware and software tools. The data sets are typically of the order
of tera or exabytes in size. These data sets are created from a diverse
range of sources: sensors that gather climate information, publicly
available information such as magazines, newspapers, articles. Other
examples where big data is generated include purchase transaction
records, web logs, medical records, military surveillance, video and
image archives, and large-scale e-commerce.
In the past decade, the world of computing has been
transformed. Oceans of data are now not only found in large companies;
even some small companies accumulate terabytes of data. Organisations
of all sizes therefore have an increased need to handle large amounts
of data, and relational databases are stretched to their limits in
terms of scalabillity. We need a solution which helps to achieve
scaling and higher availability.
Serving systems are unable to cope with bulk load massive
immutable data sets without affecting serving performance. Performance
is imparied as valuable resource is sucked away by index
creation and modification as CPU and memory resources are shared with
request serving.
A solution is a key value store. This is one of the
non-relation database models,
such as graph, document-oriented database models. Key value stores
allow the application to store its data in a
schema-less way. The data can be stored in a datatype of a
programming language or an object. This removes the need for a
fixed data model. Key value stores refers to a general concept of
database where entities (values) are indexed using a unique key.
This is the fourth article in our series identifying the
finest open source software for Big Data. The earlier articles covered Search
Engines, Data
Analysis tools and File
Systems. This feature highlights the finest
open source key value stores. Hopefully, there will be
something of interest for anyone who needs to store millions of data
records, to help in statistical or real-time analysis.
Now, let's explore the 9 key value stores at hand. For
each title we have compiled its own portal page, a full description
with an in-depth analysis of its features, together with links to
relevant resources and reviews.
| Key Value Stores
|
| Aerospike
CE |
Real-time NoSQL database and key-value store |
| LevelDB |
Fast
and lightweight key/value database library by Google |
| Scalaris |
Distributed transactional key-value store |
| Project
Voldemort |
Distributed
data store that is designed as a key-value store used by LinkedIn |
| HyperDex |
Distributed, searchable, and consistent key-value store |
| Berkeley
DB |
Family
of open source, embeddable databases |
| Apache
Accumulo |
Based on Google's BigTable design |
| Redis |
Advanced
key-value store in a similar vein to memcache |
| Apache
Cassandra |
Distributed database management system |
Return to our complete collection of Group
Tests, identifying the finest Linux software.
Last Updated Sunday, April 14 2013 @ 11:24 AM EDT |