6 Best File Systems for Big Data
Big Data is an all-inclusive term that refers to data sets so
large and complex that they need to be processed by specially designed
hardware and software tools. The data sets are typically of the order
of tera or exabytes in size. These data sets are created from a diverse
range of sources: sensors that gather climate information, publicly
available information such as magazines, newspapers, articles. Other
examples where big data is generated include purchase transaction
records, web logs, medical records, military surveillance, video and
image archives, and large-scale e-commerce.
There is a heightened interest in Big Data. Oceans of digital
data are being created from the interaction between individuals,
businesses, and government agencies. There are enormous benefits
open to organisations providing they effectively identify, access,
filter, analyze and select parts of this data.
Big Data demands the storage of a massive amount
of data. This makes it a necessity for advanced storage infrastructure;
a need to have a storage solution which is designed to scale
out on multiple servers.
This is the third article in our series identifying the finest
open source software for Big Data. This feature highlights the finest
open source file systems designed to cope with the demands imposed by
Big Data.
Hopefully, there will be something of interest for
anyone who needs to support high performance data and offer consistent
access to a common set of data from multiple servers.
Now, let's explore the 6 file systems at hand. For
each title we have compiled its own portal page, a full description
with an in-depth analysis of its features, together with links to
relevant resources and reviews.
| File Systems
|
| Quantcast
File System |
High-performance, fault-tolerant, distributed file
system |
| HDFS |
Distributed
file system providing high-throughput access |
| Ceph |
Unified, distributed storage system |
| Lustre |
File
system for computer clusters |
| GlusterFS |
Scale-out NAS file system |
| PVFS |
Designed
to scale to petabytes of storage |
Return to our complete collection of Group
Tests, identifying the finest Linux software.
Last Updated Friday, April 12 2013 @ 08:41 PM EDT |