Big Data is an all-inclusive term that refers to data sets so large and complex that they need to be processed by specially designed hardware and software tools. The data sets are typically of the order of tera or exabytes in size. These data sets are created from a diverse range of sources: sensors that gather climate information, publicly available information such as magazines, newspapers, articles. Other examples where big data is generated include purchase transaction records, web logs, medical records, military surveillance, video and image archives, and large-scale e-commerce.
In the past decade, the world of computing has been transformed. Oceans of data are now not only found in large companies; even some small companies accumulate terabytes of data. Organisations of all sizes therefore have an increased need to handle large amounts of data, and relational databases are stretched to their limits in terms of scalability. Where Big Data is concerned, we need a platform that is scalable and optimized for storing, managing, and querying unstructured data.
An XML database allows data to be stored in the Extensible Markup Language (XML) format, a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. Using a native XML database offers the ability to store data and documents without requiring a database schema. Since XML can be used to describe any type of data, it offers a common format for representing both structured and unstructured data. Further, using XML makes mapping technologies unnecessary. Unlike relational databases, XML databases allow database structures to be modified quickly to meet changing information requirements.
This article highlights the finest native XML databases that are ideal for storing large amounts of textual or binary data and documents. Only free and open source software is eligible for inclusion here.
Click the links in the table below to learn more about each database.
|Native XML Databases|
|eXist-db||Feature rich native XML database|
|BaseX||XML Database engine and XPath/XQuery 3.0 Processor|
|Berkeley DB||Family of open source, embeddable databases|
|Sedna||Provides a full range of core database services|
This article has been revamped in line with our recent announcement.
|Read our complete collection of recommended free and open source software. Our curated compilation covers all categories of software.
The software collection forms part of our series of informative articles for Linux enthusiasts. There are hundreds of in-depth reviews, open source alternatives to proprietary software from large corporations like Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk.
There are also fun things to try, hardware, free programming books and tutorials, and much more.