LinuxLinks.com
Newbies What Next ? News Forums Calendar

Search





News Sections
Home
General News (3987/0)
Reviews (637/0)
Press Releases (465/0)
Distributions (197/0)
Software (907/0)
Hardware (537/0)
Security (192/0)
Tutorials (356/0)
Off Topic (181/0)


User Functions
Username:

Password:

Don't have an account yet? Sign up as a New User


Events
There are no upcoming events




Apache Drill

Apache Drill

Apache Drill is an open source distributed system for interactive analysis of large-scale datasets.

Drill is similar to Google’s Dremel, with the additional flexibility needed to support a broader range of query languages, data formats and data sources. It is designed to efficiently process nested data. It is a design goal to scale to 10,000 servers or more and to be able to process petabyes of data and trillions of records in seconds.

Many organizations have the need to run data-intensive applications, including batch processing, stream processing and interactive analysis.

 Apache Drill

Price
Free to download

Size
1.4MB
License

Apache License 2.0

Developer
Apache Foundation

Website
incubator.apache.org/drill

System Requirements

Support:
Wiki, Mailing Lists, Apache Drill User, Crunching Big Data with Google BigQuery + Introducing Apache Drill

Selected Reviews:
eWeek, Wikibon, blogspot

Features include:

  • Consists of four key components/layers:
  • Query languages: This layer is responsible for parsing the user's query and constructing an execution plan. The initial goal is to support the SQL-like language used by Dremel and which we call DrQL. However, Drill is designed to support other languages and programming models, such as the Mongo Query Language, Cascading and Plume
  • Low-latency distributed execution engine: This layer is responsible for executing the physical plan. It provides the scalability and fault tolerance needed to efficiently query petabytes of data on 10,000 servers. Drill's execution engine is based on research in distributed execution engines (eg, Dremel, Dryad, Hyracks, CIEL, Stratosphere) and columnar storage, and can be extended with additional operators and connectors
  • Nested data formats: This layer is responsible for supporting various data formats. The initial goal is to support the column-based format used by Dremel. Drill is designed to support schema-based formats such as Protocol Buffers/Dremel, Avro/AVRO-806/Trevni and CSV, and schema-less formats such as JSON, BSON or YAML. In addition, it is designed to support column-based formats such as Dremel, AVRO-806/Trevni and RCFile, and row-based formats such as Protocol Buffers, Avro, JSON, BSON and CSV. A particular distinction with Drill is that the execution engine is flexible enough to support column-based processing as well as row-based processing. This is important because column-based processing can be much more efficient when the data is stored in a column-based format, but many large data assets are stored in a row-based format that would require conversion before use
  • Scalable data sources: This layer is responsible for supporting various data sources

Return to Data Analysis Tools for Big Data Home Page

Bookmark and Share


Last Updated Monday, April 20 2015 @ 02:25 PM EDT


We have written a range of guides highlighting excellent free books for popular programming languages. Check out the following guides: C, C++, C#, Java, JavaScript, CoffeeScript, HTML, Python, Ruby, Perl, Haskell, PHP, Lisp, R, Prolog, Scala, Scheme, Forth, SQL, Node.js (new), Fortran (new), Erlang (new), Pascal (new), and Ada (new).


Group Tests
100 Essential Apps
All Group Tests


Top Free Software
5 Office Suites
3 Lean Desktops
7 Document Processors
4 Distraction Free Tools
9 Project Management
4 Business Solutions
9 Groupware Apps
14 File Managers
10 Databases
21 Backup Tools
21 Productivity Tools
5 Note Taking Apps
9 Terminal Emulators
21 Financial Tools
5 Bitcoin Clients
21 Text Editors
21 Video Emulators
21 Home Emulators
42 Graphics Apps
6 CAD Apps
42 Scientific Apps
10 Web Browsers
42 Email Apps
12 Instant Messaging
10 IRC Clients
7 Twitter Clients
12 News Aggregators
11 VoIP Apps
42 Best Games
9 Steam Games
42 Audio Apps
5 Music Streaming
42 Video Apps
5 YouTube Tools
80 Security Apps
9 System Monitoring
8 Geometry Apps
Free Console Apps
14 Multimedia
4 Audio Grabbers
9 Internet Apps
3 HTTP Clients
5 File Managers
Programming
8 Compilers
9 IDEs
9 Debuggers
7 Revision Control Apps
6 Doc Generators
Free Web Software
21 Web CMS
14 Wiki Engines
8 Blog Apps
6 eCommerce Apps
5 Human Resource Apps
10 ERP
10 CRM
6 Data Warehouse Apps
8 Business Intelligence
6 Point-of-Sale

Other Articles
Migrating from Windows
Back up your data
20 Free Linux Books
24 Beginner Books
12 Shell Scripting Books


Older Stories
Monday 03/16
  • MIPS Creator CI20 v Raspberry Pi 2 (0)
  • Raspberry Pi 2: Raspbian (ARMv6) v Linaro (ARMv7) (0)

  • Friday 03/06
  • Raspberry Pi 2 review (0)

  • Sunday 02/22
  • Chess in a Few Bytes (0)
  • Learn the Art of Computer Programming With These Great Free Beginner Books (2)
  • CD Audio Grabbers (0)

  • Monday 01/19
  • fitlet is a tiny fanless PC full of openness (0)

  • Sunday 01/18
  • MintBox Mini gives Linux users a pocket-sized PC (0)
  • 6 Invaluable Assembly Books (0)

  • Wednesday 01/14
  • Why Mac users don’t switch to Linux (0)


  • Vote

    What Linux distribution do you run on your main computer?

    Debian
    Fedora
    Mint
    Slackware
    openSuSE
    Arch
    Ubuntu
    Redhat
    Mageia
    CentOS
    FreeBSD
    Results
    686 votes | 3 comments

    Built with GeekLog and phpBB
    Comments to the webmaster are welcome
    Copyright 2009 LinuxLinks.com - All rights reserved