Learning Cloudera Impala

Learning Cloudera Impala, 9781783281275 (1783281278), Packt Publishing, 2013

Perform interactive, real-time in-memory analytics on large amounts of data using the massive parallel processing engine Cloudera Impala

Overview

Step-by-step guidance to get you started with Impala on your Hadoop cluster
Manipulate your data rapidly by writing proper SQL statements
Explore the concepts of Impala security, administration, and troubleshooting in detail to maintain your Impala cluster

In Detail

If you have always wanted to crunch billions of rows of raw data on Hadoop in a couple of seconds, then Cloudera Impala is the number one choice for you. Cloudera Impala provides fast, interactive SQL queries directly on your Apache Hadoop data stored in HDFS or HBase. In addition to using the same unified storage platform, Impala also uses the same metadata, SQL syntax (Hive SQL), ODBC driver, and user interface (Hue Beeswax) as Apache Hive. This provides a familiar and unified platform for batch-oriented or real-time queries.

In this practical, example-oriented book, you will learn everything you need to know about Cloudera Impala so that you can get started on your very own project. The book covers everything about Cloudera Impala from installation, administration, and query processing, all the way to connectivity with other third party applications. With this book in your hand, you will find yourself empowered to play with your data in Hadoop.

As a reader of this book, you will learn about the origin of Impala and the technology behind it that allows it to run on thousands of machines. You will learn how to install, run, manage, and troubleshoot Impala in your own Hadoop cluster using the step-by-step guidance provided in the book. The book covers tenets of data processing such as loading data stored in Hadoop into Impala tables and querying data using Impala SQL statements, all with various code illustrations and a real-world example.

The book is written to get you started with Impala by providing rich information so you can understand what Impala is, what it can do for you, and finally how you can use it to achieve your objective.

What you will learn from this book

Understand the various ways of installing Impala in your Hadoop cluster
Use the Impala shell API to interact with Impala components
Utilize Impala Query Language and built-in functions to play with data
Administrate and fine-tune Impala for high availability
Identify and troubleshoot problems in a variety of ways
Get acquainted with various input data formats in Hadoop and how to use them with Impala
Comprehend how third party applications can connect with Impala to provide data visualization and various other enhancements

Approach

This book is an easy-to-follow, step-by-step tutorial where each chapter takes your knowledge to the next level. The book covers practical knowledge with tips to implement this knowledge in real-world scenarios. A chapter with a real-life example is included to help you understand the concepts in full.

Who this book is written for

Using Cloudera Impala is for those who really want to take advantage of their Hadoop cluster by processing extremely large amounts of raw data in Hadoop at real-time speed. Prior knowledge of Hadoop and some exposure to HIVE and MapReduce is expected.

Comments

Amazing Books

The Leadership Experience (with InfoTrac) (Dryden Press Series in Management)

Cengage Learning, 2007

Packed with interesting examples and real world leadership, the 4th edition of THE LEADERSHIP EXPERIENCE will help you develop an understanding of theory while acquiring the necessary skills and insights to become effective leaders. Written expressly for courses teaching leadership theory and application, the text integrates recent ideas and...

Nuclear Engineering Handbook (Mechanical Engineering)

CRC Press, 2009

Nuclear power has, in recent years, undergone a major transformation, resulting in major technical developments and a new generation of nuclear scientists and engineers. A comprehensive book that reflects the latest nuclear technologies has been lacking—until now.

The Nuclear Engineering Handbook is a response to...

Succeeding with Technology (Sam 2010 Compatible Products)

Course Technology PTR, 2010

Most students entering college have already had years of exposure to computers and other digital technologies. Elementary and high school students use computers to write papers, create presentations, communicate with each other, conduct research, and entertain themselves. Cell phones, digital cameras, and iPods are standard equipment...

The Best of 2600: A Hacker Odyssey

John Wiley & Sons, 2008

" … The Best of 2600: A Hacker Odyssey is an important, amazing book that tells the story of these kids and adults as they explore a new frontier."
—John Baichtal (Wired Blog, August, 2008)

"...a testament to a culture which thrived before computers and the internet mattered to most of...

Gsm and Personal Communications Handbook (Artech House Mobile Communications Library)

Artech House Publishers, 1998

Here is the most comprehensive reference available on GSM applications and services that?s intended to build on -- not replace -- the basic technical information in the authors? original bestseller, An Introduction to GSM. The book provides a close-up look at this hot technology, offers in-depth discussions of the features and services available...

Aesthetic Surgery of the Abdominal Wall

Springer, 2005

We have come a long way since the days when abdominal wall contouring was p- formed by simple dermolipectomies, with no attention to diastasis correction and muscular reinforcement, and little regard for the final aesthetic result. Nowadays, an abdominoplasty may be indicated for even the most demanding of patients. Details such as placement...