Pro Apache Hadoop, Second Edition brings you up to speed on Hadoop – the framework of big data. Revised to cover Hadoop 2.0, the book covers the very latest developments such as YARN (aka MapReduce 2.0), new HDFS high-availability features, and increased scalability in the form of HDFS Federations. All the old content has been revised too, giving the latest on the ins and outs of MapReduce, cluster design, the Hadoop Distributed File System, and more.
This book covers everything you need to build your first Hadoop cluster and begin analyzing and deriving value from your business and scientific data. Learn to solve big-data problems the MapReduce way, by breaking a big problem into chunks and creating small-scale solutions that can be flung across thousands upon thousands of nodes to analyze large data volumes in a short amount of wall-clock time. Learn how to let Hadoop take care of distributing and parallelizing your software—you just focus on the code; Hadoop takes care of the rest.
Covers all that is new in Hadoop 2.0
Written by a professional involved in Hadoop since day one
Takes you quickly to the seasoned pro level on the hottest cloud-computing framework
Over the past few years, there has been a fundamental shift in data storage, management, and processing. Companies are storing more data from more sources in more formats than ever before. This isn’t just about being a “data packrat” but rather building products, features, and intelligence predicated on knowing more about...
Learning Spark: Lightning-Fast Big Data Analysis
Data in all domains is getting bigger. How can you work with it efficiently? Recently updated for Spark 1.3, this book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run. With Spark, you can tackle big datasets quickly through simple APIs in Python, Java,...
Apache Hive Essentials
Immerse yourself on a fantastic journey to discover the attributes of big data by using Hive
About This Book
Discover how Hive can coexist and work with other tools in the Hadoop ecosystem to create big data solutions
Grasp the skills needed, learn the best practices, and avoid the...
Splunk Operational Intelligence Cookbook
Over 70 practical recipes to gain operational data intelligence with Splunk Enterprise
About This Book
Learn how to use Splunk to effectively gather, analyze, and report on the operational data across your environment
Expedite your operational intelligence reporting, be empowered to...