Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset

Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset, 9781484200957 (1484200950), Apress, 2014

Many corporations are finding that the size of their data sets are outgrowing the capability of their systems to store and process them. The data is becoming too big to manage and use with traditional tools. The solution: implementing a big data system.

As Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset shows, Apache Hadoop offers a scalable, fault-tolerant system for storing and processing data in parallel. It has a very rich toolset that allows for storage (Hadoop), configuration (YARN and ZooKeeper), collection (Nutch and Solr), processing (Storm, Pig, and Map Reduce), scheduling (Oozie), moving (Sqoop and Avro), monitoring (Chukwa, Ambari, and Hue), testing (Big Top), and analysis (Hive).

The problem is that the Internet offers IT pros wading into big data many versions of the truth and some outright falsehoods born of ignorance. What is needed is a book just like this one: a wide-ranging but easily understood set of instructions to explain where to get Hadoop tools, what they can do, how to install them, how to configure them, how to integrate them, and how to use them successfully. And you need an expert who has worked in this area for a decade—someone just like author and big data expert Mike Frampton.

Big Data Made Easy approaches the problem of managing massive data sets from a systems perspective, and it explains the roles for each project (like architect and tester, for example) and shows how the Hadoop toolset can be used at each system stage. It explains, in an easily understood manner and through numerous examples, how to use each tool. The book also explains the sliding scale of tools available depending upon data size and when and how to use them. Big Data Made Easy shows developers and architects, as well as testers and project managers, how to:

Store big data
Configure big data
Process big data
Schedule processes
Move data among SQL and NoSQL systems
Monitor data
Perform big data analytics
Report on big data processes and projects
Test big data systems

Big Data Made Easy also explains the best part, which is that this toolset is free. Anyone can download it and—with the help of this book—start to use it within a day. With the skills this book will teach you under your belt, you will add value to your company or client immediately, not to mention your career.

What youÂll learn

How to install and employ Hadoop
How to install and use Hadoop-related tools like Hive, Storm, Pig, Solr, Oozie, Ambari, and many others
How to set up and test a big data system
How to scale the system for the amount of data at hand and the data you expect to accumulate
How those who have spent their careers in the SQL database world can apply their skills to building big data systems

Who this book is for

This book is for developers, architects, IT project managers, database administrators, and others charged with developing or supporting a big data system. It is also for a general IT audience, anyone interested in Hadoop or big data, and those experiencing problems with data size. It’s also for anyone who would like to further their career in this area by adding big data skills.

Comments

Amazing Books

A Practical Guide to Data Structures and Algorithms using Java

CRC Press, 2007

Although traditional texts present isolated algorithms and data structures, they do not provide a unifying structure and offer little guidance on how to appropriately select among them. Furthermore, these texts furnish little, if any, source code and leave many of the more difficult aspects of the implementation as exercises. A fresh...

Information Trapping: Real-Time Research on the Web

New Riders Publishing, 2006

For a long time—and especially around 1994, when the World Wide Web was just getting its jumpstart—the Internet appeared to many as a vast pool of information just sitting in cyberspace. People who used the Internet for research “cast their nets” by entering queries into a search engine, and then pulled...

Introduction to Programming Using Visual Basic 2005, An (6th Edition)

Prentice Hall, 2006

Based on the newest version of Microsoft's VB. NET, this revision of Schneider's best-selling text is designed for students with no prior computer programming experience. The author uses Visual Basic .NET to explore the fundamentals of programming, building a strong foundation that will give students a sustainable understanding...

Macromedia Flash Professional 8 Unleashed

Sams Publishing, 2005

Macromedia Flash 8 is the latest in the Flash family of software. Flash was originally just a vector animation tool but is now one of the most advanced programs for creating rich Internet applications to provide powerful user experiences. Not only is the player that plays Flash content one of the most downloaded pieces of...

Healing Logics: Culture and Medicine in Modern Health Belief Systems

Utah State University Press, 2001

Scholars in folklore and anthropology are more directly involved in various aspects of medicine—such as medical education, clinical pastoral care, and negotiation of transcultural issues—than ever before. Old models of investigation that artificially isolated "folk medicine," "complementary and alternative...

Beginning Php 4 (Programmer to Programmer)

Peer Information Inc, 2000

PHP is a rapidly growing Web technology which enables web designers to build dynamic, interactive web applications, incorporating information from a host of databases, and including features such as e-mail integration and dynamically generated images. PHP4 added tons of features to make web application development even easier, and this book will...

What youÂll learn

Who this book is for

What youÂll learn