This guide is an ideal learning tool and reference for Apache Pig, the open source engine for executing parallel data flows on Hadoop. With Pig, you can batch-process data without having to create a full-fledged application—making it easy for you to experiment with new datasets.
Programming Pig introduces new users to Pig, and provides experienced users with comprehensive coverage on key features such as the Pig Latin scripting language, the Grunt shell, and User Defined Functions (UDFs) for extending Pig. If you need to analyze terabytes of data, this book shows you how to do it efficiently with Pig.
Delve into Pig’s data model, including scalar and complex data types
Write Pig Latin scripts to sort, group, join, project, and filter your data
Use Grunt to work with the Hadoop Distributed File System (HDFS)
Build complex data processing pipelines with Pig’s macros and modularity features
Embed Pig Latin in Python for iterative processing and other advanced tasks
Create your own load and store functions to handle data formats and storage mechanisms
Get performance tips for running scripts on Hadoop clusters in less time
Java(tm)2: A Beginner's Guide, Second Edition
Essential Skills--Made Easy!
Learn the fundamentals of Java 2 programming from master programmer and best-selling author Herb Schildt. Fully updated to cover Java 2 version 1.4, this step-by-step guide will have you programming in no time. You'll start at the beginning, learning why Java is the preeminent language of the Internet, how it...
Beginning OS X Lion Apps Development (Beginning Apress)
Mac OS X offers an amazing development environment for scores of technologies. It seems that
developers from numerous camps are migrating to Mac en masse. Scan the room at any Ruby or
Rails conference, for example, and you’ll see programmers coding on Macs almost exclusively. As
developers move to Mac, almost inevitably they...
Ethics of Big Data: Balancing Risk and Innovation
What are your organization’s policies for generating and using huge datasets full of personal information? This book examines ethical questions raised by the big data phenomenon, and explains why enterprises need to reconsider business decisions concerning privacy and identity. Authors Kord Davis and Doug Patterson provide...
Expert web developer...