Home | Amazing | Today | Tags | Publishers | Years | Account | Search 
Taming Text: How to Find, Organize, and Manipulate It

Buy

Summary

Taming Text, winner of the 2013 Jolt Awards for Productivity, is a hands-on, example-driven guide to working with unstructured text in the context of real-world applications. This book explores how to automatically organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization. The book guides you through examples illustrating each of these topics, as well as the foundations upon which they are built.

About this Book

There is so much text in our lives, we are practically drowningin it. Fortunately, there are innovative tools and techniquesfor managing unstructured information that can throw thesmart developer a much-needed lifeline. You'll find them in thisbook.

Taming Text is a practical, example-driven guide to working withtext in real applications. This book introduces you to useful techniques like full-text search, proper name recognition,clustering, tagging, information extraction, and summarization.You'll explore real use cases as you systematically absorb thefoundations upon which they are built.Written in a clear and concise style, this book avoids jargon, explainingthe subject in terms you can understand without a backgroundin statistics or natural language processing. Examples arein Java, but the concepts can be applied in any language.

Written for Java developers, the book requires no prior knowledge of GWT.

Purchase of the print book comes with an offer of a free PDF, ePub, and Kindle eBook from Manning. Also available is all code from the book.

Winner of 2013 Jolt Awards: The Best Books—one of five notable books every serious programmer should read.

What's Inside

  • When to use text-taming techniques
  • Important open-source libraries like Solr and Mahout
  • How to build text-processing applications
About the Authors

Grant Ingersoll is an engineer, speaker, and trainer, a Lucenecommitter, and a cofounder of the Mahout machine-learning project. Thomas Morton is the primary developer of OpenNLP and Maximum Entropy. Drew Farris is a technology consultant, software developer, and contributor to Mahout,Lucene, and Solr.

"Takes the mystery out of verycomplex processes."—From the Foreword by Liz Liddy, Dean, iSchool, Syracuse University

Table of Contents

  1. Getting started taming text
  2. Foundations of taming text
  3. Searching
  4. Fuzzy string matching
  5. Identifying people, places, and things
  6. Clustering text
  7. Classification, categorization, and tagging
  8. Building an example question answering system
  9. Untamed text: exploring the next frontier
(HTML tags aren't allowed.)

SOA for Profit, A Manager's Guide to Success with Service Oriented Architecture
SOA for Profit, A Manager's Guide to Success with Service Oriented Architecture

Service-Oriented Architecture is becoming the leading architecture for IT, and it is changing the way organisations work. IT is slowly but steadily gaining maturity, and becoming the flexible yet stable and reliable support for business it should be. At the same time, IT is regaining its potential to create real business innovation. SOA will...

Athletic Scholarships For Dummies
Athletic Scholarships For Dummies

Get insider tips on navigating the recruitment process

Find the right school, the right program, the right coach, and the most money

You're prepared for challenges on the athletic field. But are you prepared for the challenges of winning an athletic scholarship? Let this friendly guide be your coach. It
...

Learning PHP & MySQL: Step-by-Step Guide to Creating Database-Driven Web Sites
Learning PHP & MySQL: Step-by-Step Guide to Creating Database-Driven Web Sites
PHP and MySQL are quickly becoming the de facto standard for rapid development of dynamic, database-driven web sites. This book is perfect for newcomers to programming as well as hobbyists who are intimidated by harder-to-follow books. With concepts explained in plain English, the new edition starts with the basics of the PHP language, and explains...

Introduction to Google Analytics: A Guide for Absolute Beginners
Introduction to Google Analytics: A Guide for Absolute Beginners

Develop your digital/online marketing skills and learn web analytics to understand the performance of websites and ad campaigns. Approaches covered will be immediately useful for business or nonprofit organizations. If you are completely new to Google Analytics and you want to learn the basics, this guide will introduce you to the...

Morphing: A Guide to Mathematical Transformations for Architects and Designers
Morphing: A Guide to Mathematical Transformations for Architects and Designers

Cylinders, spheres and cubes are a small handful of shapes that can be defined by a single word. However, most shapes cannot be found in a dictionary. They belong to an alternative plastic world defined by trigonometry: a mathematical world where all shapes can be described under one systematic language and where any shape can transform into...

Maran Illustrated Office 2003
Maran Illustrated Office 2003
Produced by the award-winning maranGraphics Group, Maran Illustrated? Microsoft® Office 2003 is a valuable resource for all readers, regardless of experience. Clear, step-by-step instructions walk you through each operation from beginning to end. Helpful topic introductions and useful tips provide additional information and advice to enhance...
©2019 LearnIT (support@pdfchm.net) - Privacy Policy