Home | Amazing | Today | Tags | Publishers | Years | Account | Search 
Loading
Taming Text: How to Find, Organize, and Manipulate It

Buy

Summary

Taming Text, winner of the 2013 Jolt Awards for Productivity, is a hands-on, example-driven guide to working with unstructured text in the context of real-world applications. This book explores how to automatically organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization. The book guides you through examples illustrating each of these topics, as well as the foundations upon which they are built.

About this Book

There is so much text in our lives, we are practically drowningin it. Fortunately, there are innovative tools and techniquesfor managing unstructured information that can throw thesmart developer a much-needed lifeline. You'll find them in thisbook.

Taming Text is a practical, example-driven guide to working withtext in real applications. This book introduces you to useful techniques like full-text search, proper name recognition,clustering, tagging, information extraction, and summarization.You'll explore real use cases as you systematically absorb thefoundations upon which they are built.Written in a clear and concise style, this book avoids jargon, explainingthe subject in terms you can understand without a backgroundin statistics or natural language processing. Examples arein Java, but the concepts can be applied in any language.

Written for Java developers, the book requires no prior knowledge of GWT.

Purchase of the print book comes with an offer of a free PDF, ePub, and Kindle eBook from Manning. Also available is all code from the book.

Winner of 2013 Jolt Awards: The Best Books—one of five notable books every serious programmer should read.

What's Inside

  • When to use text-taming techniques
  • Important open-source libraries like Solr and Mahout
  • How to build text-processing applications
About the Authors

Grant Ingersoll is an engineer, speaker, and trainer, a Lucenecommitter, and a cofounder of the Mahout machine-learning project. Thomas Morton is the primary developer of OpenNLP and Maximum Entropy. Drew Farris is a technology consultant, software developer, and contributor to Mahout,Lucene, and Solr.

"Takes the mystery out of verycomplex processes."—From the Foreword by Liz Liddy, Dean, iSchool, Syracuse University

Table of Contents

  1. Getting started taming text
  2. Foundations of taming text
  3. Searching
  4. Fuzzy string matching
  5. Identifying people, places, and things
  6. Clustering text
  7. Classification, categorization, and tagging
  8. Building an example question answering system
  9. Untamed text: exploring the next frontier
(HTML tags aren't allowed.)

Ontological Engineering: with examples from the areas of Knowledge Management, e-Commerce and the Semantic Web. First Edition
Ontological Engineering: with examples from the areas of Knowledge Management, e-Commerce and the Semantic Web. First Edition
Ontologies provide a common vocabulary of an area and define - with different levels of formality - the meaning of the terms and the relationships between them. Ontologies may be reused and shared across applications and groups Concepts in the ontology are usually organized in taxonomies and relations between concepts, properties of concepts, and...
Microsoft  Windows Server(TM) 2003 PKI and Certificate Security
Microsoft Windows Server(TM) 2003 PKI and Certificate Security
No need to buy or outsource costly PKI services when you can use the robust PKI and certificate-based security services already built into Microsoft Windows Server 2003! This in-depth reference teaches you how to design and implement even the most demanding certificate-based security solutions for wireless networking, smart card authentication,...
Natural Language Processing and Text Mining
Natural Language Processing and Text Mining
From the reviews:

"The papers in this book describe a range of natural language processing (NLP) techniques and applications, all originating from an ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD) panel discussion. … Overall, the contributions are well balanced with respect to the different approaches...


Handbook of Natural Language Processing, Second Edition
Handbook of Natural Language Processing, Second Edition

As the title of this book suggests, it is an update of the first edition of the Handbook of Natural Language Processing which was edited by Robert Dale, Hermann Moisl, and Harold Somers and published in the year 2000. The vigorous growth of new methods in Natural Language Processing (henceforth, NLP) since then, strongly suggested that...

PKI Uncovered: Certificate-Based Security Solutions for Next-Generation Networks (Networking Technology: Security)
PKI Uncovered: Certificate-Based Security Solutions for Next-Generation Networks (Networking Technology: Security)

With the increasing focus on IT Security comes a higher demand for identity management in the modern business. This requires a flexible, scalable, and secure authentication method. Identity control is made mandatory by many public standards, such as PCI, and PKI is an essential component to set up authentication in many technologies,...

Big Data Forensics: Learning Hadoop Investigations
Big Data Forensics: Learning Hadoop Investigations

Perform forensic investigations on Hadoop clusters with cutting-edge tools and techniques

About This Book

  • Identify, collect, and analyze Hadoop evidence forensically
  • Learn about Hadoop's internals and Big Data file storage concepts
  • A step-by-step guide to help you...
©2017 LearnIT (support@pdfchm.net) - Privacy Policy