Getting Started With SAS Enterprise Miner 5.2

Getting Started With SAS Enterprise Miner 5.2, 9781599940021 (1599940027), SAS Institute, 2006

SAS defines data mining as the process of uncovering hidden patterns in large amounts of data. Many industries use data mining to address business problems and opportunities such as fraud detection, risk and affinity analyses, database marketing, householding, customer churn, bankruptcy prediction, and portfolio analysis.The SAS data mining process is summarized in the acronym SEMMA, which stands for sampling, exploring, modifying, modeling, and assessing data.

Sample the data by creating one or more data tables. The sample should be large enough to contain the significant information, yet small enough to process.
Explore the data by searching for anticipated relationships, unanticipated trends, and anomalies in order to gain understanding and ideas.
Modify the data by creating, selecting, and transforming the variables to focus the model selection process.
Model the data by using the analytical tools to search for a combination of the data that reliably predicts a desired outcome.
Assess the data by evaluating the usefulness and reliability of the findings from the data mining process.

You might not include all of these steps in your analysis, and it might be necessary to repeat one or more of the steps several times before you are satisfied with the results. After you have completed the assessment phase of the SEMMA process, you apply the scoring formula from one or more champion models to new data that might or might not contain the target. The goal of most data mining tasks is to apply models that are constructed using training and validation data in order to make accurate predictions about observations of new, raw data.

The SEMMA data mining process is driven by a process flow diagram, which you can modify and save. The GUI is designed in such a way that the business analyst who has little statistical expertise can navigate through the data mining methodology, while the quantitative expert can go “behind the scenes” to fine-tune the analytical process.

SAS Enterprise Miner 5.2 contains a collection of sophisticated analysis tools that have a common user-friendly interface that you can use to create and compare multiple models. Statistical tools include clustering, self-organizing maps / Kohonen, variable selection, trees, linear and logistic regression, and neural networking. Data preparation tools include outlier detection, variable transformations, data imputation, random sampling, and the partitioning of data sets (into train, test, and validate data sets). Advanced visualization tools enable you to quickly and easily examine large amounts of data in multidimensional histograms and to graphically compare modeling results.

Comments

Amazing Books

Making, Breaking Codes: Introduction to Cryptology

Prentice Hall, 2001

This book is an introduction to modern ideas in cryptology and how to employ these ideas. It includes the relevant material on number theory, probability, and abstract algebra, in addition to descriptions of ideas about algorithms and com plexity theory. Three somewhat different terms appear in the discussion of secure communications...

Jakarta Pitfalls: Time-Saving Solutions for Struts, Ant, JUnit, and Cactus

John Wiley & Sons, 2003

Escape from common coding pitfalls with this detailed book of proven Jakarta missteps and solutions

The dangers of Jakarta pitfalls are everywhere and countless developers have already been trapped. These mistakes have delayed schedules, allowed major bugs to get into the users’ hands, or led to numerous rewrites in maintenance. Luckily,...

Computer-mediated Relationships and Trust: Managerial and Organizational Effects (Premier Reference Source)

Idea Group Publishing, 2007

The recent, rapid emergence of the virtual organization has added new dynamics and challenges to the context of relationships between organizational managers and their employees, customers, and other constituents.

Computer-Mediated Relationships and Trust: Managerial and Organizational Effects provides an exhaustive collection...

Mobile Internet: Enabling Technologies and Services

CRC Press, 2004

The migration of the most common Internet services to a mobile environment has long been an evolving
demand of both business and consumer markets. The ability to be connected to the Internet while on
the go and to benefit from using such applications as e-mail, instant messaging, audio and video
streaming, Web browsing, and...

Euro-Par 2007 Workshops: Parallel Processing: HPPC 2007, UNICORE Summit 2007, and VHPC 2007

Springer, 2008

Parallel and distributed processing, although within the focus of computer science research for a long time, is gaining more and more importance in a wide spectrum of applications. These proceedings aim to demonstrate the use of parallel and distributed processing concepts in different application fields, and attempt to spark interest in novel...

Fundamentals of RF Circuit Design: with Low Noise Oscillators

John Wiley & Sons, 2001

The art of RF circuit design made simple.....

Radio Frequency circuits are the fundamental building blocks in a vast array of consumer electronics and wireless communication devices. Jeremy Everard's unique combination of theory and practice provides insight into the principles of operation, together with invaluable guidance to...