 | Level: Introductory Anja Nicolussi (nicolussi@de.ibm.com), Business Intelligence Consultant, IBM Sandra Grontzki (grontzki@de.ibm.com), Internship in Software Development, IBM
17 Jun 2005 Swim around at the surface of data mining concepts, strategies, and integration possibilities and dive deep into data mining techniques and implementation exercises.
Introduction
Have you ever wondered what Data Mining is about and how you get started to integrate Data Mining intelligence in your business applications?
Use this comprehensive self-learning guide to IBM® DB2® Intelligent Miner™ V8.2 to swim around at the surface of Data Mining concepts, strategies, and integration possibilities and to dive deep into Data Mining techniques and implementation exercises.
This tutorial will be your pool attendant and dive master with its presentation to communicate theoretical knowledge and its practical exercises to get a hands-on experience. The practical hands-on exercises deal with two simplified business scenarios: one with the risk analysis of bank customers and the other with a marketing campaign for bank products. The exercises are accompanied by screenshots and explanations.
Motivation
How can you gain new business insights? How can you detect highly profitable customers? How can you predict fraud cases or quality problems? These are only some of the questions that can be answered through Data Mining -- discover valuable business insights from your enterprise data.
What is Data Mining about? Data mining is about discovering previously unknown patterns and unknown relationships in data records from large databases. This is different than statistics, as statistics are based on hypotheses. Pointing out unknown patterns and unknown relationships has the advantage to better and more quickly identify market niches, gaining insight to the behavior of your customers, so that customer reactions can be better predicted. Therefore, it is helpful to set up appropriate actions and plans.
Where can Data Mining be applied? Let's see some examples:
- In the proteomics area, Data Mining helps identify and classify protein-protein interactions.
- In clinical informatics, Data Mining supports the identification of effective medical treatment based on the evaluation of patient data and clinical information.
- In the manufacturing area, Data Mining helps gather information about how to improve the quality of products and helps predict delays and problems in the production cycle.
- In insurance and credit card companies, Data Mining helps detect fraud and analyzes risk, aiding in the prediction of the degree of risk with new customers.
- In the retail business, Data Mining helps determine which sale of products is influenced by the sale of other products.
DB2 Intelligent Miner gives you a toolbox that suits many different situations.
This self-learning tutorial is derived from a two-day, onsite class, which has been held successfully several times for Business Partners who want to start Data Mining projects. The presentation introduces the Data Mining concept and process, discusses the DB2 Intelligent Miner product family, and covers the DB2 Intelligent Miner products in a technical way, with implementation examples.
As mentioned before, the hands-on exercises deal with two simplified business scenarios (one with the risk analysis of bank customers and the other with a marketing campaign for bank products).
Tutorial highlights
- Application scenarios for Data Mining
- Technical presentation of DB2 Intelligent Miner Modeling, DB2 Intelligent Miner Scoring, and DB2 Intelligent Miner Visualization with application scenarios
- DB2 Intelligent Miner Modeling delivers DB2 Extenders for modeling operations, DB2 Intelligent Miner Scoring provides scoring technology as database extensions, and DB2 Intelligent Miner Visualization provides Java™ visualizers to interact and graphically present Data Mining models
- A new dimension of Data Mining: The Easy Mining Procedures
- Installation of DB2 Intelligent Miner Modeling, DB2 Intelligent Miner Scoring, and DB2 Intelligent Miner Visualization
- Mining algorithms with examples taken from business scenarios
- Hands-on exercises to help you understand how Data Mining answers business questions
When you have completed this tutorial, you will have:
- An understanding of the Data Mining concept and process
- An understanding of how Data Mining can be applied using IBM DB2 Intelligent Miner products
- An overview of the IBM Business Intelligence strategy
- The experience to integrate the Data Mining techniques that you just have studied in your own corporate application.
Audience
The tutorial addresses everyone who is interested in learning how to set up a Data Mining project and seeks practice in how to integrate Data Mining with IBM DB2 Intelligent Miner.
Prerequisites: Participants should have a good knowledge of relational database technologies, as well as basic knowledge of SQL, XML, and Web technologies.
Duration
To complete this tutorial, you need about seven to eight hours, on average.
However, you might want to step through this tutorial more than once when translating the data mining techniques into your own business environment.
Navigating through the tutorial
The tutorial consists of a presentation and exercise part, both of them complementing one another. Please navigate through the tutorial as it is described in the following table. While studying the tutorial, you switch between the presentation and the exercises. The table illustrates exactly where to start, the included steps, and how long the tutorial could take to complete.
Table 1. How to navigate through the tutorial
| Presentation | Exercise | | Title | Slides | Duration | Title | Pages | Duration | | Introduction to IBM DB2 Intelligent Miner V8.2 | 1 - 6 | 2 mins | - | - | - | | Business Intelligence Overview | 7 - 16 | 10 mins | - | - | - | Introduction to Data Mining
The Data Mining Process | 17 - 36
37 - 54 | 10 mins
15 mins | -
- | -
- | -
- | | Data Mining Techniques | 55 - 95 | 20 mins | - | - | - | Intelligent Miner Product Family
Easy Mining Procedures
Intelligent Miner Visualization V8.2 | 96 - 114
115 - 134
135 - 173 | 15 mins
20 mins
15 mins | -
-
- | -
-
- | -
-
- | | - | - | - | Introduction | 1 - 5 | 5 mins | -
- | -
- | -
- | Preparation for the Exercise
The Easy Mining Procedures Exercise | 6 - 8
9 - 31 | 20 mins
2.5 hrs | | Intelligent Miner Modeling V8.2 | 173 - 211 | 25 mins | - | - | - | | - | - | - | The IM V8.2 Modeling Exercise | 32 - 69 | 1.5 hrs | | Intelligent Miner Scoring V8.2 | 112 - 229 | 25 mins | - | - | - | | - | - | - | The IM V8.2 Scoring Exercise | 60 - 74 | 1 hr | | Installation, configuration and administration of IM Modeling, IM Scoring and IM Visualization | 230 - 251 | 10 min | - | - | - | | Excursus: Data Warehouse and its components | 252 - 275 | 10 min | - | - | - |
 |
Required Software
To complete this tutorial, the following software must be installed on your computer:
- IBM DB2 Universal Database™, Version 8.1 or higher
(See Resources to download a trial version.)
- IBM DB2 Intelligent Miner Modeling and Scoring, Version 8.2 or higher containing the Easy Mining Procedures (See Resources to download a trial version for Windows.)
- IBM DB2 Intelligent Miner Visualization, Version 8.2 or higher (See Resources for more information.)
- Adobe Reader 5.0 or higher (See Resources to download a free version.)
A Windows operating system is also required to run the tutorial.
Preparing your computer
Retrieving the required files:The data for the exercise is stored in the scripts-data.zip file. Extract the scripts-data.zip file to the C:\ directory to run the scripts properly. The extracted data resides in the C:\Intelligent Miner directory. (It is assumed that you have installed and configured DB2 Intelligent Miner Modeling, Scoring, and Visualization according to the installation instructions.)
Remarks regarding the Tutorial
Please keep in mind that the problem definition phase and the data exploration phase are the time consuming part in a Data Mining project. This complicated step is already handled for you. Therefore, the exercise just concentrates on the mining steps.
In this tutorial, two mining functions, among the many different that exist in the Intelligent Miner product portfolio, will be presented. First, you will go through the Clustering mining function and then through the Association mining function in order to undertake a risk analysis of bank customers and a marketing campaign for bank products. Both will be applied to data records of a bank company.
The exercise directs you through the characteristic Data Mining phases (Modeling, Evaluation, and Deployment), each step exemplified by a typical Intelligent Miner functionality. It will take four to five hours to complete all the exercises. The exercises are accompanied with screenshots and explanations about the applied scripts.
Downloads | Description | Name | Size | Download method |
|---|
| Comprehensive Guide -- Presentation | presentation.pdf | 40 MB | FTP | HTTP |
|---|
| DB2 Intelligent Miner V8.2 Exercise booklet | exercisebooklet.pdf | 4 MB | FTP | HTTP |
|---|
| Exercise scripts and data files | scripts-data.zip | 900 KB | FTP | HTTP |
|---|
Resources
About the authors  | 
|  | Anja Nicolussi graduated in applied mathematics at the University of Paderborn in 1999. Since then, she has been working at the IBM Lab in Germany with IBM Information Management products and technologies, focusing on Business Intelligence and Data Warehouse solutions. Having a background of intense software development in Data Mining projects, she works actually as a technical consultant for DB2 Business Intelligence solutions. As a team lead she is also responsible for the EMEA-wide Business Partner Technical Enablement in the areas of Content Management and Business Intelligence. Anja Nicolussi is an IBM Certified Solution Designer (Business Intelligence).
|
 | 
|  | Dipl. Wirt.-Inf.(FH) Sandra Grontzki is a student of Multimedia Engineering at
University of Wismar, University of Technology, Business and Design, Germany. She finished her study "Business Informatics" successfully. She worked out the tutorial exercises during an internship at the EMEA Business Partner Technical Enablement department at the IBM development lab in Böblingen, Germany. |
Rate this page
|  |