Skip to main content

skip to main content

developerWorks  >  Information Management  >

DB2 Intelligent Miner: Comprehensive Guide to IBM DB2 Intelligent Miner, Version 8.2

Tutorial with hands-on exercises to get started with Data Mining using DB2 Intelligent Miner

developerWorks
Document options

Document options requiring JavaScript are not displayed


Rate this page

Help us improve this content


Level: Introductory

Anja Nicolussi (nicolussi@de.ibm.com), Business Intelligence Consultant, IBM
Sandra Grontzki (grontzki@de.ibm.com), Internship in Software Development, IBM

17 Jun 2005

Swim around at the surface of data mining concepts, strategies, and integration possibilities and dive deep into data mining techniques and implementation exercises.

Introduction

Have you ever wondered what Data Mining is about and how you get started to integrate Data Mining intelligence in your business applications?

Use this comprehensive self-learning guide to IBM® DB2® Intelligent Miner™ V8.2 to swim around at the surface of Data Mining concepts, strategies, and integration possibilities and to dive deep into Data Mining techniques and implementation exercises.

This tutorial will be your pool attendant and dive master with its presentation to communicate theoretical knowledge and its practical exercises to get a hands-on experience. The practical hands-on exercises deal with two simplified business scenarios: one with the risk analysis of bank customers and the other with a marketing campaign for bank products. The exercises are accompanied by screenshots and explanations.



Back to top


Motivation

How can you gain new business insights? How can you detect highly profitable customers? How can you predict fraud cases or quality problems? These are only some of the questions that can be answered through Data Mining -- discover valuable business insights from your enterprise data.

What is Data Mining about? Data mining is about discovering previously unknown patterns and unknown relationships in data records from large databases. This is different than statistics, as statistics are based on hypotheses. Pointing out unknown patterns and unknown relationships has the advantage to better and more quickly identify market niches, gaining insight to the behavior of your customers, so that customer reactions can be better predicted. Therefore, it is helpful to set up appropriate actions and plans.

Where can Data Mining be applied? Let's see some examples:

  • In the proteomics area, Data Mining helps identify and classify protein-protein interactions.
  • In clinical informatics, Data Mining supports the identification of effective medical treatment based on the evaluation of patient data and clinical information.
  • In the manufacturing area, Data Mining helps gather information about how to improve the quality of products and helps predict delays and problems in the production cycle.
  • In insurance and credit card companies, Data Mining helps detect fraud and analyzes risk, aiding in the prediction of the degree of risk with new customers.
  • In the retail business, Data Mining helps determine which sale of products is influenced by the sale of other products.

DB2 Intelligent Miner gives you a toolbox that suits many different situations.

This self-learning tutorial is derived from a two-day, onsite class, which has been held successfully several times for Business Partners who want to start Data Mining projects.

The presentation introduces the Data Mining concept and process, discusses the DB2 Intelligent Miner product family, and covers the DB2 Intelligent Miner products in a technical way, with implementation examples.

As mentioned before, the hands-on exercises deal with two simplified business scenarios (one with the risk analysis of bank customers and the other with a marketing campaign for bank products).



Back to top


Tutorial highlights

  • Application scenarios for Data Mining
  • Technical presentation of DB2 Intelligent Miner Modeling, DB2 Intelligent Miner Scoring, and DB2 Intelligent Miner Visualization with application scenarios
    • DB2 Intelligent Miner Modeling delivers DB2 Extenders for modeling operations, DB2 Intelligent Miner Scoring provides scoring technology as database extensions, and DB2 Intelligent Miner Visualization provides Java™ visualizers to interact and graphically present Data Mining models
  • A new dimension of Data Mining: The Easy Mining Procedures
  • Installation of DB2 Intelligent Miner Modeling, DB2 Intelligent Miner Scoring, and DB2 Intelligent Miner Visualization
  • Mining algorithms with examples taken from business scenarios
  • Hands-on exercises to help you understand how Data Mining answers business questions

When you have completed this tutorial, you will have:

  • An understanding of the Data Mining concept and process
  • An understanding of how Data Mining can be applied using IBM DB2 Intelligent Miner products
  • An overview of the IBM Business Intelligence strategy
  • The experience to integrate the Data Mining techniques that you just have studied in your own corporate application.



Back to top


Audience

The tutorial addresses everyone who is interested in learning how to set up a Data Mining project and seeks practice in how to integrate Data Mining with IBM DB2 Intelligent Miner.

Prerequisites: Participants should have a good knowledge of relational database technologies, as well as basic knowledge of SQL, XML, and Web technologies.



Back to top


Duration

To complete this tutorial, you need about seven to eight hours, on average. However, you might want to step through this tutorial more than once when translating the data mining techniques into your own business environment.



Back to top


Navigating through the tutorial

The tutorial consists of a presentation and exercise part, both of them complementing one another. Please navigate through the tutorial as it is described in the following table. While studying the tutorial, you switch between the presentation and the exercises. The table illustrates exactly where to start, the included steps, and how long the tutorial could take to complete.

Table 1. How to navigate through the tutorial

PresentationExercise
TitleSlidesDurationTitlePagesDuration
Introduction to IBM DB2 Intelligent Miner V8.21 - 62 mins - - -
Business Intelligence Overview7 - 1610 mins - - -
Introduction to Data Mining

The Data Mining Process
17 - 36

37 - 54
10 mins

15 mins
-

-
-

-
-

-
Data Mining Techniques55 - 9520 mins - - -
Intelligent Miner Product Family

Easy Mining Procedures

Intelligent Miner Visualization V8.2
96 - 114

115 - 134

135 - 173
15 mins

20 mins

15 mins
-

-

-
-

-

-
-

-

-
- - - Introduction1 - 55 mins
-

-
-

-
-

-
Preparation for the Exercise

The Easy Mining Procedures Exercise
6 - 8

9 - 31
20 mins

2.5 hrs
Intelligent Miner Modeling V8.2173 - 21125 mins - - -
- - - The IM V8.2 Modeling Exercise32 - 691.5 hrs
Intelligent Miner Scoring V8.2112 - 22925 mins - - -
- - - The IM V8.2 Scoring Exercise60 - 741 hr
Installation, configuration and administration of IM Modeling, IM Scoring and IM Visualization230 - 25110 min - - -
Excursus: Data Warehouse and its components 252 - 27510 min - - -


Back to top


Required Software

To complete this tutorial, the following software must be installed on your computer:

  • IBM DB2 Universal Database™, Version 8.1 or higher (See Resources to download a trial version.)
  • IBM DB2 Intelligent Miner Modeling and Scoring, Version 8.2 or higher containing the Easy Mining Procedures (See Resources to download a trial version for Windows.)
  • IBM DB2 Intelligent Miner Visualization, Version 8.2 or higher (See Resources for more information.)
  • Adobe Reader 5.0 or higher (See Resources to download a free version.)

A Windows operating system is also required to run the tutorial.



Back to top


Preparing your computer

Retrieving the required files:The data for the exercise is stored in the scripts-data.zip file. Extract the scripts-data.zip file to the C:\ directory to run the scripts properly. The extracted data resides in the C:\Intelligent Miner directory. (It is assumed that you have installed and configured DB2 Intelligent Miner Modeling, Scoring, and Visualization according to the installation instructions.)



Back to top


Remarks regarding the Tutorial

Please keep in mind that the problem definition phase and the data exploration phase are the time consuming part in a Data Mining project. This complicated step is already handled for you. Therefore, the exercise just concentrates on the mining steps.

In this tutorial, two mining functions, among the many different that exist in the Intelligent Miner product portfolio, will be presented. First, you will go through the Clustering mining function and then through the Association mining function in order to undertake a risk analysis of bank customers and a marketing campaign for bank products. Both will be applied to data records of a bank company.

The exercise directs you through the characteristic Data Mining phases (Modeling, Evaluation, and Deployment), each step exemplified by a typical Intelligent Miner functionality. It will take four to five hours to complete all the exercises. The exercises are accompanied with screenshots and explanations about the applied scripts.




Back to top


Downloads

DescriptionNameSizeDownload method
Comprehensive Guide -- Presentation presentation.pdf40 MBFTP|HTTP
DB2 Intelligent Miner V8.2 Exercise bookletexercisebooklet.pdf4 MBFTP|HTTP
Exercise scripts and data filesscripts-data.zip900 KBFTP|HTTP
Information about download methodsGet Adobe® Reader®


Resources



About the authors

Anja Nicolussi

Anja Nicolussi graduated in applied mathematics at the University of Paderborn in 1999. Since then, she has been working at the IBM Lab in Germany with IBM Information Management products and technologies, focusing on Business Intelligence and Data Warehouse solutions.
Having a background of intense software development in Data Mining projects, she works actually as a technical consultant for DB2 Business Intelligence solutions. As a team lead she is also responsible for the EMEA-wide Business Partner Technical Enablement in the areas of Content Management and Business Intelligence.
Anja Nicolussi is an IBM Certified Solution Designer (Business Intelligence).


Sandra Grontzki

Dipl. Wirt.-Inf.(FH) Sandra Grontzki is a student of Multimedia Engineering at University of Wismar, University of Technology, Business and Design, Germany. She finished her study "Business Informatics" successfully.
She worked out the tutorial exercises during an internship at the EMEA Business Partner Technical Enablement department at the IBM development lab in Böblingen, Germany.




Rate this page


Please take a moment to complete this form to help us better serve you.



 


 


Not
useful
Extremely
useful
 


Share this....

digg Digg this story del.icio.us del.icio.us Slashdot Slashdot it!



Back to top