... Discovery and Data Mining
3
Contents
Preface
Chapter 1. Overview of Knowledge Discovery and Data Mining
1. 1 What is Knowledge Discovery and Data Mining?
1. 2 The KDD Process
1. 3 KDD and ... Line
Income
Debt
7
Chapter 1
Overview of knowledge discovery
and data mining
1. 1 What is Knowledge Discovery and D...
...
codes. The standard-form model is a data presentation that is uniform and effective
across a wide spectrum of data mining methods and supplementary data- reduction
techniques. Its model of data makes ... most data min-
ing methods in searching for good solutions.
2.2 Data Transformations
A central objective of data preparation for data mining is to transfor...
...
Knowledge Discovery and Data Mining
36
Day Outlook Temperature Humidity Wind PlayTennis?
D1
D2
D3
D4
D5
D6
D7
D8
D9
D10
D 11
D12
D13
D14
Sunny
Sunny
Overcast ... positive (p
⊕
= 1 ), then p
⊖
is 0, and Entropy(S) = -1 log
2
(1) -
0log
2
0 = -1 0 - 0log
2
0 = 0. Note the entropy is 1 when the collection contains an
equal number of posit...
...
Items OJ Cleaner Milk Soda Detergent
OJ 4 1 1 2 1
Window Cleaner 1 2 1 1 0
Milk 1 1 1 0 0
Soda 2 1 0 3 1
Detergent 1 0 0 1 2
Table 4.2: Co-occurrence of products
Since both the transactions ... handle variable-length data without the need for summarization. Other
techniques tend to require records in a fixed format, which is not a natural way to rep-
K...
... each data point forming its own
Knowledge Discovery and Data Mining
66
know by how much. If X, Y, and Z are ranked 1, 2, and 3, we know that X > Y > Z,
but not whether (X-Y) > (Y- ... perfect sense, however, to
say that a 50-year-old is twice as old as a 25-year-old or that a 10 -pound bag of sugar
is twice as heavy as a 5-pound one. Age, weight, length, a...
... Typically, the feedback is limited to either
Knowledge Discovery and Data Mining
86
used variant. Its two primary virtues are that it is simple and easy to understand, and
it works for a wide range ... can be used for clas-
sification, modeling, and time-series forecasting. For classification problems, the in-
Knowledge Discovery and Data Mining
82
6.2 Ne...
... random partition can
be misleading for small or moderately-sized samples, and multiple train -and- test ex-
periments can do better.
Knowledge Discovery and Data Mining
11 2
Both e0 and ... measures.
Knowledge Discovery and Data Mining
11 6
References
1. Knowledge Discovery Nuggets: http://www.kdnuggets.com/
2. Adriaans, P. and Zantinge, D.:...
... ACTIVITY
Storer, A. C., and Cornish-Bowden, A. (19 74) Biochem. J. 14 1, 205.
Tipton, K. F. (19 92) In Enzyme Assays, A Practical Approach, R. Eisenthal and M. J.
Danson, Eds., IRL Press, Oxford, pp. 1 58.
Tsukada, ... 1 58.
Tsukada, H., and Blow, D. M. (19 85) J. Mol. Biol. 18 4, 703.
Venkatasubban, K. S., and Schowen, R. L. (19 84) CRC Crit. Rev. Biochem. 17 ,1.
Waley, S. G. (...