... substructures.
9. Metadata mining. Metadata are data about data. Metadata provide semi-structured
data about unstructured data, ranging from text and Web data to multimedia data-
bases. It is useful for data ... itemset
stream mining; the Hoeffding tree, VFDT, and CVFDT algorithms for stream data
classification; and the STREAM and CluStream algorithms for stream data...
... 675
11.4.2 Data Mining, Privacy, and Data Security 6 78
11.5 Trends in Data Mining 681
11.6 Summary 684
Exercises 685
Bibliographic Notes 687
Appendix An Introduction to Microsoft’s OLE DB for
Data Mining ... Statistical Data Mining 666
11.3.3 Visual and Audio Data Mining 667
11.3.4 Data Mining and Collaborative Filtering 670
11.4 Social Impacts of Data...
... include data cube–based data aggregation and attribute-
oriented induction.
From a data analysis point of view, data generalization is a form of descriptive data
mining. Descriptive data mining ... by Cai, Cercone, and Han [CCH91] and
further extended by Han, Cai, and Cercone [HCC93], Han and Fu [HF96], Carter and
Hamilton [CH 98] , and Han, Nishio, Kawano, and...
... 97
2.7
Summary
Data preprocessing is an important issue for both data warehousing and data mining,
as real-world data tend to be incomplete, noisy, and inconsistent. Data preprocessing
includes data cleaning, ... approximation of the original data.
PCA is computationally inexpensive, can be applied to ordered and unordered
attributes, and can handle sparse data and...
... processing, and data
mining. We also introduce on-line analytical mining (OLAM), a powerful paradigm that
integrates OLAP with data mining technology.
3.5.1 Data Warehouse Usage
Data warehouses and data ... Warehouse and OLAP Technology: An Overview
3.5
From Data Warehousing to Data Mining
“How do data warehousing and OLAP relate to data mining? ” In this sec...
... to as training tuples and are selected from the database under analysis. In the
context of classification, data tuples can be referred to as samples, examples, instances,
data points, or objects.
2
Because ... scalability.
While both SLIQandSPRINThandle disk-resident data sets thatare too large to fit into
memory, the scalabilityof SLIQ islimited by the useof its memory-residentdatastru...
... functions
(Hanson and Burr [HB 88] ), dynamic adjustment of the network topology (Me´zard
and Nadal [MN89], Fahlman and Lebiere [FL90], Le Cun, Denker, and Solla [LDS90],
and Harp, Samad, and Guha [HSG90] ), and ... learning rate
and momentum parameters (Jacobs [Jac 88] ). Other variations are discussed in Chauvin
and Rumelhart [CR95]. Books on neural networks include Rumelh...
... efficiently.
8
Mining Stream, Time-Series,
and Sequence Data
Our previous chapters introduced the basic concepts and techniques of data mining. The techniques
studied, however, were for simple and structured ... time-series streams, spatiotemporal data
streams, and video and audio data streams.
8. 2
Mining Time-Series Data
“What is a time-series database?” A t...
... multimedia data mining focuses on image data mining.
Mining text data and mining the World Wide Web are studied in the two subsequent
6 38 Chapter 10 Mining Object, Spatial, Multimedia, Text, and Web Data
where ... closely linked to image
analysis and scientific data mining, and thus many image analysis techniques and scien-
tific data analysis methods can be...
... constraint-based
mining) , the integration of data mining with data warehousing and database systems,
the standardization of data mining languages, visualization methods, and new meth-
ods for handling ... such as Quinlan and Rivest [QR89] and Chakrabarti, Sarawagi, and Dom
[CSD 98] . The pattern discovery point of view of data mining is addressed in numerous
machine...