... efficiently.
8
Mining Stream, Time-Series,
and Sequence Data
Our previous chapters introduced the basic concepts and techniques of data mining. The techniques
studied, however, were for simple and structured ... structured data sets, such as data in relational
databases, transactional databases, and data warehouses. The growth of data in various
complex forms (e.g....
... for the
2 .7 Summary 97
2 .7
Summary
Data preprocessing is an important issue for both data warehousing and data mining,
as real-world data tend to be incomplete, noisy, and inconsistent. Data preprocessing
includes ... approximation of the original data.
PCA is computationally inexpensive, can be applied to ordered and unordered
attributes, and can handle sparse dat...
... Data Mining 666
11.3.3 Visual and Audio Data Mining 6 67
11.3.4 Data Mining and Collaborative Filtering 670
11.4 Social Impacts of Data Mining 675
11.4.1 Ubiquitous and Invisible Data Mining 675
11.4.2 ... 61
2.3.2 Noisy Data 62
2.3.3 Data Cleaning as a Process 65
2.4 Data Integration and Transformation 67
2.4.1 Data Integration 67
2.4.2 Data Transf...
... processing, and data
mining. We also introduce on-line analytical mining (OLAM), a powerful paradigm that
integrates OLAP with data mining technology.
3.5.1 Data Warehouse Usage
Data warehouses and data ... Warehouse and OLAP Technology: An Overview
3.5
From Data Warehousing to Data Mining
“How do data warehousing and OLAP relate to data mining? ” In this sec...
... include data cube–based data aggregation and attribute-
oriented induction.
From a data analysis point of view, data generalization is a form of descriptive data
mining. Descriptive data mining ... mining describes data in a concise and summarative manner
and presents interesting general properties of the data. This is different from predic-
tive data mining, whic...
... for age with aGini index of 0. 375 ; the attributes {student}
and{ credit rating}arebothbinary,with Gini indexvaluesof0.3 67 and 0.429,respectively.
The attribute income and splitting subset {medium, ... scalability.
While both SLIQandSPRINThandle disk-resident data sets thatare too large to fit into
memory, the scalabilityof SLIQ islimited by the useof its memory-residentdatastructure...
... functions
(Hanson and Burr [HB88]), dynamic adjustment of the network topology (Me´zard
and Nadal [MN89], Fahlman and Lebiere [FL90], Le Cun, Denker, and Solla [LDS90],
and Harp, Samad, and Guha [HSG90] ), and ... such as Duda et al. [DHS01] and James [Jam85], as well as articles by Cover and
Hart [CH 67] and Fukunaga and Hummels [FH 87] . Their integration with attribut...
... substructures.
9. Metadata mining. Metadata are data about data. Metadata provide semi-structured
data about unstructured data, ranging from text and Web data to multimedia data-
bases. It is useful for data ... itemset
stream mining; the Hoeffding tree, VFDT, and CVFDT algorithms for stream data
classification; and the STREAM and CluStream algorithms for stream data...
... multimedia data mining focuses on image data mining.
Mining text data and mining the World Wide Web are studied in the two subsequent
638 Chapter 10 Mining Object, Spatial, Multimedia, Text, and Web Data
where ... closely linked to image
analysis and scientific data mining, and thus many image analysis techniques and scien-
tific data analysis methods can be ap...
... constraint-based
mining) , the integration of data mining with data warehousing and database systems,
the standardization of data mining languages, visualization methods, and new meth-
ods for handling ... Data (SIGMOD’99), pages 4 67 478 ,
Philadelphia, PA, June 1999.
[EKS 97] M. Ester, H P. Kriegel, and J. Sander. Spatial data mining: A database approach. In Proc.
19 9...