Tài liệu Data Streams Models and Algorithms- P7 ppt

30 517 0
Tài liệu Data Streams Models and Algorithms- P7 ppt

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. [...]... over data streams Other problems include those of finding significant network differences over data streams [I91 and finding quantiles [46,50] over data streams Another interesting application is that of significant differences between data streams [32,33], which has applications in numerous change detection scenarios Another recent application to sketches has been to XML and tree-structured data [82,83,87]... elegant, it is computationally intensive, and it is therefore not suitable for the data stream case We also note that the coefficient is defined according to lease purchase PDF Split-Merge on www.verypdf.com to remove this watermark DATA STREAMS: MODELS AND ALGORITHMS the wavelet coefficient definition i.e half the difference between the left hand and right hand side of the time series While this choice... watermark 186 4.3 DATA STREAMS: MODELS AND ALGORITHMS Sketches and their applications in Data Streams In the previous sections we discussed the application of sketches to the problem of massive time series Some of the methods such as fixed window sketch computation are inherently offline This does not suffice in many scenarios in which it is desirable to continuously compute the sketch over the data stream... coefficients The technique in [16] reduces the time and space efficiency for both updates and queries The method of sketches can be effectively used for second moment and join estimation First, we discuss the problem of second moment estimation [6] and < lease purchase PDF Split-Merge on www.verypdf.com to remove this watermark 188 DATA STREAMS: MODELS AND ALGORITHMS illustrate how it can be used for... dimensionality k by picking k random vectors of dimensionality d and calculating the dot product of the data point with each of these random vectors Each component of the k random vectors is drawn from the normal distribution with zero mean and unit variance In addition, the random vector is normalized to one unit in magnitude It has been shown in [64] that proportional L2 distances between the data points are approximately... watermark 196 DATA STREAMS: MODELS AND ALGORITHMS disk based index structures may be used to index and update frequency counts We argue that many applications in the sketch based literature which attempts to find specific properties of the frequency counts (eg second moments, join size estimation, heavy hitters) may in fact be implemented trivially by using simple main memory data structures, and the ability... purchase PDF Split-Merge on www.verypdf.com to remove this watermark 180 DATA STREAMS: MODELS AND ALGORITHMS of basis vectors in Figure 9.1 (in the same order as the corresponding wavelets illustrated) are as follows: The most detailed coefficients have only one +1 and one -1, whereas the most coarse coefficient has t/2 +1 and -1 entries Thus, in this case, we need 23 - 1 = 7 wavelet vectors In addition,... 184 DATA STREAMS: MODELS AND ALGORITHMS since different coordinates will render larger coefficients across different measures The technique in [25] uses a dynamic programming method to determine the optimal extended wavelet decomposition However, this method is not time and space efficient A method in [52] provides a fast algorithm whose space requirement is linear in the size of the synopsis and logarithmic... rln(l/S)l painvise independent hash functions, each of which map on to uniformly random integers in the range [0, el€], lease purchase PDF Split-Merge on www.verypdf.com to remove this watermark 192 DATA STREAMS: MODELS AND ALGORITHMS where e is the base of the natural logarithm Thus, we maintain a total of [ln(l/S)l hash tables, and there are a total of O(ln(l/S)/e) hash cells This apparently provides a... binary representation of that 1 integer will have length L The position (least significant and rightmost bit is counted as 0) of the rightmost 1-bit of the binary representation of that integer lease purchase PDF Split-Merge on www.verypdf.com to remove this watermark 194 DATA STREAMS: MODELS AND ALGORITHMS is tracked, and the largest such value is retained This value is logarithmically related to the number

Ngày đăng: 15/12/2013, 13:15

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan