... biomolecular target’s chemical data analy-
sis. In recent years, the trend has been to integrate chemical data with protein
and genetic data (bioinformatics data) and analyze the problem over multiple
proteins ... Graph Data Mining 601
dustry has generated a wealth of protein-ligand activity data for large com-
pound libraries against many biomolecular targets. The data has been system-
atically collected and ... Classification, 40
XML Clustering, 35, 291
XML Indexing, 4, 17
602 MANAGINGAND MINING GRAPH DATA
sent interactions between drugs and targets, and then used kernel regression to
the relationship among...
... 1
2. Graph Management and Mining Applications 3
3. Summary 8
References 9
2
Graph Data Management and Mining: A Survey of Algorithms and Applications
13
Charu C. Aggarwal and Haixun Wang
1. Introduction ... Conclusions and Future Research 55
References 55
3
Graph Mining: Laws and Generators
69
Deepayan Chakrabarti, Christos Faloutsos and Mary McGlohon
1. Introduction 70
2. Graph Patterns 71
x MANAGINGAND ... Beijing
viii MANAGINGAND MINING GRAPH DATA
6. Vector Space Embeddings of Graphs via Graph Matching 235
7. Conclusions 239
References 240
8
A Survey of Algorithms for Keyword Search on Graph Data
249
Haixun...
... sizes of the second and third-largest
connected components (CC2 and CC3) stabilize. We fo-
cus on these next-largest connected components in (c).
84
xx MANAGINGAND MINING GRAPH DATA
17.1 An unreduced ... Eqs.
(2.5) and (2.6) are 0.7810 and 0.5217, respectively.
492
16.3 A toy example (reproduced from 61) 496
16.4 Equivalence for Social Position 500
xviii MANAGINGAND MINING GRAPH DATA
7.3 Graph ... superlinearly-more money it
donates, and similarly, the more donations a candidate
gets, the more average amount-per-donation is received.
Inset plots on (c) and (d) show 𝑖𝑤 and 𝑜𝑤 versus time.
Note they...
... LLC 2010
C.C. Aggarwal and H. Wang (eds.), Managingand Mining Graph Data,
Advances in Database Systems 40, DOI 10.1007/978-1-4419-6045-0_1,
6 MANAGINGAND MINING GRAPH DATA
In the second case, ... the web and social networks are defined on massive graphs
4 MANAGINGAND MINING GRAPH DATA
Natural Properties of Real Graphs and Generators. In order to under-
stand the various management and mining ... in the case of structured data than in the case of multi-dimensional
data. The problem of managing graph data is related to the widely stud-
ied field of managing XML data. Where possible, we will...
... both the database and the IR communities.
Graph is a general structure and it can be used to model a variety of complex
data, including relational dataand XML data. Because the underlying data
assumes ... is to build a
24 MANAGINGAND MINING GRAPH DATA
[94], random walk kernels [81] and diffusion kernels [119]. In random walk
kernels [81], we attempt to determine the number of random walks between
the ... nodes in the graph independently and
perform random walks starting from these nodes. These random walks can be
Graph Data Management and Mining: A Survey of Algorithms and Applications 29
used in...
... transition any web
page in the collection uniformly at random.
50 MANAGINGAND MINING GRAPH DATA
examine the problem of community detection and change detection in a single
framework. This provides ... relationship (SAR) princi-
46 MANAGINGAND MINING GRAPH DATA
Let 𝐴 be the set of edges in the graph. Let 𝜋
𝑖
denote the steady state proba-
bility of node 𝑖 in a random walk, and let 𝑃 = [𝑝
𝑖𝑗
] denote ... dissemination in the underlying
Graph Data Management and Mining: A Survey of Algorithms and Applications 41
Densification: Most real networks such as the web and social networks con-
tinue to become...
... methods, procedures and functions in the program are
nodes, and the relationships between the different methods are defined
as edges. It is also possible to define nodes for data elements and model
relationships ... graphs are created during program execution, and they
represent the invocation structure. For example, a call from one pro-
56 MANAGINGAND MINING GRAPH DATA
[10] R. Agrawal, A. Borgida, H.V. Jagadish. ... of simple methods.
60 MANAGINGAND MINING GRAPH DATA
[75] M. Fiedler, C. Borgelt. Support computation for mining frequent sub-
graphs in a single graph. Workshop on Mining and Learning with Graphs
(MLG’07),...
... Query Language and Access Methods for Graph
Databases, appears as a chapter in Managingand Mining Graph Data, ed.
Charu Aggarwal, Springer, 2010.
[97] H. He, Querying and mining graph databases. ... GRAPH DATA
[175] H. Tong, C. Faloutsos, J Y. Pan. Fast random walk with restart and its
applications. In ICDM, pages 613–622, 2006.
[176] S. TrißI, U. Leser. Fast and practical indexing and querying ... fields and harmonic functions. ICML Conference, pages 912–
919, 2003.
Graph Data Management and Mining: A Survey of Algorithms and Applications 65
[159] P. R. Raw, B. Moon. PRIX: Indexing and querying...
... of the WWW, Web “clickstream” data, sales data
in retail chains, file size distributions, and phone usage data.
2.2 Small Diameters
Informal description:. Travers and Milgram [80] conducted a famous ... in the graph, and sum the results to find the total
74 MANAGINGAND MINING GRAPH DATA
sented as a table with the schema Graph(fromnode, tonode), the code for
calculating in-degree and out-degree ... [43] conjecture that for many graphs, the neighborhood size 𝑁
ℎ
80 MANAGINGAND MINING GRAPH DATA
graphs to random failures, and correlations found in the joint degree distri-
butions of the graphs....
... generators, we provide citations and a summary.
3.1 Random Graph Models
Random graphs are generated by picking nodes under some random prob-
ability distribution and then connecting them by edges. ... R
«
enyi in the 1960s [40, 41]. Their random graph
model was the first and the simplest model for generating a graph.
Description and Properties. We start with 𝑁 nodes, and for every pair of
nodes, an ... point represents a node and the 𝑥 and 𝑦 coordinates are
its degree and total weight, respectively. To achieve a good fit, we bucketize
the 𝑥 axis with logarithmic binning [64], and, for each bin, we...
... random and preferential attachment Instead of pure prefer-
ential attachment, the endpoints of new edges are chosen according to
a linear combination of preferential attachment and uniform random ... at time
94 MANAGINGAND MINING GRAPH DATA
where 𝑘(𝑖) is the degree of node 𝑖. Note that since the generated network is
undirected, we do not need to distinguish between out-degrees and in-degrees.
The ... these edges is given by
𝑃 (edge to existing vertex 𝑣) =
𝑘(𝑣)
∑
𝑖
𝑘(𝑖)
(3.14)
100 MANAGINGAND MINING GRAPH DATA
𝑡, and 𝛼 ∈ [0, 1] is a free parameter. To rephrase the equation, in order
to choose...
... 104 MANAGINGAND MINING GRAPH DATA
where 𝑑
𝑖𝑗
is the distance between nodes 𝑖 and 𝑗, ℎ
𝑗
is some measure of the
“centrality” of node 𝑗, and 𝛼 is a constant that controls ... devastating.
110 MANAGINGAND MINING GRAPH DATA
The recursive nature of the partitions means that we automatically
get sub-communities within existing communities (say, “RedHat” and
“Mandrake” enthusiasts ... parameters as possible.
There should be a fast parameter-fitting algorithm.
102 MANAGINGAND MINING GRAPH DATA
Description and properties:. As an example, suppose we have a for-
est which is prone...
... value of
customers. In Conference of the ACM Special Interest Group on Knowl-
edge Discovery andData Mining, New York, NY, 2001. ACM Press.
[35] Sergey N. Dorogovtsev and Jos
«
e Fernando Mendes. ... de Wet, and Yuri Goegebeur. A goodness-of-fit
statistic for Pareto-type behaviour. Journal of Computational and Applied
Mathematics, 186(1):99–116, 2005.
116 MANAGINGAND MINING GRAPH DATA
Small ... it only to differentiate between exponential and sub-exponential
growth
120 MANAGINGAND MINING GRAPH DATA
[42] Alex Fabrikant, Elias Koutsoupias, and Christos H. Papadimitriou.
Heuristically...
... 2010
C.C. Aggarwal and H. Wang (eds.), Managingand Mining Graph Data,
Advances in Database Systems 40, DOI 10.1007/978-1-4419-6045-0_4,
125
128 MANAGINGAND MINING GRAPH DATA
P
A
B
A
1
B
1
C
1
B
2
G
C ... V1.vid = E1.vid1 AND V1.vid = E3.vid1
AND V2.vid = E1.vid2 AND V2.vid = E2.vid1
AND V3.vid = E2.vid2 AND V3.vid = E3.vid2
AND V1.vid <> V2.vid AND V1.vid <> V3.vid
AND V2.vid <> ... of terminals and nonter-
minals, and a finite set of production rules. A production rule consists of a
122 MANAGINGAND MINING GRAPH DATA
[67] Mark E. J. Newman, Stephanie Forrest, and Justin Balthrop....