DSpace at VNU: Efficient strategies for parallel mining class association rules

Expert Systems with Applications 41 (2014) 4716–4729 Contents lists available at ScienceDirect Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa Efficient strategies for parallel mining class association rules Dang Nguyen a, Bay Vo b,⇑, Bac Le c a University of Information Technology, Vietnam National University, Ho Chi Minh, Viet Nam Information Technology Department, Ton Duc Thang University, Ho Chi Minh, Viet Nam c Department of Computer Science, University of Science, Vietnam National University, Ho Chi Minh, Viet Nam b a r t i c l e i n f o Keywords: Associative classification Class association rule mining Parallel computing Data mining Multi-core processor a b s t r a c t Mining class association rules (CARs) is an essential, but time-intensive task in Associative Classification (AC) A number of algorithms have been proposed to speed up the mining process However, sequential algorithms are not efficient for mining CARs in large datasets while existing parallel algorithms require communication and collaboration among computing nodes which introduces the high cost of synchronization This paper addresses these drawbacks by proposing three efficient approaches for mining CARs in large datasets relying on parallel computing To date, this is the first study which tries to implement an algorithm for parallel mining CARs on a computer with the multi-core processor architecture The proposed parallel algorithm is theoretically proven to be faster than existing parallel algorithms The experimental results also show that our proposed parallel algorithm outperforms a recent sequential algorithm in mining time Ó 2014 Elsevier Ltd All rights reserved Introduction Classification is a common topic in machine learning, pattern recognition, statistics, and data mining Therefore, numerous approaches based on different strategies have been proposed for building classification models Among these strategies, Associative Classification (AC), which uses the associations between itemsets and class labels (called class association rules), has been proven itself to be more accurate than traditional methods such as C4.5 (Quinlan, 1993) and ILA (Tolun & Abu-Soud, 1998; Tolun, Sever, Uludag, & Abu-Soud, 1999) The problem of classification based on class association rules is to find the complete set of CARs which satisfy the user-defined minimum support and minimum confidence thresholds from the training dataset A subset of CARs is then selected to form the classifier Since the first introduction in (Liu, Hsu, & Ma, 1998), tremendous approaches have been proposed to solve this problem Examples include the classification based on multiple association rules (Li, Han, & Pei, 2001), the classification model based on predictive association rules (Yin & Han, 2003), the classification based on the maximum entropy (Thabtah, Cowling, & Peng, 2005), the classification based on the information gain measure (Chen, Liu, Yu, Wei, & Zhang, 2006), the lazy-based approach for classification (Baralis, Chiusano, & Garza, 2008), the ⇑ Corresponding author Tel.: +84 083974186 E-mail addresses: nguyenphamhaidang@outlook.com (D Nguyen), vdbay@it tdt.edu.vn (B Vo), lhbac@fit.hcmus.edu.vn (B Le) http://dx.doi.org/10.1016/j.eswa.2014.01.038 0957-4174/Ó 2014 Elsevier Ltd All rights reserved use of an equivalence class rule tree (Vo & Le, 2009), the classifier based on Galois connections between objects and rules (Liu, Liu, & Zhang, 2011), the lattice-based approach for classification (Nguyen, Vo, Hong, & Thanh, 2012), and the integration of taxonomy information into classifier construction (Cagliero & Garza, 2013) However, most existing algorithms for associative classification have primarily concentrated on building an efficient and accurate classifier but have not considered carefully the runtime performance of discovering CARs in the first phase In fact, finding all CARs is a challenging and time-consuming problem due to two reasons First, it may be hard to find all CARs in dense datasets since there are a huge number of generated rules For example, in our experiments, some datasets can induce more than 4,000,000 rules Second, the number of candidate rules to check is very large Assuming there are d items and k class labels in the dataset, there can be up to k Â (2d À 1) rules to consider Very few studies, for instance (Nguyen, Vo, Hong, & Thanh, 2013; Nguyen et al., 2012; Vo & Le, 2009; Zhao, Cheng, & He, 2009), have discussed the execution time efficiency of the CAR mining process Nevertheless, all algorithms have been implemented by sequential strategies Consequently, their runtime performances have not been satisfied on large datasets, especially recently emerged dense datasets Researchers have begun switching to parallel and distributed computing techniques to accelerate the computation Two parallel algorithms for mining CARs were recently proposed on distributed memory systems (Mokeddem & Belbachir, 2010; Thakur & Ramesh, 2008) 4717 D Nguyen et al / Expert Systems with Applications 41 (2014) 4716–4729 Along with the advent of the computers with the multi-core processors, more memory and computing power of processors have been utilized so that larger datasets can be tackled in the main memory with lower cost in comparison with the usage of distributed or mainframe systems Therefore, this present study aims to propose three efficient strategies for parallel mining CARs on the multi-core processor computers The proposed approaches overcome two disadvantages of existing methods for parallel mining CARs They eliminate communication and collaboration among computing nodes which introduces the overhead of synchronization They also avoid data replication and not require data transfer among processing units As a result, the proposals significantly improve the response time compared to the sequential counterpart and existing parallel methods The proposed parallel algorithm is theoretically proven to be more efficient than existing parallel algorithms The experimental results also show that the proposed parallel algorithm can achieve up to a 2.1Â speedup compared to a recent sequential CAR mining algorithm The rest of this paper is organized as follows In Section 2, some preliminary concepts of the class association rule problem and the multi-core processor architecture are briefly given The benefits of parallel mining on multi-core processor computers are also discussed in this section Work related to sequential and parallel mining class association rules are reviewed in Section Our previous sequential CAR mining algorithm is summarized in Section because it forms the basic framework of our proposed parallel algorithm The primary contributions are presented in Section in which three proposed strategies for efficiently mining classification rules under the high performance parallel computing context are described The time complexity of the proposed algorithm is analyzed in Section Section presents the experimental results while conclusions and future work are discussed in Section Preliminary concepts This section provides some preliminary concepts of the class association rule problem and the multi-core processor architecture It also discusses benefits of parallel mining on the multi-core processor architecture 2.1 Class association rule One of main goals of data mining is to discover important relationships among items such that the presences of some items in a transaction are associated with the presences of some other items To achieve this purpose, Agrawal and his colleagues proposed the Apriori algorithm to find association rules in a transactional dataset (Agrawal & Srikant, 1994) An association rule has the form X ? Y where X, Y are frequent itemsets and X \ Y = £ The problem of mining association rules is to find all association rules in a dataset having support and confidence no less than user-defined minimum support and minimum confidence thresholds Class association rule is a special case of association rule in which only the class attribute is considered in the rule’s right-hand side (consequent) Mining class association rules is to find the set of rules which satisfy the minimum support and minimum confidence thresholds specified by end-users Let us define the CAR problem as follows Let D be a dataset with n attributes {A1, A2, , An} and |D| records (objects) where each record has an object identifier (OID) Let C = {c1, c2, , ck} be a list of class labels A specific value of an attribute Ai and class C are denoted by lower-case letters aim and cj, respectively Definition An item is described as an attribute and a specific value for that attribute, denoted by h(Ai, aim)i and an itemset is a set of items Definition Let I ¼ fhðA1 ; a11 Þi; ; hðA1 ; a1m1 Þi; hðA2 ; a21 Þi; ; hðA2 ; a2m2 Þi; ; hðAn ; an1 Þi; ; hðAn ; anmn Þig be a finite set of items Dataset D is a finite set of objects, D = {OID1, OID2, , OID|D|} in which each object OIDx has the form OIDx = attr(OIDx) ^ class(OIDx) (1 x |D|) with attr(OIDx) # I and class(OIDx) e C For example, OID1 for the dataset shown in Table is {h(A, a1)i, h(B, b1)i, h(C, c1)i} ^ {1} Definition A class association rule R has the form itemset ? cj, where cj e C is a class label Definition The actual occurrence ActOcc(R) of rule R in D is the number of objects of D that match R’s antecedent, i.e., ActOcc(R) = |{OID|OID e D ^ itemset # attr(OID)}| Definition The support of rule R, denoted by Supp(R), is the number of objects of D that match R’s antecedent and are labeled with R’s class Supp(R) is defined as: SuppRị ẳ jfOIDjOID D ^ itemset # attrOIDị ^ cj ẳ classOIDịgj Denition The condence of rule R, denoted by Conf(R), is dened as: Conf Rị ẳ SuppRị ActOccðRÞ A sample dataset is shown in Table It contains three objects, three attributes (A, B, and C), and two classes (1 and 2) Considers rule R: h(A, a1)i ? We have ActOcc(R) = and Supp(R) = since there are two objects with A = a1, in that one object (object 1) also conSuppðRÞ tains class We also have Conf Rị ẳ ActOccRị ẳ 12 2.2 Multi-core processor architecture A multi-core processor (shown in Fig 1) is a single computing component with two or more independent central processing units (cores) in the same physical package (Andrew, 2008) The processors were originally designed with only one core However, multi-core processors became mainstream when Intel and AMD introduced their commercial multi-core chip in 2008 (Casali & Ernst, 2013) A multi-core processor computer has different specifications from either a computer cluster (Fig 2) or a SMP (Symmetric Multi-processor) system (Fig 3): the memory is not distributed like in a cluster but rather is shared It is similar to the SMP architecture Many SMP systems, however, have the NUMA (Non Uniform Memory Access) architecture There are several memory blocks which are accessed with different speeds from each processor depending on the distance between the memory block and the processor On the contrary, the multi-core processors are usually on the UMA (Uniform Memory Access) architecture There is one Table Example of a dataset OID A B C Class a1 a1 a2 b1 b1 b1 c1 c1 c1 2 4718 D Nguyen et al / Expert Systems with Applications 41 (2014) 4716–4729 Thread Thread Chip C o r e C o r e Memory Fig Multi-core processor: one chip, two cores, two threads (Source: http:// software.intel.com/en-us/articles/multi-core-processor-architecture-explained) memory block only, so all cores have an equal access time to the memory (Laurent, Négrevergne, Sicard, & Termier, 2012) candidates Their main contribution was to enhance the task of candidate generation in the Apriori algorithm on the multi-core processor computers Schlegel, Karnagel, Kiefer, and Lehner (2013) recently adapted the well-known Eclat algorithm to a highly parallel version which runs on the multi-core processor system They proposed three parallel approaches for Eclat: independent class, shared class, and shared itemset Parallel mining has also been widely adopted in many other research fields, such as closed frequent itemset mining (Negrevergne, Termier, Méhaut, & Uno, 2010), gradual pattern mining (Laurent et al., 2012), correlated pattern mining (Casali & Ernst, 2013), generic pattern mining (Negrevergne, Termier, Rousset, & Méhaut, 2013), and tree-structured data mining (Tatikonda & Parthasarathy, 2009) While many researches have been devoted to develop parallel pattern mining and association rule mining algorithms relied on the multi-core processor architecture, no studies have published regarding the parallel class association rule mining problem Thus, this paper proposes the first algorithm for parallel mining CARs which can be executed efficiently on the multi-core processor architecture Related work This section begins with the overview of some sequential versions of CAR mining algorithm and then provides details about two parallel versions of it 2.3 Parallel mining on the multi-core processor architecture Obviously, the multi-core processor architecture has many desirable properties, for example each core has direct and equal access to all the system’s memory and the multi-core chip also allows higher performance at lower energy and cost Therefore, numerous researchers have developed parallel algorithms on the multi-core processor architecture in the data mining literature One of the first algorithms targeting multi-core processor computers was FP-array proposed by Liu and his colleagues in 2007 (Liu, Li, Zhang, & Tang, 2007) The authors proposed two techniques, namely a cacheconscious FP-array and a lock-free dataset tiling parallelism mechanism for parallel discovering frequent itemsets on the multi-core processor machines Yu and Wu (2011) proposed an efficient load balancing strategy in order to reduce massive duplicated generated 3.1 Sequential CAR mining algorithms The first algorithm for mining CARs was proposed by Liu et al (1998) based on the Apriori algorithm (Agrawal & Srikant, 1994) After its introduction, several other algorithms adopted its approach, including CAAR (Xu, Han, & Min, 2004) and PCAR (Chen, Hsu, & Hsu, 2012) However, these methods are time-consuming because they generate a lot of candidates and scan the dataset several times Another approach for mining CARs is to build the frequent pattern tree (FP-tree) (Han, Pei, & Yin, 2000) to discover rules, which was presented in some algorithms such as CMAR (Li et al., 2001) and L3 (Baralis, Chiusano, & Garza, 2004) The mining Processor Memory Processor Processor Memory Memory Processor Memory Fig Computer cluster (Source: http://en.wikipedia.org/wiki/Distributed_computing) 4719 D Nguyen et al / Expert Systems with Applications 41 (2014) 4716–4729 Main Memory System Bus Bus Arbiter Cache Cache Cache I/O Processor Processor Processor n Fig Symmetric multi-processor system (Source: http://en.wikipedia.org/wiki/Symmetric_multiprocessing) process used by the FP-tree does not generate candidate rules However, its significant weakness lies in the fact that the FP-tree does not always fit in the main memory Several algorithms, MMAC (Thabtah, Cowling, & Peng, 2004), MCAR (Thabtah et al., 2005), and MCAR (Zhao et al., 2009), utilized the vertical layout of the dataset to improve the efficiency of the rule discovery phase by employing a method that extends the tidsets intersection method mentioned in (Zaki, Parthasarathy, Ogihara, & Li, 1997) Vo and Le proposed another method for mining CARs by using an equivalence class rule tree (ECR-tree) (Vo & Le, 2009) An efficient algorithm, called ECRCARM, was also proposed in their paper The two strong features demonstrated by ECR-CARM are that it scans the dataset only once and uses the intersection of object identifiers to determine the support of itemsets quickly However, it needs to generate and test a huge number of candidates because each node in the tree contains all values of a set of attributes Nguyen et al (2013) modified the ECR-tree structure to speed up the mining process In their enhanced tree, named MECR-tree, each node contains only one value instead of the whole group They also provided theorems to identify the support of child nodes and prune unnecessary nodes quickly Based on MECR-tree and these theorems, they presented the CAR-Miner algorithm for effectively mining CARs It can be seen that many sequential algorithms of CAR mining have been developed but very few parallel versions of it have been proposed Next section reviews two parallel algorithms of CAR mining which have been mentioned in the associative classification literature 3.2 Parallel CAR mining algorithms One of the primary weaknesses of sequential versions of CAR mining is that they are unable to provide the scalability in terms of data dimension, size, or runtime performance for such large datasets Consequently, some researchers recently have tried to apply parallelism to current sequential CAR mining algorithms to release the sequential bottleneck and improve the response time Thakur and Ramesh (2008) proposed a parallel version for the CBA algorithm (Liu et al., 1998) Their proposed algorithm was implemented on a distributed memory system and based on data parallelism The parallel CAR mining phase is an adaption of the CD approach which was originally proposed for parallel mining frequent itemsets (Agrawal & Shafer, 1996) The training dataset was partitioned into P parts which were computed on P processors Each processor worked on its local data to mine CARs with the same global minimum support and minimum confidence However, this algorithm has three big weaknesses as follows First, it uses a static load balance which partitions work among processors by using a heuristic cost function This causes a high load imbalance Second, a high synchronization happens at the end of each step Final, each site must keep the duplication of the entire set of candidates Additionally, the authors did not provide any experiments to illustrate the performance of the proposed algorithm Mokeddem and Belbachir (2010) proposed a distributed version for FP-Growth (Han et al., 2000) to discover CARs Their proposed algorithm was also employed on a distributed memory system and based on the data parallelism Data were partitioned into P parts which were computed on P processors for parallel discovering the subsets of classification rules An inter-communication was established to make global decisions Consequently, their approach faces the big problem of high synchronization among nodes In addition, the authors did not conduct any experiments to compare their proposed algorithm with others Two existing parallel algorithms for mining CARs which were employed on distributed memory systems have two significant problems: high synchronization among nodes and data replication In this paper, a parallel CAR mining algorithm based on the multicore processor architecture is thus proposed to solve those problems A sequential class association rule mining algorithm In this section, we briefly summarize our previous sequential CAR mining algorithm as it forms the basic framework of our proposed parallel algorithm In (Nguyen & Vo, 2014), we proposed a tree structure to mine CARs quickly and directly Each node in the tree contains one itemset along with: (1) (Obidset1, Obidset2, , Obidsetk) – A list of Obidsets in which each Obidseti is a set of object identifiers that contain both the itemset and class ci Note that k is the number of classes in the dataset (2) pos – A positive integer storing the position of the class with the maximum cardinality of Obidseti, i.e., pos = argmaxie[1,k]{|Obidseti|} (3) total – A positive integer which stores the sum of cardinality P of all Obidseti, i.e., total ẳ kiẳ1 jObidseti jị However, the itemset is converted to the form att Â values for easily programming, where (1) att – A positive integer represents a list of attributes (2) values – A list of values, each of which is contained in one attribute in att 4720 D Nguyen et al / Expert Systems with Applications 41 (2014) 4716–4729 For example, itemset X = {h(B, b1)i, h(C, c1)i} is denoted as X = Â b1c1 A bit representation is used for storage of itemset attributes to save memory usage Attributes BC can be represented as 110 in bit representation, so the value of these attributes is Bitwise operations are then used to quickly join itemsets In Table 1, itemset X = {h(B, b1)i, h(C, c1)i} is contained in objects 1, and Thus, the node which contains itemset X has the form Â b1c1(1, 23) in which Obidset1 = {1} (or Obidset1 = for short) (i.e., object contains both itemset X and class 1), Obidset2 = {2, 3} (or Obidset2 = 23 for short) (i.e., objects and contain both itemset X and class 2), pos = (denoted by a line under Obidset2, i.e., 23), and total = pos is because the cardinality of Obidset2 for class is maximum (2 versus 1) Obtaining support and confidence of a rule becomes computing jObidset pos j |Obidsetpos| and , respectively For example, node total Â b1c1(1, 23) generates rule {h(B, b1)i, h(C, c1)i} ? (i.e., if B = b1 and C = c1, then Class = 2) with Supp = |Obidset2| = |23| = and Conf ¼ 23 Based on the tree structure, we also proposed a sequential algorithm for mining CARs, called Sequential-CAR-Mining, as shown in Fig Firstly, we find all frequent 1-itemsets and add them to the root node of the tree (Line 1) Secondly, we recursively discover other frequent k-itemsets based on the Depth-First Search strategy (procedure Sequential-CAR-Mining) Thirdly, while traversing {} 1× a1(1, ) 1× a ( ∅,3) × a1b1(1, ) × a1c1(1, ) × a 2b1( ∅,3) × a 2c1( ∅,3) × b1c1(1, 23) × a1b1c1(1, ) × a 2b1c1( ∅,3) nodes in the tree, we also generate rules which satisfy the minimum confidence threshold (procedure Generate-Rule) The pseudo code of the algorithm is shown in Fig Fig shows the tree structure generated by the sequential CAR mining algorithm for the dataset shown in Table For details on the tree generation, please refer to the study by Nguyen and Vo (2014) The proposed parallel class association rule mining algorithm Although Sequential-CAR-Mining is an efficient algorithm for mining all CARs, its runtime performance reduces significantly on large datasets due to the computational complexity As a result, Output: All CARs satisfying minSup and minConf Procedure: Let Lr be the root node of the tree Lr includes a set of nodes in which each node contains a frequent 1-itemset Sequential-CAR-Mining( Lr , minSup, minConf) CARs= ∅ ; for all lx ∈ Lr children Generate-Rule( lx , minConf); Pi = ∅ ; for all l y ∈ Lr children , with y > x if l y att ≠ lx att then // two nodes are combined only if their attributes are different O.att = l x att | l y att ; // using bitwise operation O.values = l x values ∪ l y values ; 10 O.Obidseti = l x Obidseti ∩ l y Obidseti ; // ∀i ∈ [1, k ] 11 O pos = argmax i∈[1,k ] { O.Obidseti } ; 12 O.total = ∑ O.Obidseti ; 13 if O.ObidsetO pos ≥ minSup then // node O satisfies minSup k i =1 14 15 Pi = Pi ∪ O ; Sequential-CAR-Mining( Pi , minSup, minConf); Generate-Rule( l , minConf) 16 conf = l.Obidsetl pos / l.total ; 17 if conf ≥ minConf then 18 { ( × c1(1, 23) Fig Tree generated by sequential-CAR-mining for the dataset in Table Input: Dataset D, minSup and minConf × b1(1, 23) CARs=CARs ∪ l.itemset → c pos l.Obidsetl pos , conf )} ; Fig Sequential algorithm for mining CARs D Nguyen et al / Expert Systems with Applications 41 (2014) 4716–4729 Input: Dataset D, minSup and minConf Output: All CARs satisfying minSup and minConf Procedure: Let Lr be the root node of the tree Lr includes a set of nodes in which each node contains a frequent 1-itemset PMCAR( Lr , minSup, minConf) totalCARs=CARs= ∅ ; for all lx ∈ Lr children Generate-Rule(CARs, lx , minConf); Pi = ∅ ; for all l y ∈ Lr children , with y > x if l y att ≠ lx att then // two nodes are combined only if their attributes are different O.att = lx att | l y att ; // using bitwise operation O.values = l x values ∪ l y values ; 10 O.Obidseti = l x Obidseti ∩ l y Obidseti ; // ∀i ∈ [1, k ] 11 O pos = argmax i∈[1,k ] { O.Obidseti } ; 12 O.total = ∑ O.Obidseti ; k i =1 if O.ObidsetO pos ≥ minSup then // node O satisfies minSup 13 Pi = Pi ∪ O ; 14 15 Task ti = new Task(() => { Sub-PMCAR(tCARs, Pi , minSup, minConf); }); 16 for each task in the list of created tasks 17 collect the set of rules ( tCARs ) returned by each task; 18 totalCARs = totalCARs ∪ tCARs ; 19 totalCARs = totalCARs ∪ CARs ; Sub-PMCAR(tCARs, Lr , minSup, minConf) 20 for all lx ∈ Lr children 21 Generate-Rule(tCARs, lx , minConf); 22 Pi = ∅ ; 23 for all l y ∈ Lr children , with y > x 24 if l y att ≠ lx att then // two nodes are combined only if their attributes are different 25 O.att = l x att | l y att ; // using bitwise operation 26 O.values = lx values ∪ l y values ; 27 O.Obidseti = l x Obidseti ∩ l y Obidseti ; // ∀i ∈ [1, k ] 28 O pos = argmax i∈[1,k ] { O.Obidseti } ; 29 O.total = ∑ O.Obidseti ; 30 if O.ObidsetO pos ≥ minSup then // node O satisfies minSup k i =1 31 32 Pi = Pi ∪ O ; Sub-PMCAR(tCARs, Pi , minSup, minConf); Fig PMCAR with independent branch strategy 4721 4722 D Nguyen et al / Expert Systems with Applications 41 (2014) 4716–4729 {} 1× a1(1, ) 1× a ( ∅,3) × b1(1, 23) × a1b1(1, ) × a1c1(1, ) × a 2b1( ∅,3) × a 2c1( ∅,3) × b1c1(1, 23) × a1b1c1(1, ) × a 2b1c1( ∅,3) t1 t2 × c1(1, 23) t3 Fig Illustration of the independent branch strategy we have tried to apply parallel computing techniques to the sequential algorithm to speed up the mining process Schlegel et al (2013) recently adapted the well-known Eclat algorithm to a highly parallel version which runs on the multi-core processor system They proposed three parallel approaches for Eclat: independent class, shared class, and shared itemset In the ‘‘independent class’’ strategy, each equivalence class is distributed to a single thread which mines its assigned class independently from other threads This approach has an important advantage in that the synchronization cost is low It, however, consumes much higher memory than the sequential counterpart because all threads hold entire their tidsets at the same time Additionally, this strategy often causes high load imbalances when a large number of threads are used Threads mine light classes often finish sooner than threads mine heavier classes In the ‘‘shared class’’ strategy, a single class is assigned to multiple threads This can reduce the memory consumption but increase the cost of synchronization since one thread has to communicate to others to obtain their tidsets In the final strategy, ‘‘shared itemset’’, multiple threads concurrently perform the intersection of two tidsets for a new itemset In this strategy, threads have to synchronize with each other with a high cost Basically, the proposed algorithm, Parallel Mining Class Association Rules (PMCAR), is a combination of Sequential-CAR-Mining and parallel ideas mentioned in (Schlegel et al., 2013) It has the same core steps as Sequential-CAR-Mining where it scans the dataset once to obtain all frequent 1-itemsets along with their Obidsets, and it then starts recursively mining It also adopts two parallel strategies ‘‘independent class’’ and ‘‘shared class’’ However, PMCAR has some differences as follows PMCAR is a parallel algorithm for mining class association rules while the work done by Schlegel et al focuses on mining frequent itemsets only Additionally, we also propose a third parallel strategy shared Obidset for PMCAR PMCAR is employed on a single system with the multi-core processor where the main memory can be shared with and equally accessed by all cores Hence, PMCAR does not require synchronization among computing nodes like other parallel CAR mining algorithms employed on distributed memory systems Compared to Sequential-CAR-Mining, the main differences between PMCAR and Sequential-CAR-Mining in terms of parallel CAR mining strategies are discussed in the following sections 5.1 Independent branch strategy The first strategy, independent branch, distributes each branch of the tree to a single task, which mines assigned branch independently from all other tasks to generate CARs General speaking, this strategy is similar to the ‘‘independent class’’ strategy mentioned in (Schlegel et al., 2013) except that PMCAR uses the different tree structure for the purpose of CAR mining and it is implemented by using tasks instead of threads As mentioned above, this strategy has some limitations such as high load imbalances and high memory consumption However, the primary advantage of this strategy is that each task is executed independently from other tasks without any synchronization In our implementation, the algorithm is employed based on the parallelism model in NET Framework 4.0 Instead of using threads, our algorithm uses tasks that have more advantageous than threads First, task consumes less memory usage than thread Second, while a single thread runs on a single core, tasks are designed to be aware of the multi-core processor and multiple tasks can be executed on a single core Final, using threads takes much time because operating systems must allocate data structures of threads, initialize, destroy them, and also perform the context switches between threads Consequently, our implementation can solve two problems: high memory consumption and high imbalance The pseudo code of PMCAR with independent branch strategy is shown in Fig We apply the algorithm to the sample dataset shown in Table to illustrate its basic ideas First, PMCAR finds all frequent 1-itemsets as done in Sequential-CAR-Mining (Line 1) After this step, we have Lr = {1 Â a1(1, 2), Â a2(£, 3), Â b1(1, 23), Â c1(1, 23)} Second, PMCAR calls procedure PMCAR to generate frequent 2-itemsets (Lines 3–14) For example, consider node Â a1(1, 2) This node combines with two nodes Â b1(1, 23) and Â c1(1, 23) to generate two new nodes Â a1b1(1, 2) and Â a1c1(1, 2) Note that node Â a1(1, 2) does not combine with node Â a2(£, 3) since they have the same attribute (attribute A) which causes the support of the new node is zero regarding Theorem mentioned in (Nguyen & Vo, 2014) After these steps, we have Pi = {3 Â a1b1(1, 2), Â a1c1(1, 2)} Then, PMCAR creates a new task ti and calls procedure Sub-PMCAR inside that task with four parameters tCARs, minSup, minConf, and Pi The first parameter tCARs is used to store the set of rules returned by SubPMCAR in a task (Line 15) For instance, task t1 is created and procedure Sub-PMCAR is executed inside t1 Procedure Sub-PMCAR is recursively called inside a task to mine all CARs (Lines 20–32) For example, task t1 also generates node Â a1b1c1(1, 2) and its rule Finally, after all created tasks completely mine all assigned branches, their results are collected and form the complete set of rules (Lines 16–19) In Fig 7, three tasks t1, t2, and t3 represented by solid blocks parallel mine three branches a1, a2, and b1 independently 5.2 Shared branch strategy The second strategy, shared branch, adopts the same ideas of the ‘‘shared class’’ strategy mentioned in Schlegel et al (2013) In this strategy, each branch is parallel mined by multiple tasks The pseudo code of PMCAR with shared branch strategy is shown in Fig First, the algorithm initializes the root node Lr (Line 1) Then, the procedure PMCAR is recursively called to generate CARs When node lx combines with node ly, the algorithm creates a new task ti and performs the combination code inside that task (Lines 7– 17) Note that because multiple tasks concurrently mine the same branch, synchronization happens to collect necessary information for the new node (Line 18) Additionally, to avoid a data race (i.e., two or more tasks perform operations that update a shared piece data) (Netzer & Miller, 1989), we use a lock object to coordinate tasks’ access to the share data Pi (Lines 15 and 16) We also apply the algorithm to the dataset in Table to demonstrate its work As an example, we can discuss node Â a1(1, 2) The algorithm creates task t1 to combine node Â a1(1, 2) with node Â b1(1, 23) to generate node Â a1b1(1, 2); it parallel creates task t2 to combine node Â a1(1, 2) with node Â c1(1, 23) to generate node Â a1c1(1, 2) However, before the algorithm continues creating task t3 to generate node Â a1b1c1(1, 2), it has to wait till tasks t1 and t2 finishing their works Therefore, this strategy is slower than the first one in execution time In Fig 9, three tasks t1, t2, and t3 parallel mine the same branch a1 5.3 Shared Obidset strategy The third strategy, shared Obidset, is different from the ‘‘shared itemset’’ strategy discussed in Schlegel et al (2013) Each task has a D Nguyen et al / Expert Systems with Applications 41 (2014) 4716–4729 4723 Input: Dataset D, minSup and minConf Output: All CARs satisfying minSup and minConf Procedure: Let Lr be the root node of the tree Lr includes a set of nodes in which each node contains a frequent 1-itemset PMCAR( Lr , minSup, minConf) CARs= ∅ ; for all lx ∈ Lr children Generate-Rule( lx , minConf); Pi = ∅ ; for all l y ∈ Lr children , with y > x Task ti = new Task(() => { if l y att ≠ lx att then O.att = lx att | l y att ; // using bitwise operation 10 O.values = l x values ∪ l y values ; 11 O.Obidseti = l x Obidseti ∩ l y Obidseti ; // ∀i ∈ [1, k ] 12 O pos = argmax i∈[1,k ] { O.Obidseti } ; 13 O.total = ∑ O.Obidseti ; k i =1 if O.ObidsetO pos ≥ minSup then // node O satisfies minSup 14 15 lock(lockObject) Pi = Pi ∪ O ; 16 17 }); 18 Task.WaitAll( ti ); 19 PMCAR( Pi , minSup, minConf); Fig PMCAR with shared branch strategy {} 1× a1(1, ) t1 1× a ( ∅,3) × b1(1, 23) × c1(1, 23) t2 × a1b1(1, ) × a1c1(1, ) × a 2b1( ∅,3) × a 2c1( ∅,3) × b1c1(1, 23) t3 × a1b1c1(1, ) × a 2b1c1( ∅,3) Fig Illustration of the shared branch strategy different branch assigned and its child tasks process together a node in the branch The pseudo code of PMCAR with shared Obidset strategy is shown in Fig 10 The algorithm first finds all frequent 1itemsets and adds them to the root node (Line 1) It then calls procedure PMCAR to generate frequent 2-itemsets (Lines 2–14) For each branch of the tree, it creates a task and call procedure SubPMCAR inside that task (Line 15) Sub-PMCAR is recursively called to generate frequent k-itemsets (k > 2) and their rules (Lines 20– 34) We can see that the functions of procedures PMCAR and Sub-PMCAR look like those mentioned in PMCAR with independent branch strategy However, this algorithm provides a more complicated parallel strategy In Sub-PMCAR, the algorithm creates a list of child tasks to parallel intersect Obidseti of two nodes (Lines 27– 28) This allows the work distribution to be the most fine-grained Nevertheless, all child tasks have to finish their work before calculating two properties pos and total for the new node (Lines 29–31) Consequently, there is a high cost of synchronization among child tasks and between child tasks and their parent task Let us illustrate the basic ideas of shared Obidset strategy by Fig 11 Branch a1 is assigned to task t1 In procedure Sub-PMCAR, tasks t2 and t3 which are child tasks of t1 process together node Â a1b1(1, 2), i.e., tasks t2 and t3 parallel intersect Obidset1 and Obidset2 of two nodes Â a1b1(1, 2) and Â a1c1(1, 2), respectively However, task t2 must wait till task t3 finishing the intersection of two Obidset2 to obtain Obidset1 and Obidset2 of the new node Â a1b1c1(1, 2) Additionally, parent task t1 represented by the solid block must wait till all tasks t2, t3, and other child tasks finishing their work Time complexity analysis In this section, we analyze the time complexities of both sequential and proposed parallel CAR mining algorithms We then derive the speedup of the parallel algorithm We also compare the time complexity of our parallel algorithm with those of existing parallel algorithms 4724 D Nguyen et al / Expert Systems with Applications 41 (2014) 4716–4729 Input: Dataset D, minSup and minConf Output: All CARs satisfying minSup and minConf Procedure: Let Lr be the root node of the tree Lr includes a set of nodes in which each node contains a frequent 1-itemset PMCAR( Lr , minSup, minConf) totalCARs=CARs= ∅ ; for all lx ∈ Lr children Generate-Rule(CARs, lx , minConf); Pi = ∅ ; for all l y ∈ Lr children , with y > x if l y att ≠ lx att then // two nodes are combined only if their attributes are different O.att = l x att | l y att ; // using bitwise operation O.values = l x values ∪ l y values ; 10 O.Obidseti = l x Obidseti ∩ l y Obidseti ; // ∀i ∈ [1, k ] 11 O pos = argmax i∈[1,k ] { O.Obidseti } ; 12 O.total = ∑ O.Obidseti ; k i =1 if O.ObidsetO pos ≥ minSup then // node O satisfies minSup 13 Pi = Pi ∪ O ; 14 15 Task ti = new Task(() => { Sub-PMCAR(tCARs, Pi , minSup, minConf); }); 16 for each task in the list of created tasks 17 collect the set of rules ( tCARs ) returned by each task; 18 totalCARs = totalCARs ∪ tCARs ; 19 totalCARs = totalCARs ∪ CARs ; Sub-PMCAR(tCARs, Lr , minSup, minConf) 20 for all lx ∈ Lr children 21 Generate-Rule(tCARs, lx , minConf); 22 Pi = ∅ ; 23 for all l y ∈ Lr children , with y > x 24 if l y att ≠ lx att then // two nodes are combined only if their attributes are different 25 O.att = l x att | l y att ; // using bitwise operation 26 O.values = l x values ∪ l y values ; 27 for i = to k // k is the number of classes 28 Task childi = new Task(() => { O.Obidseti = l x Obidseti ∩ l y Obidseti ; }); 29 Task.WaitAll( childi ); 30 O pos = argmax i∈[1,k ] { O.Obidseti } ; 31 O.total = ∑ O.Obidseti ; 32 if O.ObidsetO pos ≥ minSup then // node O satisfies minSup k i =1 33 34 Pi = Pi ∪ O ; Sub-PMCAR(tCARs, Pi , minSup, minConf); Fig 10 PMCAR with shared Obidset strategy We can see that the sequential CAR mining algorithm described in Section scans the dataset once and uses a main loop to mine all CARs Based on the cost model in Skillicorn (1999), the time complexity of this algorithm is: 4725 D Nguyen et al / Expert Systems with Applications 41 (2014) 4716–4729 {} 1× a1(1, ) 1× a ( ∅,3) × b1(1, 23) t2,t3 × a1b1(1, ) × a1c1(1, ) × a 2b1( ∅,3) × a 2c1( ∅,3) × b1c1(1, 23) × a1b1c1(1, ) × a 2b1c1( ∅,3) t1 × c1(1, 23) Fig 11 Illustration of shared Obidset strategy T S ¼ kS Â m ỵ a where TS is the execution time of the sequential CAR mining algorithm, kS is the number of iterations in the main loop, m is the execution time of generating nodes and rules in each iteration, and a is the execution time of accessing dataset The proposed parallel algorithm distributes node and rule generations to multiple tasks executed on multi-cores Thus, the exem cution time of generating nodes and rules in each iteration is tÂc , where t is the number of tasks and c is the number of cores The time complexity of the parallel algorithm is: m T P ẳ kP ỵa tc where TP is the execution time of the proposed parallel CAR mining algorithm, kP is the number of iterations in the main loop The speedup is thus: Sp ¼ TS kS Â m ỵ a ẳ m T P kP tc þa In our experiments, the execution time of the sequential code (for example, the code to scan the dataset) is very small In addition, the number of iterations in the main loop in both sequential and parallel algorithms is similar Therefore, the speedup equation can be simplified as follows: Sp ẳ kS m ỵ a kS m m % m m % m ¼ tÂc kP Â tÂc þ a kP Â tÂc tÂc Thus, we can achieve up to a t Â c speedup over the sequential algorithm Now we analyze the time complexity of the parallel CBA algorithm proposed in Thakur and Ramesh (2008) Since this algorithm is based on the Apriori algorithm, it must scan the dataset many times Additionally, this algorithm was employed on a distributed memory system which means that it needs an additional computation time for communication and information exchange among nodes Consequently, the time complexity of this algorithm is: T C ẳ kC m ỵaỵd p where TC is the execution time of the parallel CBA algorithm, kC is the number of iterations required by the parallel CBA algorithm, p is the number of processors, and d is the execution time for communication and data exchange among computing nodes Assume that kP % kC and t Â c % p We have: m T C ¼ kC ỵ a ỵ kC 1ị a þ kC Â d p where TF is the execution time of the parallel FP-Growth algorithm, kF is the number of iterations required by the parallel FP-Growth algorithm The parallel FP-Growth scans the dataset once and then partitions it into P parts regarding the number of processors Each processor scans its local data partition to count the local support of each item Therefore, the execution time of accessing the dataset in this algorithm is only a However, computing nodes need to broadcast the local support of each item across the group so that each processor can calculate the global count Thus, this algorithm also needs an additional computation time d for data transfer Assume that kP % kF and t Â c % p We have: TF ¼ kF m ỵ a ỵ kF d % T P ỵ kF d p It can conclude that our proposed parallel algorithm is also faster than the parallel FP-Growth algorithm in theory and TP < TF < TC Experimental results This section provides the results of our experiments including the testing environment, the results of the scalability experiments of three proposed parallel strategies, and the performance of the proposed parallel algorithm with variation on the number of objects and attributes It finally compares the execution time of PMCAR with that of the recent sequential CAR mining algorithm, CAR-Miner (Nguyen et al., 2013) 7.1 Testing environment All experiments were conducted on a multi-core processor computer which has one Intel i7-2600 processor The processor has cores and an MB L3-cache, runs at a core frequency of 3.4 GHz, and also supports Hyper-threading The computer has GB of memory and runs OS Windows Enterprise (64-bit) SP1 The algorithms were coded in C# by using MS Visual Studio NET 2010 Express The parallel algorithm was implemented based on the parallelism model supported in Microsoft NET Framework 4.0 (version 4.0.30319) The experimental datasets were obtained from the University of California Irvine (UCI) Machine Learning Repository (http://mlearn.ics.uci.edu) and the Frequent Itemset Mining (FIM) Dataset Repository (http://fimi.ua.ac.be/data/) The four datasets used in the experiments are Poker-hand, Chess, Connect-4, and Pumsb with the characteristics shown in Table The table shows the number of attributes (including the class attribute), the number of class labels, the number of distinctive values (i.e., the total number of distinct values in all attributes), and the number of objects (or records) in each dataset The Chess, Connect-4, and Pumsb datasets are dense and have many attributes whereas the Pokerhand dataset is sparse and has few attributes 7.2 Scalability experiments We evaluated the scalability of PMCAR by running it on the computer that had been configured to utilize a different number % T P ỵ kC 1ị a ỵ kC d Obviously, TP < TC which implies that our proposed algorithm is faster than the parallel version for CBA in theory Similarly, the time complexity of the parallel FP-Growth algorithm proposed in Mokeddem and Belbachir (2010) is as follows: T F ¼ kF m ỵd ỵa p Table Characteristics of the experimental datasets Dataset # Attributes # Classes # Distinctive values # Objects Poker-hand Chess Connect-4 Pumsb 11 37 43 74 10 95 76 130 2113 1,000,000 3196 67,557 49,046 4726 D Nguyen et al / Expert Systems with Applications 41 (2014) 4716–4729 Poker-hand Speedups 2.0 1.5 CAR-Miner PMCAR-Shared Branch PMCAR-Independent Branch Speedups 2.5 Chess 1.5 1.0 CAR-Miner PMCAR-Shared Branch PMCAR-Independent Branch 1.0 0.5 0.5 0.0 0.0 # cores # cores (a) Scalability of PMCAR for the Poker-hand dataset (minSup = 0.01%) (b) Scalability of PMCAR for the Chess dataset (minSup = 30%) Connect-4 Pumsb Speedups 2.0 1.5 CAR-Miner PMCAR-Shared Branch PMCAR-Independent Branch Speedups 2.5 1.5 1.0 CAR-Miner PMCAR-Shared Branch PMCAR-Independent Branch 1.0 0.5 0.5 0.0 0.0 # cores # cores (c) Scalability of PMCAR for the Connect-4 dataset (minSup = 80%) (d) Scalability of PMCAR for the Pumsb dataset (minSup = 70%) Fig 12 Speedup performance of PMCAR with two parallel strategies # Objects = 500K, Density = 55%, minSup = 50% 1,800 Dataset # Attributes # Classes Density # Objects File size (KB) C50R100KD55 C50R200KD55 C50R300KD55 C50R400KD55 C50R500KD55 C10R500KD55 C20R500KD55 C30R500KD55 C40R500KD55 50 50 50 50 50 10 20 30 40 2 2 2 2 55 55 55 55 55 55 55 55 55 100,000 200,000 300,000 400,000 500,000 500,000 500,000 500,000 500,000 9961 19,992 29,883 39,844 49,805 10,743 20,508 30,247 40,040 Runtime (in seconds) 1,600 250 200 60,000 50,000 40,000 30,000 100 20,000 50 10,000 0 100K 200K 300K 400K 350,000 300,000 250,000 800 200,000 600 150,000 400 100,000 200 50,000 20 30 40 Fig 14 Performance comparison between PMCAR and CAR-Miner with variation on the number of attributes Other parameters are set to: # Objects = 500 K, Density = 55% and minSup = 50% 70,000 150 400,000 # Attributes # CARs Runtime (in seconds) 300 1,000 10 80,000 # CARs CAR-Miner PMCAR-Shared Branch PMCAR-Independent Branch PMCAR-Shared Obidset 1,200 450,000 # Attributes = 50, Density = 55%, minSup = 70% 350 1,400 500,000 # CARs CAR-Miner PMCAR-Shared Branch PMCAR-Independent Branch PMCAR-Shared Obidset # CARs Table Characteristics of the synthetic datasets 500K # Objects Fig 13 Performance comparison between PMCAR and CAR-Miner with variation on the number of objects Other parameters are set to: # Attributes = 50, Density = 55% and minSup = 70% of cores The configuration was adjusted in the BIOS Setup The number of supported cores was setup at 1, 2, and core(s) in turn The performance of PMCAR and CAR-Miner were compared We observed that the performances of CAR-Miner were nearly identical when it was run on the computer utilized a various number of cores It can be said that the sequential algorithms cannot take the advantages of the multi-core processor architecture In the contrary, PMCAR scaled much better than CAR-Miner when the number of running cores was increased In the experiments, we used the runtime performance of CAR-Miner to be the baseline for obtaining the speedups Fig 12(a)–(d) illustrate the speedup performance of PMCAR with two parallel strategies for the Pokerhand, Chess, Connect-4, and Pumsb datasets, respectively Note that minConf = 50% was used for all experiments 4727 D Nguyen et al / Expert Systems with Applications 41 (2014) 4716–4729 Poker-hand Poker-hand 400,000 Runtime (in seconds) 350,000 # CARs 300,000 250,000 200,000 150,000 100,000 50,000 0.11 0.09 0.07 0.05 0.03 0.01 minSup (%) (a) # CARs produced 50 45 40 35 30 25 20 15 10 CAR-Miner PMCAR-Shared Branch PMCAR-Independent Branch 0.11 0.09 0.07 0.05 0.03 0.01 minSup (%) (b) Runtime for PMCAR and CAR-Miner Fig 15 Comparative results between PMCAR and CAR-Miner for the Poker-hand dataset with various minSup values Obviously, PMCAR is slower than the sequential algorithm CARMiner when they are executed on a single core because task is processor-intensive This situation is known as the processor oversubscription However, when the number of using cores is increased, PMCAR is much faster than CAR-Miner As shown in Fig 12(c) for the Connect-4 dataset, PMCAR with two independent branch and shared branch strategies archives speedups up to 2.1Â and 1.4Â, respectively Interestingly, the shared branch strategy is not beneficial for the Chess dataset Fig 12(b) shows that PMCAR with shared branch is always slower than the sequential CAR-Miner As discussed before, the shared branch strategy has high synchronization cost occurred between tasks As a result, the huge number of tasks (4,253,728 tasks) generated for the Chess dataset with minSup = 30% reduces significantly the runtime performance We also conducted the scalability experiments for the shared Obidset strategy It, however, did not obtain good scalability results because of the high costs of synchronization among child tasks and between child tasks and their parent task Therefore, we did not show its performance on the charts 7.3 Influence of the number of dimensions and the size of dataset To obtain a clear understanding on how PMCAR is affected by the dataset dimension and size, we conducted experiments on synthetic datasets with a various number of attributes and objects Based on the ideas from (Coenen, 2007), we developed a tool for generating a synthetic dataset Firstly, we fixed other parameters of the dataset generator as follows: (1) the number of attributes is 50; (2) the density is 55% We then generated test datasets with a different number of objects which ranged between 100,000 and 500,000 Secondly, we fixed the number of objects and the density for 500,000 and 55%, respectively We then generated datasets with a various number of attributes in a range between 10 and 40 The details of synthetic datasets are shown in Table Fig 13 illustrates the performance results with respect to the number of objects in the dataset As shown, PMCAR achieves a good result with the dataset size compared to CAR-Miner For example, when the dataset size reaches to 500 K, PMCAR with shared branch strategy is up to 1.6Â compared to CAR-Miner However, two other strategies, independent branch and shared Obidset, failed to execute their operation at the dataset size 500 K due to the memory leak This problem happens because each task in these strategies holds the entire branch of the tree which consumes very high memory on dense datasets Fig 14 demonstrates the performance results with respect to the number of attributes in the dataset Again, PMCAR achieves a good result with the dataset dimension compared to the sequential algorithm For instance, the execution time of PMCAR with shared branch was only 1,003.694 s while CAR-Miner was 1,572.754 s when the number of dimension was 40 However, two strategies, independent branch and shared Obidset, failed to execute at the dataset dimension 40 due to the memory leak 7.4 Comparison with sequential algorithms1 In this section, we compare the execution time of PMCAR with the sequential algorithm CAR-Miner These experiments aim to show that PMCAR is competitive with the existing algorithm Figs 15–18 show the number of generated CARs and the execution times of PMCAR and CAR-Miner for Poker-hand, Chess, Connect-4, and Pumsb datasets with various minSup values on the computer configured to utilize cores and enable Hyper-threading It can be observed that CAR-Miner performs badly except for the Chess dataset It is slower than PMCAR because it cannot utilize the computing power of the multi-core processor On the contrary, PMCAR is optimized for parallel mining the dataset; thus its performance is superior to CAR-Miner PMCAR with the independent branch strategy is always the fastest of all tested algorithms For example, consider the Connect-4 dataset with minSup = 65% Independent branch consumed only 1,776.892 s to finish its work while shared Obidset, shared branch, and CAR-Miner consumed 1,924.081s, 2,477.279s and 2,772.470s, respectively The runtime performances of shared Obidset were worst on the Poker-hand, Chess, and Pumsb datasets Thus, we did not show them on the charts Conclusions and future work In this paper, we have proposed three strategies for parallel mining class association rules on the multi-core processor architecture Unlike sequential CAR mining algorithms, our parallel algorithm distributes the process of generating frequent itemsets and rules to multiple tasks executed on multi-cores The framework of the proposed method is based on our previous sequential CAR mining method and three parallel strategies independent branch, shared branch, and shared Obidset The time complexities of both sequential and parallel CAR mining algorithms have been analyzed, with results showing the good effect of the proposed algorithm The speedup can be achieved up to t Â c in theory We have also theoretically proven that the execution time of our parallel CAR mining algorithm is faster than those of existing parallel CAR mining algorithms Additionally, a series of experiments have been conducted on both real and synthetic datasets The experimental results have also shown that three proposed parallel methods are competitive with the sequential CAR mining method However, the first and third strategies currently consume higher Executable files of the CAR-Miner and PMCAR algorithms and experimental datasets can be downloaded from http://goo.gl/hIrDtl 4728 D Nguyen et al / Expert Systems with Applications 41 (2014) 4716–4729 Chess 4,500,000 4,000,000 3,500,000 3,000,000 2,500,000 2,000,000 1,500,000 1,000,000 500,000 - Runtime (in seconds) # CARs Chess 55 50 45 40 35 100 90 80 70 60 50 40 30 20 10 CAR-Miner PMCAR-Shared Branch PMCAR-Independent Branch 30 55 50 minSup (%) 45 40 35 30 minSup (%) (a) # CARs produced (b) Runtime for PMCAR and CAR-Miner Fig 16 Comparative results between PMCAR and CAR-Miner for the Chess dataset with various minSup values Connect-4 3,000 Runtime (in seconds) # CARs Connect-4 4,500,000 4,000,000 3,500,000 3,000,000 2,500,000 2,000,000 1,500,000 1,000,000 500,000 - CAR-Miner PMCAR-Shared Branch PMCAR-Independent Branch PMCAR-Shared Obidset 2,500 2,000 1,500 1,000 500 90 85 80 75 70 minSup (%) 90 65 (a) # CARs produced 85 80 75 minSup (%) 70 65 (b) Runtime for PMCAR and CAR-Miner Fig 17 Comparative results between PMCAR and CAR-Miner for the Connect-4 dataset with various minSup values Pumsb Runtime (in seconds) # CARs Pumsb 1,800,000 1,600,000 1,400,000 1,200,000 1,000,000 800,000 600,000 400,000 200,000 90 85 80 75 70 1,000 900 800 700 600 500 400 300 200 100 65 CAR-Miner PMCAR-Shared Branch PMCAR-Independent Branch 90 85 (a) # CARs produced 80 75 70 65 minSup (%) minSup (%) (b) Runtime for PMCAR and CAR-Miner Fig 18 Comparative results between PMCAR and CAR-Miner for the Pumsb dataset with various minSup values memory than the sequential counterpart which causes their inability to cope with very dense datasets Thus, we will research how to reduce the memory consumption of these strategies in the future We will also investigate the applicability of the proposed methods on other platforms such as multiple graphic processors or clouds Acknowledgements This work was funded by Vietnam’s National Foundation for Science and Technology Development (NAFOSTED) under Grant No 102.01-2012.17 References Agrawal, R., & Shafer, J (1996) Parallel mining of association rules IEEE Transactions on Knowledge and Data Engineering, 8, 962–969 Agrawal, R., & Srikant, R (1994) Fast algorithms for mining association rules in large databases In The 20th International Conference on Very Large Data Bases (pp 487–499) Morgan Kaufmann Publishers Inc Andrew, B (2008) Multi-Core Processor Architecture Explained In http:// software.intel.com/en-us/articles/multi-core-processor-architectureexplained: Intel Baralis, E., Chiusano, S., & Garza, P (2004) On support thresholds in associative classification In The 2004 ACM Symposium on Applied Computing (pp 553–558) ACM D Nguyen et al / Expert Systems with Applications 41 (2014) 4716–4729 Baralis, E., Chiusano, S., & Garza, P (2008) A lazy approach to associative classification IEEE Transactions on Knowledge and Data Engineering, 20, 156–171 Cagliero, L., & Garza, P (2013) Improving classification models with taxonomy information Data & Knowledge Engineering, 86, 85–101 Casali, A., & Ernst, C (2013) Extracting correlated patterns on multicore architectures Availability, Reliability, and Security in Information Systems and HCI (Vol 8127, pp 118–133) Springer Chen, W.-C., Hsu, C.-C., & Hsu, J.-N (2012) Adjusting and generalizing CBA algorithm to handling class imbalance Expert Systems with Applications, 39, 5907–5919 Chen, G., Liu, H., Yu, L., Wei, Q., & Zhang, X (2006) A new approach to classification based on association rule mining Decision Support Systems, 42, 674–689 Coenen, F (2007) Test set generator (version 3.2) In http://cgi.csc.liv.ac.uk/~frans/ KDD/Software/LUCS-KDD-DataGen/generator.html Han, J., Pei, J., & Yin, Y (2000) Mining frequent patterns without candidate generation ACM SIGMOD Record (Vol 29, pp 1–12) ACM Laurent, A., Négrevergne, B., Sicard, N., & Termier, A (2012) Efficient parallel mining of gradual patterns on multicore processors Advances in Knowledge Discovery and Management (Vol 398, pp 137–151) Springer Li, W., Han, J., & Pei, J (2001) CMAR: Accurate and efficient classification based on multiple class-association rules In IEEE International Conference on Data Mining (ICDM 2001) (pp 369–376) IEEE Liu, B., Hsu, W., & Ma, Y (1998) Integrating classification and association rule mining In The 4th International Conference on Knowledge Discovery and Data Mining (KDD 1998) (pp 80–86) Liu, L., Li, E., Zhang, Y., & Tang, Z (2007) Optimization of frequent itemset mining on multiple-core processor In The 33rd International Conference on Very Large Data Bases (pp 1275–1285) VLDB Endowment Liu, H., Liu, L., & Zhang, H (2011) A fast pruning redundant rule method using Galois connection Applied Soft Computing, 11, 130–137 Mokeddem, D., & Belbachir, H (2010) A distributed associative classification algorithm Intelligent Distributed Computing IV (Vol 315, pp 109–118) Springer Negrevergne, B., Termier, A., Méhaut, J.-F., & Uno, T (2010) Discovering closed frequent itemsets on multicore: Parallelizing computations and optimizing memory accesses In International Conference on High Performance Computing and Simulation (HPCS 2010) (pp 521–528) IEEE Negrevergne, B., Termier, A., Rousset, M.-C., & Méhaut, J.-F (2013) Para Miner: a generic pattern mining algorithm for multi-core architectures Data Mining and Knowledge Discovery, 1–41 Netzer, R., & Miller, B (1989) Detecting data races in parallel program executions University of Wisconsin-Madison Nguyen, D., & Vo, B (2014) Mining class-association rules with constraints Knowledge and Systems Engineering (Vol 245, pp 307–318) Springer 4729 Nguyen, L T., Vo, B., Hong, T.-P., & Thanh, H C (2012) Classification based on association rules: A lattice-based approach Expert Systems with Applications, 39, 11357–11366 Nguyen, L T., Vo, B., Hong, T.-P., & Thanh, H C (2013) CAR-Miner: An efficient algorithm for mining class-association rules Expert Systems with Applications, 40, 2305–2311 Quinlan, J R (1993) C4 5: programs for machine learning Morgan Kaufmann Publishers Inc Schlegel, B., Karnagel, T., Kiefer, T., & Lehner, W (2013) Scalable frequent itemset mining on many-core processors In The 9th International Workshop on Data Management on New Hardware ACM Article No Skillicorn, D (1999) Strategies for parallel data mining IEEE Concurrency, 7, 26–35 Tatikonda, S., & Parthasarathy, S (2009) Mining tree-structured data on multicore systems Proceedings of the VLDB Endowment, 2, 694–705 Thabtah, F., Cowling, P., & Peng, Y (2004) MMAC: A new multi-class, multi-label associative classification approach In The 4th IEEE International Conference on Data Mining (ICDM 2004) (pp 217–224) IEEE Thabtah, F., Cowling, P., & Peng, Y (2005) MCAR: multi-class classification based on association rule In The 3rd ACS/IEEE international conference on computer systems and applications (pp 33–39) IEEE Thakur, G., & Ramesh, C J (2008) A framework for fast classification algorithms International Journal Information Theories and Applications, 15, 363–369 Tolun, M., & Abu-Soud, S (1998) ILA: An inductive learning algorithm for rule extraction Expert Systems with Applications, 14, 361–370 Tolun, M., Sever, H., Uludag, M., & Abu-Soud, S (1999) ILA-2: An inductive learning algorithm for knowledge discovery Cybernetics & Systems, 30, 609–628 Vo, B., & Le, B (2009) A novel classification algorithm based on association rules mining Knowledge Acquisition: Approaches, Algorithms and Applications (Vol 5465, pp 61–75) Springer Xu, X., Han, G., & Min, H (2004) A novel algorithm for associative classification of image blocks In The 4th International Conference on Computer and Information Technology (CIT 20 04) (pp 46–51) IEEE Yin, X., & Han, J (2003) CPAR: Classification based on predictive association rules The 3rd SIAM International Conference on Data Mining (SDM 2003) (Vol 3, pp 331–335) SIAM Yu, K.-M., & Wu, S.-H (2011) An efficient load balancing multi-core frequent patterns mining algorithm In The IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom 2011) (pp 1408–1412) IEEE Zaki, M., Parthasarathy, S., Ogihara, M., & Li, W (1997) New algorithms for fast discovery of association rules In The 3rd international conference on knowledge discovery and data mining (Vol 20, pp 283–286) Zhao, M., Cheng, X., & He, Q (2009) An algorithm of mining class association rules Advances in Computation and Intelligence (Vol 5821, pp 269–275) Springer ... recursively mining It also adopts two parallel strategies ‘‘independent class ’ and ‘‘shared class ’ However, PMCAR has some differences as follows PMCAR is a parallel algorithm for mining class association. .. efficient classification based on multiple class- association rules In IEEE International Conference on Data Mining (ICDM 2001) (pp 369–376) IEEE Liu, B., Hsu, W., & Ma, Y (1998) Integrating classification... special case of association rule in which only the class attribute is considered in the rule’s right-hand side (consequent) Mining class association rules is to find the set of rules which satisfy the

DSpace at VNU: Efficient strategies for parallel mining class association rules

Thông tin tài liệu

Từ khóa liên quan

Mục lục

Efficient strategies for parallel mining class association rules

1 Introduction

2 Preliminary concepts

2.1 Class association rule

2.2 Multi-core processor architecture

2.3 Parallel mining on the multi-core processor architecture

3 Related work

3.1 Sequential CAR mining algorithms

3.2 Parallel CAR mining algorithms

4 A sequential class association rule mining algorithm

5 The proposed parallel class association rule mining algorithm

5.1 Independent branch strategy

5.2 Shared branch strategy

5.3 Shared Obidset strategy

6 Time complexity analysis

7 Experimental results

7.1 Testing environment

7.2 Scalability experiments

7.3 Influence of the number of dimensions and the size of dataset

7.4 Comparison with sequential algorithms1

8 Conclusions and future work

Acknowledgements

References

Tài liệu cùng người dùng

Tài liệu liên quan