an incremental learning algorithm based on support vector domain classifier

5 307 0
an incremental learning algorithm based on support vector domain classifier

Đang tải... (xem toàn văn)

Thông tin tài liệu

An Incremental Learning Algorithm Based on Support Vector Domain Classifier Yinggang Zhao, Qinming He College of Computer Science, Zhejiang University, Hangzhou 310027, China Email: ygzl29g163.com SVDD algorithm gives us an enlightenment: when we classify a binary-class dataset, if we only know part of sample's Abstract category ( for example, samples with category label yi = 1), yet the other part of sample's category is unknown, then we Incremental learning technique is usually used to solve can design new type of classifier based on SVDD named large-scale problem. We firstly gave a modif ed support vector support vector domain classifier ( SVDC). This new classifier machine (SVM) classification method support vector only need to describe the data with known category, then domain classifer (SVDC), then an incremental learning obtaining the description boundary of this class of data. algorithm based on SVDC was proposed. The basic idea of Finally, we can classify the unknown binary-class data this incremental algorithm is to obtain the initial target according to the obtained boundary. concepts using SVDC during the training procedure and then In this paper our incremental learning algorithm is based update these target concepts by an updating model. Difierent on SVDC, and this algorithm is motivated by the from the existed incremental learning approaches, in our person-learning procedure. When learning a complicated algorithm, the model updating procedure equals to solve a concept, people usually obtain a initial concept by using part quadratic programming (QP) problem, and the updated model of useful information, then update the obtained concept by still owns the property of spars solution. Compared with other utilizing new information. In term of our incremental existed incremental learningalgorithms, the inverse procedure algorithm based on SVDC, it firstly utilize part of data of our algorithm (i.e. decreasing learning) is easy to conduct ( memory space permitting), then obtain a concept (namely the without extra computation. Experiment results show our parameter of obtained decision hypersurface) by SVDC algorithm is effective andfeasible. learning algorithm, finally according to the information of decision hypersurface acquired in last step, update the parameter of decision hypersurface gained in last step utilizing Keywords: Support Vector Machines, Support Vector Domain specialized updating model in the process of incremental Classifier, Incremental learning, Classification. learning, namely updating the known concept. Our algorithm owns the following characters: 1. INTRODUCTION 1) The incremental updating model in this algorithm With large amounts of data available to machine learning has a similar mathematics form compared with community, the need to design techniques that scale well is standard SVDC algorithm, and any algorithm used more critical than before. As some data may be collected over to obtain the standard SVDC can also be used to long periods, there is also a continuous need to incorporate the obtain the updating model of our algorithm; new data into the previously learned concept. Incremental 2) The inverse procedure of this algorithm, i.e. the learning techniques can satisfy the need for both the scalability decreasing learning procedure is easy to and incremental update. implement, that is to say when we perceive the Support vector machine (SVM) is based on statistical generalization performance dropped in the learning theory, which has developed over last three decades incremental process, we can easily return last step [1,2]. It has been proven very successful in many applications without extra computation; [3,4,5,6]. SVM is a supervised binary-class classifier, when The experimental results show the learning performance we train samples using SVM, the categories of the samples are of this algorithm approaches that of batch training, and needed to be known. However, in many cases, it is rare that performance well in large-scale dataset compared to other we can obtain the data with their category be known, in other SVDC incremental learning algorithm. words, most of the obtained data's categories are unknown. In The rest of this paper is organized as follows. In section 2 this situation, traditional SVM isn't appropriate. we give an introduction of SVDC, and in section 3 we present TAX et al proposed a method for data domain description our incremental algorithm. Experimental and results called support vector domain description (SVDD) [7], and it is concerning the proposed algorithm are offered in Section 4. used to describe data domain and delete outliers. The key idea Section 5 collects the main conclusions. Of SVDD is to describe one class of data by finding a sphere with minimum volume, which contains this class of data. Proc. 5th IEEE Int. Conf. on Cognitive Informatics (ICCI'06) Y.Y. Yao, Z.Z. Shi, Y. Wang, and W. Kinsner (Eds.)80 1 -4244-0475-4/06/$20.OO @2006 IEEE80 2. Support Vector Domain Classifier with constrains , = =1, and 0 < a, < C. Where the 2.1 Support Vector Domain Description [7] inner product has been replaced with kernel function K(.,.), and K(.,.) is a definite kernel satisfying mercer Of a data setcontaiing N dataobj condition, for example a popular choice is the Gaussian Of a data set containing N data objects, enl (,)=ep-xz2/2),>0 f x, Z = 1, ,~ NJ} a description iS required. We try to find a kre:Kxz=pJ1X_12 221 a>. {xs, i nd 1.,}ac dscp requre e W wtr tindma To determine whether a test point is z within the closed and compact sphere area Q with minimum sphere, the distance to the center of the sphere has to be volume, which contain all (or most of) the needed objects calculated. A test object z accepted when this distance is Q, and the outliers are outside Q. Figure 1 shows the small than the radius, i.e., when (z - a)T (z -a) < R2. sketch of Support Vector Domain Description (SVDD). Expressing the center of the sphere in term of the support support vector vector, we accept objects when Z-a 2 = K(z,z) 2 aiK(x z)+ZEaiaK(x1,xj) R2 ij o utliers ag 6jc(5 + cassiication bo.urdary 2.2 Support Vector Domain Classifier 0O '* + + : Inspired by SVDD, in this section we extend SVDD to 0 ++ *- + + + la o SVDC situation. Consider a training set of instance-label pairs (xi, yi1), i = 1, 2, l, l + 1, N, where xi c R' and 0 + Fig.l. SVDD classifier i + This is very sensitive to the most outlying object in the Now we construct a hyper-sphere for samples of target objects. When one or a few very remote objects are Yi = 1, and the samples of yi = -1 are not considered, in the training set, a very large sphere is obtained which then we can get the following quadratic optimization will not represent the data very well. Therefore, we allow for some data points outside the sphere and introduce slack problem variable Si min[R2 +C ]i I Of the sphere, described by center a and radius R, we i=1 (6) minimize the radius s.t. yi (R2 -(xi - a)T (xi -a)) >i =1, minIIR2 + CZ ] (1) where Si . 0,yi = 1, and C is a constant. Similarly, where the C is a penalty constant which gives the using multipliersi a> 0, ,i > 0, we introduce Lagrangian trade-off between simplicity ( or volume of the sphere) and / / the number of errors (number of target objects rejected). L(R, a,a>,)=R2 +aC,, -ocy'{R2 -(x -a)T(x -a)}- /3X This had to be minimized under the constraints (7) (xi-a) (xi - a) < R + d Vi d 2 0 (2) and in formula (7), set the derivatives with respect to the Incorporating these constraints in (1), we construct the primalvariables R,a,j equaltozero,andre-substituting Lagrangian the obtained results to (7) yielding L(R, a, a,) R2 + C4 f -Eca{R2 + f -(x -2ax + a2)}- E / / 1 1W(o) a= Eo,yaK(x,yx) - E a iajy1yK(xi,xj) (3) i=1 i j=l (8) withLagrange multipliers a 2> , fi .0. ( s.t. 5£rYa 1 <r< Solving minimal solution of formula (3) can transform sl-l = 1 0 < a <C to solve the maximal solution of its dual problem Te ecndsg h iaycasshr-tutr L (a) = Z aiK(x1,x1) - ZananK(x1,x j) (4) SVM classifier i i 806 f(x) =sgn(R2 - K(x, x) + 2Z ofyK(x, x) - of caayy YK(x, X1)) where a1k, ,f . 0, (i 1, , lk) is Lagrangian multiplier. (9) According to optimization solution conditions, we can where further get the following equation: k{kyk=1e{S (K(Xk,Xi) a2yiyIK(xkxi) yK(xxj)) =Rk -Rk I-2?a'y> R 0 Rk=RkR + Yk'' k k k-I kY k- L akYkXk (I10) (13) in formula (10), xk represents support vector, and k is Finally we obtain the following decision function: the number of support vector. fk(x) =sgntRk -{K(x,x) +2 E a,y,K(x,X) -ZE a,ayjy,yjK(x,ix)} If f(x) > 0, the tested sample is contained in sphere, ,ESV ,ESV and we look the samples enclosed I sphere the same-class sgn{R21 + 2Rkl E aoy1xi +( E aciyiXi)2} objects. Otherwise it is rejected, and we look it as the Xi,SVk xi,SVk opposite objects. -{K(x, x) + 2 E a1yiK(x, xi)-E aa1jy1yjK(x , xj)} xi ESV xiESV 3. SVDC Incremental Learning Algorithm According formula (6), we suppose the obtained initial sgn{ffk (x) + 2Rk L E aiy,x, + ( a ciyixi)2} parameter (sphere radius) learning with initial training set is xi csVk xi csVk RO, and the set of support vectors is SVO . The parameter (14) From equation (14) we can see it is easy to return the becomes Rk in the kth incremental learning, and the set last step of incremental earning without extra computation. of support vectors becomes SVk, and the new dataset in From the above analysis we can see only conduct a trifling modification on the standard SVDC, can it be used klh step becomes Dk = {(xk yk)j}l- to solve the updated model in incremental learning procedure. Our incremental algorithm can be described as Nowwesummarizeouralgorithmasfollowings: following: Step 1 Learning the initial concept: training SVDC Assume we has known Rkl updating the current using initial datasetoTS , then parameter R0 is model~~~~ ~ ~ usn SVkn lnXka daae TSo I/hnpaaeerR model using SJK,l1 and new dataset {(X iY7)}>=1 obtained; We updating the current model using the following Step 2 Updating the current concept: when the new data are available, using them to solve QP problem quadratic programming (QP) problem: formula ( 11), and obtain new concept; min g(Rk) I Rk - R 112 Step 3 Repeating step 2 until the incremental learning is k (Rk2 _(Xk - a)' (XV -a)) > Xk exi Dk over where Rk-l is the radius of last optimization problem (11), 4. Experiments and Results when k = 1, Ro is the radius of standard SVDC. It is In order to evaluate the learning performance offered by obvious, when RklI = 0, the incremental SVDC has the our incremental algorithm, we conducted experiment on six different datasets taken from UCI Machine Repository: same form as the standard SVDC. We will found the Banana, Diabetes, Flare-Solar, Heart, Breast-Cancer, German. updated model by the incremental SVDC also owns the Note some of then are not binary -class classification problems, but we have transform them to binary-class problem by special property of solution sparsity which is owned by the technique. Experiment parameters and Dataset are shown in standard SVDC table 1. For notation simplicity, in figure 2, our algorithm was abbreviate as Our ISVM. In order to solve (11), we transform it to its dual The experiment parameters are listed in table 1. In addition to conducting experiments with our algorithm, we problem, and introduce Lagrangian: also implemented and tested another popular and effective L=' R - R - - (k- - - incremental learning algorithm ISVM [8][9] on the same 2 a)(xk L=J'k"k datasets so that compare their learning performance in our (12) experiment we choose RBF K(x,y)=exp( 2 ) as kernel 807 function, and the kernel width o is not fixed. The MATLAB 100 Cancer SVM toolbox contributed by Gunn [10] was used in our 95 T ISVM experiment, and the experiment software and hardware 90 environment were: JIntel P4 PC(1.4GHz CPU, 256MB 85 Memory), WindowsXP Operation System 2 80 70 Table 1. Data set and experiment parameters 65 Dataset #TRS #TES #ATT C 07 60 55 Banana 400 4900 2 3.162e+ 1.OOOe+00 l_ 02 50,I1 2 3 Incremental Learning Step Breast- 200 77 9 1.519e+ 5.000e+01 (b) Cancer 01 Diabetes 468 300 8 l.OOOe+ 2.000e+01 100 Diabetes ISVM 01 95 Our ISVM Flare- 666 400 9 1.023e+ 3.000e+00 90 Solar 01 85 German 700 300 20 3.162e+ 5.500e+01 80 00 Heart 170 100 13 3.162e+ 1.200e+02 -0 - == 70 00 In table 1, the #TRS represents the number of training 65 samples, #TES represents the number of testing samples, 60 #ATT represents the number of attributes. C is penalty 55 constant, o- is the kernel width. 1 2 3 4 5 6 7 8 9 10 Literature [8] points out an efficient incremental (ncremental Learning Step learning algorithm should satisfies the following three (c) Flare-Solar criterions: 100 A. Stability: When each step of incremental learning is 95 Our ISVM over, the predication accuracy on the test should not vary 90 too obviously; 85 B. Improvement: With the performing of the 80 75 incremental learning, the algorithm s predication accuracy should improve gradually; r_0 C. Recoverability: The incremental learning algorithm should own the ability of performance recoverability, that is to say when the learning performance l of the algorithm descends after a certain step learning, the Incremental Learning Step algorithm can recovers even surpasses the former (d) performance in the later learning procedure. German Figure 2 shows the experiment results of the two '°° l,vm different incremental learning algorithms. 90 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~9 Banana 100~~~~0 100 r 1 T 1 T r a ~~~~~~ ~~ISVM 85- 95 -* OrIVMa rl~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~7 -55 85~~~~~~~~~~~~~~~~~ 2 07 8 9 1 55 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Incremental Learning Step , 1 2 3 4 5 6 7 8 9 1 0 Ve,~~~~~~~~~~~~~~~~~~~~~~5 Incem~ental Learing Step (a) 808 Heart [3] T. Joachims.: Text categorization with support vector machines: 100 ISVM learing with many relevant features, Proceedings of the European Conference on Machine Learning, Springer, Berlin, 1998, pp. 90 137-142 85 o-° '~=0e 80 / ,<<< [4] S. Tong., E., Chang,.: Support Vector Machine Active Learning 75 for Image Retrieval.Proceedings of ACM International iEi 70 / ,,"Conference on Multimedia, 2000, pp 107-118. 65 , [5] Yang Deng . et al. A new method in data mining support 55 vector machines. Beijing: Science Press, 2004. 1 2 3 4 5 6 7 8 9 10 [6] L. Baoqing. Distance-based selection of potential support vector Incremental Learning Step by kernel matrix. In International symposium on Neural (f) Networks 2004, LNCS 3173,pp. 468-473,2004 Fig. 2. Performance of two incremental learning algorithms [7] D. Tax.: One-class classification. Ph D thesis, Delft University of From figure 2 we can see after each step of incremental Technology, htp://www.phtn.tudelft.nl/-davidt/thesispdf (2001) training, the variation of the predication accuracy on the test set is not various, which satisfy the requirement of algorithm [8] N A Syed, H Liu, K Sung. From incremental learning to model stability., and we can discovery the algorithm improvement is independent instance selection - a support vector machine gradually improved and algorithm and the algorithm own the approach, Technical Report, TRA9/99, NUS, 1999 ability of performance recoverability. So our incremental ablgoithmo perfopo ned inrthisoperabmeets the duriremand l o [9] L Yangguang, C Qi, T yongchuan et al. Incremental updating method for support vector machine, Apweb2004, LNCS 3007, incremental learnig. pp. 426-435, 2004. The experiment results show, our algorithm has the similar learning performance compared with the popular [10] S R Gunn. Support vector machines for classification and ISVM algorithm presented in [9]. Another discovery in our regression. Technical Report, Inage Speech and Intelligent experiment is with the gradually performing of our Systems Research Group, University of Southampton, 1997 incremental learning algorithm, the improvement of learning performance become less and less, and at last , the learning performance no longer improve. It indicates that we can estimate the needed number of samples required in problem description by using this character. 5. Conclusion In this paper we proposed an incremental learning algorithm based on support vector domain classifier (SVDC), and its key idea is to obtain the initial concept using standard SVDC, then using the updating technique presented in this paper, in fact which equals to solve a QP problem similar to that existing in standard SVDC algorithm solving. Experiments show that our algorithm is effective and promising. Others characters of this algorithm include: updating model has similar mathematics form compared with standard SVDC, and we can acquire the sparsity expression of its solutions, meanwhile using this algorithm can return last step without extra computation, furthermore, this algorithm can be used to estimate the needed number of samples required in problem description REFERENCES [1] C. Cortes, V. N. Vapnik.: Support vector networks, Mach. Learn. 20 (1995) pp. 273-297. [2] .V. N. Vapnik.: Statistical learning Theory, Wiley, New York, 1998. 809 . 9 10 [6] L. Baoqing. Distance -based selection of potential support vector Incremental Learning Step by kernel matrix. In International symposium on Neural (f) Networks 2004, LNCS 3173,pp. 468-473,2004 Fig. 2. Performance of two incremental learning algorithms [7] D. Tax.: One-class classification. Ph D thesis, Delft University of From figure 2 we can see after each step of incremental Technology, htp://www.phtn.tudelft.nl/-davidt/thesispdf (2001) training, the variation of the predication accuracy on the test set is not various, which satisfy the requirement of algorithm [8] N A Syed, H Liu, K Sung. From incremental learning to model stability., and we can discovery the algorithm improvement is independent instance selection - a support vector machine gradually improved and algorithm and the algorithm own the approach, Technical Report, TRA9/99, NUS, 1999 ability of performance recoverability. So our incremental ablgoithmo perfopo ned inrthisoperabmeets the duriremand l o [9] L Yangguang, C Qi, T yongchuan et al. Incremental updating method for support vector machine, Apweb2004, LNCS 3007, incremental learnig. pp. 426-435, 2004. The experiment results show, our algorithm has the similar learning performance compared with the popular [10] S R Gunn. Support vector machines for classification and ISVM algorithm presented in [9]. Another discovery in our regression. Technical Report, Inage Speech and Intelligent experiment is with the gradually performing of our Systems Research Group, University of Southampton, 1997 incremental learning algorithm, the improvement of learning performance become less and less, and at last , the learning performance no longer improve. It indicates that we can estimate the needed number of samples required in problem description by using this character. 5. Conclusion In this paper we proposed an incremental learning algorithm based on support vector domain classifier (SVDC), and its key idea is to obtain the initial concept using standard SVDC, then using the updating technique presented in this paper, in fact which equals to solve a QP problem similar to that existing in standard SVDC algorithm solving. Experiments show that our algorithm is effective and promising. Others characters of this algorithm include: updating model has similar mathematics form compared with standard SVDC, and we can acquire the sparsity expression of its solutions, meanwhile using this algorithm can return last step without extra computation, furthermore, this algorithm can be used to estimate the needed number of samples required in problem description REFERENCES [1] C. Cortes, V. N. Vapnik.: Support vector networks, Mach. Learn. 20 (1995) pp. 273-297. [2] .V. N. Vapnik.: Statistical learning Theory, Wiley, New York, 1998. 809 . An Incremental Learning Algorithm Based on Support Vector Domain Classifier Yinggang Zhao, Qinming He College of Computer Science, Zhejiang University, Hangzhou 310027, China Email: ygzl29g163.com SVDD algorithm gives us an enlightenment: when we classify a binary-class dataset, if we only know part of sample's Abstract category ( for example, samples with category label yi = 1), yet the other part of sample's category is unknown, then we Incremental learning technique is usually used to solve can design new type of classifier based on SVDD named large-scale problem. We firstly gave a modif ed support vector support vector domain classifier ( SVDC). This new classifier machine (SVM) classification method . akYkXk (I10) (13) in formula (10), xk represents support vector, and k is Finally we obtain the following decision function: the number of support vector. fk(x) =sgntRk -{K(x,x) +2 E a,y,K(x,X) -ZE a,ayjy,yjK(x,ix)} If f(x) > 0, the tested sample is contained in sphere, ,ESV ,ESV and we look the samples enclosed I sphere the same-class sgn{R21 + 2Rkl E aoy1xi +( E aciyiXi)2} objects. Otherwise it is rejected, and we look it as the Xi,SVk xi,SVk opposite objects. -{K(x, x) + 2 E a1yiK(x, xi)-E aa1jy1yjK(x , xj)} xi ESV xiESV 3. SVDC Incremental Learning Algorithm According formula (6), we suppose the obtained initial sgn{ffk (x) + 2Rk L E aiy,x, + ( a ciyixi)2} parameter (sphere radius) learning with initial training set is xi csVk xi csVk RO, and the set of support vectors is SVO . The parameter (14) From equation (14) we can see it is easy to return the becomes Rk in the kth incremental learning, and the set last step of incremental earning without extra computation. of support vectors becomes SVk, and the new dataset in From the above analysis we can see only conduct a trifling modification on the standard SVDC, can it be used klh step becomes Dk = {(xk yk)j}l- to solve the updated model in incremental learning procedure. Our incremental algorithm can be described as Nowwesummarizeouralgorithmasfollowings: following: Step 1 Learning the initial concept: training SVDC Assume we has known Rkl updating the current using initial datasetoTS , then parameter R0 is model~~~~ ~ ~ usn SVkn lnXka daae TSo I/hnpaaeerR model using SJK,l1 and new dataset {(X iY7)}>=1 obtained; We updating the current model using the following Step 2 Updating the current concept: when the new data are available, using them to solve QP problem quadratic programming (QP) problem: formula ( 11), and obtain new concept; min g(Rk) I Rk - R 112 Step 3 Repeating step 2 until the incremental learning is k (Rk2 _(Xk - a)' (XV -a)) > Xk exi Dk over where Rk-l is the radius of last optimization problem (11), 4. Experiments and Results when k = 1, Ro is the radius of standard SVDC. It is In order to evaluate the learning performance offered by obvious, when RklI = 0, the incremental SVDC has the our incremental algorithm, we conducted experiment on six different datasets taken from UCI Machine Repository: same form as the standard SVDC. We will found the Banana, Diabetes, Flare-Solar, Heart, Breast-Cancer, German. updated model by the incremental SVDC also owns the Note some of then are not binary -class classification problems, but we have transform them to binary-class problem by special property of solution sparsity which is owned by the technique. Experiment parameters and Dataset are shown in standard SVDC

Ngày đăng: 24/04/2014, 12:29

Từ khóa liên quan

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan