classification parameter estimation & state estimation an engg approach using matlab

Classification, Parameter Estimation and State Estimation An Engineering Approach using MATLABÒ F van der Heijden Faculty of Electrical Engineering, Mathematics and Computer Science University of Twente The Netherlands R.P.W Duin Faculty of Electrical Engineering, Mathematics and Computer Science Delft University of Technology The Netherlands D de Ridder Faculty of Electrical Engineering, Mathematics and Computer Science Delft University of Technology The Netherlands D.M.J Tax Faculty of Electrical Engineering, Mathematics and Computer Science Delft University of Technology The Netherlands Classification, Parameter Estimation and State Estimation Classification, Parameter Estimation and State Estimation An Engineering Approach using MATLABÒ F van der Heijden Faculty of Electrical Engineering, Mathematics and Computer Science University of Twente The Netherlands R.P.W Duin Faculty of Electrical Engineering, Mathematics and Computer Science Delft University of Technology The Netherlands D de Ridder Faculty of Electrical Engineering, Mathematics and Computer Science Delft University of Technology The Netherlands D.M.J Tax Faculty of Electrical Engineering, Mathematics and Computer Science Delft University of Technology The Netherlands Copyright Ó 2004 John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England Telephone (ỵ44) 1243 779777 Email (for orders and customer service enquiries): cs-books@wiley.co.uk Visit our Home Page on www.wileyeurope.com or www.wiley.com All Rights Reserved No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except under the terms of the Copyright, Designs and Patents Act 1988 or under the terms of a licence issued by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1T 4LP, UK, without the permission in writing of the Publisher Requests to the Publisher should be addressed to the Permissions Department, John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England, or emailed to permreq@wiley.co.uk, or faxed to (ỵ44) 1243 770620 Designations used by companies to distinguish their products are often claimed as trademarks All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners The Publisher is not associated with any product or vendor mentioned in this book This publication is designed to provide accurate and authoritative information in regard to the subject matter covered It is sold on the understanding that the Publisher is not engaged in rendering professional services If professional advice or other expert assistance is required, the services of a competent professional should be sought Other Wiley Editorial Offices John Wiley & Sons Inc., 111 River Street, Hoboken, NJ 07030, USA Jossey-Bass, 989 Market Street, San Francisco, CA 94103-1741, USA Wiley-VCH Verlag GmbH, Boschstr 12, D-69469 Weinheim, Germany John Wiley & Sons Australia Ltd, 33 Park Road, Milton, Queensland 4064, Australia John Wiley & Sons (Asia) Pte Ltd, Clementi Loop #02-01, Jin Xing Distripark, Singapore 129809 John Wiley & Sons Canada Ltd, 22 Worcester Road, Etobicoke, Ontario, Canada M9W 1L1 Wiley also publishes its books in a variety of electronic formats Some content that appears in print may not be available in electronic books Library of Congress Cataloging in Publication Data Classification, parameter estimation and state estimation : an engineering approach using MATLAB / F van der Heijden [et al.] p cm Includes bibliographical references and index ISBN 0-470-09013-8 (cloth : alk paper) Engineering mathematics—Data processing MATLAB Mensuration—Data processing Estimation theory—Data processing I Heijden, Ferdinand van der TA331.C53 2004 6810 2—dc22 2004011561 British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library ISBN 0-470-09013-8 Typeset in 10.5/13pt Sabon by Integra Software Services Pvt Ltd, Pondicherry, India Printed and bound in Great Britain by TJ International Ltd, Padstow, Cornwall This book is printed on acid-free paper responsibly manufactured from sustainable forestry in which at least two trees are planted for each one used for paper production Contents Preface Foreword xi xv Introduction 1.1 The scope of the book 1.1.1 Classification 1.1.2 Parameter estimation 1.1.3 State estimation 1.1.4 Relations between the subjects 1.2 Engineering 1.3 The organization of the book 1.4 References 11 12 Detection and Classification 2.1 Bayesian classification 2.1.1 Uniform cost function and minimum error rate 2.1.2 Normal distributed measurements; linear and quadratic classifiers 2.2 Rejection 2.2.1 Minimum error rate classification with reject option 2.3 Detection: the two-class case 2.4 Selected bibliography 2.5 Exercises 13 16 23 Parameter Estimation 3.1 Bayesian estimation 3.1.1 MMSE estimation 45 47 54 25 32 33 35 43 43 vi CONTENTS 3.2 3.3 3.4 3.5 3.6 3.1.2 MAP estimation 3.1.3 The Gaussian case with linear sensors 3.1.4 Maximum likelihood estimation 3.1.5 Unbiased linear MMSE estimation Performance of estimators 3.2.1 Bias and covariance 3.2.2 The error covariance of the unbiased linear MMSE estimator Data fitting 3.3.1 Least squares fitting 3.3.2 Fitting using a robust error norm 3.3.3 Regression Overview of the family of estimators Selected bibliography Exercises 55 56 57 59 62 63 67 68 68 72 74 77 79 79 State Estimation 4.1 A general framework for online estimation 4.1.1 Models 4.1.2 Optimal online estimation 4.2 Continuous state variables 4.2.1 Optimal online estimation in linear-Gaussian systems 4.2.2 Suboptimal solutions for nonlinear systems 4.2.3 Other filters for nonlinear systems 4.3 Discrete state variables 4.3.1 Hidden Markov models 4.3.2 Online state estimation 4.3.3 Offline state estimation 4.4 Mixed states and the particle filter 4.4.1 Importance sampling 4.4.2 Resampling by selection 4.4.3 The condensation algorithm 4.5 Selected bibliography 4.6 Exercises 81 82 83 86 88 100 112 113 113 117 120 128 128 130 131 135 136 Supervised Learning 5.1 Training sets 5.2 Parametric learning 5.2.1 Gaussian distribution, mean unknown 139 140 142 143 89 CONTENTS 5.2.2 5.3 5.4 5.5 5.6 Gaussian distribution, covariance matrix unknown 5.2.3 Gaussian distribution, mean and covariance matrix both unknown 5.2.4 Estimation of the prior probabilities 5.2.5 Binary measurements Nonparametric learning 5.3.1 Parzen estimation and histogramming 5.3.2 Nearest neighbour classification 5.3.3 Linear discriminant functions 5.3.4 The support vector classifier 5.3.5 The feed-forward neural network Empirical evaluation References Exercises vii 144 145 147 148 149 150 155 162 168 173 177 181 181 Feature Extraction and Selection 6.1 Criteria for selection and extraction 6.1.1 Inter/intra class distance 6.1.2 Chernoff–Bhattacharyya distance 6.1.3 Other criteria 6.2 Feature selection 6.2.1 Branch-and-bound 6.2.2 Suboptimal search 6.2.3 Implementation issues 6.3 Linear feature extraction 6.3.1 Feature extraction based on the Bhattacharyya distance with Gaussian distributions 6.3.2 Feature extraction based on inter/intra class distance 6.4 References 6.5 Exercises 183 185 186 191 194 195 197 199 201 202 Unsupervised Learning 7.1 Feature reduction 7.1.1 Principal component analysis 7.1.2 Multi-dimensional scaling 7.2 Clustering 7.2.1 Hierarchical clustering 7.2.2 K-means clustering 215 216 216 220 226 228 232 204 209 213 214 viii CONTENTS 7.2.3 Mixture of Gaussians 7.2.4 Mixture of probabilistic PCA 7.2.5 Self-organizing maps 7.2.6 Generative topographic mapping 7.3 References 7.4 Exercises 234 240 241 246 250 250 State Estimation in Practice 8.1 System identification 8.1.1 Structuring 8.1.2 Experiment design 8.1.3 Parameter estimation 8.1.4 Evaluation and model selection 8.1.5 Identification of linear systems with a random input 8.2 Observability, controllability and stability 8.2.1 Observability 8.2.2 Controllability 8.2.3 Dynamic stability and steady state solutions 8.3 Computational issues 8.3.1 The linear-Gaussian MMSE form 8.3.2 Sequential processing of the measurements 8.3.3 The information filter 8.3.4 Square root filtering 8.3.5 Comparison 8.4 Consistency checks 8.4.1 Orthogonality properties 8.4.2 Normalized errors 8.4.3 Consistency checks 8.4.4 Fudging 8.5 Extensions of the Kalman filter 8.5.1 Autocorrelated noise 8.5.2 Cross-correlated noise 8.5.3 Smoothing 8.6 References 8.7 Exercises 253 256 256 258 259 263 264 266 266 269 270 276 280 282 283 287 291 292 293 294 296 299 300 300 303 303 306 307 Worked Out Examples 9.1 Boston Housing classification problem 9.1.1 Data set description 9.1.2 Simple classification methods 309 309 309 311 CONTENTS 9.1.3 Feature extraction 9.1.4 Feature selection 9.1.5 Complex classifiers 9.1.6 Conclusions 9.2 Time-of-flight estimation of an acoustic tone burst 9.2.1 Models of the observed waveform 9.2.2 Heuristic methods for determining the ToF 9.2.3 Curve fitting 9.2.4 Matched filtering 9.2.5 ML estimation using covariance models for the reflections 9.2.6 Optimization and evaluation 9.3 Online level estimation in an hydraulic system 9.3.1 Linearized Kalman filtering 9.3.2 Extended Kalman filtering 9.3.3 Particle filtering 9.3.4 Discussion 9.4 References ix 312 314 316 319 319 321 323 324 326 327 332 339 341 343 344 350 352 Appendix A Topics Selected from Functional Analysis A.1 Linear spaces A.1.1 Normed linear spaces A.1.2 Euclidean spaces or inner product spaces A.2 Metric spaces A.3 Orthonormal systems and Fourier series A.4 Linear operators A.5 References 353 353 355 357 358 360 362 366 Appendix B Topics Selected from Linear Algebra and Matrix Theory B.1 Vectors and matrices B.2 Convolution B.3 Trace and determinant B.4 Differentiation of vector and matrix functions B.5 Diagonalization of self-adjoint matrices B.6 Singular value decomposition (SVD) B.7 References 367 367 370 372 373 375 378 381 Appendix C Probability Theory C.1 Probability theory and random variables C.1.1 Moments 383 383 386 k ẳ f; bk ị k fẳ X k Since both Fourier series represent the same vector we conclude that: f¼ X k k ak ¼ X k k bk 364 APPENDIX A The relationship between the Fourier coefficients k and k can be made explicit by the calculation of the inner product: f; bn ị ẳ X k ak ; bn ị ẳ X k k bk ; bn ị ẳ n a:34ị k The Fourier coefficients k and k can be arranged as vectors a ¼ (0 , , Á Á Á ) and b ¼ ( , , Á Á Á ) in R N or C N (if the dimension of R is finite), or in R and C (if the dimension of R is infinite) In one of these spaces equation (a.34) defines a linear operator U: b ¼ Ua ða:35Þ The inner product in (a.34) could equally well be accomplished with respect to a vector an This reveals that an operator UÃ exists for which: a ¼ U b a:36ị U ẳ U1 a:37ị Clearly, from (a.33): Suppose we have two vectors f and f represented in Sa by a and a , and in Sb by b , and b Since the inner product (f ,f ) must be independent of the representation, we conclude that (f ,f ) ¼ (a , a ) ¼ (b , b ) Therefore: ða ; U1 b ị ẳ Ua ; b Þ ða:38Þ Each operator that satisfies (a.38) is called a unitary operator A corollary of (a.38) is that any unitary operator preserves the Euclidean norm The adjoint AÃ of an operator A is an operator that satisfies: ðAf; gị ẳ f; A gị a:39ị From this definition, and from (a.38), it follows that an operator U for which its adjoint UÃ equals its inverse UÀ1 is a unitary operator This is in accordance with the notation used in (a.37) An operator A is called self-adjoint, if Ẫ ¼ A Suppose that A is a linear operator in a space R A vector ek that satisfies: LINEAR OPERATORS 365 Aek ẳ k ek ek 6ẳ a:40ị with k a real or complex number is called an eigenvector of A The number k is the eigenvalue The eigenvectors and eigenvalues of an operator are found by solving the equation (A À k I)ek ¼ Note that if ek is a solution of this equation, then so is ek with any real or complex number If a unique solution is required, we should constrain the length of the eigenvector to unit, i.e vk ¼ ek =kek k, yielding the so-called normalized eigenvector However, since ỵek =kek k and ek =kek k are both valid eigenvectors, we still have to select one out of the two possible solutions From now on, the phrase ‘the normalized eigenvector’ will denote both solutions Operators that are self-adjoint have – under mild conditions – some nice properties related to their eigenvectors and eigenvalues The properties relevant in our case are: All eigenvalues are real With each eigenvalue at least one normalized eigenvector is associated However, an eigenvalue can also have multiple normalized eigenvectors These eigenvectors span a linear subspace There is an orthonormal basis V ¼ f v0 v1 Á Á Á g formed by the normalized eigenvectors Due to possible multiplicities of normalized eigenvalues (see above) this basis may not be unique A corollary of the properties is that any vector f R can be represented by a Fourier series with respect to V, and that in this representation the operation becomes simply a linear combination, that is: f¼ X k vk k Af ¼ X with : k ¼ ðf; vk Þ ða:41Þ k k vk ða:42Þ k The connotation of this decomposition of the operation is depicted in Figure A.2 The set of eigenvalues is called the spectrum of the operator f calculation of Fourier coefficients Figure A.2 φk λk λ kφk Fourier series expansion Eigenvalue decomposition of a self-adjoint operator Af 366 A.5 APPENDIX A REFERENCES Kolmogorov, A.N and Fomin, S.V., Introductory Real Analysis, Dover Publications, New York, 1970 Pugachev, V.S and Sinitsyn, I.N., Lectures on Functional Analysis and Applications, World Scientific, 1999 Appendix B Topics Selected from Linear Algebra and Matrix Theory Whereas Appendix A deals with general linear spaces and linear operators, the current appendix restricts the attention to linear spaces with finite dimension, i.e R N and C N With that, all that has been said in Appendix A also holds true for the topics of this appendix B.1 VECTORS AND MATRICES Vectors in R N and C N are denoted by bold-faced letters, e.g f, g The elements in a vector are arranged either vertically (a column vector) or horizontally (a row vector) For example: 6 f¼6 f0 f1 7 or : f T ẳ ẵ f0 f1 Á fNÀ1 ðb:1Þ fNÀ1 The superscript T is used to convert column vectors to row vectors Vector addition and scalar multiplication are defined as in Section A.1 A matrix H with dimension N Â M is an arrangement of NM numbers hn,m (the elements) on an orthogonal grid of N rows and M columns: Classification, Parameter Estimation and State Estimation: An Engineering Approach using MATLAB F van der Heijden, R.P.W Duin, D de Ridder and D.M.J Tax Ó 2004 John Wiley & Sons, Ltd ISBN: 0-470-09013-8 368 APPENDIX B h0;0 h1;0 6 H ¼ h2;0 hNÀ1;0 ÁÁÁ ÁÁÁ h0;1 h1;1 ÁÁÁ ÁÁÁ h0;MÀ1 h1;MÀ1 7 7 ðb:2Þ hNÀ1;MÀ1 The elements are real or complex Vectors can be regarded as N Â matrices (column vectors) or Â M matrices (row vectors) A matrix can be regarded as an horizontal arrangement of M column vectors with dimension N, for example: H ẳ ẵ h0 Á Á Á hMÀ1 h1 ðb:3Þ Of course, a matrix can also be regarded as a vertical arrangement of N row vectors The scalar–matrix multiplication H replaces each element in H with hn,m The matrix–addition H ¼ A ỵ B is only defined if the two matrices A and B have equal size N Â M The result H is an N Â M matrix with elements hn,m ẳ an,m ỵ bn,m These two operations satisfy the axioms of a linear space (Section A.1) Therefore, the set of all N Â M matrices is another example of a linear space The matrix–matrix product H ¼ AB is defined only when the number of columns of A equals the number of rows of B Suppose that A is an N Â P matrix, and that B is a P Â M matrix, then the product H ¼ AB is an N Â M matrix with elements: hn;m ¼ PÀ1 X an;p bp;m ðb:4Þ p¼0 Since a vector can be regarded as an N Â matrix, this also defines the matrix–vector product g ¼ Hf with f an M-dimensional column vector, H an N Â M matrix and g an N-dimensional column vector In accordance with these definitions, the inner product between two real N-dimensional vectors introduced in Section A.1.2 can be written as: f; gị ẳ N1 X nẳ0 fn gn ẳ f T g b:5ị VECTORS AND MATRICES 369 It is easy to show that a matrix–vector product g ¼ Hf defines a linear operator from R M into R N and C M into C N Therefore, all definitions and properties related to linear operators (Section A.4) also apply to matrices Some special matrices are: The null matrix O This is a matrix fully filled with zero It corresponds to the null operator: Of ¼ The unit matrix I This matrix is square (N ¼ M), fully filled with zero, except for the diagonal elements which are unit: I¼4 0 This matrix corresponds to the unit operator: If ¼ f A diagonal matrix Ã is a square matrix, fully filled with zero, except for its diagonal elements n,n : Ã¼4 0;0 NÀ1;NÀ1 Often, diagonal matrices are denoted by upper case Greek symbols The transposed matrix HT of an N Â M matrix H is an M Â N matrix, its elements are given by hT ¼ hn,m m,n A symmetric matrix is a square matrix for which HT ¼ H The conjugated of a matrix H is a matrix H the elements of which are the complex conjugated of the one of H The adjoint of a matrix H is a matrix HÃ which is the conjugated T and the transposed of H, that is: HÃ ¼ H A matrix H is selfÃ adjoint or Hermitian if H ¼ H This is the case only if H is square and hn,m ¼ hm,n The inverse of a square matrix H is the matrix HÀ1 that satisfies HÀ1 H ¼ I If it exists, it is unique In that case the matrix H is called regular If HÀ1 doesn’t exist, H is called singular A unitary matrix U is a square matrix that satisfies UÀ1 ¼ UÃ A real unitary matrix is called orthonormal These matrices satisfy UÀ1 ¼ UT 370 APPENDIX B A square matrix H is Toeplitz if its elements satisfy hn,m ¼ g(nÀm) in which gn is a sequence of 2N À numbers A square matrix H is circulant if its elements satisfy hn,m ¼ g(nÀm)%N Here, (n À m)%N is the remainder of (n À m)/N A matrix H is separable if it can be written as the product of two vectors: H ¼ fgT Some properties with respect to the matrices mentioned above: H ị ẳ H b:6ị ABị ẳ B A b:7ị H1 ị ẳ H ị1 b:8ị ABị1 ẳ B1 A1 b:9ị 1 A1 ỵ HT B1 Hị1 ẳ A AHT HAHT ỵ B HA b:10ị The relations hold if the size of the matrices are compatible and the inverses exist Property (b.10) is known as the matrix inversion lemma B.2 CONVOLUTION Defined in a finite interval, the discrete convolution between a sequence fk and gk : gn ¼ NÀ1 X hnÀk fk with: n ¼ 0; 1; ; N b:11ị kẳ0 can be written economically as a matrix–vector product g ¼ Hf The matrix H is a Toeplitz matrix: h0 h1 6 H ¼ h2 hNÀ1 hÀ1 h0 h1 hNÀ2 hÀ2 hÀ1 ÁÁÁ ÁÁÁ h1 h1ÀN h2ÀN 7 hÀ1 h0 ðb:12Þ .. .Classification, Parameter Estimation and State Estimation An Engineering Approach using MATLAB? ? F van der Heijden Faculty of Electrical Engineering, Mathematics and Computer Science... Publication Data Classification, parameter estimation and state estimation : an engineering approach using MATLAB / F van der Heijden [et al.] p cm Includes bibliographical references and index ISBN... problems of classification and parameter estimation on the one hand, and state estimation on the other hand This is the ordering in time (or space) in state estimation, which is absent from classification

classification parameter estimation & state estimation an engg approach using matlab

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan