Fundamentals of statistical signal processing estimation theory kay 1

PRENTICE H A L L SIGNAL PROCESSING SERIES Alan V Oppenheim, Series Editor ANDREWSAND H UNT Digital Image Restomtion BRIGHAM T h e Fast Fourier Tmnsform BRIGHAM T h e Fast Fourier Transform and Its Applications BURDIC Underwater Acoustic System Analysis, 2/E CASTLEMAN Digital Image Processing COWAN AND G RANT Adaptive Filters CROCHIERE AND R ABINER Multimte Digital Signal Processing D UDGEON AND MERSEREAU Multidimensional Digital Signal Processing H AMMING Digital Filters, / E HAYKIN,ED Advances in Spectrum Analysis and Array Processing, Vols I € II HAYKIN,ED Array Signal Processing JAYANT AND N OLL Digital Coding of waveforms J OHNSON A N D D UDGEON Array Signal Processing: Concepts and Techniques K AY Fundamentals of Statistical Signal Processing: Estimation Theory KAY Modern Spectral Estimation KINO Acoustic Waves: Devices, Imaging, and Analog Signal Processing L EA , ED Trends in Speech Recognition LIM Two-Dimensional Signal and Image Processing L IM , ED Speech Enhancement L IM AND OPPENHEIM,EDS Advanced Topics in Signal Processing M ARPLE Digital Spectral Analysis with Applications MCCLELLAN AND RADER Number Theory an Digital Signal Processing MENDEL Lessons in Digital Estimation Theory OPPENHEIM, ED Applications of Digital Signal Processing OPPENHEIM AN D NAWAB, EDS Symbolic and Knowledge-Based Signal Processing OPPENHEIM, WILLSKY, WITH Y OUNG Signals and Systems OPPENHEIM AND SCHAFER Digital Signal Processing OPPENHEIM AND SCHAFERDiscrete- Time Signal Processing Q UACKENBUSH ET AL Objective Measures of Speech Quality RABINERAND G OLD Theory and Applications of Digital Signal Processing RABINERAND SCHAFERDigital Processing of Speech Signals ROBINSON AND TREITEL Geophysical Signal Analysis STEARNS AND DAVID Signal Processing Algorithms STEARNS AND HUSH Digital Signal Analysis, 2/E TRIBOLETSeismic Applications of Homomorphic Signal Processing VAIDYANATHAN Multimte Systems and Filter Banks WIDROW AND STEARNS Adaptive Signal Processing Fundamentals of Statistical Signal Processing: Est imat ion Theory Steven M Kay University of Rhode Island For book and bookstore information I http://wmn.prenhrll.com gopher to gopher.prenhall.com I Upper Saddle River, NJ 07458 Contents Preface xi Introduction 1.1 Estimation in Signal Processing 1.2 The Mathematical Estimation Problem 1.3 Assessing Estimator Performance 1.4 Some Notes to the Reader 1 12 Minimum Variance Unbiased Estimation 15 2.1 Introduction 15 15 2.2 Summary 16 2.3 Unbiased Estimators 2.4 Minimum Variance Criterion 19 2.5 Existence of the Minimum Variance Unbiased Estimator 20 2.6 Finding the Minimum Variance Unbiased Estimator 21 2.7 Extension to a Vector Parameter 22 Cramer-Rao Lower Bound 3.1 Introduction 3.2 Summary 3.3 Estimator Accuracy Considerations 3.4 Cramer-Rao Lower Bound 3.5 General CRLB for Signals in White Gaussian Noise 3.6 Transformation of Parameters 3.7 Extension to a Vector Parameter 3.8 Vector Parameter CRLB for Transformations 3.9 CRLB for the General Gaussian Case 3.10 Asymptotic CRLB for WSS Gaussian Random Processes 3.1 Signal Processing Examples 3A 3B 3C 3D Derivation Derivation Derivation Derivation of Scalar Parameter CRLB of Vector Parameter CRLB of General Gaussian CRLB of Asymptotic CRLB vii 27 27 27 28 30 35 37 39 45 47 50 53 67 70 73 77 viii CONTENTS Linear Models 4.1 Introduction 4.2 Summary 4.3 Definition and Properties 4.4 Linear Model Examples 4.5 Extension to the Linear Model General Minimum Variance Unbiased Estimation 5.1 Introduction 5.2 Summary 5.3 Sufficient Statistics 5.4 Finding Sufficient Statistics 5.5 Using Sufficiency to Find the MVU Estimator 5.6 Extension to a Vector Parameter 5A Proof of Neyman-Fisher Factorization Theorem (Scalar Parameter) 5B Proof of Rao-Blackwell-Lehmann-Scheffe Theorem (Scalar Parameter) Best Linear Unbiased Estimators 6.1 Introduction 6.2 Summary 6.3 Definition of the BLUE 6.4 Finding the BLUE 6.5 Extension to a Vector Parameter 6.6 Signal Processing Example 6A Derivation of Scalar BLUE 6B Derivation of Vector BLUE Maximum Likelihood Estimation 7.1 Introduction 7.2 Summary 7.3 An Example 7.4 Finding the MLE 7.5 Properties of the MLE 7.6 MLE for Transformed Parameters 7.7 Numerical Determination of the MLE 7.8 Extension to a Vector Parameter 7.9 Asymptotic MLE 7.10 Signal Processing Examples 7A Monte Carlo Methods 7B Asymptotic PDF of MLE for a Scalar Parameter 7C Derivation of Conditional Log-Likelihood for EM Algorithm Example Least Squares 8.1 Introduction 8.2 Summary 83 83 83 83 86 94 101 101 101 102 104 107 116 127 130 133 133 133 134 136 139 141 151 153 157 157 157 158 162 164 173 177 182 190 191 205 211 214 219 219 219 CONTENTS 8.4 8.5 8.6 8.7 8.8 8.9 8.10 8A 8B 8C The Least Squares Approach Linear Least Squares Geometrical Interpretations Order-Recursive Least Squares Sequential Least Squares Constrained Least Squares Nonlinear Least Squares Signal Processing Examples Derivation of Order-Recursive Least Squares Derivation of Recursive Projection Matrix Derivation of Sequential Least Squares ix 220 223 226 232 242 251 254 260 282 285 286 Method of Moments 9.1 Introduction 9.2 Summary 9.3 Method of Moments 9.4 Extension to a Vector Parameter 9.5 Statistical Evaluation of Estimators 9.6 Signal Processing Example 289 289 289 289 292 294 299 10 The 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8 lOA 309 Bayesian Philosophy Introduction Summary Prior Knowledge and Estimation Choosing a Prior PDF Properties of the Gaussian PDF Bayesian Linear Model Nuisance Parameters Bayesian Estimation for Deterministic Parameters Derivation of Conditional Gaussian PDF 309 309 310 316 321 325 328 330 337 11 General Bayesian Estimators 11.1 Introduction 11.2 Summary 11.3 Risk Functions 11.4 Minimum Mean Square Error Estimators 11.5 Maximum A Posteriori Estimators 11.6 Performance Description 11 Signal Processing Example : llA Conversion of Continuous-Time System to DIscrete-TIme System 341 12 Linear Bayesian Estimators 12.1 Introduction 12.2 Summary 12.3 Linear MMSE Estimation 379 341 341 342 344 350 359 365 375 379 379 380 CONTENTS x 12.4 12.5 12.6 12.7 12A Geometrical Interpretations The Vector LMMSE Estimator Sequential LMMSE Estimation Signal Processing Examples - Wiener Filtering Derivation of Sequential LMMSE Estimator 384 389 392 400 415 13 Kalman Filters 13.1 Introduction 13.2 Summary 13.3 Dynamical Signal Models 13.4 Scalar Kalman Filter 13.5 Kalman Versus Wiener Filters 13.6 Vector Kalman Filter 13.7 Extended Kalman Filter 13.8 Signal Processing Examples 13A Vector Kalman Filter Derivation 13B Extended Kalman Filter Derivation 419 419 419 420 431 442 446 449 452 14 Sununary of Estimators 14.1 Introduction 14.2 Estimation Approaches 14.3 Linear Model 14.4 Choosing an Estimator 479 479 479 486 489 15 Extensions for Complex Data and Parameters 15.1 Introduction 15.2 Summary 15.3 Complex Data and Parameters 15.4 Complex Random Variables and PDFs 15.5 Complex WSS Random Processes 15.6 Derivatives, Gradients, and Optimization 15 Classical Estimation with Complex Data 15.8 Bayesian Estimation 15.9 Asymptotic Complex Gaussian PDF 15.10Signal Processing Examples 15A Derivation of Properties of Complex Covariance Matrices 15B Derivation of Properties of Complex Gaussian PDF 15C Derivation of CRLB and MLE Formulas 493 493 493 494 500 513 517 524 532 535 539 555 558 563 Al Review of Important Concepts Al.l Linear and Matrix Algebra Al.2 Probability, Random Processes and Time Series Models A2 Glc>ssary of Symbols and Abbreviations 567 567 574 583 INDEX 589 471 476 Preface Parameter estimation is a subject that is standard fare in the many books available on statistics These books range from the highly theoretical expositions written by statisticians to the more practical treatments contributed by the many users of applied statistics This text is an attempt to strike a balance between these two extremes The particular audience we have in mind is the community involved in the design and implementation of signal processing algorithms As such, the primary focus is on obtaining optimal estimation algorithms that may be implemented on a digital computer The data sets are therefore assumed to be sa~ples of a continuous-t.ime waveform or a sequence of data points The chOice of tOpiCS reflects what we believe to be the important approaches to obtaining an optimal estimator and analyzing its performance As a consequence, some of the deeper theoretical issues have been omitted with references given instead It is the author's opinion that the best way to assimilate the material on parameter estimation is by exposure to and working with good examples Consequently, there are numerous examples that illustrate the theory and others that apply the theory to actual signal processing problems of current interest Additionally, an abundance of homework problems have been included They range from simple applications of the theory to extensions of the basic concepts A solutions manual is available from the publisher To aid the reader, summary sections have been provided at the beginning of each chapter Also, an overview of all the principal estimation approaches and the rationale for choosing a particular estimator can be found in Chapter 14 Classical estimation is first discussed in Chapters 2-9, followed by Bayesian estimation in Chapters 10-13 This delineation will, hopefully, help to clarify the basic differences between these two principal approaches Finally, again in the interest of clarity, we present the estimation principles for scalar parameters first, followed by their vector extensions This is because the matrix algebra required for the vector estimators can sometimes obscure the main concepts This book is an outgrowth of a one-semester graduate level course on estimation theory given at the University of Rhode Island It includes somewhat more material than can actually be covered in one semester We typically cover most of Chapters 1-12, leaving the subjects of Kalman filtering and complex data/parameter extensions to the student The necessary background that has been assumed is an exposure to the basic theory of digital signal processing, probability and random processes, and linear xi xii PREFACE and matrix algebra This book can also be used for self-study and so should be useful to the practicing engin.eer as well as the student The author would like to acknowledge the contributions of the many people who over the years have provided stimulating discussions of research problems, opportunities to apply the results of that research, and support for conducting research Thanks are due to my colleagues L Jackson, R Kumaresan, L Pakula, and D Tufts of the University of Rhode Island, and Scharf of the University of Colorado Exposure to practical problems, leading to new research directions, has been provided by H Woodsum of Sonetech, Bedford, New Hampshire, and by D Mook, S Lang, C Myers, and D Morgan of Lockheed-Sanders, Nashua, New Hampshire The opportunity to apply estimation theory to sonar and the research support of J Kelly of the Naval Undersea Warfare Center, Newport, Rhode Island, J Salisbury of Analysis and Technology, Middletown, Rhode Island (formerly of the Naval Undersea Warfare Center), and D Sheldon of th.e Naval Undersea Warfare Center, New London, Connecticut, are also greatly appreciated Thanks are due to J Sjogren of the Air Force Office of Scientific Research, whose continued support has allowed the author to investigate the field of statistical estimation A debt of gratitude is owed to all my current and former graduate students They have contributed to the final manuscript through many hours of pedagogical and research discussions as well as by their specific comments and questions In particular, P Djuric of the State University of New York proofread much of the manuscript, and V Nagesha of the University of Rhode Island proofread the manuscript and helped with the problem solutions r t Chapter Introduction 1.1 Estimation in Signal Processing Modern estimation theory can be found at the heart of many electronic signal processing systems designed to extract information These systems include Radar Sonar Speech Steven M Kay University of Rhode Island Kingston, RI 02881 Image analysis Biomedicine Communications Control Seismology, and all share the common problem of needing to estimate the values of a group of parameters We briefly describe the first three of these systems In radar we are mterested in determining the position of an aircraft, as for example, in airport surveillance radar [Skolnik 1980] To determine the range R we transmit an electromagnetic pulse that is reflected by the aircraft, causin an echo to be received b the antenna To seconds later~ as shown in igure 1.1a The range is determined by the equation TO = 2R/c, where c is the speed of electromagnetic propagation Clearly, if the round trip delay To can be measured, then so can the range A typical transmit pulse and received waveform a:e shown in Figure 1.1b The received echo is decreased in amplitude due to propagatIon losses and hence may be obscured by environmental nois~ Its onset may also be perturbed by time delays introduced by the electronics of the receiver Determination of the round trip delay can therefore require more than just a means of detecting a jump in the power level at the receiver It is important to note that a typical modern l CHAPTER INTRODUCTION 1.1 ESTIMATION IN SIGNAL PROCESSING Sea surface Transmit/ receive antenna Towed array Sea bottom ' -+01 Radar processing system -~~ -~ (a) (a) Passive sonar Radar Sensor output Transmit pulse - - - - -1 Time ~ ~'C7~ Received waveform : -_ -_ _ -, Time Time Sensor output Time TO (b) ~ - ! Transmit and received waveforms Figure 1.1 Radar system radar s!,stem will input the received continuous-time waveform into a digital computer by takmg samples via an analog-to-digital convertor Once the waveform has been sampled, the data compose a time series (See also Examples 3.13 and 7.15 for a more detailed description of this problem and optimal estimation procedures.) Another common application is in sonar, in which we are also interested in the posi~ion of a target, such as a submarine [Knight et al 1981, Burdic 1984] A typical passive sonar is shown in Figure 1.2a The target radiates noise due to machiner:y on board, propellor action, etc This noise, which is actually the signal of interest, propagates through the water and is received by an array of sensors The sensor outputs f ~ \~ (b) / Time Received signals at array sensors Figure 1.2 Passive sonar system are then transmitted to a tow ship for input to a digital computer Because of the positions of the sensors relative to the arrival angle of the target signal, we receive the signals shown in Figure 1.2b By measuring TO, the delay between sensors, we can determine the bearing f3 Z var(A) ~ Appendix Review of Important Concepts A1.1 Linear and Matrix Algebra Important results from linear and matrix algebra theory are reviewed in this section rn the discussIOns to follow it is assumed that the reader already has some familiarity with these topics The specific concepts to be described are used heavily throughout the book For a more comprehensive treatment the reader is referred to the books [Noble and Daniel 1977] and [Graybill 1969] All matrices and vectors are assumed to be real A1.1.1 Definitions' Consider an m x n matrix A with elements aij, shorthand notation for describing A is [A]ii 1,2, ,m; j 1,2, ,n A = aijo The tmnspose of AJ which is denoted by AT, is defined as the n x m matrix with elements aji or [AT]ij = aji) A square matrix is one for which m = n A s uare matrix is s metrl,t if AT = A The ron of a matrix is the number of linearly independent rows or columns, whichever is less The inverse70f a square n x n matrix is the square n x n matrix A for which A-IA=AA- =1'· where I is the n x n identity matrix The inverse will exist if and onl if the rank of A is n I t e mverse does not eXIst t en is sin laf'! e e erminanfOf a square n x n matrix is denoted by det(A) It is computed as n det(A) = EaiPij )=1 567 APPENDIX REVIEW OF IMPORTANT CONCEPTS 568 569 AU LINEAR AND MATRIX ALGEBRA where All n Q= A= n LL [ aijXiXj' ;) o A22 o ;=1 j=1 in which all submatrices Aii are square and the other submatrices are identically zero The dimensions of the submatrices need not be identical For instance, if k = 2, All might have dimension x while An might be a scalar If all Aii are nonsingular, then the inverse is easily found as In defining the quadratic form it is assumed that aji = aij This entails no loss in generality since any quadratic function may be expressed in this manner Q may also be expressed as Q =xTAx} where x = [XI X2 xnf and A is a square n x n matrix with aji = aij or A is a symmetric matrix A square n x n matrix A is positive semidefinite~(f A is symmetric and xTAx ~ o for all x i= O If the quadratic form is strictly positive then A is positive definite When referring to a matrix as positive definite or positive semidefinite, it is always assumed that the matrix is symmetric The trace-Jof a square n x n matrix is the sum of its diagonal elements or n tr(A) = L aii."·} Also, the determinant is n det(A) = II det(Aii)' I i=l A square n x n matrix is orthogonanf i=l A partitioned m x n matrix j!{ is one that is expressed in terms of its submatrices An example is the x partitioning A ~ch = [All A21 For a matrix to be orthogonal the columns (and rows) must be orthonormal or if AI2] A22 where denotes the ith column, the conditions "element" Aij is a submatrix of A The dimensions of the partitions are given as k xl [ (m-k)xl A1.1.2 kx(n-l) ] (m - k) x (n -l) must be satisfied An important example of an orthogonal matrix arises in modeling of gata by a sum of harmonically related sinusoids or by a discrete Fourier series As an example, for n even Special Matrices A diagonal matrix is a square n x n matrix with the principal diagonal are zero lagona matrix appears as or all elements off A= _1_ v1 o rt v'2 I v'2 v'2 cos~ n cos 2rr(n-l) n v'2 L cos 2rr( %) v'2 n sin 2rr sin 2rr(%-I) 2rr.!!(n-l) v'2 cos n sin 2rr(n-l) sin 2rr( %-I)(n-l) n n n n 570 APPENDIX REVIEW OF IMPORTANT CONCEPTS is an orthogonal matrix This follows from the orthogonality relationships for i, j = 0,1, , n/2 27rki 27rkj {o 2:COS COS-= I n n n k=O A1.1.3 Matrix Manipulation and Formulas BTAT (A-If B- A-I and for i,j = 1,2, ,n/2-1 det(A) en det(A) (e a scalar) det(A) det(B) 27rki 27rkj n "" i L Sill -n- Sill -n- = -8 2) n-I 571 Some useful formulas for the algebraic manipulation of matrices are summarized in this section For n x n matrices A and B the following relationships are useful iij i = j = 1,2, , I-I i = j = 0, ~ n-I A1.1 LINEAR AND MATRIX ALGEBRA k=O and finally for i = 0, 1, , n/2;j = 1,2, , n/2 - det(A) tr(BA) 27rkj 2: cos -27rki - Sill - - = O n n n-1 n k=O These orthogonality relationships may be proven by expressing the sines and cosines in terms of complex exponentials and using the result 2: exp n-I ( j2: ~l ) n 2: 2:[A]ij[Bk i=1 j=1 Also, for vectors x and y we have = nt5 10 k=O It is frequently necessary to determine the inverse of a matrix analytically To so one can make use of the followmg formula 'the inverse of a square n x n matrix is for 1= 0, 1, - ,n - [Oppenheim and Schafer 1975] An idempotent matrix is a square n x n matrix which satisfies A-I=~ det(A) This condition implies that At = A for I > An example is the projection matrix where C is the square n x n matrix of cofactors of A The cofactor matrix is defined by where H is an m x n full rank matrix with m > n A square n x n Toeplit,(matrix is defined as where Mij is the minor of aij obtained by deleting the ith row and ith column of A Another formula which is quite useful is the matrix inversion lemma - or where it is assumed that A is n x n, B is n x m, C is m x m, and D is m x n and that £Iie1'ilciicated inverses eXIst A special case known as Woodbury's identityJ'esults for B an n x column vecto~C a scalar of unity, and D a x n row vector u T Then, '" '" a_(n_l) a_(n_2) (ALl) A= (A+UUT)-l =A- ao ~ent along a northwest-southeast diagonal is the same If in addition, a ak, then A is symmetric Toeplitz k P~itioned A-1uuTA-l, _ 1+uT A- 1u ' matrices may be manipulated according to the usual rules of matrix ~gebra:by conSidering each submatrix as an element For multiplication of partitioned 572 APPENDIX REVIEW OF IMPORTANT CONCEPTS matrices the submatrices which are multiplied together must be conformable As an illustration, for x partitioned matrices AB :~~ ] = AllB12 A21B12 + A12B22 + A22B22 573 A1.l LINEAR AND MATRIX ALGEBRA h the principal minors are all positive (The ith principal minor is the determinant of the submatrix formed by deleting all rows and columns wIth an mdex greater than i.) If A can be written as in (A1.2), but C is not full rank or the prmclpal minors are only nonnegjative, then A is positive semidefinite If A is positive definite, then the inverse exists and may be found from (A1.2) as A = (C 1)"(C 1) ( ] The transposition of a artitioned matrix is formed b trans os in the submatrices of the matrix an applying T to each submatrix For a x partitioned matrix Let A be positive definite If B is an m x n matrix of full rank with m < n, then BABT is also positive definite If A is positive definite (positive semidefinite) then a the diagonal elements are positive (nonnegative,) The extension of these properties to arbitrary partitioning is straightforward Determination of the inverses and determinants of partitioned matrices is facilitated b emp oymg t et A e a square n x n matrix partitioned as A =[ All A21 A12] _ [ kx k A22 (n - k) x k (All - A12A221 A 21 )-1 -(A22 - A21AIIIAI2)-IA21Al/ det(A) ~ere -(All - A12A221 A21)-tAI2A221 (A22 - A21Al/ A I2 )-1 det(A 22 ) det(All - A12A221 A21 ) det(All) det(A 22 - A21A 1II A 12 ) the inverses of Au and A22 are assumed to exist A1.1.4 Eigendecompostion of Matrices (Al.3) AV=AV ~ ~ =[ A1.1.5 for some scalar A, which ma com lex A is the eigenvaluec1f A corresponding to t e eigenvector v li is assumed that the eigenvector is normalized to have unit length or Vi v = If A is symmetric, then one can alwa s find n linear! inde endeirt"eigenvectors, althoug they wi not in general be unique An example is the identity matrix for which any vector IS an eigenvector with eigenvalue If A is symmetric then the eigenvectors corresponding to distinct ei envalues are orthonormal or v[Vj = 8i /and the eigenv ues are real If, furthermore, the matrix is positive definite (positive semidefinite), then the eigenvalues are ositive (nonnegative For a positive semI e m e matrIx e ran IS e ual to the number of nonzero ei envalues e defining relatIOn (A1.3) can also be written as Theorenm Some important theorems used throughout the text are summarized in this section or AV=VA (Al.4) where (nonnegativ~) An eigenvector of a square n x n matrix A is an n x vector v satisfyiJ;!g k x (n - k) ] (n - k) x (n - k) Then, A-I h the determinant of A, which is a principal minor, is positive ~9.uare v = n x n matrix A is positive definite if and only if where C is also n x n and is full rank and hence invertible, ~ V2 • Vn A = diag(Ab A2"'" An) a it can be written as A=CCT [VI (A1.2) 574 APPENDIX REVIEW OF IMPORTANT CONCEPTS A1.2 PROBABILITY, RANDOM PROCESSES, TIME SERIES MODELS becomes The extension to a set of random variables or a random vector x mean E(x) = I-'x n E>'ivivr.·; 575 = [Xl X2 •• xn]T with and covariance matrix i=l Also, the inverse is easily determined as A -1 ~ is the multivariate Gaussian PDF y T - A- l y - l p(x) YA- 1yT, ~ ~ i=l Ni, (211")2 det' (C x ) exp [ (x - I-'x)T C;l (x - I-'x)] (A1.6) Note that C x is an n x n symmetric matrix with [Cx]ij = E{[Xi - E(Xi)][Xj - E(xj)]} = Xj) and is assumed to be positive definite so that C x is invertible If C x is a diagonal matrix, then the random variables are uncorrelated In this case (A1.6) factors into the product of N univariate Gaussian PDFs of the form of (A1.5), and hence the random variables are also independent If x is zero mean, then the higher order joint moments are easily computed In particUlar the fourth·order moment is T :f ViVi ' COV(Xi' t A final useful relationship follows from (AlA) as det(A) = = det(Y) det(A) det(y-1f:> det(A) - = II\·' n i=l J If x is linearly transformed as y=Ax+b A1.2 Probability, Random Processes, and Time Series Models where A is m x nand b is m x with m < n and A full rank (so that C y is nonsingular), tfien y is also distributed according to a multivariate Gaussian distribution with A1.2.1 and )211"a; 2a; y 00 < x < 00 i=l ~ere X~ denotes a X random variable with n degrees of freedom The PDF is given (A1.5) p(y) = The shorthand notation x '" N (/1-x, a;) is often used, where '" means "is distributed ~ccording to." If x '" N(O, then the moments of x are an, E( k) _ { 1·3··· (k - l)a~ x = Ex; '" x~ as - = C y = ACxAT n A probability density function PDF which is frequently used to model the statistical behavlOr a ran om vana e is the Gaussian distribution A random variable x with mean /1-x and variance a~ is distributed according to a Gaussian or normal distribution if the PDF is given by = _1_ exp [ l_(x _ /1- x )2] E [(y -l-'y)(Y -I-'yfl Another useful PDF is the distribution, which is derived from the Gaussian distribution x is composed of independent and identically distributed random variables with Xi '" N(O, 1), i = 1,2, , n, then Useful Probability Density Functions p(X) = I-'y = AI-'x + b E(y) An assumption is made that the reader already has some familiarity with probability theory and basic random process theory This chapter serves as a review of these topics For those readers needing a more extensive treatment the text by Papoulis [1965] on probability and random processes is recommended For a discussion of time series modeling see [Kay 1988] { ~y~-l exp(-h) for y ;::: o for y < r(.) where r(u) is the gamma integral The mean and variance of yare E(y) var(y) k even k odd J = n 2n y 576 A1.2.2 APPENDIX REVIEW OF IMPORTANT CONCEPTS A1.2 PROBABILITY RANDOM PROCESSES, TIME SERIES MODELS Random Process Characterization 00 A discrete random process x[n] is a sequence of random variables defined for every integer n If the discrete random process is wide sense stationary (WSS), then it has a mean E(x[n]) 577 = J.Lx (A1.9) k=-oo erty Tyx[k] It also follows from the definition of the cross-P t at = Txy[-k] which does not depend on n, and an autocorrelation function (ACFl Txx[k] = E(x[n]x[n + k]) (Al.7) which depends only on the lag k between the two samples and not their absolute positions Also, the autocovariance function is defined as cxx[k] = E [(x[n] - J.Lx)(x[n + k] - J.Lx)] = Txx[k] - J.L; In a similar manner, two jointly WSS random processes x[n] and y n have a crosscorrelation function (CCF Txy[k] = E(x[n]y[n + k]) where 8[k] is the discrete impulse function This says that each sample is uncorrelated with all the others Using (A1.8), the PSD becomes and a cross-covariance function cxy[k] = E [(x[n] - J.Lx)(y[n + k] -/Ly)] = Txy[k] - J.LxJ.Ly.: Pxx(J) = and is observed to be completely flat with fre uenc Alternatively, ;yhite noise is compose egUl-power contributions from all freguencies For a linear shift invariant (LSI) system with impulse response h[n] and with a WSS random process input, various I!lationships between the correlations and spect.ral density fUnctions of the input process x[n] and output process yrn] hold The correlatIOn relationships are Some useful properties of the ACF and CCF are Txx[O] Txx[-k] Txy[-k] (}"2 > ITxx[kJl Txx[k] Tyx[k] 00 Note that Txx[O] is positive, which follows from (A1.7) The z transforms of the ACF and CCF defined as h[k] *Txx[k] L = h[l]Txx[k -I] 1=-00 00 L Pxx(z) 00 Txx[k]z-k h[-k] *Txx[k] = k=-oo L h[-I]Txx[k -I] 1=-00 00 Pxy(z) = L Txy[k]z-k 00 L k=-co l~ad to the definition of the pOwer spectral density (PSD) When evaluated on the unit circle, Prx(z) and Pxy(z) become the auto-PSD, Pxx(J) = Prr(exp[j27rfJ), and cross-PSD, PXy(J) = PXy(exp[j27rf]), or explicitly co Pxx(J) = L k=-co Txx[k]exp(-j21l"fk) (A1.8) m=-oo 00 h[k - m] L h[-I]Txx[m -I] [=-00 where * denotes convolution Denoting the system function by 1i( z) = 2:~- 00 h[n]z-n, the following relationships fort"he PSDs follow from these correlation propertIes Pxy(z) Pyx(z) Pyy(z) 1i(z)Pxx (z) 1i(l/z)Pxx (z) 1i(z)1i(l/ z)Pxx(z) 578 In particular, letting H(f) 579 A1.2 PROBABILITY, RANDOM PROCESSES, TIME SERIES MODELS APPENDIX REVIEW OF IMPORTANT CONCEPTS The first time series model is termed an autoregressive (AR) process, which has the time domain representation = 1£[exp(j27rJ)] be the frequency response of the LSI system results in p Pxy(f) Pyx (f) Pyy(f) = H(f)Pxx(f) H* (f)Pxx (f) IH(fW Pxx(f) x[n] A1.2.3 = p + La[k]exp(-j27r!k) H(f) This will form the basis of the time series models to be described k=1 the AR PSD is Gaussian Random Process Pxx(f) A Gaussian random process is one for which an sam les x no ,x n1],' ,x[nN -1] are Jomt y 1stributed accor mg to a multivariate Gaussian PDF If the samp es are a en at successive times to generate the vector x = [x [0] x[I] x[N -l]f, then assuming a zero mean WSS random process, the covariance matrix takes on the form l rxx[O] rxx[l] rx~[O] p - La[l]rxx[k -l] rxx [ k ] m = p :: In matrix form this becomes for k l rxx[O] rxx[I] - ~a[l]rxx[l] + (]'~ k=O -l]]l a[I]] a[2] = 1,2, , P rxx[p rxx[p - 2] · rxx[O] a[p] ·· rxx[p - 1] rxx[p - 2] and also k2:1 1=1 { -oo

Fundamentals of statistical signal processing estimation theory kay 1

Thông tin tài liệu

Từ khóa liên quan

Mục lục

Fundamentals of Statistical Signal Processing: Estimation Theory

Tài liệu cùng người dùng

Tài liệu liên quan