Nghiên cứu, phát triển kỹ thuật định vị trong nhà sử dụng tín hiệu wi fi tt

MINISTRY OF EDUCATION AND TRAINING MINISTRY OF SCIENCE AND TECHNOLOGY NATIONAL CENTER FOR TECHNOLOGICAL PROGRESS VU TRUNG KIEN RESEARCH AND DEVELOPMENT FOR WI-FI BASED INDOOR POSITIONING TECHNIQUE SUMMARY OF DOCTORAL THESIS Field of study: Electronics Engineering Code: 9520203 HA NOI - 2019 The thesis is completed at: National Center for Technological Progress Supervisor: Prof., Dr Le Hung Lan Reviewer 1: Assoc Prof., Dr Thai Quang Vinh Reviewer 2: Assoc Prof., Dr Ha Hai Nam Reviewer 3: Assoc Prof., Dr Hoang Van Phuc The thesis shall be defended in front of the Thesis Committee at Academy Level at National Center for Technological Progress At hour date month year 2019 The thesis can be found at: The Library of National Center for Technological Progress; The National Library LIST OF WORKS RELATED TO THE THESIS HAS BEEN PUBLISHED [CT1] Hoang Manh Kha, Duong Thi Hang, Vu Trung Kien, Trinh Anh Vu (2017), Enhancing WiFi based Indoor Positioning by Modeling measurement Data with GMM, IEEE International Conference on Advanced Technologies for Communications, IEEE, Quy Nhon, Vietnam, pp 325-328 [CT2] Vu, T.K., Hoang, M.K., and Le, H.L (2018), "WLAN Fingerprinting based Indoor Positioning in the Precence of Dropped Mixture Data", Journal of Military Science and Technology 57A(3), pp 25-34 https://drive.google.com/file/d/1jv2U3tmJq1vUEez6nt6Cq8DzJW EWZu6-/view [CT3] Vu, Trung Kien and Le, Hung Lan (2018), "Gaussian Mixture Modeling for Wi-Fi Fingerprinting based Indoor Positioning in the Presence of Censored Data", Vietnam Journal of Science, Technology and Engineering 61(1), pp 3-8, DOI: https://doi.org/10.31276/VJSTE.61(1).03-08 [CT4](ISI-Q2) Vu, Trung Kien, Hoang, Manh Kha, and Le, Hung Lan (2019), "An EM algorithm for GMM parameter estimation in the presence of censored and dropped data with potential application for indoor positioning", ICT Express, 5(2), pp 120-123, DOI: 10.1016/j.icte.2018.08.001 Accepted paper: [CT5](ISI-Q3) Vu, Trung Kien, Hoang, Manh Kha, and Le, Hung Lan (2019), “Performance Enhancement of Wi-Fi Fingerprinting based IPS by Accurate Parameter Estimation of Censored and Dropped Data”, Radioengineering, ISSN: 1805-9600 Submission: 06/04/2019, Reviews Opened: 27/05/2019, Accepted: 03/09/2019 INTRODUCTION The necessity of the thesis Satellite based positioning systems such as the GPS (Global Positioning System) can accurately locate objects in outdoor environments However, in indoor environments, because satellite signals are not transmitted directly to the positioning device, the accuracy of these systems is greatly reduced On the other hand, there are more and more indoor navigation needs, such as positioning for smartphone users to move in terminals, airports, and commercial centers; locating for goods in stock; positioning for cars in the parking lots For these reasons, in recent years, the IPS (Indoor Positioning System) is interested in research and development Among the current indoor positioning technologies, Wi-Fi based positioning technology in the WLAN (Wireless Local Area Network) is most commonly used due to some reasons such as: Wi-Fi is available at most areas, popular mobile devices such as phones and computers are equipped with Wi-Fi signal transceivers According to the above reasons, the author has chosen the topic: "Research and development for Wi-Fi based indoor positioning techniques", which delves into the research of RSSIF-IPT (Received Signal Strength Indication Fingerprinting based Indoor Positioning Technique) Scope of the study Researching techniques for positioning the static objects in 2dimensional space in indoor environments Positioning technique focused on research is RSSIF-IPT The studied issues include: Characteristics of Wi-Fi RSSI; modelling the distribution of Wi-Fi RSSI; algorithm to estimate parameters, optimize the parameters of the model used to model the distribution of Wi-Fi RSSI; online positioning algorithm Research objectives of the topic Researching and developing the Wi-Fi RSSI fingerprinting based indoor positioning technique in order to minimize positioning errors and optimize positioning time The detailed research objectives are as follow: + Developing algorithms to estimate the parameters and number of Gaussian components in GMM (Gaussian Mixture Model) in the presence of unobservable data; + Developing a positioning algorithm for minimizing positioning errors and optimizing positioning time; Methods Statistical method for conducting the characteristics of collected data (Wi-Fi RSSI); analytical method for developing parameter estimation algorithms and positioning algorithms; Monte Carlo method for evaluating proposed algorithms; empirical methods on both simulation data and real data to verify the effectiveness of the proposals applied to IPS New findings of the doctoral dissertation - The parameter estimation algorithm for GMM in the presence of censored and dropped mixture data [CT2-CT4]; - The model selection algorithm for GMM from incomplete data [CT5]; - The positioning procedure in the presence of unobservable data [CT5] Organization of dissertation The thesis will be divided into chapters: Chapter 1: Overview of WiFi based IPS Chapter 2: GMM parameter estimation in the presence of censored and dropped data Chapter 3: GMM model selection in the presence of censored and dropped data Chapter 4: Positioning algorithm and experimental results CHAPTER OVERVIEW OF WI-FI BASED IPS 1.1 Wi-Fi based indoor positioning techniques Wi-Fi based indoor positioning techniques (IPT) can be divided into two main groups: - Time and Space Attributes of Received Signal (TSARS) based IPTs TSARS can be the Time of Arrival (ToA); the Time Difference of Arrival (TDoA) or the Angle of Arrival (AoA) - RSSI based IPTs This group includes the proximity positioning technique; Path Loss Model (PLM) based positioning technique and RSSIF-IPT RSSIF-IPT consists of two phases: the offline training phase and the online positioning phase During the training phase, RSSIs were collected at the reference points (RP) to build the database At the online positioning stage, the RSSIs collected by the object (OB) are compared with the database, thereby estimating the position of the OB through the location of one or several RPs Among the positioning techniques, RSSIF-IPT has the most advantages RSSIF-IPT can be utilized the deterministic method (D-RSSIF-IPT) or probability method (P-RSSIF-IPT) Compared with D-RSSIF-IPT, PRSSIF-IPT has lower positioning error because the database of this method can cover the variation of RSSI P-RSSIF-IPT can use nonparametric model (e.g histogram) or parametric model (e.g Gaussian process, GMM) to model the distribution of Wi-Fi RSSIs P-RSSIF-IPT using a parametric model has lower positioning errors; the database has to store fewer parameters than P-RSSIF-IPT using a non-parametric model 1.2 Theoretical studies about the available RSSIF-IPT The distribution of Wi-Fi RSSIs can be fitted by the Gaussian process or the GMM if data was collected under the changing conditions (e.g door opening or closing, the moving of commuters) Therefore, compared to Gaussian process, GMM can model Wi-Fi RSSI distribution more accurately However, some data samples may not be observable due to either of the following reasons: - Censoring, i.e., clipping This problem refers to the fact that sensors are unable to measure RSSI values below some threshold, such as −100 dBm - Dropping It means that occasionally RSSI measurements of access points are not available, although their value is clearly above the censoring threshold While censoring occurs due to the limited sensitivity of Wi-Fi sensors on portable devices, dropping comes from the limitation of sensor drivers and the operation of WLAN system According to our data investigation, the data set (Wi-Fi RSSIs) collected at an RP, from an AP has the characteristics corresponding to one of the following eight cases: (1) The distribution of data can be drawn from one Gaussian component, data set are observable; (2) The distribution of data can be drawn from one Gaussian component, a part of data set are unobservable due to censoring problem; (3) The distribution of data can be drawn from one Gaussian component, a part of data set are unobservable due to dropping problem; (4) The distribution of data can be drawn from one Gaussian component, a part of data set are unobservable due to censoring and dropping problems; (5) The distribution of data can be drawn from more than one Gaussian component, data set are observable; (6) The distribution of data can be drawn from more than one Gaussian component, a part of data set are unobservable due to censoring problem (figure 1.10a); (7) The distribution of data can be drawn from more than one Gaussian component, a part of data set are unobservable due to dropping problem (figure 1.10b); (8) The distribution of data can be drawn from more than one Gaussian component, a part of data set is unobservable due to censoring and dropping problems (figure 1.10c) a b c Figure 1.10 Histogram of Wi-Fi RSSIs The authors in published articles solved the data set with characteristics such as (1) - (5) However, no studies have been able to solve the data set with the same characteristics as the cases (6) - (8) For this reason, the thesis focuses on researching and proposing solutions to develop RSSIF-IPT to simultaneously solve the problems of censoring, dropping and multi-component problems (cases (6) - (8)) 1.3 Conclusion of chapter In this chapter, the thesis presents available Wi-Fi based indoor positioning techniques Chapter also summarizes and analyzes related works on RSSIF-IPT According to related works and the issues that have not been solved for RSSIF-IPT, the thesis proposes scientific research goals CHAPTER GMM PARAMETER ESTIMATION IN THE PRESENCE OF CENSORED AND DROPPED DATA 2.1 Motivation In indoor environment, data set (Wi-Fi RSSIs) collected at a RP from an AP can be modeled by the GMM with J Gaussian components (J is a finite number) Let yn is RSSI value gathered at n time, ( yn  , th n   N ), N is the number of measurements yn are independent and identically distributed random variables In a GMM, the PDF (Probability Density Function) of an observation yn is: J p  yn ; Θ    w j  ( yn ; j ), (2.1) j 1 Θ is a set of parameters of GMM, w j and  j are mixing weights and parameters jth Gaussian component While y   y1 ,y2 , ,yN  is the set of unobservable, non-censored, non- dropped data (complete data), let c be the specific threshold at which a portable device (e.g., smart phone) does not report the signal strength; let x   x1 ,x2 , ,xN  be the set of observable data, censored, possibly dropped data (incomplete data) The censoring problem can be presented as follow:  yn if yn  c xn    c if yn  c ,n  1 N (2.4) Let d   d1 ,d2 , ,dN  be the set of hidden binary variables indicating whether an observation ( yn ) is dropped (dn  1) or not (dn  0) The dropping problem can be presented as follow:  y if d n =0 xn   n , n   N c if d =1 n  (2.5) If data are unobservable owing to the censoring and dropping problems then:  y if yn  c and d n = xn   n ,n  1 N c if y  c and d = n n  (2.6) The motivation of this chapter is GMM parameters estimation via incomplete data (x) 2.2 Introduction to the EM algorithm The EM (Expectation Maximization) algorithm is an iterative method for ML (Maximum Likelihood) estimation of parameters of statistical models in the presence of hidden variables This method can be used to estimate the parameters of a GMM, including two steps: - E-step: Creates a function for the expectation of the loglikelihood evaluated using the current estimate for the parameters - M-step: computes parameters maximizing the expected loglikelihood found on the E-step 2.3 GMM parameter estimation in the presence of censored data The EM algorithm for GMM parameters estimation in the presence of censored data (EM-C-GMM) [CT3] is developed as follows: Let Δnj ( n   N , j   J ) be the latent variables, Δ nj  if yn belongs to j th Gaussian component, Δ nj  if yn does not belong to j th Gaussian component The expectation of log-likelihood function (LLF) of y given by observations (x) and old estimated parameters are calculated: E-step:    Q Θ; Θ ( k )   ln   Θ; y, Δ  x; Θ( k ) N J        nj ln  w j p  yn ; j   p yn ,  nj | xn ; Θ n 1 j 1    (k )  dy n (2.17) 10  (k ) Q Θ;Θ N N J   d w    n1 j 1 J (k )  j ln n  w   ln   j (2.30)   1  dn   xn ; (jk ) ln  w j   ln 1    ln   xn ; j   n1 j 1 In the equation (2.30),   P(dn 1) is the dropping probability M-step:  dn    xn ; (jk )  xn   n1 N  (jk 1)   dn    xn ; (jk )    n1 N N  2j  ( k 1)   1  dn    xn ; (jk )  xn   (jk )  n1  N 1  dn    xn ;(jk )   n1  N w(jk 1) (2.31)  N (2.32) N  1  dn   xn ; (jk )  dn w(jk ) n 1 n 1 (2.33) N  ( k 1)  dn  n1 N (2.34) 2.5 GMM parameter estimation in the presence of censored and dropped data The EM algorithm for GMM parameters estimation in the presence of censored and dropped data (EM-CD-GMM) [CT4] is developed as follows: E-step: 11  Q Θ; Θ( k ) N  J       1    xn ; (jk ) ln  w j   ln 1    ln   xn ; j   n1 j 1 N J vnβ n1 j 1 N  (jk )  α Θ (k ) , (k )   J    c ln  w    ln  y ;   n j    j      yn ; (j k )   I0  (j k )  dy n  vn w(jk ) 1  α Θ( k ) , ( k )  ln    n1 j 1 (2.52) In the equation (2.52): (n  1 N ) are hidden binary variables indicating whether yn is unobservable    xn  c  or observable J    x n  y n  ; α  Θ (k ) , ( k )   1  ( k )   w(jk ) I0  (j k )  j 1 J 1     w (k ) j 1 I  (j k )    ( k ) (k ) j M-step:  N       I  v   ,   v I1  (j k )  1    xn ; (jk ) xn  β (jk ) α Θ(k ) , (k )  (jk 1)  n1 1      n1 N xn ; (jk )   β (jk )  α Θ N  2j  ( k 1)  (k ) 1     xn ; (jk )  xn   (jk )   n 1 N 1      n 1 N n (k ) n  j N (k ) (2.53) n n1 N   β   α Θ ,   v I   2 I      β    α  Θ ,       v I   I        +  1  v    x ;    β    α Θ ,   v (k ) j xn ; (jk ) (k ) (jk ) (k ) j (k ) n 1 n n (k ) j (k ) n 1 (k ) j (k ) j N (k ) (k ) j n (k ) j (k ) j (k ) j (k ) (k ) n 1 N n 1 N n n (2.54) 12 N w(jk 1)  1      n1  1  α Θ ( k ) , ( k )  v   n1 n xn ; (jk )   β  α Θ (k ) , (k ) N v n 1 N n (2.55) N   N  (jk ) ( k 1)  N   1  α Θ(k ) , ( k )  v   n1 n (2.56) N As can be seen in equations (2.53) - (2.56), collected data, including observable, censored and dropped samples are contributed to the estimate, simultaneously This means the proposed EM algorithm can deal with all the mentioned phenomena presented in collected data 2.6 Evaluation of the EM-CD-GMM In this section, the proposed EM-CD-GMM was evaluated and compared to other EM algorithms by using Kullback Leibler Divergence (KLD) After 1000 experiments, the mean of KLD (KLD ) is shown in table 2.1 and standard deviation of KLD (  KLD ) is shown in table 2.2 (when c= – 90dBm) Table 2.1  KLD of the EM algorithms after 1000 experiments c (dBm) Algorithm –90 EM-GMM EM-CD-G EM-CD-GMM  3.1491 0.0798 0.0098 0.075 3.2325 0.0864 0.0111 0.15 3.3142 0.1096 0.0229 0.225 3.5054 0.1329 0.0334 0.3 6.1253 0.1998 0.0364 Table 2.2  KLD of the EM algorithms after 1000 experiments c (dBm) Algorithm –90 EM-GMM EM-CD-G EM-CD-GMM  0.0351 0.1199 0.0227 0.075 0.3535 0.1364 0.0601 0.15 1.7911 0.1535 0.0857 0.225 2.202 0.1963 0.1005 0.3 2.4937 0.296 0.1302 13 As can be seen in table 2.1 and table 2.2: - When   and c 96, data are almost observable The EM-GMM and the EM-CD-GMM introduced the same results The EM-CD-G has a larger error due to the fact that this algorithm assumed the distribution of data by the Gaussian process - For other cases,  KLD and  KLD of the EM-CD-GMM are always the smallest Hence, EM-CD-GMM is the most effective algorithm for GMM parameter estimation in the presence of censored and dropped data 2.7 Conclusion of chapter In chapter 2, the author proposed three algorithms to estimate the parameters of GMM in the following cases: A part of the data set cannot be observed due to censoring; due to dropping; due to censoring and dropping Experimental results had demonstrated the effectiveness of EM-CD-GMM algorithm compared to EM-GMM and EM-CD-G 14 CHAPTER GMM MODEL SELECTION IN THE PRESENCE OF CENSORED AND DROPPED DATA 3.1 Motivation In the complex indoor environments, the histogram of collected Wi-Fi RSSIs can be drawn from one or more than one Gaussian components If using GMM with J Gaussian components, the number of parameters of GMM will be NPs = 3J-1 This means that the number of parameters to store in the database and the computational cost of positioning algorithms are proportional to the number of Gaussian components used to describe the distribution of Wi-Fi RSSIs Therefore, it is necessary to have a solution to estimate the number of Gaussian components in GMM to optimize the database and reduce the complexity of the calculations in the positioning algorithm of the IPS 3.2 Methods for GMM model selection 3.2.1 Penalty Function (PF) based methods x be the mixture and observable data set; N is the number of ˆ is the set of parameters of GMM with J Gaussian samples in x ; Θ J Let ˆ | x) is the components; N Ps is the number of parameters of GMM; (Θ J likelihood function PF of Akaike Information Criterion (AIC), AIC3 and Bayesian Information Criterion (BIC) were defined as follows: PFAIC (Θˆ J )  2ln[(Θˆ J | x)]  2NPs (3.3) PFAIC3(Θˆ J )  2ln[(Θˆ J | x)]  3NPs (3.4) PFBIC (Θˆ J )  2ln[(Θˆ J | x)]  NPs ln  N  (3.5) 3.2.2 Characteristic Function (CF) based methods The CF based method uses the convergence of the Sum of Weighted Real parts of all Log-Characteristic Functions (SWRLCF) to determine the number of Gaussian components, is as follows: 15 J SWRLCF( J )   wˆ jˆ j (3.6) j 1 3.3 GMM model selection in the presence of censored and dropped data [CT5] The term ln [(Θˆ J | x)] of PFBIC in the equation (3.5) can be calculated as follows:    J  1   ln  ˆ wˆ j  xn ;ˆj n 1 j 1  N J    ln  ˆ wˆ j I0 ˆ j ˆ  n 1 j 1   ˆ ,ˆ | x   ln  Θ J   N             (3.7) ˆ ,ˆ ) be the PF of BIC in the presence of censored and Let PFBICCD (Θ J dropped data, we have:   N  J n1  j 1    ˆ ,ˆ  2 1   ln 1 ˆ  wˆ  xn ;ˆ  PFBICCD Θ j   j J   ln  ˆ n1  N     wˆ j I0 ˆj  ˆ   3J ln  N  J  j 1  (3.12) The algorithm for GMM parameter estimation and model selection in the presence of censored and dropped data (EM-CD-GMM-PFBIC-CD) is as follows (figure 3.4): Input: A set of incomplete data (x) , convergence threshold of the EM algorithm for CD-GMM ( EM ) and the maximum number of Gaussian components ( J max ) for calculating PFs Output: The estimated number of Gaussian components ( Jˆ ) and ˆ ,ˆ ) in the CD-GMM using to model the estimated parameters (Θ Jˆ distribution of x 16 Begin J 1 The EM algorithm k 1 Initiate  j  j   J  and  k  k 1 E-step: According to equation (2-5,10,11),  , I   , β  Θ  , α  Θ ,  ( k )  ,   at k th iteration  j =1  J  ; compute   x ; Θ n  I1  j( k )  and I (k ) j (k ) j (k ) (k) j (k ) j According to equation (18), compute:    ln   Θ J (k ) , (k )  | x  at k iteration th  M-step: According to equation (6-9), compute: Θ (jk 1) =    ( k  1) j ,  j  ( k  1) ,w ( k  1) j   and  ( k 1) at  k  1 iteration  j =1  J  ; th According to equation (18), compute:    ln   Θ J ( k 1 )  , ( k 1) | x  at  k  1 iteration th      False ( k 1) (k ) ln   Θ J  , ( k 1) | x   ln   Θ J  , ( k ) | x        True Output a set of estimated parameters in the CD-GMM ˆ J   Θ J ( k 1) ,ˆ   ( k 1) with J Gausssian components: Θ  ˆ J ,ˆ According to equation (19), compute PFBIC  CD Θ False J=Jmax  J  J 1 True Select the smallest PFBIC CD among J max penalty functions:       ˆ Jˆ ,ˆ   PF ˆ J 1 ,ˆ , , PF ˆ J=J ,ˆ  PFBIC CD Θ Θ BIC CD Θ  BIC CD   max   Output the estimated number of Gaussian components Jˆ ˆ Jˆ ;ˆ and estimated parameters: Θ End Figure 3.4 The EM-CD-GMM-PFBIC-CD algorithm 17 3.4 Evaluation of GMM model selection algorithms In this section, the following GMM model selection algorithms will be evaluated through various experiments with artificial data: - GMM model selection algorithm utilized the EM-GMM and PFAIC (EM-GMM-PFAIC); initialized parameters are  EM  106 , Jmax  ; - GMM model selection algorithm utilized the EM-GMM and PFBIC (EM-GMM-PFBIC); initialized parameters are  EM  106 , Jmax  ; - GMM model selection algorithm utilized the EM-GMM and SWRLCF (EM-GMM-SWRLCF); initialized parameters are  EM  106 , CF  0.02; - The proposed algorithm (EM-CD-GMM-PFBIC-CD), initialized parameters are  EM  106 , Jmax  After 1000 experiments, different levels between the true number ( J ) and estimated number ( Jˆ ) of Gaussian components were recorded in table 3.2 As can be seen in Tab.2, the proposed method introduced far better results than other approaches, especially when data are suffered from censoring or dropping or both of them This can be explained as follows: The proposed method utilized the extended version of the EM algorithm in which both observable data ( xn  yn ) and unobservable data ( xn  c) are contributed to the estimates When data are unobservable owing to the censoring and dropping problems, this algorithm produces a lot better results compared to the standard EM algorithm Moreover, in the PF of AIC, the PF of BIC and SWRLCF, unobservable data had almost no practical contribution while they really contributed to the likelihood in PF of our proposal, as mentioned in sub-section 3.3 18 c (dBm) Table 3.2 Different levels between J and Jˆ of four approaches Methods Probability EM-GMM-PFAIC P(J =Jˆ) P(| J  Jˆ |1) P(J  Jˆ | 2) P(J =Jˆ) EM-GMM-PFBIC 92 EM-GMM-SWRLCF P(| J  Jˆ |1) P(J  Jˆ | 2) P(J =Jˆ) P(| J  Jˆ |1) P(J  Jˆ | 2) P(J =Jˆ) EM-CD-GMM-PFBIC-CD P(| J  Jˆ |1) P(J  Jˆ | 2)  0.01 0.31 0.68 0.01 0.39 0.6 0.52 0.39 0.09 0.82 0.16 0.02 0.1 0.01 0.27 0.72 0.01 0.37 0.62 0.02 0.78 0.2 0.8 0.18 0.02 0.2 0.01 0.22 0.78 0.01 0.3 0.69 0.01 0.77 0.22 0.79 0.2 0.01 3.5 Conclusion of chapter When a portion of the data is not observed due to dropping or censoring or both, the other GMM model selection algorithms have a large error due to the absence of unobserved data samples In chapter 3, PF of BIC is calculated on both the observed data samples and the unobserved data samples These are new findings of the proposed GMM model selection method compared to others 19 CHAPTER POSITIONING ALGORITHM AND EXPERIMENTAL RESULTS 4.1 Motivation P-RSSIF-IPT includes offline training phase and online positioning phase In the offline training phase, let NRP be the number of RPs; N AP is the number of APs; x q ,i  q   N RP , i   N AP  is the data set collected at qth RP from ith AP Therefore, database built in the offline training stage of IPS utilized P-RSSIF-IPT is: R  Θˆ q ,i ; q   N RP , i   N AP , (4.1) ˆ q,i is the set of parameters in the GMM used to model the Θ distribution of x q ,i , estimated by the EM-CD-GMM-PFBIC-CD During the online positioning phase, let xon  ( x1on xNonAP ) be the data set collected by OB, the positioning problem can be formulated as a classification problem, where the classes are the positions from which RSSI measurements are taken during the offline training phase (RPs) To estimate the target’s position, a MAP (maximum a posteriori) based classification rule is developed in this chapter The censoring and dropping problems were also considered in this proposal 4.2 Optimal classification rule for censored and dropped mixture data [CT5] Let  q be the position of the qth RP; xon  [x1on , x2on , , xNonAP ] is the data set gathered by OB Posterior probability is determined as follows: N AP p  xion |  q  P   q   i 1 p   q | xon   N RP N AP p   q' 1 i 1 xion    |  q' P  q' (4.2) 20 In the equation (4.2), P( q ) is the marginal probability, considering that RPs are independent of each other, then N RP N AP p  xion |  q'  P   q'    q' 1 i 1 is the normalizing constant; P  q   ; N RP p  xion |  q  is likelihood, can be calculated as follows: Jˆq ,i  N AP  ˆ q ,i wˆ q ,i , j  xion ;ˆq ,i , j  i 1 j 1 xion > c N N ˆ J q ',i  RP AP ˆ  q ',i wˆ q ',i , j  xion ;ˆq ',i , j  j 1  q '1 i 1   N AP  Jˆq ,i   ˆ  w ˆ I  ˆ q ,i  ˆ q ,i   i 1  j 1 q ,i , j q ,i , j     x on  c i  N RP N AP  Jˆq ,i    w ˆ q ,i , j I0 ˆq ,i , j ˆ q ,i  ˆ q ,i      q '1 i 1  j 1   p   q | xon                  (4.9)  Using the set KNN of nearest neighbors which is chosen among the offline locations by taking those with the largest posteriors, the final location estimate is then obtained by the weighted average:  q p   q | x on   q  K ˆ  x on    qK p   q | xon  NN (4.10) NN 4.3 Experimental results 4.3.1 Positioning accuracy In order to evaluate the positioning accuracy of the proposed method, compared to the other state-of-art approaches, the author of this thesis conducts experiments with both simulation data and real field data 21 4.3.1.1 Simulation results In order to evaluate the effectiveness of the proposed approach, a floor plan having an overall size of 45m by 45m with 100 RPs and 10 APs was generated The training data were collected as following: (1) Collect data at each RP from each AP according to PLM: r yn[dBm]=RSSI 0[dBm]  10 log10      (4.11) r0 (2) Rounding yn (3) Generate censored and dropped data,   0.15 , c  100dBm In the training phase, 400 measurements were collected at each RP from each AP Data collected at 50% of the RPs is distributed according to the Gaussian process; data collected at other 17% RPs are distributed following GMM with Gaussian components; data collected at another 17% RPs have a distribution conforming to GMM with Gaussian components; data collected at the remaining 16% RPs was distributed according to the GMM with Gaussian components During the online positioning phase, 1000 measurements were collected at 100 locations of 100 RPs At each location, 10 samples were collected in the same scenarios with the training data Table 4.2 shown the mean and variance of positioning error of IPS utilized approaches, including: - Histogram - EM-GMM-AIC-MaP In this approach, at the training stage, GMM is used to model the distribution of data and the EM-GMM algorithm combined with the AIC standard is used to estimate the parameters of GMM At the online positioning stage, the MaP based positioning algorithm was applied - EM-CD-G-MaP This approach used Gaussian process to model the distribution of data and used EM-CD-G algorithm to estimate parameters in the training phase The MaP based positioning algorithm was also applied in the online positioning stage 22 - EM-CD-GMM-BIC-MaP is the method proposed by the author In the training phase, GMM is used to describe the distribution of data and the EM-CD-GMM-PFBIC-CD algorithm is used to estimate the parameters of GMM The initiated parameters of the EM-CD-GMM-PFBIC-CD include:  EM  106 , Jmax  In the online positioning stage, the algorithm locates based on the MaP method (section 4.2) was utilized The number of nearest neighbours is KNN  After 1000 experiments, positioning results were calculated and reported in table 4.1 4.3.1.2 Experimental results In order to evaluate the positioning accuracy of the proposed method, compared to the three state-of-art approaches on real data, the author conducted experiments on a ground of a floor having overall size of 360m2 In the training phase, RSSI values were taken at 25 RPs (25 free positions, without wall, furniture), roughly evenly distributed, resulting in an average distance of 2,7m between two locations At each RP, 400 measurements were collected from each available AP Training measurements were gathered at four different times of the day including morning, noon, afternoon and evening (100 samples per section) The direction of the collecting equipment (smart phone) was also changed during the measurement collection In each direction of 0o, 90o, 180o and 270o, 25 measurements were collected There are APs which are available at all positions of 25 RPs among 26 APs detected in collected training data The strongest AP selection strategy was applied to select APs which have the largest mean of RSSI values and use them to build the radio-map by utilizing the algorithm introduced in sub-section 4.2 The convergence threshold of the EM algorithm was set to 106 ( EM  106 ) and the maximum number of Gaussian components for calculating PFs was set to ( J max  6) In the online phase, 100 sets ( x on ) of Wi-Fi RSSI measurements were gathered at the positions of 25 RPs (4 sets per RP) in the same scenarios with the training data The MAP method presented in sub-section 4.2 23 was applied to estimate the target’s position The number of nearest neighbors is ( K NN  3) After 100 experiments, positioning results were calculated and reported in table 4.2 Table 4.1, 4.2 Mean and variance of positioning errors Artificial data Real data Approach 2 DE[m]  DE DE[m] [m]  DE [m] Histogram 2.8 5.1 2.0 4.7 EM-GMM-AIC-MaP 2.2 4.9 1.3 4.6 EM-CD-G-MaP 1.6 4.4 0.8 3.5 EM-CD-GMM-BIC-MaP 1.0 3.0 0.5 2.3 4.3.2 The computational cost In order to evaluate the computational cost, the author performed four experiments with the same collected data as mentioned in sub-section 4.3.1.2, but different numbers of Gaussian components were selected In the first experiment, the number of components and parameters in the CD-GMM were estimated by applying the algorithm introduced in subsection 4.2 ( J  Jˆ ) In the experiment 2, and 4, the number of components was fixed by 2, and ( J  2,3 and 4) , respectively; parameters were estimated by using the EM algorithm for CD-GMM mentioned in sub-section 3.2.5 After 100 experiments, the mean time spent on estimating the target’s position ( t ETP ), the mean and variance of distance error were recorded in table 4.3 Table 4.3 Mean of tETP , mean and variance of distance error of four experiments Experiment J  Jˆ J 2 J 3 J 4  tETP [ms] DE [m]  DE [m]  257.577 1.0686 2.9862   340.931 1.1401 3.0388   369.604 1.0372 2.9685   400.335 1.0058 2.9527 24 4.4 Conclusion of chapter In chapter 4, the positioning algorithm was proposed and verified by both artificial and real field data Experimental results on real data showed that, under the same experimental conditions, IPS applied the results of the studies [CT4] and [CT5] produced positioning errors of at least 0.6m lower than IPS utilized state-of-art approaches On the other hand, when applying the EM-CD-GMM-PFBIC-CD Algorithm [CT5], the average positioning time was reduced by at least 25% compared to applying EM-CD-GMM [CT4] only CONCLUSION A The main results of the thesis - Proposed to use GMM to model the distribution of Wi-Fi RSSIs during offline training of IPS utilized P-RSSIF-IPT Simulation results showed that when using GMM, the average positioning error decreased by 0.652m compared to using Gaussian process [CT1] - Developed three GMM parameter estimation algorithms for three cases, including: a part of the data was not observed due to censoring, a part of data was not observed due to dropping and a part of data was not observed due to censoring and dropping [CT2-CT4] - Developed a model selection for GMM in the presence of censored and dropped data [CT5] - Developed a positioning algorithm and conducted experiments on an indoor having overall size of 360m2 [CT5] the results showed that average positioning error decreases by at least 0.6m compared to other published works; average positioning time decreased by 25% B Future works - Researching solutions to estimate the the limited sensitivity of the WiFi chipset in the offline training and online positioning phase - Researching solutions for updating the database automatically - Combining the proposed P-RSSIF-IPT and dead reckoning positioning technique in order to reduce positioning errors, increase real-time features of the IPS ... CHAPTER OVERVIEW OF WI- FI BASED IPS 1.1 Wi- Fi based indoor positioning techniques Wi- Fi based indoor positioning techniques (IPT) can be divided into two main groups: - Time and Space Attributes of... Characteristics of Wi- Fi RSSI; modelling the distribution of Wi- Fi RSSI; algorithm to estimate parameters, optimize the parameters of the model used to model the distribution of Wi- Fi RSSI; online... indoor positioning technologies, Wi- Fi based positioning technology in the WLAN (Wireless Local Area Network) is most commonly used due to some reasons such as: Wi- Fi is available at most areas,

Nghiên cứu, phát triển kỹ thuật định vị trong nhà sử dụng tín hiệu wi fi tt

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan