Phát hiện và nhận dạng đối tượng 3 d hỗ trợ sinh hoạt của người khiếm thị 3 d object detection and recognition assisting visually impaired people in daily activities

HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY LE VAN HUNG 3-D OBJECT DETECTIONS AND RECOGNITIONS: ASSISTING VISUALLY IMPAIRED PEOPLE Major: Computer Science Code: 9480101 DOCTORAL DISSERTATION OF COMPUTER SCIENCE SUPERVISORS: Dr Vu Hai Assoc Prof Dr Nguyen Thi Thuy Hanoi − 2018 HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY LE VAN HUNG 3-D OBJECT DETECTIONS AND RECOGNITIONS: ASSISTING VISUALLY IMPAIRED PEOPLE Major: Computer Science Code: 9480101 DOCTORAL DISSERTATION OF COMPUTER SCIENCE SUPERVISORS: Dr Vu Hai Assoc Prof Dr Nguyen Thi Thuy Hanoi − 2018 DECLARATION OF AUTHORSHIP I, Le Van Hung, declare that this dissertation titled, ”3-D Object Detections and Recognitions: Assisting Visually Impaired People in Daily Activities ”, and the works presented in it are my own I confirm that: This work was done wholly or mainly while in candidature for a Ph.D research degree at Hanoi University of Science and Technology Where any part of this thesis has previously been submitted for a degree or any other qualification at Hanoi University of Science and Technology or any other institution, this has been clearly stated Where I have consulted the published work of others, this is always clearly attributed Where I have quoted from the work of others, the source is always given With the exception of such quotations, this dissertation is entirely my own work I have acknowledged all main sources of help Where the dissertation is based on work done by myself jointly with others, I have made exactly what was done by others and what I have contributed myself Hanoi, November 2018 PhD Student Le Van Hung SUPERVISORS Dr Vu Hai Assoc Prof Dr Nguyen Thi Thuy i ACKNOWLEDGEMENT This dissertation was written during my doctoral course at International Research Institute Multimedia, Information, Communication and Applications (MICA), Hanoi University of Science and Technology (HUST) It is my great pleasure to thank all the people who supported me for completing this work First, I would like to express my sincere gratitude to my advisors Dr Hai Vu and Assoc Prof Dr Thi Thuy Nguyen for their continuous support, their patience, motivation, and immense knowledge Their guidance helped me all the time of research and writing this dissertation I could not imagine a better advisor and mentor for my Ph.D study Besides my advisors, I would like to thank to Assoc Prof Dr Thi-Lan Le, Assoc Prof Dr Thanh-Hai Tran and members of Computer Vision Department at MICA Institute The colleagues have assisted me a lot in my research process as well as they are co-authored in the published papers Moreover, the attention at scientific conferences has always been a great experience for me to receive many the useful comments During my PhD course, I have received many supports from the Management Board of MICA Institute My sincere thank to Prof Yen Ngoc Pham, Prof Eric Castelli and Dr Son Viet Nguyen, who gave me the opportunity to join research works, and gave me permission to joint to the laboratory in MICA Institute Without their precious support, it has been being impossible to conduct this research As a Ph.D student of 911 program, I would like to thank this programme for financial support I also gratefully acknowledge the financial support for attending the conferences from Nafosted-FWO project (FWO.102.2013.08) and VLIR project (ZEIN2012RIP19) I would like to thank the College of Statistics over the years both at my career work and outside of the work Special thanks to my family, particularly, to my mother and father for all of their sacrifices that they have made on my behalf I also would like to thank my beloved wife for everything she supported me Hanoi, November 2018 Ph.D Student Le Van Hung ii CONTENTS DECLARATION OF AUTHORSHIP i ACKNOWLEDGEMENT ii CONTENTS v SYMBOLS vi LIST OF TABLES viii LIST OF FIGURES xvii LITERATURE REVIEW 1.1 Aided-systems for supporting visually impaired people 1.1.1 Aided-systems for navigation services 1.1.2 Aided-systems for obstacle detection 1.1.3 Aided-systems for locating the interested objects in scenes 1.1.4 Discussions 1.2 3-D object detection, recognition from a point cloud data 1.2.1 Appearance-based methods 1.2.1.1 Discussion 1.2.2 Geometry-based methods 1.2.3 Datasets for 3-D object recognition 1.2.4 Discussions 1.3 Fitting primitive shapes 1.3.1 Linear fitting algorithms 1.3.2 Robust estimation algorithms 1.3.3 RANdom SAmple Consensus (RANSAC) and its variations 1.3.4 Discussions 8 11 12 13 13 16 16 17 17 18 18 19 20 23 POINT CLOUD REPRESENTATION AND THE PROPOSED METHOD FOR TABLE PLANE DETECTION 24 2.1 Point cloud representations 24 2.1.1 Capturing data by a Microsoft Kinect sensor 24 2.1.2 Point cloud representation 25 2.2 The proposed method for table plane detection 28 2.2.1 Introduction 28 iii 2.2.2 2.2.3 2.3 Related Work The proposed method 2.2.3.1 The proposed framework 2.2.3.2 Plane segmentation 2.2.3.3 Table plane detection and extraction 2.2.4 Experimental results 2.2.4.1 Experimental setup and dataset collection 2.2.4.2 Table plane detection evaluation method 2.2.4.3 Results Separating the interested objects on the table plane 2.3.1 Coordinate system transformation 2.3.2 Separating table plane and the interested objects 2.3.3 Discussions PRIMITIVE SHAPES ESTIMATION BY A NEW ROBUST ESTIMATOR USING GEOMETRICAL CONSTRAINTS 3.1 Fitting primitive shapes by GCSAC 3.1.1 Introduction 3.1.2 Related work 3.1.3 The proposed a new robust estimator 3.1.3.1 Overview of the proposed robust estimator (GCSAC) 3.1.3.2 Geometrical analyses and constraints for qualifying good samples 3.1.4 Experimental results of robust estimator 3.1.4.1 Evaluation datasets of robust estimator 3.1.4.2 Evaluation measurements of robust estimator 3.1.4.3 Evaluation results of a new robust estimator 3.1.5 Discussions 3.2 Fitting objects using the context and geometrical constraints 3.2.1 The proposed method of finding objects using the context and geometrical constraints 3.2.1.1 Model verification using contextual constraints 3.2.2 Experimental results of finding objects using the context and geometrical constraints 3.2.2.1 Descriptions of the datasets for evaluation 3.2.2.2 Evaluation measurements 3.2.2.3 Results of finding objects using the context and geometrical constraints 3.2.3 Discussions iv 29 30 30 32 34 36 36 37 40 46 46 48 48 51 52 52 53 55 55 58 64 64 67 68 74 76 77 77 78 78 81 82 85 DETECTION AND ESTIMATION OF A 3-D OBJECT MODEL FOR A REAL APPLICATION 86 4.1 A Comparative study on 3-D object detection 86 4.1.1 Introduction 86 4.1.2 Related Work 88 4.1.3 Three different approaches for 3-D objects detection in a complex scene 90 4.1.3.1 Geometry-based method for Primitive Shape detection Method (PSM) 90 4.1.3.2 Combination of Clustering objects and Viewpoint Features Histogram, GCSAC for estimating 3-D full object models (CVFGS) 91 4.1.3.3 Combination of Deep Learning based and GCSAC for estimating 3-D full object models (DLGS) 93 4.1.4 Experiments 95 4.1.4.1 Data collection 95 4.1.4.2 Evaluation method 98 4.1.4.3 Setup parameters in the evaluations 101 4.1.4.4 Evaluation results 102 4.1.5 Discussions 106 4.2 Deploying an aided-system for visually impaired people 109 4.2.1 Environment and material setup for the evaluation 111 4.2.2 Pre-built script 112 4.2.3 Performances of the real system 114 4.2.3.1 Evaluation of finding 3-D objects 115 4.2.4 Evaluation of usability and discussion 118 CONCLUSION AND FUTURE WORKS 121 5.1 Conclusion 121 5.2 Future works 123 Bibliography 125 PUBLICATIONS 139 v ABBREVIATIONS No Abbreviation Meaning API Application Programming Interface CNN Convolution Neural Network CPU Central Processing Unit CVFH Clustered Viewpoint Feature Histogram FN False Negative FP False Positive FPFH Fast Point Feature Histogram fps f rame per second GCSAC Geometrical Constraint SAmple Consensus GPS Global Positioning System 10 GT Ground Truth 11 HT Hough Transform 12 ICP Iterative Closest Point 13 ISS Intrinsic Shape Signatures 14 JI Jaccard Index 15 KDES Kernel DEScriptors 16 KNN K Nearest Neighbors 17 LBP Local Binary Patterns 18 LMNN Large Margin Nearest Neighbor 19 LMS Least Mean of Squares 20 LO-RANSAC Locally Optimized RANSAC 21 LRF Local Receptive Fields 22 LSM Least Squares Method 23 MAPSAC Maximum A Posteriori SAmple Consensus 24 MLESAC Maximum Likelihood Estimation SAmple Consensus 25 MS MicroSoft 26 MSAC M-estimator SAmple Consensus 27 MSI Modified Plessey 28 MSS Minimal Sample Set 29 NAPSAC N-Adjacent Points SAmple Consensus vi 30 NARF Normal Aligned Radial Features 31 NN Nearest Neighbor 32 NNDR Nearest Neighbor Distance Ratio 33 OCR Optical Character Recognition 34 OPENCV OPEN source Computer Vision Library 35 PC Persional Computer 36 PCA Principal Component Analysis 37 PCL Point Cloud Library 38 PROSAC PROgressive SAmple Consensus 39 QR code Quick Response Code 40 RAM Random Acess Memory 41 RANSAC RANdom SAmple Consensus 42 RFID Radio-Frequency IDentification 43 R-RANSAC Recursive RANdom SAmple Consensus 44 SDK Software Development Kit 45 SHOT Signature of Histograms of OrienTations 46 SIFT Scale-Invariant Feature Transform 47 SQ SuperQuadric 48 SURF Speeded Up Robust Features 49 SVM Support Vector Machine 50 TN True Negative 51 TP True Positive 52 TTS Text To Speech 53 UPC Universal Product Code 54 URL Uniform Resource Locator 55 USAC A Universal Framework for Random SAmple Consensus 56 VFH Viewpoint Feature Histogram 57 VIP Visually Impaired Person 57 VIPs Visually Impaired People vii LIST OF TABLES Table 2.1 The number of frames of each scene 36 Table 2.2 The average result of detected table plane on our own dataset(%) 41 Table 2.3 The average result of detected table plane on the dataset [117] (%) 43 Table 2.4 The average result of detected table plane of our method with different down sampling factors on our dataset 44 Table 3.1 The characteristics of the generated cylinder, sphere, cone dataset (synthesized dataset) 66 Table 3.2 The average evaluation results of synthesized datasets The synthesized datasets were repeated 50 times for statistically representative results 75 Table 3.3 Experimental results on the ’second cylinder’ dataset The experiments were repeated 20 times, then errors are averaged 75 Table 3.4 The average evaluation results on the ’second sphere’, ’second cone’ datasets The real datasets were repeated 20 times for statistically representative results 76 Table 3.5 Average results of the evaluation measurements using GCSAC and MLESAC on three datasets The fitting procedures were repeated 50 times for statistical evaluations 83 Table 4.1 The average result detecting spherical objects on two stages 102 Table 4.2 The average results of detecting the cylindrical objects at the first stage in both the first and second datasets 103 Table 4.3 The average results of detecting the cylindrical objects at the second stage in both the first and second datasets 106 Table 4.4 The average processing time of detecting cylindrical objects in both the first and second datasets 106 Table 4.5 The average results of 3-D queried objects detection 116 viii ometric Histograms In European Conference on Computer Vision, pp pp 674– 686 [13] Badino H (2011) Least squares estimation of a plane surface in disparity image space Technical report in Carnegie Mellon University Pittsburgh [14] Badino H (2011) Least squares estimation of a plane surface in disparity image space Carnegie Mellon University Pittsburgh, PA 15217, USA [15] Bentley J.L (1975) Multidimensional binary search trees used for associative searching Communications of the ACM , 18:pp pp 509–517 [16] BERKSON J (2017) ESTIMATION BY LEAST SQUARES AND BY MAXIMUM LIKELIHOOD digitalassets.lib.berkeley.edu/math/ucb/text/ math_s3_v1_article-01.pdf [Online; accessed 18-Septemper-2017] [17] Berthold K.P.H (1987) Closed-form solution of absolute orientation using unit quaternions Journal of the Optical Society of America, 4(4):pp 629–642 [18] Bertram D., Markus U., Nassir N., and Slobodan I (2017) Model globally, match locally: Efficient and robust 3d object recognition In IEEE Conference on Computer Vision and Pattern Recognition, p DOI: 10.1109/CVPR.2010.5540108 [19] Besl P and McKay N.D (1992) A method for registration of 3-D shapes IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume: 14(Issue: 2):pp 239 – 256 [20] Bhattacharya P., Liu H., Rosenfeld A., and Thompson S (2000) Houghtransform detection of lines in 3-D space Pattern Recognition Letters, Volume 21(9):pp Pages 843–849 [21] Borrmann D., Elseberg J., Lingemann K., and Nuchter A (2011) The 3d hough transform for plane detection in point clouds : A review and a new accumulator design 3D Research, 2(2) [22] Brown R.A (2015) Building a Balanced Kd-Tree in O(kn log n) Time Journal of Computer Graphics Techniques, pp pp 50–68 [23] Caldini A., B M.F., and Colombo C (2015) Smartphone-based obstacle detection for the visually impaired In International Conference on Image Analysis and Processing, pp pp 480–488 [24] Chau C.p and Siu W.c (2004) Generalized Hough Transform Using Regions with Homogeneous Color International Journal of Computer Vision, 59(2):pp 183–199 126 [25] Chen C.S., Hung Y.P., and Cheng J.B (1999) RANSAC-based DARCES: a new approach to fast automatic registration of partially overlapping range images IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(11):pp 1229 –1234 [26] Choi S., Kim T., and Yu W (2009) Performance evaluation of ransac family In Procedings of the British Machine Vision Conference, pp 1–12 British Machine Vision Association [27] Chum O and Matas J (2005) Matching with prosac - progressive sample consensus In Proceedings of the Computer Vision and Pattern Recognition, pp 220–226 [28] Chum O., Matas J., and Kittler J (2003) Locally optimized ransac In DAGMSymposium, volume 2781 of Lecture Notes in Computer Science, pp 236–243 Springer [29] Chum O., Matas J., and Kittler J (2003) Locally optimized ransac In DAGMSymposium, volume 2781 of Lecture Notes in Computer Science, pp 236–243 Springer [30] Derpanis K.G (2005) Overview of the ransac algorithm [31] Deschaud J.E and Goulette F (2010) A fast and accurate plane detection algorithm for large noisy point clouds using filtered normals and voxel growing In Proceedings of the 5th International Symposium on 3D Data Processing (3DPVT) [32] Diniz P (2013) The Least-Mean-Square (LMS) Algorithm In: Adaptive Filtering Springer [33] Dirk Holz S., Rusu R.B., and Behnke S (2011) Real-Time Plane Segmentation Using RGB-D Cameras In LNCS (7416): RoboCup 2011 - Robot Soccer World Cup XV , pp 306–317 [34] Dong Z., Chen W., Bao H., Zhang H., and Peng Q (2004) Real-time voxelization for complex polygonal models In 12th Pacific Conference on the Computer Graphics and Applications,, pp 43–50 Washington, DC, USA ISBN 0-76952234-3 [35] Duda R.O and Hart P.E Use of the hough transformation to detect lines and curves in pictures Comm ACM , Vol 15:p pp 11–15 127 [36] Duncan K., Sarkar S., Alqasemi R., and Dubey R (2013) Multiscale superquadric fitting for efficient shape and pose recovery of unknown objects In Procedings of the International Conference on Robotics and Automation (ICRA’2013) [37] Dynamics B Spotmini, howpublished = https: // www bostondynamics com/ spot-mini , year = 2018, note = ”[online; accessed 20-septemper-2017]” [38] E T and J M C (2010) A mobile phone application enabling visually impaired users to find and read product barcodes In Proceedings of the 12th international conference on Computers helping people with special needs, pp pp 290–295 [39] Eberly D Least Squares Fitting of Data [40] Eberly D (2017) Fitting 3D Data with a Cylinder https://geometrictools com/Documentation/CylinderFitting.pdf [Online; accessed 18-Septemper2017] [41] Emanuele R., Andrea A., Filippo B., and Andrea T (2005) A Scale Independent Selection Process for 3D Object Recognition in Cluttered Scenes International Journal of Computer Vision, Volume 102(Issue 1–3):p pp 129–145 [42] Everingham M., Gool L.V., Williams C.K.I., and Winn J (2010) The PASCAL Visual Object Classes ( VOC ) Challenge International Journal of Computer Vision, Volume 88(Issue 2):pp 303–338 [43] Faber P and Fisher R.B (2001) A Buyer’s Guide to Euclidean Elliptical Cylindrical and Conical Surface Fitting In Procedings of the British Machine Vision Conference 2001 , 1, pp 54.1–54.10 [44] Feng C and Hung Y (2003) A robust method for estimating the fundamental matrix In In Proceedings of the 7th Digital Image Computing: Techniques and Applications, p 633–642 [45] Feng C., Taguchi Y., and Kamat V (2014) Fast plane extraction in organized point clouds using ag-glomerative hierarchical clustering In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp 6218– 6225 [46] Fischler M.A and Bolles R (1981) Random sample consensus: A paradigm for model fitting with applications to image analysisand automated cartography Communications of the ACM , 24(6):pp 381–395 [47] Garcia S (2009) Fitting primitive shapes to point clouds for robotic grasping Master Thesis in Computer Science (30 ECTS credits) at the School of Electrical Engineering Royal Institute of Technology 128 [48] Geiger A., Lenz P., and Urtasun R (2012) Are we ready for autonomous driving? the kitti vision benchmark suite In Conference on Computer Vision and Pattern Recognition (CVPR) [49] Girshick R (2015) Fast R-CNN In International Conference on Computer Vision [50] Girshick R., Donahue J., Darrell T., and Malik J (2014) Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation In Computer Vision and Pattern Recognition [51] Glent A., Lilita B., and Dirk Kraft K (2017) Rotational subgroup voting and pose clustering for robust 3d object recognition In International Conference on Computer Vision [52] Greenacre M and Ayhan H.O (2017) Identifying inliers https://econpapers.upf.edu/papers/1423.pdf [Online; accessed 18-Septemper-2017] [53] Guo Y., Bennamoun M., Sohel F., Lu M., and Wan J (2014) 3D Object Recognition in Cluttered Scenes with Local Surface Features : A Survey IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(11):pp 2270–2287 [54] Hachiuma R., Ozasa Y., and Saito H (2017) Primitive shape recognition via superquadric representation using large margin nearest neighbor classifier In International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications [55] Hartley R and Zisserman A (ISBN:0521540518, 2003) Multiple View Geometry in Computer Vision Cambridge University Press New York [56] Hough P (1959) Machine Analysis of Bubble Chamber Pictures In Proc Int Conf High Energy Accelerators and Instrumentation [57] Huang H.c., Hsieh C.t., and Yeh C.h (2015) An indoor obstacle detection system using depth information and region growth sensors, pp 27116–27141 [58] Huang T., Yang G., and Tang G (1979) A fast two-dimensional median filtering algorithm IEEE Trans Acoust., Speech, Signal Processing, 27(1):pp 13–18 [59] Huy-Hieu P., Thi-Lan L., and Nicolas V (2015) Real-time obstacle detection system in indoor environment for the visually impaired using microsoft kinect sensor Journal of Sensors [60] IGI G (2018) What is point cloud https://www.igi-global.com/ dictionary/point-cloud/36879 [Online; accessed 10-January-2018] 129 [61] ImageNet ImageNet Object Detection Challenge https://www.kaggle.com/ c/imagenet-object-detection-challenge [Online; accessed 18-Septemper2017] [62] Jaccard P (1912) The distribution of the flora in the alpine zone New Phytologist, 11(2):pp 37–50 [63] Jafri R., Ali S.A., and Arabnia H.R (2014) Computer vision-based object recognition for the visually impaired using visual tags The Visual Computer: International Journal of Computer Graphics, Volume 30(Issue 11):pp Pages 1197–1222 [64] Jagadeesan N and Parvathi R (2014) An efficient image downsampling technique using genetic algorithm and discrete wavelet transforman Journal of Theoretical and Applied Information Technology, 61(3):pp 506–514 [65] Jain D (2014) Path-guided indoor navigation for the visually impaired using minimal building retrofitting In Proceedings of the 16th international ACM SIGACCESS conference on Computers accessibility, pp 225–232 [66] Jean-Yves B (2018) Camera calibration toolbox for matlab http://www vision.caltech.edu/bouguetj/calib_doc/ [Online; accessed 10-January2018] [67] Johnson A and Hebert M (1999) Using spin images for efficient object recognition in cluttered 3D scenes IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume: 21(Issue: 5):pp 433 – 449 [68] Kevin L., Liefeng B., and Dieter F (2014) Unsupervised feature learning for 3d scene labeling In Robotics and Automation (ICRA) [69] Khaled A., Mohammed E., and Sherif B (2014) 3d object recognition based on image features: A survey International Journal of Computer and Information Technology [70] Knopp J., Prasad M., Willems G., and Timofte R (2010) Hough Transforms and 3D SURF for robust three dimensional classification In European Conference on Computer Vision, pp pp 589–602 [71] Kohei M., Yusuke U., Shigeyuki S., and Sato S (2016) Geometric verification using semi-2d constraints for 3d object retrieval In Proceedings of the International Conference on Pattern Recognition (ICPR) 2012., pp 2339–2344 [72] Kramer J., Burrus N., Echtler F., Daniel H.C., and Parker M (2012) Hacking the Kinect Apress 130 [73] Kwon S.W., Liapi K.A., Haas C.T., and Bosche F (2003) Algorithms for fitting cylindrical objects to sparse range point clouds for rapid workspace modeling In Proceedings of the 20th ISARC , pp 173–178 [74] Lab M.M (2012) FINGERREADER A WEARABLE INTERFACE FOR READING ON-THE-GO http://fluid.media.mit.edu/projects/ fingerreader [Online; accessed 18-Septemper-2017] [75] Lai K., Bo L., Ren X., and Fox D (2011) A large-scale hierarchical multiview RGB-D object dataset In IEEE International Conference on Robotics and Automation (ICRA), pp 1817–1824 [76] Lai K., Liefeng B., Ren X., and Fox D (2012) Detection-based object labeling in 3d scenes In 2012 IEEE International Conference on Robotics and Automation, pp 1330–1337, ISSN :1050–4729 Ieee [77] Lam J and Greenspan M (2013) 3d object recognition by surface registration of interest segments In International Conference on 3D Vision, p DOI: 10.1109/3DV.2013.34 [78] Lanigan P.E., Paulos A.M., Williams A.W., Rossi D., and Narasimhan P (2006) Trinetra: Assistive technologies for grocery shopping for the blind In 10th IEEE International Symposium on Wearable Computers, pp pp.147–148 [79] Lawson C.L and Hanson R.J (ISBN 0-13-822585-0, 1974) Solving Least Squares Problems Englewood Cliffs, NJ: Prentice-Hall [80] Lebeda K., Matas J., and Chum O (2012) Fixing the locally optimized ransac In Proceedings of the British Machine Vision Conference 2012., pp 3–7 [81] Liefeng B., Kevin L., Xiaofeng R., and Dieter F (2011) Depth kernel descriptors for object recognition In IEEE/RSJ International Conference on Intelligent Robots and Systems [82] Liefeng B., Kevin L., Xiaofeng R., and Dieter F (2011) Object recognition with hierarchical kernel descriptors In Conference on Computer Vision and Pattern Recognition, pp 581–599 [83] Liefeng B., Xiaofeng R., and Dieter F (2010) Kernel descriptors for visual recognition In Advances in Neural Information Processing Systems 23 , pp 244– 252 [84] Lin T., Maire M., Belongie S.J., Bourdev L.D., Girshick R.B., Hays J., Perona P., Ramanan D., Doll´ar P., and Zitnick C.L (2014) Microsoft COCO: common objects in context CoRR, abs/1405.0312 131 [85] Lowe D.G (2004) Distinctive Image Features from Scale-Invariant Keypoints International Journal of Computer Vision, 60(2):pp 91–110 [86] Mair E., Gregory D.H., Burschka D., Michael S., and Gerhard H [87] Marco C., Roberto V., and Rita C (2014) 3d hough transform for sphere recognition on point clouds Machine Vision and Applications, p 1877–1891 [88] Matas J and Chum O (2005) Randomized ransac with sequential probability ratio test In Proceedings of the 10th IEEE International Conference on Computer Vision [89] Matthew T., Rafal M., and Wolfgang H (2011) Blur-aware image downsampling EUROGRAPHICS , 30(2) [90] Microsoft (2018) Kinect for Windows SDK v1.8 https://www.microsoft.com/ en-us/download/details.aspx?id=40278 [Online; accessed 10-January-2018] [91] Mikolajczyk K and Schmid C (2005) A Performance Evaluation of Local Descriptors IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10):pp 1615–1630 [92] Monther A.S., Mustahsan M., Abdullah M.A., and Ahmed M.A (2014) An obstacle detection and guidance system for mobility of visually impaired in unfamiliar indoor environments International Journal of Computer and Electrical Engineering, DOI: 10.7763/IJCEE.2014.V6.849 [93] Mueller C.A and Birk A (2016) Hierarchical graph-based discovery of nonprimitive-shaped objects in unstructured environments In International Conference on Robotics and Automation [94] Myatt D., Torr P., Nasuto S., Bishop J., and Craddock R (2002) Napsac: high noise, high dimensional robust estimation In Procedings of the British Machine Vision Conference (BMVC’02), pp 458–467 [95] Naeemabadi M., Dinesen B., Andersen O.K., Najafi S., and Hansen J (2018) Evaluating accuracy and usability of microsoft kinect sensors and wearable sensor for tele knee rehabilitation after knee operation In Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 1: BIODEVICES,, pp 128–135 INSTICC, SciTePress ISBN 978-989-758-277-6 doi:10.5220/0006578201280135 [96] Nguyen B.H (2012) Scientist Shines Light For Visually Impaired http://greetingvietnam.com/technology/scientist-shines-lightfor-visually-impaired.html [Online; accessed 20-October-2017] 132 [97] Nguyen Q.H., Vu H., Tran T.H., Nguyen Q.H., Veelaert P., and Philips W (Sept., 2014) A visual slam system on mobile robot supporting localization services to visually impaired people In in the Proceeding of the 2nd Workshop on Assistive Computer Vision and Robotics, in conjuntion with ECCV 2014 [98] Nicholson J., Kulyukin V., and Coster D (2009) Shoptalk: independent blind shopping through verbal route directions and barcode scans The Open Rehabilitation Journal , vol 2:pp pp 11–23 [99] Nicolas B (2018) Calibrating the depth and color camera http://nicolas burrus.name/index.php/Research/KinectCalibration [Online; accessed 10January-2018] [100] Nieuwenhuisen M., Stuckler J., Berner A., Klein R., and Behnke S (2012) Shapeprimitive based object recognition and grasping shape primitive detection and object recognition In The 7th German conference on Robotics, May [101] Nieuwenhuisen M., Stueckler J., Berner A., Klein R., and Behnke S (2012) Shape-primitive based object recognition and grasping In Proc of ROBOTIK VDE-Verlag [102] Nikolakis G., Tzovaras D., and Strintzis M.G Object recognition for the blind (30):pp 1–4 [103] OpenCV (2018) Opencv library https://opencv.org/ [Online; accessed 10January-2018] [104] Osselman G., Gorte B., Sithole G., and Rabbani T (2004) Recognising structure in laser scanner point clouds In International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, p 33–38 [105] Pang G and Neumann U (2016) 3D Point Cloud Object Detection with MultiView Convolutional Neural Network In 23rd International Conference on Pattern Recognition [106] (PCL) P.C.L (2013) Point cloud library (pcl) 1.7.0 pointclouds.org/1.7.0/mlesac_8hpp_source.html http://docs [107] (PCL) P.C.L (2014) How to use random sample consensus model http://pointclouds.org/documentation/tutorials/random_sample_ consensus.php [108] Polewski P., Yao W., Heurich M., Krzystek P., and Stilla U (2017) A votingbased statistical cylinder detection framework applied to fallen tree mapping in 133 terrestrial laser scanning point clouds ISPRS Journal of Photogrammetry and Remote Sensing, Vol 129:pp pp 118–130 [109] Press W., Teukolsky S., Vetterling W.T., and Flannery B.P (2007) Numerical recipes: The art of scientific computing Cambridge University Press, pp pp 1099– 1110 [110] Qingming Z., Yubin L., and Yinghui X (2009) Color-based segmentation of point clouds Laser scanning 2009, IAPRS [111] Radu B., Nico B., and Michael B (2009) Fast point feature histograms (fpfh) for 3d registration In IEEE International Conference on Robotics and Automation, pp pp3212 – 3217, DOI: 10.1109/ROBOT.2009.5152473 [112] Raguram R., Chum O., Pollefeys M., Matas J., and Frahm J.M (Aug 2013) Usac: A universal framework for random sample consensus IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):pp 2022–2038 [113] Raguram R., Frahm J.M., and Pollefeys M (2008) A comparative analysis of ransac techniques leading to adaptive real-time random sample consensus In Procedings of the European Conference on Computer Vision (ECCV’08), pp 500–513 [114] Redmon J., Divvala S., Girshick R., and Farhadi A (2016) You Only Look Once: Unified, Real-Time Object Detection In Computer Vision and Pattern Recognition [115] Redmon J and Farhadi A (2017) YOLO9000: Better, Faster, Stronger In Computer Vision and Pattern Recognition [116] Ren S., He K., Girshick R., and Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks In Advances in Neural Information Processing Systems 28 , pp 91–99 [117] Richtsfeld A., Morwald T., Prankl J andZillich M., and Vincze M (2012) Segmentation of unknown objects in indoor environments In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp 4791–4796 [118] Ridwan M., Choudhury E., Poon B., Amin M.A., and Yan H (2014) A navigational aid system for visually impaired using microsoft kinect In International MultiConference of Engineers and Computer Scientists, volume I [119] Rimon S., Peter B., Julian S., Benjamin H.G., Christine F.M., Eva D., Joerg F., and Bjoern M.E (2016) Blind path obstacle detector using smartphone camera 134 and line laser emitter In International Conference on Technology and Innovation in Sports, Health and Wellbeing (TISHW 2016) [120] Robert C., Emmanuel K.N., and Ratko G (2016) Survey of state-of-the-art point cloud segmentation methods Technical Report: Josip Juraj Strossmayer University of Osijek [121] Rusu B Cluster recognition and 6dof pose estimation using vfh descriptors http: //pointclouds.org/documentation/tutorials/vfh_recognition.php [Online; accessed 20-January-2018] [122] Rusu B Euclidean cluster extraction http://www.pointclouds.org/ documentation/tutorials/cluster_extraction.php [Online; accessed 20January-2018] [123] Rusu B Fast point feature histograms (fpfh) descriptors http://pointclouds org/documentation/tutorials/fpfh_estimation.php#fpfh-estimation [Online; accessed 20-January-2018] [124] Rusu B Fast point feature histograms (fpfh) descriptors http://pointclouds org/documentation/tutorials/pfh_estimation.php#pfh-estimation [Online; accessed 20-January-2018] [125] Rusu B., Bradski G., Thibaux R., and Hsu J (2010) Fast 3d recognition and pose using the viewpoint feature histogram pp 2155 – 2162 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems [126] Saad B (2015) Hough Transform and Thresholding http://me.umn.edu/ courses/me5286/vision/Notes/2015/ME5286-Lecture9.pdf [Online; accessed 18-Septemper-2017] [127] Saffoury R., Blank P., Sessner J., Groh B.H., Martindale C.F., and Dorschky E (2016) Blind path obstacle detector using smartphone camera and line laser emitter In Proceedings of 1st International Conference on Technology and Innovation in Sports, Health and Wellbeing, Tishw [128] Saval-Calvo M., Azorin-Lopez J., Guillo A.F., and Rodriguez J.G (2017) Threedimensional planar model estimation using multi-constraint knowledge based on k-means and RANSAC CoRR, abs/1708.01143 [129] Scharstein D and Szeliski R (2003) High-Accuracy Stereo Depth Maps Using Structured Light In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1(June):pp 195–202 135 [130] Schauerte B., Martinez M., and Constantinescu A (2012) An Assistive Vision System for the Blind that Helps Find Lost Things In International Conference on Computers for Handicapped Persons, volume 2011, pp pp 566–572 [131] Schnabel R., Wahl R., and Klein R (2007) Efficient ransac for point-cloud shape detection Computer Graphics Forum, 26(2):pp 214–226 [132] Silberman N and Fergus R (2011) Indoor scene segmentation using a structured light sensor In Proceedings of the International Conference on Computer VisionWorkshop on 3D Representation and Recognition [133] Silberman N., HoiemPushmeet D., and Fergus K (2012) Indoor segmentation and support inference from rgbd images In European Conference on Computer Vision, pp pp 746–760 [134] Steder B., Rusu R.B., Konolige K., and Burgard W (October 8, 2010 2010) Narf: 3d range image features for object recognition In Workshop on Defining and Solving Realistic Perception Problems in Personal Robotics at the IEEE/RSJ Int Conf on Intelligent Robots and Systems (IROS) Taipei, Taiwan [135] Stein F and Medioni G (1992) Structural indexing: Efficient 3D object recognition IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume: 14(Issue: 2):pp 125 – 145 [136] Su Y.T., Hua S., and Bethel J.S (2017) Estimation of cylinder orientation in three-dimensional point cloud using angular distance-based optimization Optical Engineering, Volume 56(Issue 5) [137] Subaihi A.A (2016) Orthogonal Least Squares Fitting with Cylinders International Journal of Computer Mathematics, 7160(February) [138] Sudhakar K., Saxena P., and Soni S (2012) Obstacle detection gadget for visually impaired peoples International Journal of Emerging Technology and Advanced Engineering, 2(12):pp 409–413 [139] Sujith B and Safeeda V (2014) Computer vision-based aid for the visually impaired persons- a survey and proposing International Journal of Innovative Research in Computer and Communication Engineering, pp 365–370 [140] Tombari F., SaltiLuigi S., and Stefano D (2010) Unique Signatures of Histograms for Local Surface Description In European Conference on Computer Vision, pp pp 356–369 136 [141] Tombari F and Stefano L.D (2012) Hough voting for 3d object recognition under occlusion and clutter IPSJ Transactions on Computer Vision and Applications, 4:pp 20–29 [142] Torr P.H.S and Murray D (1997) The development and comparison of robust methods for estimating the fundamental matrix International Journal of Computer Vision, 24(3):p 271–300 [143] Torr P.H.S and Zisserman A (2000) Mlesac: A new robust estimator with application to estimating image geometry Computer Vision and Image Understanding, 78(1):pp 138–156 [144] Trung-Thien T., Van-Toan C., and Denis L (2015) Extraction of cylinders and estimation of their parameters from point clouds Computers and Graphics, 46:pp 345–357 [145] Trung-Thien T., Van-Toan C., and Denis L (2015) Extraction of reliable primitives from unorganized point clouds 3D Research, 6:44 [146] Trung-Thien T., Van-Toan C., and Denis L (2016) esphere: extracting spheres from unorganized point clouds The Visual Computer , Volume 32(No.10):p pp 1205–1222 [147] Van Hamme D.and Veelaert P and Philips W (2011) Robust visual odometry using uncertainty models In Advanced Concepts for Intelligent Vision Systems ACIVS 2011 Lecture Notes in Computer Science, vol 6915 Springer, Berlin, Heidelberg, pp 1–12 ISBN 978-3-642-23686-0 doi:10.1007/978-3-642-23687-7 [148] Virgil T., Popescu S., Bogdanov I., and Caleanu C (2008) Obstacles detection system for visually impaired guidance department of applied electronics In 2th WSEAS International Conference on SYSTEMS , September 2017 [149] Wang H Mirota D.I.M and Hager G (2008) Robust motion estimation and structure recovery from endoscopic image sequences with an adaptive scale kernel consensus estimator [150] Wang H and Suter D (2004) Robust adaptive-scale parametric model estimation for computer vision IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.26(No.11):p pp.1459–1474 [151] Wattal A., Ojha A., and Kumar M (2016) Obstacle detection for visually impaired using raspberry pi and ultrasonic sensors In National Conference on Product Design, July, pp 1–5 137 [152] Wittrowski J., Ziegler L., and Swadzba A (2013) 3d implicit shape models using ray based hough voting for furniture recognition In International Conference on 3D Vision - 3DV [153] Xiang Y., Kim W., Chen W., Ji J., Choy C., Su H., Mottaghi R., Guibas L., and Savarese S (2016) ObjectNet3D : A Large Scale Database for 3D Object Recognition In European Conference on Computer Vision, pp pp 160–176 [154] Xiang Y., Mottaghi R., and Savarese S (2014) Beyond pascal: A benchmark for 3d object detection in the wild In IEEE Winter Conference on Applications of Computer Vision (WACV) [155] Yang M.Y and Forstner W (2010) Plane detection in point cloud data Technical report Nr.1 of Department of Photogrammetry, Institute of Geodesy and Geoinformation, University of Bonn [156] Yang S.w., Wang C.c., and Chang C.h (2010) RANSAC Matching : Simultaneous Registration and Segmentation In IEEE International Conference on Robotics and Automation [157] Yi C., Flores R.W., Chincha R., and Tian Y (2014) Finding objects for assisting blind people Network Modeling Analysis in Health Informatics and Bioinformatics, Volume 2(2):pp pp 71–79 [158] Yoo H.W., Kim W.H., Park J.W., Lee W.H., and Chung M.J (2013) Real-time plane detection based on depth map from kinect In International Symposium on Robotics (ISR2013) [159] Zhong Y (2009) Intrinsic Shape Signatures : A Shape Descriptor for 3D Object Recognition In 2009 IEEE 12th International Conference on Computer Vision Workshops (ICCV Workshops) [160] Zhou X (2012) A Study of Microsoft Kinect Calibration Technical report Dept of Computer Science George Mason University [161] Zollner M., Huber S., Jetter H.c., and Reiterer H (2011) Navi – a proof-ofconcept of a mobile navigational aid for visually impaired based on the microsoft kinect In IFIP Conference on Human-Computer Interaction, pp pp 584–587 138 PUBLICATIONS OF DISSERTATION [1] Van-Hung Le, Hai Vu, Thuy Thi Nguyen, Thi Lan Le, and Thanh Hai Tran (2015) Table plane detction using geometrical constraints on depth image, The 8th Vietnamese Conference on Fundamental and Applied IT Research, FAIR, Hanoi, VietNam, ISBN: 978-604-913-397-8, pp.647-657 [2] Van-Hung Le, Hai Vu, Thuy Thi Nguyen, Thi-Lan Le, Thi-Thanh-Hai Tran, Michiel Vlaminck, Wilfried Philips and Peter Veelaert (2015) 3D Object Finding Using Geometrical Constraints on Depth Images, The 7th International Conference on Knowledge and Systems Engineering, HCM city, Vietnam, ISBN 978-1-46738013-3, pp.389-395 [3] Van-Hung Le, Thi-Lan Le, Hai Vu, Thuy Thi Nguyen, Thanh-Hai Tran, TranChung Dao and Hong-Quan Nguyen (2016), Geometry-based 3-D Object Fitting and Localization in Grasping Aid for Visually Impaired People, The 6th International Conference on Communications and Electronics (IEEE-ICCE), HaLong, Vietnam, ISBN: 978-1-5090-1802-4, pp.597-603 [4] Van-Hung Le, Michiel Vlaminck, Hai Vu, Thuy Thi Nguyen, Thi-Lan Le, ThanhHai Tran, Quang-Hiep Luong, Peter Veelaert and Wilfried Philips (2016), Real-time table plane detection using accelerometer and organized point cloud data from Kinect sensor, Journal of Computer Science and Cybernetics, Vol 32, N.3, ISSN: 1813-9663, pp 243-258 [5] Van-Hung Le, Hai Vu, Thuy Thi Nguyen, Thi-Lan Le, Thanh-Hai Tran (2017), Fitting Spherical Objects in 3-D Point Cloud Using the Geometrical constraints Journal of Science and Technology, Section in Information Technology and Communications, Number 11, 12/2017, ISSN: 1859-0209, pp 5-17 [6] Van-Hung Le, Hai Vu, Thuy Thi Nguyen, Thi-Lan Le, Thanh-Hai Tran (2018), Acquiring qualified samples for RANSAC using geometrical constraints, Pattern Recognition Letters, Vol 102, ISSN: 0167-8655, pp 58-66, (ISI) [7] Van-Hung Le, Hai Vu, Thuy Thi Nguyen (2018), A Comparative Study on Detection and Estimation of a 3-D Object Model in a Complex Scene, 10th International Conference on Knowledge and Systems Engineering (KSE 2018), pp 203-208 [8] Van-Hung Le, Hai Vu, Thuy Thi Nguyen, Thi-Lan Le, Thanh-Hai Tran (2018), GCSAC: geometrical constraint sample consensus for primitive shapes estimation in 3D point cloud, International Journal Computational Vision and Robotics, Accepted (SCOPUS) [9] Van-Hung Le, Hai Vu, Thuy Thi Nguyen (2018), A Frame-work assisting the Visually Impaired People: Common Object Detection and Pose Estimation in Surrounding Environment, 5th Nafosted Conference on (NICS 2018), pp 218-223 [10] Hai Vu, Van-Hung Le, Thuy Thi Nguyen, Thi-Lan Le, Thanh-Hai Tran (2019), Fitting Cylindrical Objects in 3-D Point Cloud Using the Context and Geometrical constraints, Journal of Information Science and Engineering, ISSN: 1016-2364, Vol.35, N1, (ISI) 140 ... used for 3- D object detection In Chapter 4, we revisit 3- D object detection in which relevant ones are presented more details 12 1.2 3- D object detection, recognition from a point cloud data 3- D. .. using the proposed methods for detecting 3- D primitive shape objects in a lab-based environment The system combined the table plane detection technique and the proposed method of 3- D objects detection. .. tasks including: (1) separating the queried objects from a table plane; (2) detecting candidates of the interested objects using appearance features; and (3) estimating a model of the queried-object

Phát hiện và nhận dạng đối tượng 3 d hỗ trợ sinh hoạt của người khiếm thị 3 d object detection and recognition assisting visually impaired people in daily activities

Thông tin tài liệu

Từ khóa liên quan

Mục lục

DECLARATION OF AUTHORSHIP

ACKNOWLEDGEMENT

CONTENTS

SYMBOLS

LIST OF TABLES

LIST OF FIGURES

LITERATURE REVIEW

Aided-systems for supporting visually impaired people

Aided-systems for navigation services

Aided-systems for obstacle detection

Aided-systems for locating the interested objects in scenes

Discussions

3-D object detection, recognition from a point cloud data

Appearance-based methods

Discussion

Geometry-based methods

Datasets for 3-D object recognition

Discussions

Fitting primitive shapes

Linear fitting algorithms

Robust estimation algorithms

RANdom SAmple Consensus (RANSAC) and its variations

Discussions

POINT CLOUD REPRESENTATION AND THE PROPOSED METHOD FOR TABLE PLANE DETECTION

Point cloud representations

Capturing data by a Microsoft Kinect sensor

Point cloud representation

Tài liệu cùng người dùng

Tài liệu liên quan