Master thesis: Motion analysis from encoded video bitstream

53 48 0
Master thesis: Motion analysis from encoded video bitstream

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

This thesis proposes a new method to determine moving objects by applying some motion estimation techniques in the video compression domain. At the same time, this method will also be used to build an application that supports the identification and search of motion in home surveillance videos. The compression format of the videos in the thesis uses H264 compression (MPEG-4 part10), a popular video compression standard today. The thesis''s contribution is to propose a new method of detecting moving objects in surveillance video encoded with H264 compression using motion vectors and macroblock sizes

VIETNAM NATIONAL UNIVERSITY, HANOI UNIVERSITY OF ENGINEERING AND TECHNOLOGY NGUYEN MINH HOA MOTION ANALYSIS FROM ENCODED VIDEO BITSTREAM MASTER’S THESIS HA NOI – 2018 VIETNAM NATIONAL UNIVERSITY, HANOI UNIVERSITY OF ENGINEERING AND TECHNOLOGY NGUYEN MINH HOA MOTION ANALYSIS FROM ENCODED VIDEO BITSTREAM Major: Computer Science MASTER’S THESIS Supervisor: Dr Do Van Nguyen Co-Supervisor: Dr Tran Quoc Long HA NOI - 2018 i AUTHORSHIP “I hereby declare that the work contained in this thesis is of my own and I have not submitted this thesis at any other institution in order to obtain a degree To the best of my knowledge and belief, the thesis contains no materials previously published or written by another person other than those listed in the bibliography and identified as references.” Signature: ……………………………………………… ii SUPERVISOR’S APPROVAL “I hereby approve that the thesis in its current form is ready for committee examination as a requirement for the Master of Computer Science degree at the University of Engineering and Technology.” Signature: ……………………………………………… Signature: ……………………………………………… iii ACKNOWLEDGMENTS First of all, I would like to express special gratitude to my supervisors, Dr Do Van Nguyen and Dr Tran Quoc Long, for their enthusiasm for instructions, the technical explanation as well as advices during this project I also want to give sincere thanks to Assoc Prof Dr Ha Le Thanh, Assoc Prof Dr Nguyen Thi Thuy for the instructions as well as the background knowledge for this thesis And I would like to also thank my teachers, my friends in Human Machine Interaction Lab for their support Thank my friends, my colleagues in the project "Nghiên Cứu Cơng Nghệ Tóm Tắt Video", and project “Multimedia application tools for intangible cultural heritage conservation and promotion”, project number ĐTDL.CN-34/16 for their working and support Last but not least, I want to thank my family and all of my friends for their motivation and support as well They stand by and inspire me whenever I face the tough time TABLE OF CONTENTS AUTHORSHIP i SUPERVISOR’S APPROVAL ii ACKNOWLEDGMENTS iii TABLE OF CONTENTS ABBREVIATIONS List of Figures List of Tables INTRODUCTION CHAPTER LITERATURE REVIEW Moving object detection in the pixel domain Moving object detection in the compressed domain 10 1.2.1 Motion vector approaches 11 1.2.2 Size of Macroblock approaches 13 Chapter Summarization 14 CHAPTER METHODOLOGY 15 Video compression standard h264 15 2.1.1 H264 file structure 15 2.1.2 Macroblock 18 2.1.3 Motion vector 19 Proposed method 21 2.2.1 Process video bitstream 21 2.2.2 Macroblock-based Segmentation 22 2.2.3 Object-based Segmentation 24 2.2.4 Object Refinement 28 Chapter Summarization 28 CHAPTER RESULTS 30 The moving object detection application 30 3.1.1 The process of application 31 3.1.2 The motion information 34 3.1.3 Synthesizing movement information 35 3.1.4 Storing Movement Information 36 Experiments 36 3.2.1 Dataset 36 3.2.2 Evaluation methods 40 3.2.3 Implementations 41 3.2.4 Experimental results 41 Chapter Summarization 44 CONCLUSIONS 45 List of of author’s publications related to thesis 46 REFERENCES 47 ABBREVIATIONS MB Macroblock MV Motion vector NALU Network Abstraction Layer Unit RBSP Raw Byte Sequence Payload SODB String Of Data Bits List of Figures Figure 1.1 The process of moving object detection with data in the pixel domain 10 Figure 1.2 The process of moving object detection with data in the compressed domain 11 Figure 2.1 The structure of a H264 file 15 Figure 2.2 RBSP structure 16 Figure 2.3 Slide structure 18 Figure 2.4 Macroblock structure 18 Figure 2.5 The motion vector of a Macroblock 20 Figure 2.6 The process of moving object detection method 22 Figure 2.7 Skipped Macroblock 23 Figure 2.8 (a) An outdoor and in-door frames (b) The "size-map" of frames, (c) The "motion-map" of frames 24 Figure 2.9 Example about the “consistent” of motion vector 26 Figure 3.1 The implementation process of the approach 33 Figure 3.2 Data struct to storage motion information 35 Figure 3.3 Example frames of test videos 37 Figure 3.4 Example frames and their ground truth 39 Figure 3.5 An example frame of Pedestrians (a) and ground truth image (b) 40 List of Tables Table 2.1 NALU types 16 Table 2.2 Slide types 17 Table 3.1 The information of test videos 38 Table 3.2 The information of test sequences in group 39 Table 3.3 The performance of two approachs with Pedestrians, PETS2006, Highway, and Office 42 Table 3.4 The experimental result of Poppe’s approach on 2nd group 42 Table 3.5 The experimental result of proposed method on 2nd group 43 34 depending on the frequency and appearance of the motion to obtain the motion information The motion description information obtained from the above steps will be reshaped and stored in a convenient data structure for later retrieval and use in Storage Movement Information (4) The details of the step (3) and step (4) will be described later 3.1.2 The motion information The motion information in the thesis is understood as a value representing the level of motion of the object in the video In order to obtain information describing motion, we can first classify the motion in the video into real motion (caused by objects such as human beings, vehicles, etc.) and motion due to interference The types of observations that can be observed are: • Noise due to camera shake: The characteristic of this noise is the large motion on the entire frame, with the cycle • Noise due to camera quality: This is caused by the low light intensity, usually a form of noise is small, no cycles but fairly distributed • Noise due to light: blinking light (cyclic noise), tube lights, etc These types of interference are cyclical, large hard to determine • Noise due to weather factors such as rain, clouds, etc With real motion, we can divide into two types: normal movement and meaningful movement The concept of normal and meaningful here depends on the circumstances of the video For example, with home video, shaking curtains cause visible movement, but movement means human movement in the scene; With the motion on the road, the types of motion are more difficult to define With general types of motion, we can divide as follows: • Movement of cyclic motion equipment (such as rotor blades, rotating wheels) • Wind motion caused by the wind (leaves, curtain fabric) These movements are usually large movements and can have cycles • Movements of foreign objects such as sun shining, lights (motorcycle lights, automobile lights from remote) These movements are often difficult to determine However, they usually appear in night-time video • Lastly, real motions are like moving people, moving vehicles in the observation area 35 3.1.3 Synthesizing movement information Synthesis method, classification of motion begins with the step of calculating the weight of motion for each position in the frame (each position corresponds to one MB) in the time interval T For a position, we weight the number of megabytes of motion at times (by frame) during T is as follows: • If the MB is moving at the time of review, the weight of motion at that moment is equal to the count of the moments of preceding consecutive motion • Other cases, if the MB at the time of review has no motion, the weight is zero Then, the moving weight of each position in the composite frame after the time T is equal to the sum of the time weightings at all times in the period T After calculating the moving weight, we proceed to evaluate the motion level to perform the motion classification for each position in the composite frame after the time T based on the weight calculated in the previous step The level of motion is divided into four levels by the binary symbol, namely: no movement (00), few movement or noisy (01), movement (10), and many movement (11) Movement level values are then saved to two-dimensional arrays and stored in a twodimensional array Figure 3.2 Data struct to storage motion information 36 3.1.4 Storing Movement Information This step will store the movement information obtained after the synthesis step described by the motion described above The movement information data is stored according to the hierarchy of space and time of the video The structure that stores the motion description information is depicted in the Fig Figure 3.2 Where: • Level is a folder that contains aggregate data for each video storaged time by time • Level is the folder that contains the files that contains the information data according the horizontal of frame in a temporal dimension • Level is the files that contains movement information data of blocks in columns of the frame in a temporal dimension • Level is the contents of the files in level These files contain binary values from to The value is the level of motion of the block in a time T (may be seconds, seconds, seconds, 10 seconds, etc.) The user can modify T through using the parameter The advantage of this data structure is when you want to search the moments when movement happen, you can choose an area (corresponding to some MBs) In that case, the time to searching is shorter because the application has only searched in the files correspond with the MBs you choose Moreover, predefining the searching region (region of interested) will make the accuracy of the result is higher than the searching on full frame Experiments 3.2.1 Dataset The proposed method is designed to operate with a fixed, downward-facing camera The maximum resolution of videos is 1920x1080 pixels The program can be installed directly on a device attached to the camera like Raspberry Pi, running Linux operating system that guarantees real-time processing The experimental data was provided by VP9 Vietnam company and processed by HMI laboratory, University of Engineering and Technology The data set includes 43 videos with resolutions of 1280x720 and 1920x1080 In addition, the method uses live data from more than 100 cameras installed in the city of Hanoi and Da Nang City which are provided by VP9 including indoor data and outdoor The 37 videos with various lighting and environmental conditions including outdoor light (sunlight, low sunshine), artificial light (tube, led), wind, rain, etc It can be said that the data set satisfies the supply of different situations and environments for the moving object detection problem Figure 3.3 Example frames of test videos For gathering and statistics for the report, I made the ground truth for videos with a resolution of 1280x720 and 1920x1080 and used these videos to perform the experimental results Table 3.1 describes the information about the videos used for the experimental results In Fig Figure 3.3, we have some example frames of the test videos (Figure 3.3a is a frame of TrongNha_02, Figure 3.3b is a frame of DNG8_1708, Figure 3.3c is a frame of NEM1_131, Figure 3.3d is a frame of HMI_WetRoad, Figure 3.3e is a frame of CuaHang_01 and Figure 3.3f is a frame of HMI_OutDoor) These videos are captured in different environments 38 and circumstances to perform the experiements Fig Figure 3.4 depicts some of their respective frames and ground truth Table 3.1 The information of test videos Video HMI_WetRoad HMI_OutDoor GVO2_0308 NEM1_131 DNG8_1708 CuaHang_01 TrongNha_02 Resolution 1920 × 1080 1280 × 720 1280 × 720 1920 × 1080 1920 × 1080 1280 × 720 1280 × 720 Information Place Outdoor Outdoor Outdoor In-door Outdoor In-door In-door In addition, to compare with the approach of Poppe [24] that we base on in macroblock-based segmentation phase, we use the second dataset from IEEE Change Detection Workshop 2014 [30] So, the experimental process will carry out on datasets, including 11 test sequences, which are divided into groups First group consists of test sequences: PETS2006, Pedestrians, Highway and Office from the baseline profile of the IEEE Change Detection Workshop 2014 Both video frames and motion ground truth can be downloaded on the homepage of Changedetection We use ffmpeg [31] to create compressed video from given frames with all of encoding parameters set to default Fig Figure 3.5 shows an example frame of Pedestrians test sequence (a) and its motion ground truth (b) Table 3.2 shows the information of four videos: the 1st column is the name of videos, the next three columns are the resolution, frame rate value, and quantization parameter (qp) value, respectively, of each video As we can see, the videos in the 1st group have difference resolution but they are all low resolution videos The frame rate of videos is 25 fps and qp value depends on each video These videos are quite similar to the videos in Poppe’s experiment 39 Figure 3.4 Example frames and their ground truth Table 3.2 The information of test sequences in group Video pedestrians PETS2006 Highway Office Information Resolution fps 360 × 240 25 720 × 576 25 320 × 240 25 360 × 240 25 qp 25 27 23 23 The videos in the 2nd group are videos mentioned above These videos from actual indoor and outdoor surveillance cameras without scripting and prior arrangement The motion ground truth are made by ourself by investigating the video frame by frame They are all the high spatial resolution videos 40 Figure 3.5 An example frame of Pedestrians (a) and ground truth image (b) 3.2.2 Evaluation methods The efficiency of the method is evaluated by the recall value, the precision value and F1 score In which, the precision value is calculated by: 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑇𝑟𝑢𝑒𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑇𝑟𝑢𝑒𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 + 𝐹𝑎𝑙𝑠𝑒𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 recall value is calculated by: Recall = TruePositive TruePositive + FalseNegative and the F1 score is calculated by: 𝐹1 = ∗ 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑅𝑒𝑐𝑎𝑙𝑙 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙 41 where: • TruePositive: The total number of Macroblocks correctly detected as a moving object • FalsePositive: The total number of Macroblocks are background but detected as a moving object • FalseNegative: The total number of Macroblocks are a moving object but not detected High precision means that the accuracy of the method is good High recall means that the percentage of the missing moving object is low A perfect system is a system with precision and recall is both 100% However, this is impossible Normally, when adjusting the system for precision priority, it will reduce recall and vice versa In that case, we can use the F1 score This allows for a balance between precision and recall 3.2.3 Implementations The proposed method in this thesis is set up in C++ language Our experiments were done on Windows PC of the Intel Core i5-3337U, 1.8GHz, and GB RAM Base on observation, we’ve seen that Ts should be chosen empirically base on each test video The other parameters should be Tc = 90%, TA = 10◦, TL = 20, and Tdensity = 80% 3.2.4 Experimental results The videos in 1st group are performed experiment many times and select the best result Table 3.3 shows the comparison experimental results of approachs these videos In the case of using the proposed method, the average value of precision of the four videos is 80%, the average of recall is 84%, and the F1 score is 81.9878 If using Poppe’s method, average of precision is 81%, average of recall is 83%, and the F1 score is 81.95122 We can see that the performance of our method is equivalent to that of Poppe’s method when applying on a low-resolution video 42 Table 3.3 The performance of two approachs with Pedestrians, PETS2006, Highway, and Office Video pedestrians PETS2006 Highway Office Average Our approach Poppe’s approach Precision Recall Precision Recall F1 F1 (%) (%) (%) (%) 84 95 89.16201 80 90 84.70588 87 80 83.35329 88 78 82.6988 77 81 78.94937 78 80 78.98734 72 82 76.67532 75 83 78.79747 80 84 81.95122 81 83 81.9878 With the 2nd video group, the high resolution videos, the proposoed method is used to perform experiment many times with different Ts parameters and selected best results Table 3.4 is the experimental result when using Poppe’s approach and Table 3.5 is the experimental result of the proposed method on these videos The results show that the recall values of Poppe’s approach are usually smaller than the values of proposed method, meaning the number of missing moving objects detected by Poppe’s approach greater than the proposed method This happen because there are many “skip_mode” MBs in a frame of a high resolution video Table 3.4 The experimental result of Poppe’s approach on 2nd group Video HMI_WetRoad HMI_OutDoor GVO2_0308 NEM1_131 DNG8_1708 CuaHang_01 TrongNha_02 Precision 0.4954 0.5145 0.6821 0.6055 0.8777 0.7468 0.8341 Recall 0.8943 0.7711 0.6016 0.7602 0.7489 0.8339 0.7247 F1 0.6376 0.6172 0.6393 0.6741 0.8082 0.788 0.7756 In additional, the experimental results in Table 4.5 show that the videos which have good results are the videos have less noise, and there is a clear distinction between the background and moving objects And, the results not depend on videos capture from outdoor or indoor cameras As in the results table, the best result is the TrongNha_02 video (Fig Figure 3.3a) with F1 score = 0.8771 This is a video obtained in a working room (namely a police station) Good environmental conditions with low noise A moving object is a person who clearly 43 distinguishes the floor The shirt of a moving object has only one color but is not uniform due to many wrinkles The worst video is NEM1_131 (Fig Figure 3.3d) with F1 score = 0.6235 Although this video is recorded indoors, it has an outward-facing view And the entrance of the room is made of glass, easy to reflect the moving objects The video is recorded in the evening, so the light outside the room is easy to make noise Table 3.5 The experimental result of proposed method on 2nd group Video HMI_WetRoad HMI_OutDoor GVO2_0308 NEM1_131 DNG8_1708 CuaHang_01 TrongNha_02 Ts 90 100 110 120 70 80 90 100 70 80 90 100 90 100 110 120 60 65 70 75 75 80 85 90 50 55 60 65 Precision 0.7409 0.734 0.736 0.7461 0.6916 0.641 0.7055 0.7195 0.5926 0.577 0.5376 0.5821 0.4762 0.4655 0.4847 0.4855 0.7612 0.7889 0.7843 0.777 0.7498 0.7676 0.7372 0.6828 0.8283 0.8139 0.8248 0.8254 Recall 0.8644 0.8935 0.8197 0.9453 0.8681 0.8656 0.8962 0.9151 0.8018 0.8653 0.836 0.916 0.8183 0.9333 0.8737 0.8702 0.8164 0.9217 0.9157 0.8789 0.8796 0.9302 0.8598 0.9339 0.9319 0.9095 0.9261 0.9247 F1 0.7979 0.8059 0.7756 0.834 0.7699 0.7366 0.7895 0.8056 0.6815 0.6923 0.6543 0.7118 0.602 0.6211 0.6235 0.6233 0.7878 0.8501 0.8449 0.8248 0.8095 0.8411 0.7938 0.7889 0.8771 0.859 0.8725 0.8722 The experimental results also show that the choice of the threshold T s is quite difficult This is also a limitation of the proposed method Normally, the video is 44 less noise, the threshold value Ts will be less than the Ts of the video has more noise Under the system conditions described above, the processing speed is between 17 and 23 fps If you install the program on a Raspberry Pi2 device, the processing speed is between 22 and 27 fps depending on the amount of motion in each frame of the video This speed fully meets the real-time requirements of the problem Chapter Summarization This chapter presents the experiment results of thesis The dataset of experiments are taken from database of Change Detection Workshop 2014 and more than 100 actual surveillance cameras installed in Hanoi City and Da Nang City which are provided by VP9 including indoor data and outdoor sences These videos are captured without scripting and prior arrangement The results show that the proposed method can determine accurately moving objects in the benchmark videos of Change Detection Workshop 2014 In addition, with high-resolution videos, the proposed method can perform in real-time better than the related works This may due to the appearance of many “skip_mode” MBs in a frame of a high resolution video The proposed method has been also used to build a moving object detection application for industrial use 45 CONCLUSIONS The thesis proposes a new moving object detection approach in H264/AVC compressed domain method for high-resolution video surveillance that exploits not the size of MBs but also the characteristics of MV fields of moving object to identify the interested moving object The method can detect quickly most regions that contain moving objects even with uniform color objects The thesis is a result of a real project of a company so the ability to apply in practice is very high The application using the proposed method in the thesis can helps people to search, detect the moments when movement happen more effectively The people can save a lot of time and effort However, the proposed method still needs empirical thresholds in order to accurately detect the interested moving objects In some scenes, the removal of noise motion like swaying tree branches cannot be done because the motion value of tree branches is high For future work, we will focus on making the system selftuning the thresholds by using machine learning to get the best results 46 List of of author’s publications related to thesis Minh Hoa Nguyen, Tung Long Vuong, Dinh Nam Nguyen, Do Van Nguyen, Thanh Ha Le and Thi Thuy Nguyen, “Moving Object Detection in Compressed Domain for High Resolution Videos,” SoICT ’17, pp 364369, 2017 Nguyễn Đình Nam, Nguyễn Thị Thủy, Nguyễn Đỗ Văn, Nguyễn Minh Hòa, Vương Tùng Long, Lê Thanh Hà, "Phương pháp phân tích lưu trữ thông tin mô tả chuyển động nội dung viđeo phương tiện lưu trữ liệu tổng hợp mô tả chuyển động nội dung viđeo" Pending Patent, apply in 03/05/2017 47 REFERENCES [1] S Aslam, "Omnicore," Omnicore Group, 18 2018 [Online] Available: https://www.omnicoreagency.com/youtube-statistics/ [2] M Piccardi, "Background subtraction techniques: a review," IEEE International Conference on Systems, Man and Cybernetics, pp 3099-3104, 2004 [3] A A T D a A C Wren, "Pfinder: real-time tracking of the human body," IEEE Trans on Patfern Anal and Machine Infell, vol 19, pp 780-785, 1997 [4] J T J G B a S D.Koller, "Towards Robust Automatic Traffic Scene Analysis in Realtime," Proc ICPR’94, pp 126-131, 1994 [5] B a S.A.Velastin, "Automatic congestion detection system for underground platforms," Proc ISIMP2001, pp 158-161, 2001 [6] C M a A R.Cucchiara, "Detecting moving objects, ghosts, and shadows in video streams," IEEE Trans on Pattern Anal and Machine Intell, vol 25, pp 1337-1442, 2003 [7] C a W.E.L.Grimson, "Adaptive background mixture models for real-time tracking," Proc IEEE CVPR 1999, pp 246-252, 1999 [8] P P a J.A.Schoonees, "Understanding background mixture models for foreground segmentation," Proc of IVCNZ 2002, pp 267-271, 2002 [9] M T a P R Venkatesh Babu, "A survey on compressed domain video analysis techniques," Multimedia Tools and Applications, vol 75, p 1043–1078, 2016 [10] G G a G T.Wiegand, "Overview of the H.264/AVC video coding standard," IEEE Transactions on Circuits and Systems for Video Technology, vol 13, pp 560-576, 2003 [11] D J G a H Q ZengW, "Robust moving object segmentation on H.264/AVC compressed video using the block-based MRF model," Real-Time Imaging, vol 11, pp 36-44, 2009 [12] Y L a Z Z Zhi Liu, "Real-time spatiotemporal segmentation of video objects in the H.264 compressed domain," Journal of visual communication and image representation, vol 18, p 275–290, 2007 [13] F.-E G R.-B L M.-G J a J.-L L C Solana-Cipres, "Real-time moving object segmentation in H.264 compressed domain based on approximate reasoning," International Journal of Approximate Reasoning, vol 51, p 99–114, 2009 [14] C.-M M a W.-K Cham, "Real-time video object segmentation in H.264 compressed domain," IET Image Processing, vol 3, p 272 – 285, 2009 [15] P C V S L P a V D W R S De Bruyne, "Estimating motion reliability to improve moving object detection in the H.264/AVC domain," IEEE international conference on multimedia and expo, p 290–299, 2009 [16] Z y W a R m H Shi zheng Wang, "Surveillance video synopsis in the compressed domain for fast video browsing," Journal of Visual Communication and Image Representation, vol 24, p 1431–1442, 2003 [17] P A A H a A K Marcus Laumer, "Compressed Domain Moving Object Detection by Spatio-Temporal Analysis of H.264/AVC Syntax Elements," Picture Coding Symposium (PCS), p 282–286, 2015 [18] R V B a R G P Manu Tom, "Compressed domain human action recognition in H.264/AVC video streams," Multimedia Tools and Applications, vol 74, no 21, p 9323– 9338, 2015 48 [19] B R Biswas S, "Real-time anomaly detection in H.264 compressed videos," National conference on computer vision, pattern recognition, image processing and graphics, pp 1-4, 2013 [20] B R Biswas S, "Anomaly detection in compressed H.264/AVC video," Multimedia Tools and Applications, p 1–17, 2014 [21] C D C Vimal Thilak, "Tracking of extended size targets in H.264 compressed video using the probabilistic data association filter," European Signal Processing Conference 12th, p 281–284, 2004 [22] S M K M You W, "Moving object tracking in H.264/AVC bitstream," Multimedia Content Analysis and Mining, pp 483-492, 2007 [23] H N Christian Käs, "An Approach to Trajectory Estimation of Moving Objects in the H.264 Compressed Domain," Advances in Image and Video Technology, pp 318-329, 2009 [24] B S P T L P a d W R C Poppe, "Moving object detection in the H.264/AVC compressed domain for video surveillance applications," Journal of Visual Communication and Image Representation, vol 20, p 428–437, 2009 [25] L R S M C P a R v d W Antoine Vacavant, "Adaptive background subtraction in H.264/Avc bitstreams based on macroblock sizes," Computer Vision Theory and Application (VISAPP), p 51–58, 2011 [26] K A P H S Ajay Divakaran, "Method for summarizing a video using motion and color descriptors" US Patent US09634364, 09 08 2000 [27] K Ratakonda, "Method for hierarchical summarization and browsing of digital video" US Patent US5956026A, 19 12 1997 [28] K C L H T O Lipin Liu, "Intelligent, dynamic, long-term digital surveilance media storage system" US Patent US7751632B2, 15 02 2005 [29] J T C I J 1, "ISO/IEC 14496-10," ISO and IEC, 2014 [Online] Available: https://www.iso.org/obp/ui/#iso:std:iso-iec:14496:-10:ed-8:v1:en [30] D M., "Gentle Logic," 16 11 2011 [Online] Available: http://gentlelogic.blogspot.com/2011/11/exploring-h264-part-2-h264-bitstream.html [31] R Finlayson, "LIVE555.COM," Live Networks, Inc., [Online] Available: http://www.live555.com/ [32] Karsten.Suehring, "Fraunhofer," Fraunhofer Heinrich Hertz Institute, [Online] Available: http://iphome.hhi.de/suehring/ [33] V L a K Wong, "Design & Reuse," Ocean Logic Pty Ltd, [Online] Available: https://www.design-reuse.com/articles/12849/designing-a-real-time-hdtv-1080pbaseline-h-264-avc-encoder-core.html ... HANOI UNIVERSITY OF ENGINEERING AND TECHNOLOGY NGUYEN MINH HOA MOTION ANALYSIS FROM ENCODED VIDEO BITSTREAM Major: Computer Science MASTER S THESIS Supervisor: Dr Do Van Nguyen Co-Supervisor: Dr... vision are taken from scientific articles related to the video analysis problem on the compression domain, determine the motion form on the compression domain of the video The videos for test... server for analysis The video analysis process needs a lot of features to describe different aspects of vision Typically, these features are extracted from the pixel values of each video frame

Ngày đăng: 17/01/2020, 08:00

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan