|
Abstract:
|
Traditional computer vision algorithms depend on information taken by visible -light cameras . But there are inherent limitations of this data source , e .g . they are sensitive to illumination changes , occlusions and background clutter . Range sensors give us 3D structural information of the scene and it’s robust to the change of color and illumination . In this thesis , we present a series of approaches which are developed using the depth information by Kinect to address the issues regarding human detection and action recognition .
Taking the depth information , the basic problem we consider is to detect humans in the scene . We propose a model based approach , which is comprised of a 2D head contour detector and a 3D head surface detector . We propose a segmentation scheme to segment the human from the surroundings based on the detection point and extract the whole body of the subject . We also explore the tracking algorithm based on our detection result . The methods are tested on a dataset we collected and present superior results over the existing algorithms .
With the detection result , we further studied on recognizing their actions . We present a novel approach for human action recognition with histograms of 3D joint locations (HOJ3D ) as a compact representation of postures . We extract the 3D skeletal joint locations from Kinect depth maps using Shotton et al .’s method . The HOJ3D computed from the action depth sequences are reprojected using LDA and then clustered into k posture visual words , which represent the prototypical poses of actions . The temporal evolutions of those visual words are modeled by discrete hidden Markov models (HMMs ) . In addition , due to the design of our spherical coordinate system and the robust 3D skeleton estimation from Kinect , our method demonstrates significant view invariance on our 3D action dataset . Our dataset is composed of 200 3D sequences of 10 indoor activities performed by 10 individuals in varied views . Our method is real -time and achieves superior results on the challenging 3D action dataset . We also tested our algorithm on the MSR Action3D dataset and our algorithm outperforms existing algorithm on most of the cases . |