Summary
The authors seek to recognize dynamic hand gesture with changing shape as well as motion. They use principal components analysis (PCA) to get an eigenspace representation of the objects they wish to track (hands). Within the eigenspace, particle filtering is used to predict where the eigen-hands (hand image projected into eigenspace) will appear next. Skin color and motion cues are used to initialize the system automatically.
The EigenTracker is used to segment the hand motions (second paragraph of section 3) when "a drastic change in the appearance of the gesticulating hand, caused by the change in the hand shape, results in a large reconstruction error. This forces an epoch change, indicating an new shape of the gesticulating hand." The segments are used to create shape/motion pairs for the gesture. Trajectories are modeled with linear regression (least-squares linear approximation).
The tracked hand gestures are modeled as sequences of shape/movement pairs. The models are trained to get a mean gesture and covariace (Gaussian models), and the model with the smallest Mahalanobis distance to our training set is chosen as the classification label.
5 eigenvectors are used in PCA to capture 90% of the variance. Each gesture split into 2 epochs. Using Mahalanobis distance, they get 100% classification accuracy.
Discussion
They test with their training data, so this is crap. Also, their dataset is extremely simple, with very unique and defined shape/trajectory patterns. And, their background and image tracking is very clean (not a lot of noise) and too easy, as well. They say their data is easy to prove an optimal upper bound on classification accuracy...which turns out to be 100%. So, um, no duh? I'm going to make something impossible to classify and prove the lower bound is 0% (or at most 1/n, a random guess), sound good?
That said, I do like the way they use PCA to simplify the data and particle filtering to both track the hand and segment epochs. It's just their data sets that leave me feeling unimpressed.
2 comments:
The comment about the authors testing their system in a controlled-like environment with trivial gestures sorely disappointed me. The fact that this paper spent a lot of time explaining what I thought was a complex approach made the whole process moot with such a weak evaluation test. It was nice to get different viewpoints in all, with your negative opinion, Pankaj's positive opiion, and Brandon's neutral opinion on this paper.
The upper bound on random guessing is also 100%. Boo ya.
Post a Comment