Monday, February 4, 2008

Chen - Dynamic Gesture Interface w/ HMMs

Chen, Qing.... "A Dynamic Gesture Interface for Virtual Environments Based on Hiddean Markov Models." HAVE 2005

Summary



Chen et al. use hidden Markov models (HMMs) to classify gestures (their focus is a simple domain of three gestures). The algorithm they use captures the standard deviation of the different bend sensor values on a glove, with the argument/idea that using the std., they don't have to worry about segmentation. They feed the std data into HMMs and classify like that. Their three gestures are very simple and are used to control three axes of rotation for a virtual, 3D cube.

They give no recognition results.

Discussion



I'm not sure these guys are too well versed in machine learning. This paper is pretty weak. I'll just make a laundry list instead of trying to tie all my complaints together in prose.


  • They mention other approaches (Kalman filters, dynamic time warping, FSM) that have been used, but state they have "very strict assumptions." Okay, like what? Kalman filters and hidden Markov models pretty much do the exact same thing, so why will HMMs do better than Kalman filters?

  • They say (page 2, first par.) that gestures are noisy and even if a person does it the same way, it will still be different. Duh. Too bad. Measurements and data are noisy, just like everything in machine learning. Otherwise, you'd just look it up in a hash table and save yourself a lot of trouble.

  • It's the Expectation-MAXIMIZATION algorithm, not -Modification.

  • They claim to avoid the need for segmentation. Okay, then what are you computing the standard deviation of? You have to have some sort of window of points to do the calculations on. I suppose their assumption is they just get the gesture in a window, not half of one, and things happen by magic.



Weak paper. Do not want. Would not buy from seller again.

2 comments:

Brandon said...

yeah i agree - i didn't understand this paper at all. their whole thesis of using standard deviation to avoid segmentation made no sense to me at all. apparently no one else in class seemed to get it either...

Grandmaster Mash said...

I like the idea of looking up gestures in a hash table. You should write a paper on it.