Summary
Westeyn et al. present a toolkit to simplify the recognition of hand gestures using hidden Markov models. Their system, dubbed GT2k, runs on top of the HMM toolkit used for speech recognition. It abstracts away the complexity of HMMs and the application to speech recognition (it's been shown that speech models do good recognizing gestures, too). You provide feature vectors, a grammar defining classification targets and how they are related, and trained examples. The system will train and can be used later for classification purposes.
They give several example applications which use the GT2k. The first recognizes postures performed between a camera and an array of IR LEDs, and achieves 99.2% accuracy for 8 classes. They also give examples of blink-prints, mobile sign language rec (90%) and workshop recognition.
Discussion
So first off, it's neat that there is a little toolkit thing we can use to do hand gesture recognition. Built on top of a HMM kit for speech recognition isn't too scary since HMMs pretty much pwn speech rec. It also makes HMMs more available to the masses.
That being said, I don't feel like the authors really applied their toolkit to any example that is truly worthy of the power of an HMM. The driving thing is a simple neural network and is crazy easy with even template matching. The blink print thing, besides being dumb, is just short/long sequence identification and template matching / nearest neighbor. Telesign... their grammar looks like you'd have to specify all possible orderings of words (UGH!). I think GT2K has promise in this area, however. Workshop activity recognition... besides the fact that the sensor data is able to classify activities, which is neat, this application is absurd.
However, again I'd like to clarify that the GT2K is a great idea and I'd like to use it more, hopefully with more worthy applications.
BibTeX
@inproceedings{958452,
author = {Tracy Westeyn and Helene Brashear and Amin Atrash and Thad Starner},
title = {Georgia tech gesture toolkit: supporting experiments in gesture recognition},
booktitle = {ICMI '03: Proceedings of the 5th international conference on Multimodal interfaces},
year = {2003},
isbn = {1-58113-621-8},
pages = {85--92},
location = {Vancouver, British Columbia, Canada},
doi = {http://doi.acm.org/10.1145/958432.958452},
publisher = {ACM},
address = {New York, NY, USA},
}
No comments:
Post a Comment