Friday, February 22, 2008

Lichtenauer - 3D Visual recog. of NGT sign production

J.F. Lichtenauer, G.A. ten Holt, E.A. Hendriks, M.J.T. Reinders. "3D Visual Detection of Correct NGT Sign Production." Thirteenth Annual Conference of the Advanced School for Computing and Imaging, Heijen, The Netherlands, June 13-15 2007.

Summary



Lichtenauer et al. discuss a system for recognizing Dutch sign language (NGT) gestures using two cameras and an image processing approach. The system is initialized semi-automatically to take the skin color of the face. The hands are then located in the image by finding the points that have the same color as the face. Gesture segmentation is manually enforced, with the hands at rest on the table between gestures. The gestures are turned into feature vectors of movement/angle through space (blob tracking) and time, and compared to a reference gesture per class using dynamic time warping. The features are classified independently of one another, and the results per class per feature are summed, giving one average probability per class (across the features). If the probability is above a certain threshold, the gesture is labeled as that class. They report 95% true positive rate and 5% false positive.

Discussion



This method seems pretty hardcore on the computation, since they're doing a classifier for each of the ~5000 features. I don't know if that's how all DTW stuff works, but I think you could do something to dramatically reduce the amount of error.

If you wear a short sleeve shirt, will the algorithm break if it starts trying to track your forearms or elbows? It's just using skin color, so I think it might.

They use the naive Bayes assumption to make all their features independent of each other. I think this is pretty safe to do, especially as it simplifies computation. They do mention that even though some features might contain correlation, they've added features to capture this correlation independently, and extract it out of the space "between" features (that's a hokey way to put it, sorry).

They don't report accuracy, but true positives. This is pretty much bogus, as far as I'm concerned, as it doesn't tell you much about how accurate their system is at recognizing gestures correctly.

BibTex



@proceedings{lightenauer2007ngtSign
,author="J.F. Lichtenauer and G.A. ten Holt and E.A. Hendriks and M.J.T. Reinders"
,title="{3D} Visual Detection of Correct {NGT} Sign Production"
,booktitle="Thirteenth Annual Conference of the Advanced School for Computing and Imaging"
,address="Heijen, The Netherlands"
,year="2007"
,month="June"
}

2 comments:

Grandmaster Mash said...

I haven't commented in awhile, so get ready for a barrage of these.

That's a good point on the "short-sleeve shirt" issue. Researchers might have to provide sleeves that attach to the gloves, which is kind both ridiculous and funny. This might not be an issue if they know where the hand/arm ends, but if the color of the hands is similar to the color of the background, it might make segmenting out the hand and arm harder.

Paul Taele said...

Haha, that's hilarious. Might as well have a shirt-based input device while they're at it. This paper had some flawed or omitted aspects that don't really motivate me to further investigate vision-based methods over glove-based methods. The bogus results at the end depresses me a bit, too. Your observation of the paper's naive Bayes assumption does make me dislike the paper a bit less, since I didn't recall it in my last reading and seems insightful.