Saturday, May 10, 2008

Poddar - Gesture Spech, Weatherman

I. Poddar, Y. Sethi, E. Ozyildiz, R. Sharma. Toward Natural Gesture/Speech HCI: A Case Study of Weather Narration. Proc. Workshops on Perceptual User Interfaces, pages 1-6, November, 1998.

Summary



Three categories of gestures: pointing, area, and contour, each with three phases: preparing, making the gesture, and retraction. Use features that measure distances/angles between the face and hands and plug into an HMM. Get 50-60% accuracy on four test sequences.

Now add speech to the gesture data. Compute co-occurrences of marker words with different gestures and use the data to help the HMM classify gestures. Accuracy goes up about 10%.

Discussion



Adding speech to gesture data improves the accuracy. This is fairly obvious, and they've shown that it does a little bit. The one thing I don't like is the manual labeling of speech data.

I wish they would have done more gestures, and their accuracies weren't great. But at least it was a fusion of contextual data.

No comments: