Summary
Take accelerometer data. Segment the gesture by holding a button down during movement. Resample the gesture to 40 frames. Vector quantize (with k-means) the 40 3D points into 40 codewords (size of the codebook is 8). Plug the 40D vectors into ergodic, 5 state HMM and classify. Train HMMs until percent difference in log likelihood is behold a threshold.
Need more training data? Augment some of the training examples you do have with some noise, either uniformly or normally distributed. Signal to noise ratio of about 3 is best for Gaussian, 5 for uniform, and both slightly increase accuracy when used to generate training examples. Accuracy increases with more training examples. They get about 98% accuracy for their easy data set.
Discussion
Another paper that uses a ridiculously easy gesture set for use with powerful hidden Markov models. I think Rubine or $1 would do just as good, and wouldn't require the complexity of HMMs.
I do like the idea of generating new training examples by adding artificial noise. This can be useful when you don't have a lot of training data to begin with. However, I don't like the way they did it. They should be learning the parameters for their distributions by examining the real data. For example, using the real training examples, discover what the mean and covariance values should be. Then, sample this (these) distributions to get new training examples, rather than adding noise to a real training example (which will make outliers even worse). Also, it's not clear if there is any real advantage to using Gaussian over uniformly distributed noise. In Fig 6, Gaussian seems to do better for low SNR and uniform better for high SNR. And in Fig 7, the results are all over the place. Are the differences in accuracies statistically significant?