Thursday, September 20, 2007

Kim and Kim - Curvature Estimation for Segmentation

Kim, Dae Hyun, and Myoung-Jun Kim. "A Curvature Estimation for Pen Input Segmentation in Sketch-Based Modeling." 2006.

Summary



Kim and Kim look at the issue of using curvature to determine segmentation points for pen-drawn strokes. This is a recurring theme in the literature, including use in Sezgin and Yu. Kim and Kim make the contribution of looking at two new methods of computing curvature. They make the assumption that that every point drawn by the user is intentional and there is no noise in a gesture.

The authors perform a sort of smoothing on the dataset, resampling the stroke points to produce a set of data where the points are approximately a certain distance apart. The specifics of how this operation is carried out is not discussed, but they do say they use a distance of 5 pixels. This re-sampled set of points makes up a stroke. Changes in direction are computed as normal per point.

Curvature is defined to be the change in direction / the length this change occurs over. However, because resampling has occurred and each point is now approximately the same distance apart, curvature simplifies to just being the change in direction at each point.

The authors then propose two methods for adjusting the curvatures based on support neighborhoods around each point. The neighborhoods are initially of size k. So the support for point p_i would be points p_(i-k) through p_(i+k). The authors look at the convexity of the curvature changes moving away from p_i, adding the curvatures of point p_j (for i-k <= j < i, and i < j <= i+k, moving away from p_i in either direction) only if the curvature of p_i has the same sign as the curvature at point p_i. In other words, any curvature that is occurring in the same direction (clockwise or counter-clockwise) is added together to bolster the curvature of p_i (within the window of k points on either side of p_i).

This is a problem when there are several consecutive points (say on an arc) that all have the same sign. In this case, the curvature values of each point would all be equal, since they all get the sum of the curvatures in the neighborhoods. To fix this problem, the authors add an additional constraint. Not only must the curvature of neighboring points have the same sign, but the curvature values must be monotonically decreasing in order to be added. This has the effect of weighting certain points with a higher curvature.

The authors compare their curvature estimation methods with the circumscribing circle and curve bending function methods, finding theirs to be comparable and arguing their means are justified. They then show an evaluation experiment, showing the results of gesture segmentation based on support neighborhoods alone, using convexity information alone, using the convexity and monotonicity measures, and compared to another method. They claim 95% success in detecting the correct segmentation points.

Discussion



It seems strange to me that the authors assume that all the input points are intentional and that there is no noise. But then, they resample the data! If you assume no noise is in the data and that all the points are intentional, why do you throw away features by resampling? They show how it makes the curvature calculations simpler, but really, is computing the curvature really that hard to begin with? Even if you have to do a division by the length of the segment to calculate the initial curvature at each point, you can still use the convexity and monotonicity checks to weight each curvature. I also have a problem with the assumption that there is no noise to begin with. I understand that you either assume noise or you don't, and that either way has its pros and cons. But until input devices get 100% accurate, and until the striated muscle fibers in my arm, hand, and fingers quit tensing and relaxing multiple times per second (hold your hand out in front of you and try to hold it perfectly still--you can't unless you're a robot and you assume outside factors don't affect your motion), there will ALWAYS be noise. Now, the hard part is in determining what is true noise, and what are intentional deviations. And guess what, this is also impossible. Because what might look statistically like noise might really be me wanting to make intentional, small serrations in my otherwise-straight line. There is absolutely no way to guess user intention, in a general sense, with 100% accuracy (not even a human can do that 100% of the time, much less a computer)

</rant>

However, aside from that soap-box issue, I love the way that curvatures are weighted in the paper. It's nice to see curvature weighting methods, especially when I'm facing the problem in my own segmentation implementations of figuring out which of the gobs of curvature-based segmentation options to use. I also like how they use the speed information to change the threshold in determining if a curvature change is intentional or not, assuming that lower speeds mean more intentional measures and giving more weight to curvatures. I do wish, however, that this would have been discussed more than in passing.

2 comments:

rg said...

Yeah, I think the way they use speed is interesting, but light on details. In implementing their algorithms it seems difficult to pick an appropriate threshold.

Paul Taele said...

The assumption of having no noise by the authors was quite humorous on my first reading of the paper. I figured I probably interpreted it incorrectly, but turns out that others were equally critical about that. I'm beginning to wonder if they really mean what they say, or if the concepts and terms discussed by the guys over in Asia are lost in translation...

In other news, I was a bit disappointed in the introduction of time as a reference in passing. On the one hand, their lack of mentioning it means we didn't have to implement it (yay!), but excluding the details made it seem like the authors didn't think it was that important. It did seem really handy though as a reference that at least deserved an addendum.