Thursday, September 20, 2007

Paper #8 - A Curvature Estimation for Pen Input Segmentation in Sketch-Based Modeling

Paper:
A Curvature Estimation for Pen Input Segmentation in Sketch-Based Modeling
(Dae Hyun Kim and Myoung-Jun Kim)

Summary:
Kim et al’s approach to sketch recognition involves segmenting pen input at high-curvature points in order to enhance shape recognition of pen strokes. Best curvature estimation is done by using only local shape information of the stroke on the fly. Several features are extracted from the stroke to aid in the process.

The first feature is taking the direction of the point by taking the angle from the preceding and succeeding point. Points are already re-sampled in this approach, which means choosing points a given distance from the previous point. The second feature is the use of support points, or a set of k number of preceding and k number of succeeding points (k = 3 usually), from a point aid in that point’s curvature estimation. Due to re-sampling, curvature is set to the point’s direction.

Two additional features are used to aid in locating segmentation points: local convexity and local monotonicity. In local convexity, a point’s curvature is computed as the sum of curvature values of itself and support points which share local convexity (same direction sign). Segmentation confusion can occur with local convexity. In other words, when points in the same neighborhood can end up having the same curvature values in local convexity, only complicated heuristics can remove the unnecessary features. To remedy it, local monotonicity is alternatively used. In local monotonicity, curvature estimation values of a point proceeds as follows. The sum is first taken to be the current points’ curvature value. While looping through the support points, add to the running sum the curvature value of a support point whose absolute curvature value is less than the current points’ absolute curvature value. When that is done, set that support point’s absolute value as the minimum. Repeat the process with the other support points to get the curvature value of the current point.

With the above definitions, the stroke can be scanned just once to compute the curvature at all input points. Segmentation occurs at the local maximum of positive curvature points and local minima of negative curvature points. These points are recognized as segmentation points if the absolute value of its curvature value does not exceed a threshold. Pen speed data is used in case of a low threshold value giving incorrect recognition (due to adding unnecessary segmentation points).

On one of the evaluation tests which the method was tested on (different drawing styles on a standard set of drawings), the method achieved a 95% recognition rate.

Discussion:
Shape recognition by segmentation of re-sampled points at high curvature points is a novel alternative to the Sezgin time-dependent approach. Unlike Yu’s solution to the problems of the Sezgin’s method by employing an approach independent of time, the Kim method does not entirely do away with time. The time component is only used sparingly though, as it is employed as a reference as opposed to a dominant feature. Instead, the focus of the Kim method in segmentation at high curvatures deals with the problems of corner-finding difficulties in Sezgin’s method with what appears to be heavy attention on computing curvature estimations of points in relation to neighboring points. Adding local convexity and local monotonicity as features to aid in shape recognition does seem to hold promise, given the recognition results from the evaluation tests used.

One of the issues that was mentioned in the paper was the inability of their method, along with segmentation algorithms in general, to deal with strokes of smaller drawing areas. With their method employing re-sampling, it is obvious why they restrict strokes to a minimum 30x20 drawing area. Thus, images which are scaled down pose a problem to the Kim method. Judging by the tone from the writers, it seems like the kind of restriction that has to be accepted with their method since its a fact of life. Scaling up the stroke’s size, extracting the line that runs through the middle of the scaled-up stroke, and smoothing that line for shape recognition would be an obvious solution to a naïve person. If only smoothing weren’t such a trivial operation…

2 comments:

Grandmaster Mash said...

You could always pick the point distance threshold based on the stroke's bounding box size.

And I'm still wondering whether they use both convexity and monotonicity in their implementation. From their wording, it seems like they ignore convexity completely when calculating their curvature values.

rg said...

I have to agree that they are low on details here for implementation. Their sparing use of time isn't very clear to me. I get that they use it only in the extreme cases, but what defines the extreme cases exactly?