Saturday, October 27, 2007

Paper #17 - Envisioning Sketch Recognition: A Local Feature Based Approach to Recognizing Informal Sketches

Paper:
Envisioning Sketch Recognition: A Local Feature Based Approach to Recognizing Informal Sketches
(Michael Oltmans)

Summary:
The sketch recognition system created by the author strives to handle freehand sketching independent of constant system feedback and rigid drawing constraints. The first challenge in freehand sketch recognition involves shape drawing variations from the signal noise level (i.e., in which users did not intend) and the conceptual level (i.e., which reflects on the diversity of user sketching). The second challenge is unnecessary overtracing by the user, and the third challenge is the task of segmenting groups of strokes to a particular shape. The author relies on vision and machine learning techniques to cope with signal noise and conceptual variations, a visual approach to process overtraced strokes as original strokes, and an appearance-based approach which doesn’t need to handle segmentation of strokes since it deals with a sketch’s visual patterns instead of individual strokes.

Recognition of sketches is handled by the visual parts and shapes. For representing visual parts, a shape context feature called the bullseye. Visual parts are analyzed in a circular region consisting of a center point and concentric rings of wedges, where the wedges serve as a histogram of points within them. Shapes make use of the visual parts from the bullseye feature grouped together based on a codebook, which is a standard collection of parts analogous to a vocabulary. A set of parts is selected from a range of parts in the training set, clustered together into similar parts, and divided into entries of this codebook. Shape identification is then done on the parts by using some distance metric on match vectors, a representation which calculates the shape as a set of bullseye parts in terms of the codebook.

The author’s system handles two tasks: isolated shape classification and shape localization. Using the methods discussed previously, shape classification is done based on visual part representation in order to train a classifier that distinguishes the shapes. In the second task, a three-stage process is used to perform shape localization. Candidate locations are first scanned on the sketch, a classifier is then applied on those locations, and a final set of predictions from the sorted candidates are finally done. The final output of the system is this shape localization. From this entire approach, the author was able to achieve 89.5% shape recognition in isolated tasks, and a 74% shape recognition rate of 24% precision on shapes in complete sketches.

Discussion:
Up to this paper, a majority of the sketch recognition techniques covered this semester concerned stroke-based recognizers. What was very interesting about the approach used by the author of this paper was that it abandoned the primary techniques of stroke-based recognition for a more machine learning / computer vision-based one. The author was able to back up his separate approach with incredible recognition rates on outstandingly sloppy sketches of various symbols shown below.


It’s difficult to find noticeable faults, right off hand, with the author’s separate sketch recognition approach. The bullseye shape feature used as a histogram to calculate points has a functional approach to shape orientation, and it can also scale well by altering the radius size for varying shape size. Furthermore, information that is available in stroke-based recognition is still accessible to the author’s hybrid recognition technique, thus making use the strengths of the traditional stroke-based approaches without much of its weaknesses, if any. The only thing that comes in mind involves the shapes being incorrectly classified by the author’s recognizer. He attributes it to the strong similarity between shapes of other classes causing the mis-recognition.


I can’t blame the recognizer for mis-recognizing those terribly drawn symbols, as those symbols probably would have been difficult for the stroke-based recognizers already introduced in the semester to recognize anyway. One thing that may remedy the already high recognition rates by the author’s recognizer would be to employ context-recognition. It would be highly unrealistic to expect this recognizer to correctly recognize symbols that are difficult for a person to recognize when those symbols are in isolation. The thing is that these symbols are presented in a domain in which symbols can be disambiguated by figuring out that symbol’s context in relation to surrounding symbols that may have already been correctly classified. That’s the only improvement I can see, but that seems to be quite an ambitious addition for an already ambitious approach designed originally to recognize informal sketches. Plus, I think recognition through context is already a difficult problem in itself.

No comments: