author = "Michael Shilman and Hanna Pasula and Stuart Russell and Richard Newton",
title = "Statistical visual language models for ink parsing",
booktitle = proc # aaai # "Spring Symposium on Sketch Understanding",
year = "2002",
pages = "126--132",
publisher = "AAAI Press",
abstract = "In this paper we motivate a new technique for automatic recognition of hand-sketched digital ink. By viewing sketched drawings as utterances in a visual language, sketch recognition can be posed as an ambiguous parsing problem. On this premise we have developed an algorithm for ink parsing that uses a statistical model to disambiguate. Under this formulation, writing a new recognizer for a visual language is as simple as writing a declarative grammar for the language, generating a model from the grammar, and training the model on drawing examples. We evaluate the speed and accuracy of this approach for the sample domain of the SILK visual language and report positive initial results."
}
Built on top of SILK (Landay et al) to extends its recognition capabilities. Low-level primitives are recognized with Rubine's algorithm and his features. Higher-level components are constructed from low-level primitives and visual constraints placed on them. Constraints include:
- Distance, DeltaX, DeltaY, Overlap - spatial relations
- Angle
- WidthRatio, HeightRatio - size relations
Rather than trying all possible sets of ink strokes to get the optimal set of features and low-level labels to compute the MAP criterion for, the authors propose a simple ink parsing algorithm. The algorithm takes a stroke at a time and only considers groupings that are relevant to the new symbol. The parse tree is pruned using cutoff values for the constraint posteriors.
The authors play with the threshold value and can ashieve a max of about 80% stroke-level accuracy and 90% stroke-level precision @ 3.
No comments:
Post a Comment