Friday, October 26, 2007

Paper #16 - Naturally Conveyed Explanations of Device Behavior

Paper:
Naturally Conveyed Explanations of Device Behavior
(Michael Oltmans and Randall Davis)

Summary:
The authors of this paper created a multi-modal system which can interpret simple two-dimensional kinematic devices called ASSISTANCE. The program interprets these devices using sketch and verbal input. In order to describe a device to a computer as easily to a person, a description would need to contain that device’s structure and behavior. A tool in ASSISTANCE is capable of enhancing descriptions by understanding a device’s graphical representation and also a verbal description of its behavior. In order to use ASSISTANCE, a designer first sketches the device, then explains the structure to the system, and finally has the system interpret the utterance and gesture. To make the system manageable, the domain is limited to two-dimensional kinematic devices.

The system’s structure is composed of three parts. First, the device’s sketch (i.e., physical body and arrows) and speech (parsed textual phrases) descriptions are handled by the ASSIST sketch and ViaVoice voice recognizer. Second, output in the form of propositional statements is sent to ASSISTANCE for interpretation. Last, event and casual links generated by ASSISTANCE from a truth maintenance and rule system is sent to the last part, which then finds a consistent casual structure that also closely matches the designer’s descriptions of the device.

The method that the authors use to evaluate ASSISTANCE is by comparing it to other alternatives and also by analyzing its usability. The system strives to combine the mathematical precision of formal alternatives like CAD tools and the constraint-independence of informal alternatives such as written and verbal explanations. It has also demonstrated the ability to interpret a designer’s explanations of a device in the early design stage.

Discussion:
Previous papers, besides Abam’s, introduced us to novel approaches in solving various sketch recognition problems. What those papers had in common was the fact that they dealt solely on sketch recognition. With this paper, it’s very interesting that we’re introduced to a multi-modal method which employs speech recognition in addition to the already familiar sketch recognition. This is quite an ambitious combination for the author to use in order to interpret two-dimensional kinematic devices in a domain as difficult as physics.

I agree with one fault brought up by Dr. Hammond during class about the use of two different recognition systems, which was how the system copes with speech input which completely contradicts sketch input. This can be remedied in several ways, one being giving priority of one input over another. The other fault I see lies in the level of usefulness this kind of multi-modal system would bring, given the amount of overhead needed to see benefits over alternative approaches. The argument that the author makes on his system is that it gives the flexibility of descriptions made in normal communication and the precision of computing tools. A sketch-voice recognition approach is intriguing, but it’s a bit difficult for me to gauge whether or not the amount of effort it takes to employ such an approach outweighs simply resorting to an all-sketch recognition method instead.

No comments: