Where Are My Intelligent Assistant's Mistakes? A Systematic Testing Approach

Kulesza, T., Burnett, M., Stumpf, S., Wong, W., Das, S., Groce, A., Shinsel, A., Bice, F. & McIntosh, K. (2011). Where Are My Intelligent Assistant's Mistakes? A Systematic Testing Approach. Paper presented at the Third International Symposium on End-User Development (IS-EUD), 07 - 10 June 2011, Torre Canne, Italy.

[img]
Preview
PDF
Download (626kB) | Preview

Abstract

Intelligent assistants are handling increasingly critical tasks, but until now, end users have had no way to systematically assess where their assistants make mistakes. For some intelligent assistants, this is a serious problem: if the assistant is doing work that is important, such as assisting with qualitative research or monitoring an elderly parent’s safety, the user may pay a high cost for unnoticed mistakes. This paper addresses the problem with WYSIWYT/ML (What You See Is What You Test for Machine Learning), a human/computer partnership that enables end users to systematically test intelligent assistants. Our empirical evaluation shows that WYSIWYT/ML helped end users find assistants’ mistakes significantly more effectively than ad hoc testing. Not only did it allow users to assess an assistant’s work on an average of 117 predictions in only 10 minutes, it also scaled to a much larger data set, assessing an assistant’s work on 623 out of 1,448 predictions using only the users’ original 10 minutes’ testing effort.

Item Type: Conference or Workshop Item (Paper)
Additional Information: The original publication is available at http://www.springer.com/computer/swe/book/978-3-642-21529-2
Uncontrolled Keywords: intelligent assistants, end-user programming, end-user development, end-user software engineering, testing, machine learning
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
T Technology > T Technology (General)
Divisions: School of Informatics > Centre for Human Computer Interaction Design
URI: http://openaccess.city.ac.uk/id/eprint/214

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics