End-user feature labeling: Supervised and semi-supervised approaches based on locally-weighted logistic regression

Das, S., Moore, T., Wong, W-K, Stumpf, S., Oberst, I., McIntosh, K. & Burnett, M. (2013). End-user feature labeling: Supervised and semi-supervised approaches based on locally-weighted logistic regression. Artificial Intelligence, 204, pp. 56-74. doi: 10.1016/j.artint.2013.08.003

Download (1MB) | Preview


When intelligent interfaces, such as intelligent desktop assistants, email classifiers, and recommender systems, customize themselves to a particular end user, such customizations can decrease productivity and increase frustration due to inaccurate predictions — especially in early stages when training data is limited. The end user ca
improve the learning algorithm by tediously labeling a substantial amount of additional training data, but this takes time and is too ad hoc to target a particular area of inaccuracy. To solve this problem, we propose new supervised and semi-supervised learning algorithms based on locally weighted logistic regression for feature labeling by end users, enabling them to point out which features are important for a class, rather than provide new training instances.

We first evaluate our algorithms against other feature labeling algorithms under idealized conditions using feature labels generated by an oracle. In addition, another of our contributions is an evaluation of feature labeling algorithms under real world conditions using feature labels harvested from actual end users in our user study. Our user study is the first statistical user study for feature labeling involving a large number of end users (43 participants), all of whom have no background in machine learning.

Our supervised and semi-supervised algorithms were among
the best performers when compared to other feature labeling algorithms in the idealized setting and they are also robust to poor quality feature labels provided by ordinary
end users in our study. We also perform an analysis to investigate the relative gains of incorporating the different sources of knowledge available in the labeled training set, the feature labels and the unlabeled data. Together, our results strongly suggest that feature labeling by end users is both viable and effective for allowing end users to improve the learning algorithm behind their customized applications.

Item Type: Article
Uncontrolled Keywords: Feature labeling, locally weighted logistic regression, machine learning,intelligent interfaces, semi-supervised learning
Subjects: Q Science > QA Mathematics > QA76 Computer software
Z Bibliography. Library Science. Information Resources > Z665 Library Science. Information Science
Divisions: School of Informatics > Centre for Human Computer Interaction Design
URI: http://openaccess.city.ac.uk/id/eprint/2741

Actions (login required)

View Item View Item


Downloads per month over past year

View more statistics