City Research Online

Can Large Language Models Be Good Companions?

Xu, Z. ORCID: 0009-0008-4497-724X, Xu, H. ORCID: 0009-0000-7570-4078, Lu, Z. ORCID: 0009-0005-6159-6914 , Zhao, Y. ORCID: 0000-0001-5902-1306, Zhu, R. ORCID: 0000-0002-9944-0369, Wang, Y. ORCID: 0000-0002-6220-029X, Dong, M. ORCID: 0000-0002-8897-5931, Chang, Y. ORCID: 0000-0003-2607-916X, Lv, Q. ORCID: 0000-0002-9437-1376, Dick, R. P. ORCID: 0000-0001-5428-9530, Yang, F. ORCID: 0000-0003-2164-8175, Lu, T. ORCID: 0000-0002-6633-4826, Gu, N. ORCID: 0000-0002-2915-974X & Shang, L. ORCID: 0000-0003-3944-7531 (2024). Can Large Language Models Be Good Companions?. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 8(2), pp. 1-41. doi: 10.1145/3659600

Abstract

Developing chatbots as personal companions has long been a goal of artificial intelligence researchers. Recent advances in Large Language Models (LLMs) have delivered a practical solution for endowing chatbots with anthropomorphic language capabilities. However, it takes more than LLMs to enable chatbots that can act as companions. Humans use their understanding of individual personalities to drive conversations. Chatbots also require this capability to enable human-like companionship. They should act based on personalized, real-time, and time-evolving knowledge of their users. We define such essential knowledge as the common ground between chatbots and their users, and we propose to build a common-ground-aware dialogue system from an LLM-based module, named OS-1, to enable chatbot companionship. Hosted by eyewear, OS-1 can sense the visual and audio signals the user receives and extract real-time contextual semantics. Those semantics are categorized and recorded to formulate historical contexts from which the user's profile is distilled and evolves over time, i.e., OS-1 gradually learns about its user. OS-1 combines knowledge from real-time semantics, historical contexts, and user-specific profiles to produce a common-ground-aware prompt input into the LLM module. The LLM's output is converted to audio, spoken to the wearer when appropriate. We conduct laboratory and in-field studies to assess OS-1's ability to build common ground between the chatbot and its user. The technical feasibility and capabilities of the system are also evaluated. Our results show that by utilizing personal context, OS-1 progressively develops a better understanding of its users. This enhances user satisfaction and potentially leads to various personal service scenarios, such as emotional support and assistance.

Publication Type: Article
Additional Information: © the authors | ACM 2024. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, https://doi.org/10.1145/3659600.
Publisher Keywords: Smart eyewear, large language model, common ground, contextaware
Subjects: H Social Sciences > HN Social history and conditions. Social problems. Social reform
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Departments: Bayes Business School
Bayes Business School > Actuarial Science & Insurance
SWORD Depositor:
[thumbnail of main.pdf]
Preview
Text - Accepted Version
Download (6MB) | Preview

Export

Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Downloads

Downloads per month over past year

View more statistics

Actions (login required)

Admin Login Admin Login