UbiWell Lab at UbiComp 2025
Published on
Several lab members attended UbiComp 2025 in Finland!
Overview
- 4 papers
- 1 workshop
- PACM IMWUT editorial board meeting
- Members who attended: Varun, Akshat, Ha, Jiachen, and Yuna.
Papers
Choube et al., IMWUT 2025
Akshat presented his paper on GLOSS, a fully automated open-ended sensemaking system powered by a group of LLMs that triangulate raw passive sensing data to answer complex natural language queries across multiple health and behavior use cases.
The ubiquitous presence of smartphones and wearables has enabled researchers to build prediction and detection models for various health and behavior outcomes using passive sensing data from these devices. Achieving a high-level, holistic understanding of an individual's behavior and context, however, remains a significant challenge. Due to the nature of passive sensing data, sensemaking -- the process of interpreting and extracting insights -- requires both domain knowledge and technical expertise, creating barriers for different stakeholders. Existing systems designed to support sensemaking are either not open-ended or cannot perform complex data triangulation. In this paper, we present a novel sensemaking system, Group of LLMs for Open-ended Sensemaking (GLOSS), capable of open-ended sensemaking and performing complex multimodal triangulation to derive insights. We demonstrate that GLOSS significantly outperforms the commonly used Retrieval-Augmented Generation (RAG) technique, achieving 87.93% accuracy and 66.19% consistency, compared to RAG's 29.31% accuracy and 52.85% consistency. Furthermore, we showcase the promise of GLOSS through four use cases inspired by prior and ongoing work in the UbiComp and HCI communities. Finally, we discuss the potential of GLOSS, its broader implications, and the limitations of our work.
Li et al., IMWUT 2025
Jiachen presented her paper introducing Vital Insight, an interactive LLM-assisted visualization system that helps experts make sense of multi-modal personal tracking data by integrating sensor signals, self-reports, and contextual insights.
Passive tracking methods, such as phone and wearable sensing, have become dominant in monitoring human behaviors in modern ubiquitous computing studies. While there have been significant advances in machine-learning approaches to translate periods of raw sensor data to model momentary behaviors, (e.g., physical activity recognition), there still remains a significant gap in the translation of these sensing streams into meaningful, high-level, context-aware insights that are required for various applications (e.g., summarizing an individual's daily routine). To bridge this gap, experts often need to employ a context-driven sensemaking process in real-world studies to derive insights. This process often requires manual effort and can be challenging even for experienced researchers due to the complexity of human behaviors. We conducted three rounds of user studies with 21 experts to explore solutions to address challenges with sensemaking. We follow a human-centered design process to identify needs and design, iterate, build, and evaluate Vital Insight (VI), a novel, LLM-assisted, prototype system to enable human-in-the-loop inference (sensemaking) and visualizations of multi-modal passive sensing data from smartphones and wearables. Using the prototype as a technology probe, we observe experts' interactions with it and develop an expert sensemaking model that explains how experts move between direct data representations and AI-supported inferences to explore, question, and validate insights. Through this iterative process, we also synthesize and discuss a list of design implications for the design of future AI-augmented visualization systems to better assist experts' sensemaking processes in multi-modal health sensing data.
Le at al., GenAI for Human Sensing Workshop
Ha presented her paper exploring the potential of a multi-agent LLM system for human activity recognition at the GenAI4HS Workshop.
Accurate human activity recognition (HAR) is critical for health monitoring and behavior-aware systems. Developing reliable HAR models, however, requires large, high-quality labeled datasets that are challenging to collect in free-living settings. Although self-reports offer a practical solution for acquiring activity annotations, they are prone to recall biases, missing data, and human errors. Context-assisted recall can help participants remember their activities more accurately by providing visualizations of multiple data streams, but triangulating this information remains a burdensome and cognitively demanding task. In this work, we adapt GLOSS, a multi-agent LLM system that can triangulate self-reports and passive sensing data to assist participants in activity recall and annotation by suggesting the most likely activities. Our results show that GLOSS provides reasonable activity suggestions that align with human recall (63–75% agreement) and even effectively identifies and corrects common human annotation errors. These findings demonstrate the potential of LLM-powered, human-in-the-loop approaches to improve the quality and scalability of activity annotation in real-world HAR studies.
Watanabe at al., MHSI Workshop
Yuna presented her paper highlighting the need for adaptive filtering in PPG signals given the additional sources of noise beyond motion artifacts.
Wearable physiological monitors are ubiquitous, and photoplethysmography (PPG) is the standard low-cost sensor for measuring cardiac activity. Metrics such as inter-beat interval (IBI) and pulse-rate variability (PRV) -- core markers of stress, anxiety, and other mental-health outcomes -- are routinely extracted from PPG, yet preprocessing remains non-standardized. Prior work has focused on removing motion artifacts; however, our preliminary analysis reveals sizeable beat-detection errors even in low-motion data, implying artifact removal alone may not guarantee accurate IBI and PRV estimation. We therefore investigate how band-pass cutoff frequencies affect beat-detection accuracy and whether optimal settings depend on specific persons and tasks observed. We demonstrate that a fixed filter produces substantial errors, whereas the best cutoffs differ markedly across individuals and contexts. Further, tuning cutoffs per person and task raised beat-location accuracy by up to 7.15% and reduced IBI and PRV errors by as much as 35 ms and 145 ms, respectively, relative to the fixed filter. These findings expose a long-overlooked limitation of fixed band-pass filters and highlight the potential of adaptive, signal-specific preprocessing to improve the accuracy and validity of PPG-based mental-health measures.
Workshop
Varun co-organized the 9th Mental Health Sensing and Intervention Workshop. This year the workshop featured two amazing keynotes, 11 papers presentations, and roundtable discussions. The Center for Technology and Behavioral Health (CTBH) sponsored the best paper award!