ACM CHIIR 2024 Logo

SESSION: Session 1: Behaviour Analysis & Information Use

Unveiling Health Literacy through Web Search Behavior: A Classification-Based Analysis of User Interactions

More and more people are relying on the Web to find health information. Challenges faced by individuals with low health literacy in the real world likely persist in the virtual realm. To assist these users, our first step is to identify them. This study aims to uncover disparities in the information-seeking behavior of users with varying levels of health literacy. We utilized data gathered from a prior user experiment. Our approach involves a classification scheme encompassing events during web search sessions, spanning the browser, search engine, and web pages. Employing this scheme, we logged interactions from video recordings in the user study and subjected the event logs to descriptive and inferential analyses. Our data analysis unveils distinctive patterns within the low health literacy group. They exhibit a higher frequency of query reformulations with entirely new terms, engage in more left clicks, utilize the browser's backward functionality more frequently, and invest more time in interactions, including increased scrolling on results pages. Conversely, the high health literacy group demonstrates a greater propensity to click on universal results, extract text from URLs more often, and make more clicks with the mouse middle button. These findings offer valuable insights for inferring users' health literacy in a non-intrusive manner. The automatic inference of health literacy can pave the way for personalized services, enhancing accessibility to information and education for individuals with low health literacy, among other benefits.

Why Do Customers Return Products? Using Customer Reviews to Predict Product Return Behaviors

Product returns are an increasing environmental problem, as an estimated 25% of returned products end up as landfill [10]. Returns are expensive for retailers as well, and it is estimated that 15-40% of all online purchases are returned [34]. The problem could be mitigated by identifying issues with a product that are likely to lead to its return, before many have sold. Understanding and predicting return reasons can help identify manufacturing defects, misleading information in the product description or reviews, issues with a seller or shipping company, and customers who are habitual returners. While there has been much work to identify and predict return volume, little attention has been given to the reasons for the return. In this paper we explore how customer reviews could be used as signals to identify return reasons. We developed a multi-class classifier to predict return reasons, with a fine-tuned BERT-based model to encode customer review text as features. The classifier with customer review text yields an increase of more than 20% average precision over the baseline classifier with no reviews text. We also showed that we can use aggregated review information to predict product return in case the customer returning the product did not write a review. Lastly we show that reviews can be used to identify nuanced return reasons beyond what the customer indicated.

Uncharted Territory: Understanding Exploratory Search Behaviours in Literature Reviews

In the realm of Information Seeking and Retrieval (ISR), searching the literature for relevant references in the context of academic work, such as theses or publications, is widely recognised as an exploratory search task. This task becomes particularly challenging when searchers lack prior knowledge of the subject matter. To help address the growing need for supporting exploratory search endeavours, ISR researchers have developed exploratory search models and interfaces. However, while much attention has been given to conceptualising exploratory searches, little focus has been placed on understanding the specific approaches and behaviours that searchers employ during conducting literature reviews. This paper aims to bridge this gap by conducting semi-structured interviews with 30 Master’s students at the end of a user study. This paper provides comprehensive definitions for the fundamental exploratory characteristics from the recent conceptual model, pinpoints potential factors that could influence these characteristics, introduces new exploratory dimensions, and extends our comprehension of existing ones in the academic context. It also uncovers a spectrum of approaches used in literature searches, shedding light on how individuals rely on specific paper sections to measure their relevance and highlighting essential facets of knowledge acquisition in the context of literature searches.

The Impact of CHIIR Publications: A Study of Eight Years of CHIIR

Across all scientific fields, there is an increased focus on the impact of scientific research: what academic and societal benefits does it provide? This question has spurred the development of a variety of different approaches to impact assessment, each appropriate in different circumstances. In this paper, we study the academic impact of the CHIIR community through a comprehensive analysis of the work published in the 2016-2023 CHIIR conference series. We collect citation counts, citing documents, and altmetrics scores for all CHIIR publications to determine their academic impact across a variety of different attributes of the CHIIR publications. In addition, we analyze a subset of citation contexts in the papers that have cited CHIIR publications to analyze how they are being used and what that means for their potential impact. Finally, we attempt to predict which properties of CHIIR publications are most predictive of future impact.

SESSION: Session 2: Presentation

Mobile search made easier: An ability-based mobile search prototype for people with dyslexia

Although 1 person in 14 has dyslexia, most search interfaces are designed based on a ‘one-size-fits-all’ approach, creating inequity for neurodiverse searchers. This is also the case for mobile search, which accounts for most Google searches. While existing research has found search typically presents greater challenges for people with dyslexia, no prior work has examined how best to support them when searching on mobile devices. Rather than focus on addressing their search difficulties, we adopted an ability-based design approach. This involved designing a prototype, based on modifications to Google's mobile SERPs, aimed at enhancing their abilities – identified through interviews and observations with mobile searchers with dyslexia. A user evaluation found several of the modifications were useful; they supported searchers with dyslexia in making relevance judgements and boosted their resilience and self-efficacy. This research provides valuable insight into how to better support mobile searchers with dyslexia that can inform IIR research and design. It also demonstrates the potential of ability-based design approaches in supporting neurodiverse searchers.

Improving expert search effectiveness: Comparing ways to rank and present search results

Expert search systems help professionals find colleagues with specific expertise. Expert search results can be presented as a list of documents with their associated experts, or as a list of candidate experts with evidence for their expertise based on documents they authored. The type of result may affect search behaviour, and therefore search task performance. Previous work has not considered such effects from the result presentation, focusing instead on how to rank experts or on ways to interact with the search results.

We compare the task performance of novice users using either a document-centric interface (where each search result is a document and its associated expert) or a candidate-centric interface (where each search result is a candidate expert and their associated documents). We also compare candidate-centric and document-centric ranking functions per interface.

A post-experiment survey indicated that two variables affect which interface participants preferred: the retrieval unit (candidates or documents) and the complexity (number of documents per search result). These variables affected participants’ search strategy, and consequently their task performance. A quantitative analysis revealed that 1) using the candidate-centric interface results in a higher rate of correctly completed tasks, as users evaluate candidates more thoroughly, and 2) the document-centric ranking yields faster task completion. Weak evidence of a statistical interaction effect was found that prevents a straightforward combination of the most effective interface type and the most efficient ranking type. Present work resulted in a more effective, albeit less efficient, search engine for expert search at the municipality of Utrecht.

Visual Keyword/Result Linking: Using Interaction to Dynamically Reveal Relationships

Keywords contain important contextual information about search results within academic digital library search interfaces. However, such information tends to be underutilized within modern search interface designs. In prior work, methods for visually linking keywords between search results have been proposed and studied. In this research, we analyze the design space and propose a new approach that aggregates the keywords over all items on the search engine results page (SERP), visually linking them back to their source search result. We have created interactive and static versions of both interfaces, and conducted a controlled laboratory study to assess the impact of the interfaces on measures of utility (efficiency, effectiveness, feature use) and perceived value (usefulness, ease of use, satisfaction, user engagement, knowledge gain, and interest gain). The findings from this research show the merit of using keywords to provide summaries of documents and search result sets, the value of making keywords interactive, and the benefit of using visualization to interactively link information within a search engine results page. The differences between providing the keywords along side each document or aggregated over the entire SERP were minimal, suggesting that it does not matter how the keywords are represented as long as they can be used to interactively reveal relationships among the search results.

The Influence of Presentation and Performance on User Satisfaction

Information Retrieval (IR) systems are designed to provide users with a ranked list of results based on their queries. The effectiveness of an IR system is gauged not just by its ability to retrieve relevant results but also by how it presents these results to users; an engaging presentation often correlates with increased user satisfaction. While existing research has delved into the link between user satisfaction, IR performance metrics, and presentation, these aspects have typically been investigated in isolation. Our research aims to bridge this gap by examining the relationship between query performance, presentation and user satisfaction. For our analysis, we conducted a between-subjects experiment comparing the effectiveness of various result card layouts for an ad-hoc news search interface. Drawing data from the TREC WaPo 2018 collection, we centered our study on four specific topics. Within each of these topics, we assessed six distinct queries with varying nDCG values. Our study involved 164 participants who were exposed to one of five distinct layouts containing result cards, such as “title”, “title+image”, or “title+image+summary”. Our findings indicate that while nDCG is a strong predictor of user satisfaction at the query level, there exists no linear relationship between the performance of the query, presentation of results and user satisfaction. However, when considering the total gain on the initial result page, we observed that presentation does play a significant role in user satisfaction (at the query level) for certain layouts with result cards such as, title+image or title+image+summary. Our results also suggest that the layout differences have complex and multifaceted impacts on satisfaction. We demonstrate the capacity to equalize user satisfaction levels between queries of varying performance by changing how results are presented. This emphasizes the necessity to harmonize both performance and presentation in IR systems, considering users’ diverse preferences. Ultimately, our insights can steer the evolution of more user-aligned IR systems, underscoring the balance between system performance and result presentation.

SESSION: Session 3: Recommendation

Trust Through Recommendation in E-commerce

We explore the influence of recommender systems on trust among consumers in the fashion e-commerce domain. Anchoring on the Trust Building Model (TBM) [13], we investigate its adaptability and applicability in the context of interactive communication in recommender systems. Primarily leaning on qualitative data collection methods, namely semi-structured interviews, our work evaluates the classic TBM components – structure assurance, perceived reputation, perceived site quality, perceived web risk, trusting belief, and behavioral intention – affirming their relevance to recommender systems. Furthermore, new components, i.e., perceived service and recommendation quality, previous experience, perceived enjoyment, perceived recommendation authenticity, and intention to share interaction data, were examined in the context of recommender systems. Significantly, our study unveils that trusting beliefs can notably influence TBM’s preliminary behavioral intentions, with the competence belief having the most substantial impact, challenging the conventional TBM findings. The outcomes highlight that consumers place heightened value on the tangible provisions from the company over ethics-based factors like integrity. The proposed refined TBM offers potential in enhancing recommender systems in fashion e-commerce, facilitating a better understanding of consumer behavior and trust dynamics.

The Effect of Simulated Contextual Factors on Recipe Rating and Nutritional Intake Behaviour

Despite the importance of context in Recommender Systems (RSs) more generally, and its clear applicability in the food domain, most existing research focuses on single contextual factors, and only considers simple extrinsic factors such as location and time. No RSs research has systematically explored the impact of multiple dynamic factors, or investigated the effect of emotion in determining people’s eating, recipe rating and nutritional intake behaviour. To bridge these gaps, we conducted a comprehensive large-scale (n=397) crowdsourced experimental study to uncover the intricate relationship between various simulated contextual factors and users’ subsequent recipe rating and implied nutritional intake behaviour. We further aimed to explore how these contextual factors can be incorporated to improve recommendation performance. Four distinct types of contextual factors were investigated: seasonal, emotional, busyness and physical activity, encompassing a total of seven elements. Our findings show that people’s eating preferences and the likelihood of them choosing to eat healthy recipes vary depending on the simulated context they find themselves in. Moreover, we demonstrate how these contextual features can be used to significantly improve recipe rating prediction performance. Our research has implications for the future development of food RSs, and shows that emotion-aware systems could lead to better healthy food recommendations.

The Dark Matter of Serendipity in Recommender Systems

Serendipity has been recognized as a valuable property of recommender systems. While there is a lack of consensus on the precise definition of serendipity, it is often conceptualized in terms of the relevance, novelty and unexpectedness of recommendations. However, the common understanding and original meaning of serendipity is conceptually broader, requiring serendipitous encounters to be neither novel nor unexpected. Recent work has highlighted the various ways in which serendipity can manifest, leading to a more generalized definition of serendipity. In this paper, we conducted an observational study where we collected 2002 survey responses from 397 users of an online article recommender system. In our study, we found a significant proportion of serendipitous recommendations were missed by the conventional definitions used in the recommender systems research literature, exposing the “dark matter” of serendipity that has been overlooked in prior studies. Interestingly, users’ opinions of which articles should be considered serendipitous did not strongly align with any of the definitions investigated. Furthermore, despite several user behaviors being significantly associated with a majority of definitions of serendipity, the overall goodness of fit was very low. Our findings highlight the issues of evaluating serendipity in recommender systems and the challenge of reconciling serendipity with user expectations.

SESSION: Session 4: Engagement

Stopped yet Completed: Exploring the Relationships between Session-stopping Reasons, Information Types, and Cognitive Activities in Cross-Session Searches

The aim of this study is to explore connections among the reasons that lead users to stop a search session, the types of information they found during the session, and the cognitive activities involved in interacting with such information – all in the context of complex tasks involving multiple search sessions. Prior research has primarily examined search-stopping behavior in single search sessions, focusing on instances when users identify needed information or are satisfied with their findings. However, recent insights from the Search as Learning (SAL) community highlight the relationships between user search behavior, the nature of the information they encountered, and various cognitive activities during the information-searching process. We conducted a diary study with 25 participants engaged in real-life tasks that spanned multiple search sessions over time. Our analysis found six predominant reasons users elected to stop a search session during the cross-session search process. We also examined the types of information that participants found and the cognitive activities that they engaged in during search sessions. We found statistically significant associations between the information types found and the session stopping reasons. We did not find significant associations between the cognitive activities and stopping reasons, but did observe that almost half the sessions reached the evaluate level of cognitive activity. Our findings provide insights about search behaviors and help inform the design of tools to support users working on multi-session search tasks.

From Potential to Practice: Intellectual Humility During Search on Debated Topics

An essential characteristic for unbiased and diligent information-seeking that can enable informed opinion formation and decision-making is intellectual humility (IH), the awareness of the limitations of one’s knowledge and opinions. While researchers have recognized the potential to boost IH in individuals, the effect of such interventions on their search behavior, along with the broader significance of IH in the context of web search on debated topics remains unexplored. In this paper, we present the results of a preregistered user study (N = 299) that we conducted to (1) test the effect of three interventions that boost self-reported IH on opinionated individuals’ search behavior and (2) explore the role of IH in the search process of opinionated individuals more broadly. IH-boosting interventions did not affect search behavior; we attribute this to the high familiarity of the search environment, prompting searchers to default to their usual search behavior. Still, explorations of the role of IH in the search process indicate that IH and IH-related search intentions should be considered as relevant factors in the pursuit of supporting unbiased and diligent search on debated topics. Based on our exploratory findings, we argue that future research should investigate interventions that are more directly integrated into the search process, as well as such that combine boosting IH with encouraging searchers to approach the search task in an IH-driven way and promoting transparency for appropriate reliance on the search system and ranking.

A User Study on the Acceptance of Native Advertising in Generative IR

Commercial conversational search engines need a business model. Since advertising is the main source of revenue for “traditional” ten-blue-links web search, ads are not an unlikely option for conversational search either. In traditional web search, ads are usually placed above organic search results. However, large language models (LLMs) may be dynamically prompted to blend product placements with “organic” conversational responses, similar to native advertising in journalism. This type of advertising can be very difficult to recognize, depending on how subtly it is integrated and disclosed. To raise awareness of this potential development, we analyze the capabilities of current LLMs to blend ads with generative search results. In a user study, we ask people about the perceived quality of (emulated) search results in different advertising scenarios. In a substantial number of cases, our survey participants do not notice brand or product placements when they do not expect them. Thus, our results show the potential of LLMs to subtly mix advertising with generated search results. This warrants further investigation, for example, to develop appropriate advertising disclosure rules, and to detect advertising in generated results. Our research also raises broader concerns about whether commercial or open-source generative models can be trusted not to be fine-tuned to generate ads rather than “genuine” responses.

Seeking Socially Responsible Consumers: Exploring the Intention-Search-Behaviour Gap

The increasing prominence of “Socially Responsible Consumers” has brought about a heightened focus on the ethical, environmental, social, and ideological dimensions influencing product purchasing decisions. Despite this emphasis, studies have consistently revealed a significant gap between individuals’ intentions to be socially responsible and their actual purchasing behaviors: they often choose products that do not align with their values. This paper aims to investigate the role of “search” and it how influences this gap. Our investigation involves an online survey of 286 participants, where we inquire about their search behaviors and whether they considered various dimensions—ranging from price and features to environmental, social, and governance issues — in relation to a recent purchase. Contrary to expectations of a clear intention-behavior gap, our findings suggest most participants exhibited indifference or lack of awareness regarding these “responsible” aspects. While, for those participants who were more ethically minded, they reported difficulties related to searching for and acquiring information regarding such aspects, which contributed to the gap. Our findings suggests that part of the intention-behaviour gap can be framed as an information seeking problem. Moreover our findings motivate the development of search systems and platforms that better help support consumers make more informed and responsible purchasing decisions.

Generative Information Systems Are Great If You Can Read

Generative models, especially in information systems like ChatGPT and Bing Chat, have become increasingly integral to our daily lives. Their significance lies in their potential to revolutionize how we access, process, and generate information  [44]. However, a gap exists in ensuring these systems are accessible to all, especially considering the literacy challenges faced by a significant portion of the population in (but not limited to) English-speaking countries. This paper aims to investigate the “readability’’ of generative information systems and their accessibility barriers, particularly for those with literacy challenges. Using popular instruction fine-tuning datasets, we found that this training data could produce systems that generate at a college level, potentially excluding a large demographic. Our research methods involved analyzing the responses of popular Large Language Models (LLMs) and examining potential biases in how they can be trained. The key message is the urgent need for inclusivity in systems incorporating generative models, such as those studied by the Information Retrieval (IR) community. Our findings indicate that current generative systems might not be accessible to individuals with cognitive and literacy challenges, emphasizing the importance of ensuring that advancements in this field benefit everyone. By situating our research within the sphere of information seeking and retrieval, we underscore the essential role of these technologies in augmenting accessibility and efficiency of information access, thereby broadening their reach and enhancing user engagement.

SESSION: Session 5: Search

Teachable Facets: A Framework of Interactive Machine Teaching for Information Filtering

Interactive tools help users filter relevant information from massive online sources, like news feeds and online discussion forums, by enabling them to externalize their preferences. However, users’ information goals and preferences are often complex and are comprised of data attributes and a user’s subjective judgements over these attributes. For instance, when filtering news articles based on their newsworthiness, the system must capture both data attributes like recency and shareability of the article, along with the user’s personal and flexible assessment of news sentiment. While most interactive tools enable users to externalize goals that are expressible as true/false statements, they do not support incorporating subjective, loosely structured judgements of data attributes which fulfill complex goals. In this paper, we introduce Teachable Facets (TF), widgets that users can create on the fly to filter relevant information to improve the sense-making of analysts. These teachable widgets employ a Machine Teaching (MT) framework to enable users to formulate personalized filtering criteria for complex, multi-dimensional, loosely indexed, and unstructured data; teach a filtering criterion using representative samples; apply these filters to new data streams; and assess the relevance of outcomes. Through a user study, we evaluate the performance of these filters based on their ability to discover relevant items and the expressibility they offer to the users in teaching criteria. In our discussion, we identify ways this approach might improve future systems and delineate implications should such systems be deployed broadly.

Something Just Like This: A Secret History of the Role of Analogues in Information Seeking

Information seekers often want 'something just like this', an information object (song, book, movie) like one they already know about. This approach is the premise of many modern recommender systems, which ask information seekers for examples of things they like in order to return similar things. This type of information seeking, though, is not particularly well supported by search engines, partly because the ways in which objects can be alike are multifaceted and difficult to articulate, partly because the analogue might emerge in the process of information seeking. This approach to information seeking has been touched on repeatedly in information science literature but not examined in detail. In this paper, we present a pair of studies that explore both how people seek information using analogues online and the ways in which information objects might be alike. Our results offer a novel understanding of a previously underexplored but common behaviour, addressing how analogues are identified, how they are used in information seeking, and the ways in which objects can be analogous from an information seeker's perspective. Our work can support the development of recommender systems, conversational search approaches, and digital interfaces that support this common but little-examined type of information seeking.

Human and Large Language Model Intent Detection in Image-Based Self-Expression of People with Intellectual Disability

Non-verbal communication is essential for the social inclusion of individuals with an intellectual disability, affecting interactions with others as well as technological systems. This study focuses on non-symbolic communication of people with intellectual disability through generic images without specific or detailed subject matter. A key challenge in this medium is discerning the underlying intentions behind images selected as visual prompts for conversation.

Through interviews with people with intellectual disability, we collected a dataset of images and their associated communication intentions, as well as the interpreted intention from potential audiences including humans and systems. Notably, we employed GPT-4, a Large Language Model (LLM), to decipher the images and provide insights from the Conversational Systems’ (CSs) perspective. Adopting a user-centered qualitative approach, we analyzed this data to understand the nuances of image-based self-expression and identify areas of ambiguity. Additionally, we present an analysis of comprehension levels and challenges for intent detection faced by humans and systems.

Our findings suggest that generic images offer a rich medium for individuals to share personal interests and unique experiences, enriching communication. Within this framework, we identified that situational, personal, and historical contexts facilitate understanding intents. In a comparative analysis of human and systems viewpoints, we found that LLMs have encouraging capabilities for detecting various aspects of ambiguity in predicting user intentions. Based on this, we offer design strategies for intent clarification and crafting more inclusive multimodal conversational tools for individuals with intellectual disability. Findings can be extrapolated to enhance image-based information retrieval and recommendation systems.

SESSION: Session 6: Conversation

Towards Self-Contained Answers: Entity-Based Answer Rewriting in Conversational Search

Conversational Information Seeking (CIS) is an emerging paradigm for knowledge acquisition and exploratory search. Traditional web search interfaces enable easy exploration of entities, but this is limited in conversational settings due to the limited-bandwidth interface. This paper explore ways to rewrite answers in CIS, so that users can understand them without having to resort to external services or sources. Specifically, we focus on salient entities—entities that are central to understanding the answer. As our first contribution, we create a dataset of conversations annotated with entities for saliency. Our analysis of the collected data reveals that the majority of answers contain salient entities. As our second contribution, we propose two answer rewriting strategies aimed at improving the overall user experience in CIS. One approach expands answers with inline definitions of salient entities, making the answer self-contained. The other approach complements answers with follow-up questions, offering users the possibility to learn more about specific entities. Results of a crowdsourcing-based study indicate that rewritten answers are clearly preferred over the original ones. We also find that inline definitions tend to be favored over follow-up questions, but this choice is highly subjective, thereby providing a promising future direction for personalization.

[citation needed]: An Examination of Types and Purpose of Evidence Provided in Three Online Discussions on Reddit

In a world where misinformation is abundant, and conspiracy theorists urge others to 'do their own research’, how do people use evidence in online discussions? What types of evidence do they provide, and for what purpose? Decades of human information interaction research has focused on making it easy to share and discuss information online; and decades of information literacy research have examined how to promote critical thinking and evaluation. However, there is a lack both of systematic analyses of evidence use in online discussions, and the ways community norms affect use of evidence in those discussions. We present a mixed methods analysis of the use of three formats of external evidence (images, links, and direct quotation by using blockquotes) across three Reddit communities with very different norms. One focuses on promoting conspiracy theories, another on debunking them, and a third on personal view change. We investigate the use of these evidence formats within and between communities to understand how evidence is used in different kinds of conversation. Our findings support the design of online information tools that promote good evidentiary practice.

Restarting the conversation about conversational search: exploring new possibilities for multimodal and collaborative systems with people with intellectual disability

Advanced sensing capabilities are emerging in smart conversational systems. Without support, keyword-based searching can be challenging, requiring expertise, experience and the ability to formulate abstract knowledge of the information-seeking process. However, people with intellectual disability are often comfortable communicating in conversations that allow for both verbal and visual communication. This paper presents an understanding of multimodal and collaborative search systems for people with intellectual disability. First, we present a Wizard of Oz probe, called MMCS, that provides flexibility and customisation to explore different modalities and opportunities for accessible opportunities. We present the outcomes of an ethnographic study conducted in a collaborative setting with twenty participants across four sessions and follow-up interviews. We found that multimodal and conversational interaction can play a crucial role in social support, peer awareness, and personal interests. Finally, we provide design implications and future research directions towards understanding, responding, and reformulating query intent for multimodal and collaborative search systems.

SESSION: Session 7: Learning & Information Acquisition

Design Principles for a Study Planning Assistant in Higher Education

Digital study assistants (DSA) aim to support the challenging tasks of searching for relevant information and organizing this data for individual study planning. Although such systems are characterized by complex process flows, their user experience (UX) has rarely been examined adequately. This research comprehensively analyzes the UX of such a system at the University of Bamberg, which includes short-term planning for one semester as well as the distinctive feature of long-term planning beyond one semester. Via remote usability testing including an online questionnaire and involving 26 participants, this study explores students’ interactions with the system and evaluates the impact of the system design on the UX, identifying strengths and weaknesses. The study has revealed that participants faced major challenges related to complex processes resulting from the lack of functional and terminological differentiation between short- and long-term study planning for users. In addition, certain features, including extended search options, were hidden and could not be found immediately. Derived from these findings, we present nine design principles to guide the development of effective DSA and similar support systems.

Balancing Act: Boosting Strategies for Informed Search on Controversial Topics

In this work, we investigate the efficacy of boost interventions rooted in information literacy principles to enhance user interaction with search results and promote knowledge acquisition on debated topics. We conducted a pre-registered online user study with a between-groups design involving 351 participants who each completed knowledge assessments before and after performing a search task. In total, 9 boost conditions were tested (consisting of 4 search tips × 2 presentations and one control without a boost).

Our findings indicate that the tested boost interventions successfully steer users towards examining a greater number of search results, investing more time in their search, and achieving a more equitable distribution of arguments presented for each side of a debated topic. Nonetheless, in terms of overall knowledge gain, the interventions do not yield a significant difference when compared to the baseline.

The results underscore that boosts could be useful in effectively restricting some of the biases involved when users perform searches on debated topics, and should therefore be tested in more naturalistic settings. However, additional support mechanisms are essential if the goal is to enhance overall knowledge acquisition.

On the Effects of Automatically Generated Adjunct Questions for Search as Learning

Actively engaging learners with learning materials has been shown to be very important in the Search as Learning (SAL) setting. One active reading strategy relies on asking so-called adjunct questions, i.e., manually curated questions geared towards essential concepts of the target material. However, manual question creation is impractical given the vast online content. Recent research has explored the effects of Automatic Question Generation (AQG) on aiding human learning. These studies have primarily focused on user studies in controlled online reading scenarios with limited documents. However, the impacts of adjunct questions on learning in the SAL setting, which involves learning through web searching, are not yet well understood. This paper addresses this gap by conducting a user study with automatically generated adjunct questions integrated into the reading interface built on top of a search system. We conducted a between-subjects user study (N = 144) to investigate the incorporation of automatically generated adjunct questions on participants’ learning. We employed three different question generation strategies as well as a control condition: (i) synthesis questions; (ii) factoid questions targeting random text spans; and (iii) factoid questions targeting terms and phrases relevant to the information need at hand. We present four major findings: (i) participants who received adjunct questions exhibited significantly more fine-grained reading behaviour, such as longer document dwell time and more scrolls, than those without adjunct questions. However, adjunct questions’ influence on learning outcomes depends on the AQG strategy. (ii) Question types significantly influence participants’ reading behaviour. (iii) The adjunct questions’ target spans significantly influence learning outcomes. Lastly, (iv) participants’ prior knowledge levels affect adjunct questions’ effects on their learning outcomes and their reaction to different AQG strategies. Our findings have significant design implications for learning-oriented search systems. The data and code is available at

The Effects of Goal-setting on Learning Outcomes and Self-Regulated Learning Processes

We present a user study (N = 40) that investigated the role of goal-setting on learning during search. To this end, we developed a tool called the Subgoal Manager (SM). The SM was designed to help searchers break apart a learning-oriented search task into smaller subgoals. The tool enabled participants to add, delete, and modify subgoals; take notes with respect to subgoals; and mark subgoals as completed. During the study, participants completed a single learning-oriented search task and were assigned to one of two subgoal conditions. In the Subgoals condition, participants had access to the SM; were instructed to develop at least three subgoals before the search session; and could add, delete, and modify subgoals during the search session. In the NoSubgoals condition, participants were not instructed to set subgoals and were simply provided with a text editor to take notes. We investigate the effects of the subgoal condition on: (RQ1) learning and retention and (RQ2) the extent to which participants engaged in specific self-regulated learning (SRL) processes during the search session. Our results found two important trends. First, participants in the Subgoals condition had better learning outcomes, especially with respect to retention. Second, based on a qualitative analysis of participants’ search sessions, participants in the Subgoals condition engaged in more self-regulated learning (SRL) processes. Combined, our results suggest that goal-setting improves learning during search by encouraging and supporting greater engagement with SRL processes.

SESSION: Session 8: Information You Can See

Comparing Traditional and LLM-based Search for Image Geolocation

Web search engines have long served as indispensable tools for information retrieval; user behavior and query formulation strategies have been well studied. The introduction of search engines powered by large language models (LLMs) suggested more conversational search and new types of query strategies. In this paper, we compare traditional and LLM-based search for the task of image geolocation, i.e., determining the location where an image was captured. Our work examines user interactions, with a particular focus on query formulation strategies. In our study, 60 participants were assigned either traditional or LLM-based search engines as assistants for geolocation. Participants using traditional search more accurately predicted the location of the image compared to those using the LLM-based search. Distinct strategies emerged between users depending on the type of assistant. Participants using the LLM-based search issued longer, more natural language queries, but had shorter search sessions. When reformulating their search queries, traditional search participants tended to add more terms to their initial queries, whereas participants using the LLM-based search consistently rephrased their initial queries.

Exploring the Impact of Verbal-Imagery Cognitive Style on Web Search Behaviour and Mental Workload

Cognitive style has been shown to influence users’ interaction with search interfaces. However, as a fundamental dimension of cognitive styles, the relationship between the Verbal-Imagery (VI) cognitive style dimension and search behaviour has not been studied thoroughly, and it is not clear whether VI cognitive style can be used to inform search user interface design. We present a study (N=29), investigating how search behaviour and mental workload (MWL) changes relate to VI cognitive styles by examining participants’ search behaviour across three increasingly complex tasks. MWL was subjectively rated by participants, and blood oxygenation changes in the prefrontal cortex were measured using functional near-infrared spectroscopy (fNIRS).

Our results revealed a significant difference between verbalisers and imagers in search behaviour. In particular, verbalisers preferred a Sporadic navigation style and adopted the Scanning strategy as they processed information, according to their viewing and bookmarking patterns, whereas imagers preferred the Structured navigation style and reading information in detail. The fNIRS data showed that verbalisers had significantly higher blood oxygenation in the prefrontal cortex when using the same search interface, suggesting a higher MWL than imagers. When based on task complexity bias, the search time significantly increased as task complexity increased, but there were no significant differences in search behaviours. Our study indicated that VI cognitive styles have a noticeable and stronger impact on users’ searching behaviour and their MWL when interacting with the same interface than task complexity, which can be considered further in future search behaviour studies and search user interface design.

Keepin' it Reel: Investigating how Short Videos on TikTok and Instagram Reels Influence View Change

Novel short video platforms such as TikTok and Instagram Reels often entertain, can inform and may persuade. Recent human-information interaction research has demonstrated the potential for information encounters on social media to sow the seeds of view change. However, little research has examined the role of this new type of social media platform in view change. To examine this role, we conducted a two-week diary study, followed by interviews, with 12 regular users of TikTok and Instagram Reels. All participants reported viewing videos that influenced their views. They predominantly passively encountered these videos on their personalized feeds, rather than actively seeking them. Content verification was limited, with many participants voicing (potentially misplaced) trust in influencers and accessible experts. Reassuringly though, some participants demonstrated a higher level of critical engagement. Overall, our findings highlight the strong persuasive power of short video platforms and the risk they may be used to misinform or manipulate. Based on our findings, we discuss key implications for research and platform design.

SESSION: Short Papers

Decoding Distress: How Search Engine Data Reveals Socioeconomic Disparities in Mental Health

Technology-based tools for mental health support, powered by certain algorithms, often become the first point of contact for people looking for mental health resources, especially in underserved communities. The different ways people understand and talk about their mental distress significantly affect the resources these tools recommend or exclude. Using the Google search symptom dataset, we looked into how people from socioeconomically different counties in Alabama search for mental health symptoms. We found that individuals in better-off areas mostly conducted searches using specific clinical terms, while those in less advantaged areas mostly used general mental health symptom terms. A similar trend was seen for physical symptoms, with people in disadvantaged areas often using general and pain-related terms, possibly indicating a lack of access to specialized care or wrong clinical treatments and raising concerns about opioid misuse. Also, counties with mainly African American populations had fewer mental health-related searches, suggesting there might be cultural or linguistic barriers. We discuss what these findings mean for designing algorithms focusing on health fairness. This study helps suggest how search engine algorithm designers can be aware of the societal factors affecting how people express distress and adjust search engine algorithms and interfaces to help reduce gaps in healthcare access.

Enabling Exploratory Browsing using Dynamic Search Result Tagging, Highlighting, and Filtering

In academic digital libraries, searchers commonly engage in exploratory search when faced with complex search tasks. An important part of exploratory search is exploratory browsing, where the focus is on search activities associated with discovery, learning, and investigation. However, these critical aspects of exploratory browsing are often not adequately supported by existing digital library search systems. In particular, they are hindered by the inability for searchers to add further information to inform their exploratory browsing style of searching. We address this issue by providing two new features: dynamic tagging of search results and an interactive workspace that allows the searcher to highlight and filter the search results using these tags. We have evaluated this approach compared to a baseline search system in a 32-participant user study. Increases in typical subjective measures were found, along with increases in perceived motivation and ability. Further, the documents saved as part of the exploratory browsing process were of higher precision when using this approach. These results show the value of providing searchers with interactive features that enable an exploratory browsing style of searching, beyond simply entering a query and selecting/saving search results.

Enhancing Human Annotation: Leveraging Large Language Models and Efficient Batch Processing

Large language models (LLMs) are capable of assessing document and query characteristics, including relevance, and are now being used for a variety of different classification labeling tasks as well. This study explores how to use LLMs to classify an information need, often represented as a user query. In particular, our goal is to classify the cognitive complexity of the search task for a given “backstory”. Using 180 TREC topics and backstories, we show that GPT-based LLMs agree with human experts as much as other human experts. We also show that batching and ordering can significantly impact the accuracy of GPT-3.5, but rarely alter the quality of GPT-4 predictions. This study provides insights into the efficacy of large language models for annotation tasks normally completed by humans, and offers recommendations for other similar applications.

Modeling Activity-Driven Music Listening with PACE

While the topic of listening context is widely studied in the literature of music recommender systems, the integration of regular user behavior is often omitted. In this paper, we propose PACE (PAttern-based user Consumption Embedding), a framework for building user embeddings that takes advantage of periodic listening behaviors. PACE leverages users’ multichannel time-series consumption patterns to build understandable user vectors. We believe the embeddings learned with PACE unveil much about the repetitive nature of user listening dynamics. By applying this framework on long-term user histories, we evaluate the embeddings through a predictive task of activities performed while listening to music. The validation task’s interest is two-fold, while it shows the relevance of our approach, it also offers an insightful way of understanding users’ musical consumption habits.

Product Query Recommendation for Enriching Suggested Q&As

To help customers who are still in the exploration phase, Web search engines and e-commerce websites often provide relevant Q&As in widgets, such as ‘People Also Ask’ and ‘Customers Also Ask Alexa’, with additional information. In this work, we propose to enrich this customer experience by rendering related products under each Q&A based on an automated online query recommendation. We define what are the tenets for high-quality query recommendations and explain why this challenge is different from the existing query re-writing, query expansion and keyphrase generation methods. We describe a data collection method which uses customer co-click information on a proprietary website in order to successfully guide our model into generating query recommendations that satisfy all tenets. Offline and online evaluation results demonstrate that our proposed approach generates superior query recommendations and brings much more customer engagement over strong baselines.

Product Spam on YouTube: A Case Study

YouTube videos are a popular medium for online product reviews. They are not only informative and entertaining, but may also be perceived as quite credible under the viewer’s impression of a personal product demonstration by an expert. As the world’s largest online video platform, YouTube’s content is included prominently in the results of most general-purpose web search engines. Consequently, online marketeers are using classic Search Engine Optimization (SEO) techniques also for placing their video content in search engines. Over the years, we have noticed an ever increasing noise floor of low-quality SEO content in product search results and in this study, we show that this trend has spilled over into videos as well. We examine YouTube video reviews for several thousand products retrieved from three commercial search engines and conduct spam detection experiments based directly on the videos’ subtitle transcripts rather than relying on metadata and comments. We find that at least a third of the retrieved videos can be regarded as spam or low-quality productions. We are further able to distinguish these spam product reviews accurately from higher-quality videos with a semi-supervised n-gram classification approach.

RULKKG: Estimating User’s Knowledge Gain in Search-as-Learning Using Knowledge Graphs

In the context of search as learning, users engage in search sessions to fill their information gaps and achieve their learning goals. Tracking the user’s state of knowledge is therefore essential for estimating how close they are to achieve these learning goals. In this respect, we extend a recently proposed approach that uses the recognition of entities present in the text to track the user’s knowledge. Our approach introduces a more complete representation by considering both the entities and their relations. More precisely, we represent both the user’s knowledge and the user’s learning goals (or target knowledge) as knowledge graphs.

We show that the proposed representation captures a complementary aspect of knowledge, thus helping to improve the user knowledge gain estimation when used in combination with other representations.

Task Supportive and Personalized Human-Large Language Model Interaction: A User Study

Large language model (LLM) applications, such as ChatGPT, are a powerful tool for online information-seeking (IS) and problem-solving tasks. However, users still face challenges initializing and refining prompts, and their cognitive barriers and biased perceptions further impede task completion. These issues reflect broader challenges identified within the fields of IS and interactive information retrieval (IIR). To address these, our approach integrates task context and user perceptions into human-ChatGPT interactions through prompt engineering. We developed a ChatGPT-like platform integrated with supportive functions, including perception articulation, prompt suggestion, and conversation explanation. Our findings of a user study demonstrate that the supportive functions help users manage expectations, reduce cognitive loads, better refine prompts, and increase user engagement. This research enhances our comprehension of designing proactive and user-centric systems with LLMs. It offers insights into evaluating human-LLM interactions and emphasizes potential challenges for under served users.

SESSION: Demonstrations & Resource Papers

Assisted Knowledge Graph Authoring: Human-Supervised Knowledge Graph Construction from Natural Language

Encyclopedic knowledge graphs, such as Wikidata, host an extensive repository of millions of knowledge statements. However, domain-specific knowledge from fields such as history, physics, or medicine is significantly underrepresented in those graphs. Although few domain-specific knowledge graphs exist (e.g., Pubmed for medicine), developing specialized retrieval applications for many domains still requires constructing knowledge graphs from scratch. To facilitate knowledge graph construction, we introduce WAKA: a Web application that allows domain experts to create knowledge graphs through the medium with which they are most familiar: natural language.

FrameFinder: Explorative Multi-Perspective Framing Extraction from News Headlines

Revealing the framing of news articles is an important yet neglected task in information seeking and retrieval. In the present work, we present FrameFinder, an open tool for extracting and analyzing frames in textual data. FrameFinder visually represents the frames of text from three perspectives, i.e., (i) frame labels, (ii) frame dimensions, and (iii) frame structure. By analyzing the well-established gun violence frame corpus, we demonstrate the merits of our proposed solution to support social science research and call for subsequent integration into information interactions.

From Chat to Publication Management: Organizing your related work using BibSonomy & LLMs

The ever-growing corpus of scientific literature presents significant challenges for researchers with respect to discovery, management, and annotation of relevant publications. Traditional platforms like Semantic Scholar, BibSonomy, and Zotero offer tools for literature management, but largely require manual laborious and error-prone input of tags and metadata. Here, we introduce a novel retrieval augmented generation system that leverages chat-based large language models (LLMs) to streamline and enhance the process of publication management. It provides a unified chat-based interface, enabling intuitive interactions with various backends, including Semantic Scholar, BibSonomy, and the Zotero Webscraper. It supports two main use-cases: (1) Explorative Search & Retrieval - leveraging LLMs to search for and retrieve both specific and general scientific publications, while addressing the challenges of content hallucination and data obsolescence; and (2) Cataloguing & Management - aiding in the organization of personal publication libraries, in this case BibSonomy, by automating the addition of metadata and tags, while facilitating manual edits and updates. We compare our system to different LLM models in three different settings, including a user study, and we can show its advantages in different metrics.

JayBot -- Aiding University Students and Admission with an LLM-based Chatbot

This demo paper presents JayBot, an LLM-based chatbot system aimed at enhancing the user experience of prospective and current students, faculty, and staff at a UK university. The objective of JayBot is to provide information to users on general enquiries regarding course modules, duration, fees, entry requirements, lecturers, internship, career paths, course employability and other related aspects. Leveraging the use cases of generative artificial intelligence (AI), the chatbot application was built using OpenAI’s advanced large language model (GPT-3.5 turbo); to tackle issues such as hallucination as well as focus and timeliness of results, an embedding transformer model has been combined with a vector database and vector search. Prompt engineering techniques were employed to enhance the chatbot’s response abilities. Preliminary user studies indicate JayBot’s effectiveness and efficiency. The demo will showcase JayBot in a university admission use case and discuss further application scenarios.

Mixed Reality Interaction Enhanced by Whiteboard for Product Search

In the modern era of digitization, information retrieval has become an indispensable part of daily life. In the future, with further advances in digital technology, information retrieval is expected to become even more diverse and complex. Therefore, we propose an information organization method using a whiteboard and sticky notes in combination with Mixed Reality (MR) devices. We use this method to organize virtual objects displaying information about searched products on a real whiteboard and decide on the items to purchase. With this proposed approach, users can interact with search results in a spatial manner, affixing them to the whiteboard as sticky notes and adding handwritten notes to virtual objects, offering a more intuitive and efficient way to decide on the purchase of products.

Walert: Putting Conversational Information Seeking Knowledge into Action by Building and Evaluating a Large Language Model-Powered Chatbot

Creating and deploying customized applications is crucial for operational success and enriching user experiences in the rapidly evolving modern business world. A prominent facet of modern user experiences is the integration of chatbots or voice assistants. The rapid evolution of Large Language Models (LLMs) has provided a powerful tool to build conversational applications. We present Walert, a customized LLM-based conversational agent able to answer frequently asked questions about computer science degrees and programs at RMIT University. Our demo aims to showcase how conversational information-seeking researchers can effectively communicate the benefits of using best practices to stakeholders interested in developing and deploying LLM-based chatbots. These practices are well-known in our community but often overlooked by practitioners who may not have access to this knowledge. The methodology and resources used in this demo serve as a bridge to facilitate knowledge transfer from experts, address industry professionals’ practical needs, and foster a collaborative environment. The data and code of the demo are available at

QookA: A Cooking Question Answering Dataset

Conversational agents have become increasingly integrated into our daily lives, including assisting with cooking-related tasks. To address these issues and supplement other datasets, we introduce QookA—a unique dataset featuring spoken queries, associated information needs, and answers rooted in cooking recipes. QookA overcomes shortcomings in existing datasets, laying the foundation for more effective conversational agents tailored to cooking tasks. This paper outlines the dataset construction process, analyzes the data, and explores research applications, providing a valuable resource to enhance conversational agents in the cooking domain.

"You tell me": A Dataset of GPT-4-Based Behaviour Change Support Conversations

Conversational agents are increasingly used to address emotional needs on top of information needs. One use case of increasing interest are counselling-style mental health and behaviour change interventions, with large language model (LLM)-based approaches becoming more popular. Research in this context so far has been largely system-focused, foregoing the aspect of user behaviour and the impact this can have on LLM-generated texts. To address this issue, we share a dataset containing text-based user interactions related to behaviour change with two GPT-4-based conversational agents collected in a preregistered user study. This dataset includes conversation data, user language analysis, perception measures, and user feedback for LLM-generated turns, and can offer valuable insights to inform the design of such systems based on real interactions.

SiTunes: A Situational Music Recommendation Dataset with Physiological and Psychological Signals

With an increasing number of music tracks available online, music recommender systems have become popular and ubiquitous. Previous research indicates that people’s preferences, especially in music, dynamically change with various factors, such as surrounding situations and emotional status. However, few existing public recommendation datasets contain such situation or emotion information. Therefore, we constructed SiTunes, a situational music recommendation dataset with rich physiological and psychological signals. We collected the data through a three-stage user study, including: (1) recorded users’ inherent music preference in a lab setting (Stage 1), (2) recorded physiological and environmental situations by smart wristband devices in users’ daily life, and provided psychological and rating feedback for music recommended by traditional recommenders (Stage 2) and (3) by situation-aware recommenders (Stage 3). The experiments were conducted with strict privacy concerns and ethical approval. The dataset contains over 2000 listening logs from 30 users on over 300 music tracks. SiTunes serves as a valuable resource for future studies on situational recommenders and user understanding in recommendation. The dataset is available at

SESSION: Tutorials

NORMalize: A Tutorial on the Normative Design and Evaluation of Information Access Systems

Information access systems, such as Google News or YouTube, increasingly employ algorithms to rank diverse content such as music, recipes, and news articles. Acknowledging the influential role of these algorithms as gatekeepers to online content, the research community is increasingly exploring ‘beyond-accuracy’ metrics. However, deciding what norms and values are relevant and should be prioritized when designing and evaluating information access systems is a challenging task. This tutorial aims to cultivate normative thinking and decision-making in the design and evaluation of information access systems. The tutorial comprises two key components. The first part involves a lecture on the foundational principles of normative thinking, emphasizing the importance of reflecting on the desired state of a system rather than its current state. The second part is an interactive session where participants engage in group discussions, applying normative thinking to a specific use case. Participants analyze the system’s usage, stakeholders, and relevant norms and values and address potential conflicts between stakeholders and/or values. Through a point-allocation exercise, participants represent stakeholders and advocate for specific values, fostering a deeper understanding of normative decision-making in the context of information access systems.

Qualitative Research in Information Interaction: Data Gathering

Qualitative research is essential for how people interact with information and informing the design of interactive information retrieval systems. However, it is often challenging to do well. There are a plethora of possible approaches, often complementary but sometimes at odds with each other and combining quantitative and qualitative approaches to data collection adds further complexity. In this tutorial, focusing on qualitative data collection approaches, we will provide attendees with a comprehensive toolkit of CHIIR-relevant qualitative methods, including interviews and both naturalistic and ‘in the wild’ observations. We will also support development of an agile skillset so attendees can adopt or adapt qualitative approaches to meet their research aims. We will cover a core of flexible, proven and rigorous practice that will provide a thorough grounding in how to plan and conduct high-quality rigorous research. The tutorial leaders come from different backgrounds and can provide insight into how to use qualitative methods for attendees from all disciplinary origins. On completion of the tutorial, you will be able to plan and conduct rigorous qualitative investigations into how people interact with information, with potential to result in high-impact information interaction research.

Search under Uncertainty: Cognitive Biases and Heuristics - Tutorial on Modeling Search Interaction using Behavioral Economics

Modeling how people interact with search interfaces is core to the field of Interactive Information Retrieval. While various models have been proposed ranging from conceptual (e.g., Belkin’s ASK[12], Berry picking[11], Everyday-life information seeking, etc.) to theoretical (e.g., Information foraging theory[50], Economic theory[4], etc.), more recently there has been a body of working explore how people’s biases and the heuristics that they take influence how they search. This has led to the development of new models of the search process drawing upon Behavioural Economics and Psychology. This half day tutorial will provide a starting point for researchers seeking to learn more about information searching under uncertainty. The tutorial will be structured into two parts. First, we will provide an introduction of the biases and heuristics program put forward by Tversky and Kahneman [59] which assumes that people are not always rational. The second part of the tutorial will provide an overview of the types and space of biases in search [6, 42], before doing a deep dive into several specific examples and the impact of biases on different types of decisions (e.g., health/medical, financial etc.). The tutorial will wrap up with a discussion of some of the practical implication for how we can better design and evaluate IR systems in the light of cognitive biases.

The concept of information need and its operationalization in CHIIR research

One of the most utilized concepts in CHIIR research is ‘information need’, which has been used as part of research designs in both qualitative and quantitative studies, either as an integral element of experiments and evaluations or as an explanatory aspect of information searching. Its definitions and operationalizations are likewise many and they often remain vague hindering the possibility to credibly compare findings. This tutorial will help participants to use the concept more precisely and appropriately in their own research. The tutorial consists of talks by experienced researchers combined with small group and plenary sessions.

SESSION: Workshops

The Eighth Workshop on Search-Oriented Conversational Artificial Intelligence (SCAI’24)

With the emergence of voice assistants and large language models, conversational interaction with information has become part of everyday life. The eighth edition of the search-oriented conversational AI (SCAI) workshop brings together practitioners and researchers from various disciplines to discuss challenges and advances in conversational search systems. This year’s edition focuses on evaluations beyond relevance and accuracy and looks at conversational search from the user’s perspective. The workshop features a shared task on user-centered evaluation datasets and metrics, challenging participants to develop new and innovative ways to evaluate conversational search systems while accounting for the needs and preferences of users.

UnExplored FrontCHIIRs: A Workshop Exploring Future Directions for Information Access

With the rise and growing prevalence of generative models, particularly multi-modal ones, it is an opportune time to explore beyond existing interactive information retrieval research trends. Indeed, it is essential to determine new avenues to explore how users interact with these models as well as revisit existing avenues that can be embellished with new technology. In this session, we aim to create a venue to workshop ideas that explore the future of search experiences and user interactions with information in a collaborative, low-pressure environment. This UnExplored FrontCHIIRs workshop enables participants to form a sub-community within CHIIR to facilitate further development of the proposed ideas and allow deeper collaborative problem-solving than just presenting late-breaking work.

PIM 2024: The Information We Need, When We Need It…: As We Get Ever Closer, Is this Ideal Still Ideal?

An oft-repeated ideal of personal information management (PIM) is to have “the right information, at the right time, in the right place…” for the current need. But the technologies and innovations that bring us ever closer to this ideal carry costs as well as benefits. In this ninth in a series of PIM workshops, we give closer, critical consideration to the “right time, right place” ideal of PIM. Can we manage the potential downsides involved in achieving this ideal, while preserving its obvious benefits? Or should we revise our ideal of PIM?

SESSION: Doctoral Consortium

A Proactive System for Supporting Users in Interactions with Large Language Models

With the advancements of Large Language Models (LLMs) and the prevalent application of ChatGPT, there is a significant interest in maximizing productivity and user experience through proactive systems. Current proactive conversational systems mostly concentrate on user preference in the recommendation scenarios, but overlook critical user perceptions, which impact their experience and task completion. Addressing this gap, the study proposes a novel framework integrating user perceptions into LLM interactions to support user tasks and improve learning outcomes. This framework include two approaches: a user interface design dedicated to streamlining LLM interactions by mitigating complexities in the interaction with the main LLM systems like ChatGPT, and an adaptation of reinforcement learning from human feedback (RLHF) to incorporate user perceptions, enhancing personalization and effectiveness of LLM learning paths. The project’s significance extends beyond user engagement, promising broader societal impacts.

Conversational Bibliographic Search

Finding experts, publications, and topics is a daily task not only of every scientist and student but also for journalists and people who search for sources when consuming information. To support this process, we aim to develop a conversational search engine with which it is possible to search for experts interactively and to explore interesting publications and topics where existing tools reach their limits. An important aspect of the search is that the search query is formulated in such a way that it leads to the desired result. However, formulating a query by a user or understanding a query by a system are challenging tasks. For example, when a query is formulated too unspecific, the search results might not entirely cover the information need whereby small further pieces of information can help immensely. Current systems do little to accurately understand the user’s search intent and offer little support during the search process. Thus, we designed an interactive search engine which runs in a chat window, so that the query can be specified over several turns until the desired search results are obtained. The search engine initiates the conversation by asking the user what they want to search for. The user answers in natural language or can choose adequate answers suggested by the system. The conversation continues until the user has fulfilled their search need or wants to start the conversation from the beginning in order to perform a new search.

Educational Resource Search in Scottish Schools

This project investigates the needs and challenges of school teachers in Scotland involved in finding, using, and sharing educational resources online. The first exploratory stage comprises interviews with primary and secondary school teachers, teacher trainees, and other school staff to define the processes involved and how these processes are situated in teachers’ work context. The second stage is a review of existing tools that facilitate these tasks. The third and final stage consists of user centred iterative design and prototype evaluation studies, experimenting with potential improvements to online resource discoverability.

Identifying textual disinformation using Large Language Models

The spread of disinformation is becoming a more acute challenge in modern society. The rise of AI technologies is providing it with an additional boost, making disinformation creation and propagation available to almost anyone. This change in the disinformation landscape needs to be responded to by the improvement of debunking and detection techniques. The plain fact-checking solutions might not be sufficient, since disinformative articles often consist of manipulated versions of correct information. Large Language Models (LLM) can be employed to identify emotions, stances, and motives behind the text. This research aims to find the way those abilities of LLM can be used for disinformation detection with an accuracy comparable to the debunking expert. Additionally, the question of multilingual detection with LLM models will be addressed, since different languages might require different approaches in LLM training and tuning. Based on these results, a semi-automated disinformation labeling system is to be built.

Supporting Neuroscience Literature Exploration by Utilising Indirect Relations between Topics in Augmented Reality

Neuroscientists need to analyse a large number of publications to identify potentially fruitful experiments. This task is necessary before undertaking any costly practical experiments. Exploring direct relations between topics (rather than publications), such as brain regions and brain diseases, has been shown to help neuroscientists identify fruitful experiments. In previous studies, users were able to query and visualise direct relations between topics using DatAR, an Augmented Reality prototype. Neuroscientist participants suggested that identifying previously unknown, or indirect, relations between topics could provide additional information for identifying fruitful experiments. I follow a user-centred design approach: defining functional requirements for finding indirect relations, designing interactive AR visualisations for the specified functionalities, and engaging neuroscientists in evaluating the usefulness of finding indirect relations. Neuroscientists who participated in my initial study of finding indirect relations, pointed out the potential of current indirect relations by demonstrating how indirect relations in the past may have evolved into present direct relations. This suggestion informs Study 2 on exploring publication-date dependent direct and indirect relations. Participating neuroscientists also suggested providing specific intermediate topics, such as genes, when indicating indirect relations between topics. This proposal informs Study 3 on identifying specific intermediate topics and publications indicating indirect relations. My final study will assess the usefulness of the designed DatAR in neuroscientists’ daily research work for identifying potentially fruitful experiments.

Visualization-Enhanced Aggregated Search Interfaces

Search interfaces serve as the primary gateway to information retrieval (IR) platforms, with each IR platform possessing its unique interface. Yet, the challenge remains in delivering diverse content effectively to users, particularly when dealing with unfamiliar sources or content. Instead of platforms presenting search results from different sources in individual tabs for distinct sources, which can lead to user confusion and over-reliance on certain tabs, this paper investigates an approach to aggregate search results into a single, unified display, providing users with a consolidated list. Our research emphasizes novel presentation methodologies that combine insights from previous studies with advanced visualization techniques. The aim is to offer an intuitive and streamlined search experience for all users. Key research questions address: the design of interfaces to blend aggregated results while visually indicating their provenance; the advantages of such interfaces; the impact of search result diversity on perceived trustworthiness; and the applicability of the approach in structured data domains, specifically digital humanities archives and digital academic libraries. Our structured, iterative research methodology encompasses a three-phased approach: starting with low-fidelity prototyping, moving to medium-fidelity design iteration, and currently working towards functional prototype development. An upcoming controlled laboratory study, complemented by data collection tools like LogUI and eye-tracking, aims to gain a comprehensive understanding of user interaction and attention patterns and evaluate the proposed designs, providing insights into their effectiveness and the implications of visual result provenance.

The University of Sheffeld logo. SIGIR logo. TPXimpact logo.