A critical review of consumer health search engine functionalities for patients with chronic health conditions
Event Type
Poster Presentation
TimeThursday, April 152:00pm - 2:01pm EDT
LocationDigital Health

Patients with chronic conditions (PwCC) increasingly perform online information search using various search engines for self-management of their health. However, studies show that patients face many challenges during online search. In this work, we identify the key functionalities of different search engines (focusing on algorithms) that support specific patient health information search behaviors (HISB) and health needs; critique their advantages and disadvantages for patient health management and identify future research in CHISEs for PwCC. Recently, there have been many efforts to translate the machine learning (ML) and natural language processing (NLP) algorithms to make the current generation of engines more intelligent. Thus, we will focus on the state-of-art ML and NLP algorithms in search engines, some of which have been used in the existing consumer health information (CHI) search engines (CHISEs), and some have not.


We created search queries to find relevant papers published until 2020. To find relevant literature on CHISEs, we used the following query: consumer health” AND (“search engine” OR "search interface"). For finding literature on PwCC HISB and needs, we used the following query: “chronic health” AND “information needs”. The searches were performed on the following databases 1) Google Scholar, 2) ACM Digital Library, 3) PubMed. The first query returned 5570, 116, and 280 results, and the second query returned 4650, 22, and 9 results from the 3 databases respectively. We selected the 29 most relevant and representative articles by scanning titles, abstracts; reading full papers; further exploration of the citations and related articles.
Next, we categorized the search functionalities of existing CHISEs according to the HISB and needs of PwCC found in the literature and identified the state-of-art ML/NLP algorithms and Interactive Information Retrieval (IIR) models that could address those needs.


Below, we discuss our findings according to the HISB and needs of PwCC. We present a summary of the search engine functionalities that have been implemented in the existing CHISEs, such as query formulation support, scatter/gather search, health cards, etc., and recommend ML/NLP algorithms and IIR models that should be implemented in the future CHISEs, such as context-based personalization, text simplification, reliability assessment, stance classification, collaborative search, etc.

1) Unsure of information need: Patients might face difficulties during Query Formulation due to a vocabulary gap. Existing algorithms in CHISEs provide features such as Query Reformulation, Autocompletion, and Expansion to suggest suitable queries, but they can sometimes distract patients. Another strategy is to use Faceted Search or Scatter/Gather search engines that categorize results into one or more pre-defined (e.g., based on taxonomies), or automatically generated (e.g. by clustering results) topics. This helps patients discover multiple aspects of a health information need without explicitly formulating focused queries.

2) Context governs information needs: Several contextual factors (e.g., demographic, situational) impact the type of information sought by patients. CHISEs provide filters (e.g., age) to manually refine the search results. Algorithms for personalization of search results based on readability alone have been developed but not yet deployed in CHISEs. Personalization based on all contextual factors could be implemented in future CHISEs.

3) Navigating through large volumes of search results: Traditional CHISEs typically display search results as pages of lists of articles. Patients feel overwhelmed while evaluating relevant results from long lists and feel lost after navigating for a long time. Faceted Search or Scatter/Gather CHISEs support the organization of the results but may increase cognitive load. Recently, Health Cards containing summarized information about the search results have been shown to be beneficial for efficient search with less effort during well-defined health search tasks. Their benefits to user-formulated queries remain to be studied. CHI question-answering systems that extract answers from reliable sources, voice-based and conversational search are also promising directions that are currently under-explored in CHISEs.

4) Understanding search results: To help understand search results, some CHISEs use readability scores (e.g., Flesch Reading Ease) to indicate the difficulty level of a webpage or re-rank the search results. Since images/videos usually aid in comprehension, some CHISEs allow users to tune the webpages shown based on the number of images/videos. Another technique is using understandability, estimated using machine learning (e.g., Extreme Gradient Boosting), as a feature in machine learning-based ranking of search results. But, it is not currently incorporated in CHISEs. One major disadvantage of all these methods is that they simply prioritize more readable sources and do not assist with understanding complex sources. For such assistance, Text Simplification could be a promising research area. It simplifies text by adding definitions of jargons, splitting long sentences, etc. Although it has been explored for simplifying medical text with reasonable accuracy (e.g. using sequence-to-sequence neural networks), future research should integrate and evaluate it for CHI search.

5) Assessing results credibility: Due to factors such as low health literacy, patients have difficulty evaluating the credibility of search results. Studies show that CHISEs displaying visualizations (e.g. bar graphs) indicating the reliability of search results, based on features such as the number of citations, help enhance users’ credibility judgments. Using reliability of webpages, estimated using machine learning, as features for ranking has also achieved reasonable accuracy and could be incorporated in future CHISEs.

6) Integrating inconsistent information across different sources: Although inconsistency may sometimes arise due to misinformation by unreliable sources, multiple perspectives are often valid and useful for making informed decisions. The state-of-art ML algorithms, such as Stance classification and Multi-Perspective QnA techniques that identify contrasting viewpoints could be deployed and evaluated for CHI search.

7) Collaboration with healthcare providers (HCPs) and other patients: Patients like to share the searched information to consult with their HCPs and communicate and learn from other patients. Some existing CHI websites provide chat support and discussion forums. However, it is not easy to capture the entire search context using such functionalities. Collaborative Search Engines enable users to search as a group, thus providing a shared space where they can collectively submit queries, view the search history, annotate search results, etc. It could be useful in CHISEs, specifically, to support the two different types of users (i.e., patients and HCPs), by providing different types of views and features. For example, patients could share a summary of their search activities with HCPs to facilitate their communication, and benefit from other (similar or expert) patients by having them more closely involved in the entire search process.

8) Long-term knowledge acquisition: A major information need of patients is to continuously understand their disease at various stages for self-management. But, current CHISEs do not support long-term educational needs. While there are patient education websites with expert-created content and useful educational features, such as tracking and testing patients’ knowledge, they are often limited by the availability of experts. Ideally, existing search engines, having vast and updated information, should be enhanced to support patient education. For example, long-term search history visualizations displaying knowledge graphs of concepts learned by the patients could help in tracking and recalling the information learned. Contextual cues have been shown to improve recall of past searches. Since patients often search before or after certain events (e.g., consultations), those can naturally be used as cues for Contextual Search within their search histories.

Specific audience "takeaways"

Although many CHISEs have been developed, the literature shows that they do not fully support PwCC patients, particularly for “Understanding Search Results”, “Assessing credibility of results”, “Long-term knowledge acquisition”, etc. To bridge this gap, we suggest potential future research directions, such as adapting and evaluating algorithms (e.g. Reliability Assessment, Text Simplification), and IIR models (e.g., Contextual Search, Collaborative Search) for advancing patient-centered CHI search.