Research summary: Recalls and safety alerts search optimization, Round 2
In October of 2019, the Treasury Board’s Digital Transformation Office (DTO) identified an opportunity to follow up on a previous optimization project from 2018: Health Canada’s recalls and safety alerts management system (RSAMS).
In the first project, we had created a prototype search using the Canada.ca search engine. Results from that project validated many of the hypotheses that had informed the prototype’s design, but the search functionality was limited. This meant that the prototype couldn’t be deployed to the RSAMS environment that was in place at the time.
For this second project, the team was able to experiment with an open source search solution. We also added a search expert to the project team. The intent was to help Health Canada prepare for implementing search functionality in a new Drupal-based RSAMS publishing infrastructure, to be launched in the 2020 to 2021 fiscal year.
For the DTO, our goal was to learn more about delivering effective specialized search, so we could provide advice and guidance to other GC institutions offering similar search functionality to their users.
What made this project stand out
To help us experiment with features and options, we created a search testbed. We got the data for the testbed’s index using a combination of a database extraction (from RSAMS) and open data sets for vehicle recalls, published by Transport Canada on open.canada.ca. The testbed allowed the project team to experiment with a range of options for configuring the search solution and interface. This included features such as facets and filters, refinement mechanisms and feedback, auto-suggestion (typeahead), query correction, highlighting, and more. Applying these features to real data was extremely helpful.
Optimization testing vs. product testing
When we started the project, we planned for one round of testing. We planned to reuse tasks and baseline data from the first project, and compare task performance to complete the study. Our optimization studies require a minimum of 16 performances of each task for reliable comparison.
Implementing a full search solution within a constrained time frame is a complex task. We found ourselves implementing new features, updating the index, and making significant changes to the product as we tested. This was fantastic for learning and experimenting. It is exactly what should be done in a product development process. The drawback was that it meant we didn’t have our minimum of 16 task performances, so we couldn’t use the test data for comparison.
Defining specialized search
Specialized search is different from web search. It focuses on queries of a specific structured database, or a collection of related content, which may include both structured and unstructured content sources.
Unlike web search, all components of the search system should be within the control — or at least the influence — of the product manager. To help institutions deliver and maintain an effective user experience, we wanted to define these various components and the skills and activities required to support them.
Components of specialized search
A full search solution combines multiple components, each of which involves design decisions.
- Content stores
- Sources of structured content to be indexed and/or unstructured content to be crawled and indexed
- Content/data structure, changes and improvements, such as adding summaries to aid scanning and comprehension, and creating fields to support facets and filters
- Content processing and indexing
- Identifying what should be included or excluded from the search index
- Functions such as tokenization, lemmatization, stemming, and stop words that can be configured in the search solution
- Query processing and search features
- Constraining results by date or other criteria
- Classification, faceting, auto-suggest, error correction (“did you mean?”), etc.
- Search interface design and results presentation
- Snippets, sort order, refinement options, zero-result recovery, etc.
- Search analytics and performance measurement
Supporting effective search
At the end of the project, we gave Health Canada recommendations for the new RSAMS search implementation.
Our experience with the RSAMS project reinforced our understanding of the complexity of delivering effective specialized search. The key recommendations, summarized here, apply to any specialized search implementation.
Understand search use cases
It is important to look at the evidence of how people are using specialized search and design the solution to support this.
For example, earlier research had shown that RSAMS search behaviour follows distinct patterns:
- searching by product (brand) or product category (toys, cars)
- searching by health concern (specific allergies)
- searching by issue (recalls due to food-borne listeria or E. Coli)
A shared understanding of these patterns can help determine how to structure your content. For example, the RSAMS data did not capture brand names or product codes as a separate field. That undermined precision in search results.
Fix the data at source, with the right people involved
For RSAMS, a significant area of challenge has been that the content is sourced from multiple groups and institutions. Each has its own processes, templates and formats for publishing recalls and other notices related to the specific products it is responsible for monitoring.
For this project, we had a static dataset. That meant we could make manual additions and deletions to prepare a single, search-specific data source for indexing. Changes made to the data included:
- separating multi-product notices into one-product-per-notice
- revising the format for recall titles
- making improvements to the RSAMS taxonomy, and applying updated values
- excluding outdated and redundant records from the index
This “post-processing” of search data prior to indexing was labour-intensive. It was also difficult because of inconsistencies and issues with data quality.
Ideally, these types of issues should be fixed at the source, rather than patched afterwards. Getting this right requires bringing together the people most knowledgeable about the data or content.
Dedicate resources to performance monitoring and data curation
Is your search effective? To answer this question, you need to define metrics for search performance and perform routine analytics.
You need to continually monitor these metrics. They should drive improvements to the relevant search solution component.
If your search solution aggregates data and content from multiple sources, data curation is essential. The publishing infrastructure (content management system) should enforce structural consistency, and impose constraints (as in the use of controlled vocabularies for tagging content). Content needs to be monitored for quality. This monitoring should, in turn, inform improvements to structure, guidance and shared assets (such as those controlled vocabularies).
Search is never done
An effective search solution requires dedicated resources. It needs to involve multiple skill sets in its design and maintenance. This should include subject matter experts, information architects, content designers, developers, data and network specialists, search analysts, and user researchers.
To quote our search expert:
“Search is hard. Good search is harder.”
Request the research
If you’d like to see the detailed research findings from this project, email us at firstname.lastname@example.org.
Let us know what you think
Tweet using the hashtag #Canadadotca.