Blacklight and Stemming

With the coming transition of the IUCAT public interface from the existing SIRSIDynix OPAC to the new Blacklight discovery layer there are a lot of exciting new features coming our way. Some examples include faceted searching, better results, an easier to use interface. Along with the change in the interface, we will see changes in how search works. One of these changes relates to truncation and word stemming.

Truncation is the ability to expand a keyword search to retrieve multiple forms of a word by using a specified symbol to replace a character or set of characters. The truncation symbol can typically be used anywhere within a word: at the end, beginning, or within a term. For example in the current IUCat a search for comput$ would find words such as:  computer, computers, computing, and computation. Truncation is a handy tool that can help bring back a lot of different results and it is a common search feature in most traditional OPACs and in many vendor databases. Blacklight, like other discovery layer interfaces such as VuFind, relies on a technique called word stemming rather than on truncation.

Word Stemming is when the catalog searches for the “root” of a word and displays all words with that stem. Rather than relying on the searcher to place a specific character to expand the search as in truncation, the use of word stemming initiates an automatic search for the “root” of a search term, then returns results with all words associated with that stem. This is similar to how Google searches, so users who use Google a lot won’t notice much of a difference.

Because this is an automatic process, oftentimes it is difficult or impossible to know or predict the “stem” terms for any particular word. For example, knees has a stem of knee, but kneel has a stem of kneel not knee. Another example of stemming is when you type the word “searching” or “search” or “searches” you’ll find they all stem to “search”. But “searcher” does not; it stems to “searcher”.

For searchers who are accustomed to truncation, there may be similar terms that would have been retrieved using truncation, but which will not be retrieved using word stemming because they do not share the same stem.

For many of our users, this change will not be apparent, but we hope this is a helpful explanation of this change for expert searchers accustomed to relying on truncation.

Author- Rachael Cohen

Discovery User Experience Librarian in the Discovery & Research Services Department, IUB Libraries.