Subject Indexing
Most fields of study rely on authoritative lists of domain-specific subjects, known as controlled vocabularies, curated by subject matter experts like librarians or researchers. These vocabularies, such as the prestigious ones available via PsycInfo by the American Psychological Association (APA), Cumulative Index to Nursing and Allied Health Literature (CINAHL), or Medline’s assignment of the Medical Subject Headings (MeSH), summarize content with subject tags applied by clinicians and medical experts. EBSCO Discovery Service stands out for its one-click search across various subject authorities and its linked data network, aiding users in discovering related research, authors, and institutions. This meticulous subject indexing ensures a more confident, higher quality, and targeted search experience.
EBSCO Discovery Service (EDS) is the only discovery service that leverages these prestigious controlled vocabularies in search relevancy to ensure the most relevant information is retrieved, and in query expansion where equivalent subjects and users’ natural language are mapped together. This helps researchers focus on their research and not worry about missing articles from their queries because they did not know which subject headings to use. These work together to create a highly relevant, multidisciplinary, and user-focused research journey within EDS.
Because EBSCO treats controlled vocabularies and subject indexing with a level of respect not typically found in the research discovery space, EBSCO has the most robust linked data knowledge graph of controlled vocabularies and their metadata, called the Unified Subject Index (USI), which is the main foundation of our search engine relevancy algorithm.
Search: Relevancy Ranking
EBSCO Discovery Service maintains impartiality towards content providers. Search results prioritize relevance to user queries, ranking metadata in the following order of contextual significance:
- Match on subject headings from controlled vocabularies
- Match on article titles
- Match on author keywords
- Match on keywords within abstracts
- Match on keywords within full text
Search: Value Ranking
Once the most relevant results are identified, it is helpful to the user to make some value judgments, which we call “value ranking.” Here are some examples of value ranking used in EDS:
- Recency/currency — Recently published content is scored higher for value than older content.
- Document type — Certain document types are weighted higher than others. For instance, a research article would rank higher because it has a wider applicability to the research audience, compared to a book review which is used for a more limited amount of research goals. Both are important to research, and both can be ranked high through user-applied filters on document types they are looking for, but in general document types with more applicability are ranked higher.
- Document length — Documents of a more substantial length are likely more beneficial to research than, for example, a quarter-page article which will likely have less substance.
Honoring Users’ Intent
EBSCO Discovery Service honors user intent by providing relevant results when a user searches for a specific subject term, publication name, database name, chemical compound or formula, term-in-time (such as from archival material), or question from the user.
EDS also helps researchers with mistyped queries through autocorrect features and autocomplete which helps users quickly find a common search string.
Subject Mapping in Action
The Unified Subject Index (USI) contains hundreds of multidisciplinary controlled vocabularies, including many from the Unified Medical Language System (UMLS) like the Systematized Nomenclature of Medicine (SNOWMED) and the National Cancer Institute (NCI). The EDS knowledge graph is where the controlled vocabulary equivalencies are stored and maintained. Some of the data is mapped from linked data, but vocabularies not formatted as linked data and users’ natural language are mapped by EBSCO subject matter experts. All mappings are regularly reviewed for correctness, appropriateness, completeness, consistency, and timeliness.
Users’ natural language is gathered from bidirectional card sorting exercises with librarians (scientific warrant), domain subject matter experts (literarily warrant), and end-user studies (user warrant) using Institutional Review Board (IRB) ethical human study (surveys) methodologies.
An example of a subject query expansion using the USI: A search for “plastic surgery” is expanded to the following controlled vocabularies from publishers, EBSCO, National Authorities, Open Linked Data sources, and natural language: