133 81 9MB
English Pages 285 [516] Year 2023
Online Searching
Online Searching
A Guide to Finding Quality Information Efficiently and Effectively Third Edition Karen Markey Revised by Cheryl Knott
ROWMAN & LITTLEFIELD
Lanham • Boulder • New York • London
Published by Rowman & Littlefield An imprint of The Rowman & Littlefield Publishing Group, Inc. 4501 Forbes Boulevard, Suite 200, Lanham, Maryland 20706 www.rowman.com 86-90 Paul Street, London EC2A 4NE Copyright © 2023 by The Rowman & Littlefield Publishing Group, Inc.
All rights reserved. No part of this book may be reproduced in any form or by any electronic or mechanical means, including information storage and retrieval systems, without written permission from the publisher, except by a reviewer who may quote passages in a review.
British Library Cataloguing in Publication Information Available Library of Congress Cataloging-in-Publication Data Names: Markey, Karen, author. | Knott, Cheryl, 1954- author. Title: Online searching : a guide to finding quality information efficiently and effectively / Karen Markey ; revised by Cheryl Knott. Description: Third edition. | Lanham : Rowman & Littlefield Publishers, [2023] | Includes bibliographical references and index. Identifiers: LCCN 2022055420 (print) | LCCN 2022055421 (ebook) | ISBN 9781538167724 (cloth) | ISBN 9781538167731 (paperback) | ISBN 9781538167748 (ebook) Subjects: LCSH: Electronic information resource searching. | Information retrieval. Classification: LCC ZA4060 .M37 2023 (print) | LCC ZA4060 (ebook) | DDC 025.0425—dc23/eng/20221128 LC record available at https://lccn.loc.gov/2022055420 LC ebook record available at https://lccn.loc.gov/2022055421 The paper used in this publication meets the minimum requirements of American National Standard for Information Sciences—Permanence of Paper for Printed Library Materials, ANSI/NISO Z39.48-1992.
(KM) For Pauline Cochrane, mentor, colleague, and friend (CK) For James Edward Syme
Contents Cover Half Title Title Copyright Dedication Contents List of Figures, Textboxes, and Tables Preface to the Third Edition Acknowledgments Chapter 1
The Age of Search
Chapter 2
Accessing Quality Information at the Library Website
Chapter 3
The Reference Interview
Chapter 4
Selecting a Relevant Database
Chapter 5
Presearch Preparation
Chapter 6
Controlled Vocabulary
Chapter 7
Free-Text Searching
Chapter 8
Web Search Engines
Chapter 9
Known-Item Searching
Chapter 10
Assessing Research Impact
Chapter 11
Search Strategies
Chapter 12
Working with Results
Chapter 13
Performing a Technical Reading Database’s Search System
Chapter 14
Interacting with Library Users
Chapter 15
Online Searching Now and in the Future
Glossary About the Authors
of
a
Figures, Textboxes, and Tables FIGURES Figure 2.1.
University of Illinois Library Easy Search interface.
Figure 2.2.
University of Illinois Library Easy Search results screen.
Figure 2.3.
Search results with document-type labels.
Figure 2.4.
Pima County Public Library home page.
Figure 2.5.
New Mexico State Library’s El Portal home page.
Figure 2.6.
Resources for Learners, Digital Arizona Library.
Figure 3.1.
Phases of the reference interview and steps of the online searching process.
Figure 4.1.
Description of the Historical Abstracts database.
Figure 4.2.
Search screen of the USC Shoah Foundation Visual History Archive Online.
Figure 4.3.
Results page in the Library of Congress Prints & Photographs Online Catalog.
Figure 4.4.
Data.gov search screen.
Figure 4.5.
Geospatial dataset results for the search “oil wells” in the Data.gov catalog.
Figure 4.6.
Databases by title and subject at the University of Arkansas Libraries.
Figure 4.7.
Entry from the database Gale in Context: Biography.
Figure 4.8.
Entry from the Encyclopedia of Associations in the Gale Directory Library.
Figure 4.9.
Brief records and resolver links for a search in the Music Index.
Figure 4.10.
Results page in the MERLOT database.
Figure 5.1.
Venn diagram of the Boolean operator AND.
Figure 5.2.
Venn diagram of the Boolean operator OR.
Figure 5.3.
Venn diagram of the Boolean operator NOT.
Figure 5.4.
Venn diagram depicting the overlap containing two facets.
Figure 5.5.
Venn diagram depicting the overlaps of all three facets.
Figure 5.6.
Venn diagram depicting NOT eliminating book reviews from the results.
Figure 5.7.
Venn diagram depicting author name AND book title word overlap.
Figure 6.1.
ERIC thesaurus results for the term aids.
Figure 6.2.
ERIC thesaurus entry for the subject descriptor autoinstructional aids.
Figure 6.3.
First three results of autoinstructional aids.
a
subject
search
for
Figure 6.4.
Advanced search screen for the ERIC database on the EBSCOhost platform.
Figure 6.5.
Surrogate record for an article indexed in the ERIC database.
Figure 6.6.
Browsing for all terms containing the word depression in the APA Thesaurus of Psychological Index Terms.
Figure 6.7.
Authority record for the subject descriptor “Major Depression” in the APA Thesaurus of Psychological Index Terms, with additional terms selected.
Figure 6.8.
Authority record for the subject descriptor “Depression (Emotion)” in the APA thesaurus with the related term “Sadness” selected.
Figure 6.9.
EBSCOhost’s searching language for selected APA thesaurus terms.
Figure 6.10.
The “Search History” feature in PsycInfo.
Figure 6.11.
Filters on the PsycInfo results page.
Figure 6.12.
Filters for two subject categories in the PsycInfo database.
Figure 6.13.
Filter for “Subject: Major Heading” with descriptors ranked by number of results in PsycInfo.
Figure 6.14.
Subject search using field selection menu.
Figure 6.15.
MeSH authority record for Monkeypox.
Figure 6.16.
Selection of Library of Congress Subject Headings beginning with the letter C.
Figure 7.1.
Search history showing three sets of results in Business Source Ultimate.
Figure 7.2.
Results sets showing high recall of free-text searches in full-text sources indexed in Academic Search Ultimate.
Figure 7.3.
Results page for simple keyword search in the Access World News database.
Figure 7.4.
Results page for a keyword search using Boolean and proximity operators and truncation in Access World News.
Figure 7.5.
Results page for a broad free-text search limited to the “lead/first paragraph” field in Access World News.
Figure 7.6.
Results for a free-text search limited to the features sections of US news sources indexed in Access World News.
Figure 7.7.
Small number of results in Nexis Uni for a search using only one term for each of three facets.
Figure 7.8.
Algorithmically assigned subject terms for a full-text article indexed in Nexis Uni.
Figure 7.9.
First result for a free-text search using Boolean and proximity operators and truncation in Nexis Uni.
Figure 7.10.
Chronicling America advanced search screen.
Figure 8.1.
Google search engine results page.
Figure 8.2.
Bing search engine results page.
Figure 8.3.
DuckDuckGo search engine results page.
Figure 8.4.
Google advanced search form.
Figure 8.5.
Google advanced search for data about poverty levels.
Figure 8.6.
Google results for the poverty-level data search.
Figure 8.7.
Google search results for .gov sites and .xls files.
Figure 8.8.
Google search results for science fair guides limited to pdf files.
Figure 8.9.
Google Images advanced search form.
Figure 8.10.
Google Images results for “electric cars” search.
Figure 8.11.
Google Scholar advanced search form results.
Figure 8.12.
Google Scholar search limited to titles.
Figure 9.1.
Record for Babe Didrikson in the Library of Congress Name Authority File.
Figure 9.2.
Library of Congress catalog browse for last name Lessing.
Figure 9.3.
Catalog of U.S. Government Publications browse for United States agencies.
Figure 9.4.
Searching by author last name in the PubMed advanced search builder.
Figure 9.5.
Browsing the author index in Academic Search Ultimate and selecting names to add to the search box.
Figure 9.6.
Browsing the Library of Congress Online Catalog by title.
Figure 9.7.
Citation-matching form with journal name, title phrase, and year of publication in the labeled search boxes.
Figure 9.8.
Form for the “Find Citation” feature on the Ovid platform.
Figure 9.9.
A search for a known item using the advanced search screen in Access World News.
Figure 9.10.
OSTI.gov search results including DOIs.
Figure 10.1.
Dimensions home page.
Figure 10.2.
Full record including publication metrics in the Dimensions database.
Figure 10.3.
Detailed record in the Scopus database.
Figure 10.4.
Scopus and PlumX metrics for an article indexed in Scopus.
Figure 10.5.
Complete PlumX metrics for an article indexed in Scopus.
Figure 10.6.
Web of Science beamplot.
Figure 10.7.
Metrics for an article published in the open-access journal Molecules.
Figure 10.8.
Google page for Naomi Oreskes.
Figure 11.1.
Building block search strategy.
Figure 11.2.
Can’t live without this facet first search strategy.
Figure 11.3.
Pearl growing strategy.
Figure 11.4.
Results sets for controlled-vocabulary searches as part of a pearl growing strategy.
Figure 11.5.
Company name and results by topic in the Gale OneFile: Business subject guide.
Figure 11.6.
Browsing company entity field index for Oracle.
Figure 11.7.
Gale OneFile: Business “Topic Finder” visualization of the subject cybersecurity, with the subtopic artificial intelligence selected.
Figure 11.8.
Visualization of a browse starting with the category “Plants” in the Gale in Context: Elementary database.
Figure 12.1.
Search results database.
Figure 12.2.
Brief records and filters on the Academic Search Ultimate results page.
Figure 12.3.
First two results selected for downloading or exporting.
Figure 12.4.
Beginning of full record for a search result in the General OneFile database.
Figure 12.5.
Tools for managing results displayed on the right side of a detailed record page.
Figure 12.6.
Options to add the search statement to a folder and send a notification when new items matching the search are added to the database in EBSCOhost.
Figure 12.7.
Search alert feature at the top of the results screen on the Gale platform.
Figure 12.8.
Access World News search alert feature.
Figure 13.1.
Basic search screen in Gale OneFile: News.
Figure 13.2.
Advanced search screen in Gale OneFile: News.
Figure 13.3.
List of subject descriptors containing the term gender.
in
the
Gale
General
OneFile
Figure 13.4.
Gale Academic OneFile subject guide search for toxins.
Figure 13.5.
AGRICOLA advanced search screen with presearch qualifiers.
Figure 13.6.
Science.gov results page with date and topic clusters for filtering results.
Figure 13.7.
Postsearch visualization of results in science.gov.
Figure 14.1.
Simple Boolean search using AND.
Figure 14.2.
Advanced search screen with three boxes for terms.
Figure 14.3.
PubMed results screen with filters for refining the search.
Figure 15.1.
Open science.
Figure 15.2.
Components of information disorder.
TEXTBOXES Textbox 1.1.
Brenda Linares, Research & Learning Health Sciences Librarian, A. R. Dykes Library, University of Kansas Medical Center
Textbox 1.2.
Earnrolyn “Lynn” Smith, Academic Success Manager, LexisNexis
Textbox 1.3.
Magali Sanchez, School Library Media Specialist
Textbox 1.4.
Marydee Ojala, Editor in Chief, Online Searcher
Customer
Textbox 3.1.
Helping Users Transition from Q2 (Conscious) to Q3 (Formal-Need) Questions
Textbox 4.1.
The Utility of Surrogates in Full-Text Databases
Textbox 6.1.
Changes Searchers Make to Ongoing Searches
Textbox 7.1.
Sources of Free-Text Search Terms
TABLES Table 3.1.
Reasons People Don’t Reveal Their Real Information Needs
Table 3.2.
Open-Ended Questions That Give Rise to Negotiated Queries
Table 3.3.
More Negotiation for In-Depth Queries
Table 3.4.
Seven Steps of the Online Searching Process
Table 4.1.
Bibliographic Information in a Surrogate Record
Table 5.1.
Relationship Types
Table 6.1.
Sample Database Bearing Seven Surrogate Records
Table 6.2.
Sample Database’s Inverted Index for Controlled Vocabulary Searching
Table 7.1.
Sample Database’s Inverted Index for Free-Text Searching
Table 11.1.
PsycInfo Subject Descriptors and Classification Captions Extracted from Results Using the Pearl Growing Search Strategy
Table 12.1.
Search Tactics
Table 14.1.
Course Documentation
Table 14.2.
Using Prompts to Elicit Queries from Users
Table 14.3.
Inviting Users to Dissect Their Queries into Main Ideas
Table 14.4.
Advising Users to Break Up or Restate Wordy Big Ideas
Preface to the Third Edition “Why didn’t I know all this before?” That’s a common reaction among undergraduate and graduate students about midway through an online searching course or learning module. You can find the answer to their question on the web. Not by putting the question in the search box, of course, but by searching for the phrase online searching. The results will reflect what most people think online searching is, that is, the thing you just did. But there’s more to online searching than that, as this book proves. Much has changed since the publication of the second edition of Online Searching, and throughout this third edition are essential updates and additions. One of the major additions is a chapter on web search engines, which are capable of much more than we ask of them with our simple keyword searches. Chapter 8 goes into some detail about the ways in which you can apply your developing search skills to make the most out of Google. Another major revision appears in chapter 10 on assessing research impact. Since the last edition, research impact metrics have become ubiquitous with scores, badges, and pop-ups appearing on journal articles accessed through databases, publishers’ websites, and the profile pages of individual scholars. Chapter 10 surveys these and introduces a relatively new and freely accessible database, Dimensions, that uses algorithms to identify relationships among publications, funding sources, citations, policy documents, and researchers as signs of researchers’ reputation and influence. Dimensions also uses automated indexing of many resources to provide access to an enormous number of research publications, not
only for retrieval but also for making new connections within and across areas of scholarship. The third edition also exhibits a broad perspective on the role of online searching throughout the many kinds of library and information environments serving diverse information seekers. Chapter 1 establishes this perspective by including some words of wisdom from four information professionals—only two of whom are librarians—that highlight some persistent truths about online searching. Additionally, the first chapter provides historical context that will serve as the backdrop for the learning you are undertaking. Since the present always carries some of the past with it, you will encounter vestiges of early search systems as you read this book and try the exercises in it. Knowing where the vestiges came from will help you know which to pay attention to and which to ignore in different circumstances. It will also help you understand why your experienced colleagues use some of the terminology they do and why they think through research and reference questions in the ways they do. It will help you function well in an environment where printed books, newspapers, and journals remain important even as e-publishing and web-based information expand. You may grow to appreciate long-established and still powerful methods of information organization and access even as artificial intelligence and machine learning techniques supplement or supplant the long tradition of human-generated indexing. Chapter 2 underscores why libraries persist: They are entities people can rely on for access to a well-selected collection of authoritative research, credible information sources, educational materials, and generally desirable stuff that supports reading, viewing, playing, and making. While a lot of browsing takes place in library buildings and on library websites, search makes it possible to find information resources that the information seeker doesn’t happen upon while browsing. Information discovery also works in the other direction, when a search ends and a browse begins. Each can stand alone as well. But a library itself cannot stand alone. Chapter 2 includes an overview of some of the companies that market library systems and services. Much of this chapter introduces
you to aspects of libraries that most users may not notice, but as a professional librarian, you’ll want to make a point of exploring physical and virtual libraries and the many ways they make online searching possible and productive. Chapters 3 to 12 cover the seven steps of the online search process: 1. Conducting the reference interview, where you determine what the information seeker really wants (chapter 3) 2. Selecting a relevant database (chapter 4) 3. Typecasting the negotiated query as a subject or a known item (chapter 5) 4. Conducting a facet analysis and logical combination of the negotiated query (chapter 5) 5. Representing the negotiated query as input to the search system (chapters 6 to 10) 6. Entering the search and responding strategically (chapter 11) 7. Displaying retrievals, assessing them, and responding tactically (chapter 12) There are so many databases, search systems, and search engines that you cannot be an expert at searching every single one. Chapter 13 gives you a methodology for conducting a technical reading of a database. Do a technical reading of a database to quickly and efficiently familiarize yourself with its content and search system before using it. As a new librarian, you can use this method systematically to familiarize yourself with the information resources available to the users of your library. Chapter 14 reviews the seven steps of the online searching process from another perspective, with a focus on helping the information seeker become more proficient at finding needed information. The book concludes with Chapter 15 and a discussion of salient trends in the online searching arena. With numerous figures, tables, textboxes, and practical exercises supplementing the text, the book is designed for many different
types of readers with their own individual motivations and objectives:
Students in library and information science (LIS) programs who are learning how to intermediate between information seekers and information resources for the best possible outcomes Faculty whose courses include online searching modules or are entirely devoted to the topic—this is your textbook End users who want to know expert searchers’ secrets for finding information, including independent information professionals, ambitious undergraduates engaged in research, graduate students preparing to write master’s theses and doctoral dissertations, university faculty and school teachers, and professional amateurs who are passionate about their avocations Practicing librarians and other information specialists who want to upgrade their search skills and knowledge As you work your way through the chapters, you may want to look up the definitions of unfamiliar terms in the glossary at the back of the book. Although the book follows the sequence of the online searching process, feel free to dip into any section in any order to review parts you’ve already read or to consult the text or figures as needed. Supplementary videos covering the key basics of online searching are at https://vimeo.com/showcase/9817164. You’ll also find interviews with librarians and other information professionals there.
Acknowledgments When Karen Markey, the author of the first two editions of Online Searching, asked me to work on the third edition, I was thrilled. I am grateful to her for having faith in my ability to revise and update what I have long considered the definitive textbook for online searching courses and course modules. For the material that has carried over to the third edition, the acknowledgments and appreciation published in the earlier editions still apply, although they are not repeated here. My thanks go to Charles Harmon at Rowman & Littlefield Publishers, a model editor who has always been kind and patient and full of wise counsel. Thanks also to Erinn Slanina and Melissa McClellan, whose fast answers to myriad questions helped keep the publishing process moving. The three anonymous reviewers of the proposal for the third edition offered invaluable advice, for which I am grateful. Four busy professionals took time away from their work to talk about online searching, and I am forever grateful to them: Brenda Linares, Marydee Ojala, Magali Sanchez, and Lynn Smith. Their words add immeasurably to the book’s usefulness by letting students know firsthand how working professionals think about online searching in their respective fields. Without Niamh Wallace, who arranged and completed the interviews, edited the videos, and crafted the title slides, this rich addition to the text could not have happened, and I am thankful for her diligence. Over the years, many librarians at the University of Arizona Libraries have helped me in a variety of ways. For this project, I especially want to recognize Mona Ammon, Ellen Dubinsky, Lori
Eagleson, and Devin Johnson. I’m also grateful to the librarians and university officials who gave me permission to use screenshots of their library websites and related collections, and for the individuals who helped along the way: Jason J. Battles, Dean of Libraries, University of Arkansas; C. Ellen Foran, Associate Secretary to the Board of Trustees, University of Illinois System; Georgiana Gomez, Access Supervisor, University of Southern California (USC) Shoah Foundation–The Institute for Visual History and Education; Megan Hammond, Research Library Administrator, Arizona State Library, Archives and Public Records; Loretta Hunter, Arthur Cartwright Learning Resource Center (LRC) Library, Wayne County Community College District; Zach Larkin, Program Management Officer, USC Shoah Foundation; Amber D. Mathewson, Executive Director, Pima County Public Library, Tucson, Arizona; Heather Murphy, Chief Communications Officer, University Library, University of Illinois Urbana-Champaign; Oneka Samet, Associate Provost, Learning Resources and Instructional Technology, Wayne County Community College District, Detroit, Michigan; Svetlana Sowers, Associate Director, Office of Technology Management, University of Illinois Urbana-Champaign; and Lori Smith Thornton, Public Services Bureau Chief, New Mexico State Library. I am indebted to Mary Ann Francis at the University of Maryland College of Information Studies, who invited me to give a talk in the Search Mastery series as I was writing chapter 8. Her helpful advice and encouragement came at the best possible time. I also appreciate the comments and suggestions I have received from Melissa Wong, Joyce Valenza, Dan Russell, and Beth St. Jean. I appreciate Claire Wardle’s allowing me to reproduce her information disorder graphic. I am grateful to Naomi Oreskes for giving me permission to use her Google page and to her assistant, Yazid Alfata, for her help. Many thanks to Sean Burns, who tweeted a crucial tip that helped me find an obscure source. Diane Kraut secured permission to use many of the screenshots, and I thank her for her work. Also assisting with securing permission to use screenshots, for which I am grateful, were Zeina Al-Muhtaseb, Researcher & Account Developer, Global Content Alliances ProQuest,
Clarivate; Angela Albert, Customer Engagement Product Specialist, Team Lead Health Learning, Research & Practice, Wolters Kluwer; Mollie Belli, Associate Counsel, EBSCO Information Services; Megan Burnside, Director of Product Marketing, LexisNexis; Rory Butler, Technical Support Representative II, EBSCO Information Services; Kevin Calanza, Technical Support, Legal & Regulatory U.S., Wolters Kluwer; Quinton Carr, Director of Publisher Support, NewsBank; Stephanie Faulkner, Director of Product Management, Research Indicators and Subject Classifications, Elsevier (PlumX); Eizatul Hannis, Product Support Representative, Web of Science, Clarivate; Regor Layosa, Customer Experience Champion, Elsevier (Scopus); Simon Linacre, Head of Content, Brand & Press, Digital Science; Thomas Ramsden, Director of Licensing, Rights, and Permissions Operations, Wolters Kluwer Health, Inc.; Rachel Tilney, IP Granting Department, Cengage; and Thomas Rexson Yesudoss, Copyrights Coordinator, Elsevier. During the many years that I have taught courses devoted to or involving online searching, I have learned from countless undergraduate and graduate students. I am grateful for their questions and comments, which have inspired me to keep learning. Early in my career, I worked as a reference and instruction librarian at the University of Texas at Austin, where I learned about online searching from a stellar cast of colleagues as well as from hands-on workshops sponsored by database vendors. The best thing about my years at the university, however, was meeting Margo Gutiérrez, whose friendship over the decades has meant more to me than I can say. I also want to express my gratitude to Jim Syme, for being there and for being him.
TRADEMARK ACKNOWLEDGMENTS
APA PsycInfo® and the Thesaurus of Psychological Index Terms® are registered trademarks of the American Psychological Association. The Dax logo is a registered trademark of DuckDuckGo. DOI® is a trademark of the International DOI Foundation. Google™, Google Books™, Google Images™, and Google Scholar™ are trademarks of Google LLC; this book is not endorsed by or affiliated with Google in any way. Microsoft® and Bing® are registered trademarks of the Microsoft group of companies. NewsBank® is a registered trademark of NewsBank, Inc. Nexis Uni® is a registered trademark of LexisNexis. Ovid® is a registered trademark of Wolters Kluwer. Scopus® and ScienceDirect® are registered trademarks of Elsevier B.V. ProQuest® and Web of Science™ are trademarks of Clarivate™.
1
The Age of Search We’re awash in information. But information is more than the amorphous barrage of news, social media posts, books, magazine stories, streaming services, radio shows, scholarly journal articles, websites, blogs, and podcasts we are inundated with every day. Information is an important industry, designated as economic sector 51 in the North American Industry Classification System (NAICS). Business entities in the US information industry had a payroll totaling almost $440 billion in 2020 and almost 3.6 million employees, with around 138,000 of them librarians and library media specialists (Bureau of the Census 2022; Bureau of Labor Statistics 2022). The US e-publishing industry earns revenue of around $9.1 billion annually, and this is projected to increase over time. Globally, revenues are close to $25 billion annually (Statista 2022a). In 2019, the parent company of mega-publisher Elsevier, whose products include major databases available in many libraries, had revenues totaling $9.8 billion. Elsevier accounted for around a third of that amount (MIT Libraries 2022). The information industry is about more than revenues and profits, however. Its products—including books, periodicals, movies, news, and software, among others—enlighten us, entertain us, connect us to others, and help us get our work done. The individuals who understand how to search through vast information resources to find what interests and informs them and, in the case of librarians and other information specialists, what interests and informs their clientele, play a special role in the industry. Being able to find the right information at the right time helps people participate in
government, learn, create, earn a living, and realize their dreams. At times, it can save lives. In the next minute, millions of Google searches will be done. Almost 30 percent of all internet traffic comes from search engines (Statista 2022b). As immense as the web is, however, it can’t satisfy all information needs, and Online Searching will introduce you to concepts and techniques and resources that go beyond the common keyword approach that so many of us rely on every day for a wide variety of products, services, social interaction, and knowledge. This chapter provides some historical and professional context that will serve as the backdrop for the learning you are undertaking.
A BRIEF HISTORY OF TOO MUCH INFORMATION Cataloging, classification, and indexing developed over the course of the age of print and were well established and in wide use before World War II. When the war ended, the Cold War between the United States and the Soviet Union ushered in an arms race, a space race, and heavy investment in research that would lead to inventions, discoveries, medical and pharmaceutical advances, development of energy sources, and the creation of new knowledge, all of it documented in reports and publications. One of the most important inventions of the immediate postwar period was the computer. Originally designed as a fast calculator, it was soon able to do much more. Among the many pioneers of software was Grace Hopper, who in the 1950s and 1960s helped develop the COBOL programming language and then worked to create methods for standardizing and testing the programming languages that made computers more than calculators (Williams 2004).
The Library of Congress, seeing that automation would make it possible to share its cataloging metadata with other libraries via computers rather than cards, hired systems analyst Henriette Avram in the 1960s to create the necessary standards and systems. She and her team created the Machine-Readable Cataloging format (MARC) that underlies libraries’ ability to use a single union catalog, in the form of an electronic database, to share records and resources (Spicher 1996). Making cataloging and classification of books, periodicals, and other information objects shareable was a major breakthrough, but MARC wasn’t designed to solve the problem of the exponential growth of scholarly literature in science, medicine, technology, and business. That fell to another area of the information industry, where indexing of individual research articles published in journals, rather than cataloging of books, prevailed. By the 1950s, it was clear that indexing alone wasn’t sufficient for making the voluminous literature of scholarly disciplines, many of them newly spawned, accessible. Several organizations began developing thesauri of controlled vocabulary, listing the preferred terms for the different topics in a broad subject field and indicating how the terms related to each other. One of the first, the Thesaurus of Engineering and Scientific Terms, published in 1967, included an appendix of guidelines and practices for creating thesauri. The Educational Resources Information Center of the U.S. Department of Education (ERIC) published its Rules for Thesaurus Preparation, and over time thesauri of preferred terms for subject areas became more standardized (Dextre Clarke 2018). Among the subject-specific thesauri currently in use are the ERIC thesaurus, the Thesaurus of Psychological Index Terms, and the Art & Architecture Thesaurus. The U.S. Department of Defense’s Advanced Research Projects Agency (ARPA) funded research at the Lockheed Missiles and Space Company that led to one of the first online search systems, DIALOG. Lockheed implemented DIALOG on NASA computers, where it could be used to search a database of citations on foreign technology, defense, and space research topics (Bourne 1980; Summit 2002). By the early 1970s, Lockheed was exploring the commercial potential of its DIALOG search system. Minus a database, search
systems have nothing to search. The key to Lockheed’s success was government contracts from the U.S. Office of Education, National Technical Information Service, National Agricultural Library, and National Library of Medicine that supported the development of the ERIC, NTIS, AGRICOLA, and MEDLINE databases, respectively (Bourne 1980). Over time, the number of databases grew through the efforts of for-profit publishers and scholarly and professional societies that transformed their print indexes into electronic databases and licensed them to DIALOG and other database vendors. By the end of the 1970s, Lockheed, the System Development Company (SDC), and Bibliographic Retrieval Services (BRS) had become database aggregators, hosting databases from a variety of publishers and marketing online searching of these databases through their DIALOG, Orbit, and BRS search systems, respectively, to libraries and information centers (Björner and Ardito 2003). Only librarians who were expert intermediary searchers were authorized to use these search systems, for a variety of reasons: Special training was necessary because each search system had its own searching language that operated within a terse command-line interface. Online searching was a high-cost enterprise that involved the purchase and maintenance of computer equipment and supplies that few people could afford. Search systems billed librarians for every search they conducted, and to pay for these services, librarians had to pass much of the cost of the enterprise on to people who used the services. By the mid-1980s, end users were doing their own online searching, first through the library’s online public access catalog (OPAC) that listed the library’s holdings (Markey 1984) and later through the library’s CD-ROM search systems that accessed many of the same databases that expert intermediary searchers accessed online through the DIALOG, Orbit, and BRS search systems (Mischo
and Lee 1987). They were difficult to use because information seekers had to know what commands to issue to activate a search, but it didn’t matter to end users. Online searching eliminated the most tedious and time-consuming elements of the search task: consulting year after year of indexes in book form, writing down citations, and then finding the periodicals containing the desired articles in the stacks. End users lined up to use electronic databases, and even if their searches were not as sophisticated or their results as precise as those of expert searchers, end users wanted to conduct online searches on their own (Mischo and Lee 1987). The trend was clear, and database publishers and vendors developed online search systems and interfaces, made licensing deals with the publishers of print and CD-ROM indexes, and began marketing their brand of information access to libraries under names like ArticleFirst, FirstSearch, SilverPlatter, Ovid, LexisNexis, ProQuest, H. W. Wilson, and EBSCOhost. In addition to its impact on information organization and access, the Cold War gave rise to the internet. In the 1960s, US scientists and military strategists worried that a Soviet attack could wipe out the nation’s telephone system. In 1962, computer scientist J. C. R. Licklider proposed a solution in the form of an “intergalactic network of computers” that would continue working even if attacks on one or more computers succeeded (Hafner and Lyon 1996, 38). ARPA, the same US agency that funded the development of the earliest search systems, funded the building of such a network and named it ARPANET. From the 1970s through the 1990s, scientists in the United States and around the world routinely used ARPANET and its successor NSFNET to send messages, share files, and conduct their research on the network’s supercomputers (Abbate 1999). In the late 1980s, computer scientist Tim Berners-Lee envisioned an online space where information stored across the vast network of interlinked computers (i.e., the internet) could be shared with anyone anywhere (Berners-Lee 1999). By the early 1990s, he had prototyped a suite of tools that transformed his vision into reality. Ensuring the success of Berners-Lee’s new invention, which he called the world wide web, were computer enthusiasts who set up web
servers, populated them with hypertext documents, and developed free web browsers that people could use to surf the web. People searched the web right from the start. Personal computers had become commonplace a decade before, and people became familiar with networked personal computers as a result of using them at school or on the job. Once web browsers began to make searching user friendly in the mid-1990s, individuals were ready to explore on their own. Anyone with access to a computer and a telecommunications connection could publish directly to the web and communicate almost instantaneously with potentially hundreds, thousands, and even millions of people. With so many new and compelling ways for individuals and groups to express themselves and interact with others, the web grew at an astonishing rate. It is now in the trillions of pages and has become the dominant information and communication medium. Initially, finding useful information meant clicking on a series of web page links until you found something of interest. In the web’s first decade, a series of search engines came into existence, but none was overwhelmingly more popular than another. That all changed in the new millennium when Google rose to prominence, topping the most-popular list, where it still stands today. Google’s initial success at achieving better search results was based on its PageRank innovation that displayed at the top of the screen the web pages with the most and the highest-quality links to them. Noticing that the number of web queries far surpassed the number of OPAC queries, librarians understood that the web had become the database of choice and that search engines were the system of choice for many information seekers, supplanting the library’s longtime role in this regard (Fast and Campbell 2004; Yu and Young 2004; Donlan and Carlin 2007). To win users back, library systems companies redesigned OPACs with graphical user interfaces featuring the kinds of elements and icons that users of web browsers had come to expect. Database publishers phased out CDROMs in favor of web-based search systems. Academic and public libraries refashioned themselves as places that users could visit physically or virtually, inviting users to their websites, where they
could use the library catalog and subscription-based databases to find needed and wanted information. Despite the user-friendly interfaces, information seekers found the sheer number of separate search systems and databases daunting, especially in academic libraries (King 2008). Consequently, academic libraries introduced web-scale discovery systems with a Google-like search box on the home page that retrieved material from the library catalog and databases simultaneously.
PROFESSIONAL ONLINE SEARCHING When self-directed searchers reach an impasse in their quest for information, they are much more likely to ask a friend, a colleague, their instructor, or a family member for help than a librarian (Head and Eisenberg 2010; Markey, Leeder, and Rieh 2014; Thomas, Tewell, and Willson 2017). In fact, the librarian is usually the last person whom library users consult about their information needs (Head and Eisenberg 2009; Beiser and Medaille 2016). Some people even experience library anxiety, “an uncomfortable feeling or emotional disposition experienced in a library setting” (Onwuegbuzie, Jiao, and Bostick 2004, 25), that prevents them from approaching librarians for help. When library users ask librarians for help, they have probably already exhausted the people they usually consult and searched the web. Librarians must be prepared to answer the most difficult questions people have. People search the web because it is easy, quick, and convenient (Fast and Campbell 2004; Vinyard, Mullally, and Colvin 2017). If they know that the library offers subscription databases (and many don’t), they may complain about how difficult it is to use the library’s search systems and databases. They don’t know which database to
choose, they can’t find the right keywords, they are stymied by the complex interfaces, they are overwhelmed with too many irrelevant retrievals, and they are stumped about how to narrow their topics (Head and Eisenberg 2010; Colón-Aguirre and Fleming-May 2012; Thomas, Tewell, and Willson 2017). Yet people sense that the sources they find through the library’s search systems are more credible, accurate, and reliable than many of the sources they might find on the web (Fast and Campbell 2004; Colón-Aguirre and Fleming-May 2012; Markey, Leeder, and Rieh 2014). But even the notion that what’s provided by the library is automatically more authoritative than what’s found on the web has evolved. With more research reports and data being made freely available and with services such as Google Scholar making it easy to identify and locate research publications, it’s necessary to think about the universe of online information when making decisions about where and how to find information. A central node in this universe is the library, and the next chapter makes the case for the enduring reliability of the library as a site for quality information. Not surprisingly, changes over the past seventy years have wrought changes in how libraries function. Throughout libraries and other information-intensive environments, the role of librarians and other information specialists has evolved as the amount of information has grown exponentially; as knowledge organization, standards, and technology have developed; and as users’ sensibilities and expectations have changed. In all kinds of settings, professionals with master’s degrees in library and information science are involved in online searching. In textboxes 1.1 to 1.4, four professionals in different areas of the information industry share their experiences and wisdom. Links to short video interviews of each person are provided at the end of each textbox. In textbox 1.1, Brenda Linares shares her experience as a health sciences librarian at the University of Kansas Medical Center. She works primarily with faculty, researchers, and students in the School of Nursing. She also provides information services and assistance to the practicing nurses in the hospital. After completing her bachelor of science degree in finance at California State University,
Northridge, she earned a master of library and information science degree from the University of California, Los Angeles. She also has a master’s in business administration from North Carolina State University and an Academy of Health Information Professionals Certification.
TEXTBOX 1.1. Brenda Linares Research & Learning Health Sciences Librarian A. R. Dykes Library University of Kansas Medical Center I did the National Library of Medicine Associate Fellowship in Maryland. I got the opportunity to be there with people who were working with and updating and creating PubMed. I got a chance to sit there and see how people were indexing those articles every day. I definitely have signed up for a lot of the PubMed and the National Library of Medicine trainings that they offer for free when they update and new things are added to PubMed. Those are really helpful. There are other databases that we use, such as CINAHL and Embase, and a lot of those vendors have trainings available and I’ve been able to sign up for some of them. There’s always something new that they tweak or add to the database or the interface changes. And so I always like to keep up-to-date with those trainings because it helps to see how things are changing when I teach a class or meet one-on-one with students using those databases. Another thing that is always helpful is if you have colleagues who have been librarians longer than you. I always like to observe how they teach the databases or any tricks that they know for the databases because they’ve done it for so long that it’s nice to see the approach they take when it comes to searching the databases. That has been helpful in my career when it comes to learning and keeping up-to-date—doing
training classes, observing people, and keeping an eye on any updates that are being done to the databases. I tend to do a lot of one-on-one consultations with students. That’s when I get a chance to learn about the research question, and I teach them a lot about Boolean searching with AND and OR, how each database works, and how they can work in each database. I like to then teach the students about concept searching. A lot of them think that searching a database is like searching Google, but it’s not. They can’t just write a phrase or a sentence. It’s more concept based. I take my time explaining to them how different that is from Google when it comes to databases. I help people with literature review assignments that they have. I work with some teams on scoping reviews and systematic reviews, and those tend to take a little bit longer to work with because a lot of the time spent is in the preliminary consultations that I have with them. I need to have a good understanding of what the research question is. What approach are they taking? What is it that they’re including and excluding? A lot of times it helps to have some type of protocol so that we can all follow those steps and be on the same page. In those types of projects, I search a lot because we start with a very broad search strategy, and then as we’re trying to tweak it specifically to what they’re trying to find, search terms can be changed, and terms can be added based on my follow-up meetings with the researchers. It’s a matter of evaluating what they’re looking for and how each database is organized, because whenever I do work on a systematic review or scoping review, it involves working in more than one database. It requires understanding that each database has its own way of indexing the citations and that often the concepts they use might not be the same. For the most part the databases use similar concepts to index the material, but sometimes there might be a word or a phrase that’s a bit different for one concept. You want to make sure that when you’re searching you know how that information is
organized. You also want to explain that to the researchers, so they understand why the approach might be a little bit different within different databases, but at the end of the day you’re trying to come up with the same results. So I tend to do a lot of showing them how to do the search, but I also like to ask them questions and make sure that they understand what I’m doing, so that hopefully if they have to do something on their own in the future they’ll be prepared. This applies especially when I work with students. They need to understand why am I using quotation marks, or why am I using parentheses. Something that I always tell people to keep in mind is that searching is not a science, like here are the three steps to take. It’s more of an art, and with each research question you have to take a different approach and have a lot of follow-up questions, especially if you’re a library student or you’re going to be a future librarian. The reference interview is essential for you to understand what they’re looking for because you’re not a subject expert; you know how to search the databases, but you need to understand what exactly they’re looking for. That is so essential from the beginning, asking the right questions and understanding what exactly they’re looking for, because if you understand that, then you can translate that research question into the best search strategy that you can come up with as you’re working together. In terms of the researchers or the students, I always tell them that they’re probably not going to find what they’re looking for the first time around. It’s a lot of trial and error and becoming familiar with the databases. The more you do it, the more comfortable you become with the concepts and how the information is organized, and you don’t get frustrated. As long as you have the basic understanding of how the information is organized, how the ANDs and the ORs work, and you understand the subject headings which are involved in how the information is indexed, you have a good foundation, because you might have different searches for different research
questions that you’re going to be working on throughout your classes and your profession. (Edited for clarity and length) https://vimeo.com/723849260/3e58790fd2
Earnrolyn “Lynn” Smith, whose advice can be found in textbox 1.2, is an academic customer success manager at LexisNexis, where she has worked for more than 20 years. She has a bachelor’s degree in history with a minor in political science from Loyola University in New Orleans, a master’s degree in information and library studies from the University of Michigan, and a JD from Saint Louis University’s School of Law.
TEXTBOX 1.2. Earnrolyn “Lynn” Smith Academic Customer Success Manager LexisNexis My job is to provide training and consulting work for our customers, primarily in the academic market. On average, I am conducting about three to four webinars a day strictly focused on searching one of our products, whether it’s one of our academic products or one of our corporate products that an academic customer has purchased. My job also entails helping customers to create search queries and learn search strategies. So, basically, that’s my job—always assisting someone with how to do online searching, primarily in a LexisNexis product. Now, one of the wonderful things about online research is that when you have the foundation of how to search, you can pretty much search any database. I get asked to speak to students about “non-traditional” careers, and one of the things I can say is that a lot of librarians don’t work in a library. They may work in a corporation or work
for the government where they are doing research. I always like to talk about the Library of Congress, and within the Library of Congress, a lot of people don’t know this, you have the Congressional Research Service (CRS). Many of the individuals who work in CRS are librarians, but their primary job is to go out and do research for Congress. I mean, that’s their customer. That’s their patron, Congress. I don’t think people realize that here you have this group within our government system and their job is strictly to do research on whatever topic a member of Congress needs information on. That’s not an avenue that a lot of people think about when they think about librarianship. You know, most people, I think they get the general idea that when you’re searching, you’re doing full-text keyword searching, and I like to tell people that full-text keyword searching is a blessing, but it’s also a big-old curse. The two key things I like to remind new librarians and also older librarians about is segment searching and field searching, because the placement of a search term will make that document more relevant, more on point. So, if you’re searching newspapers, you want to search the headline, the title, etc. Search terms that are mentioned in that part of the document are truly going to make the document more relevant. Training is ongoing, because all database providers are constantly making enhancements to their products to give the end user a better research experience. Some of the fundamentals of a product that you learned in library school are always going to be there, but they’re constantly trying to make the product better. Whenever someone is offering you training on a product, take it. Even when you think you know a product, trust me: if you pay attention, you’re going to pick up at least one thing that you didn’t know was available. Training doesn’t stop once you leave graduate school. (Edited for clarity and length) https://vimeo.com/723848930/099789dd56
Textbox 1.3 contains Magali Sanchez’s thoughts about her work as a public school library media specialist. She has a bachelor’s degree in ethnic studies with an emphasis in Chicanx studies and a minor in Spanish from Oregon State University. Her master’s in library and information science is from the University of Arizona.
TEXTBOX 1.3. Magali Sanchez School Library Media Specialist I’ve been teaching students that are under the age of twelve, and it’s really important for them to be able to filter out information and find information that’s accurate. A lot of the students like going on Google and just typing in their whole question, which can give you quick answers for important things. Like, “How long is a mile in meters or inches,” right? But there are other instances where they are actually looking for information about animals or historical facts, and although you can find those quickly, sometimes getting more details or more information around the subject can be more beneficial than just getting a quick answer. Helping them know that quick answers are good for certain things, but not everything, is really challenging at times because it’s so easy to just go on Google. Recently, we’ve been going on online encyclopedias and looking at how that information is sorted and the different features that these databases can actually offer. A lot of students may or may not be fluent readers yet. Having the ability to press “play” and listen to the article being read to you is so fascinating to them. Even having the letters be enlarged. The accessibility to them has made online searching kind of fun. For me, it’s been a challenge because I learned online searching as an adult. Looking at databases as an adult versus as a student or an elementary school student can be very different. I have done online searching for my own personal
professional development, looking for ways to create my lessons and for them to be accessible to the different learning types that we have in students. Libraries are always changing, and so I’ve had to produce data for our principal and for other people in leadership in our school to advocate for the importance of libraries and of us having the time to teach students these online searching skills and why it’s so important to start at elementary school and not wait until high school to teach those skills. It can be intimidating to even start thinking of keywords to search, or about how you’re going to approach or answer a question. Just dive in, start typing, because more things will pop up as you start looking and it will help narrow it down, even if you start super broad. (Edited for clarity and length) https://vimeo.com/723848614/c706d75a38
Marydee Ojala, editor in chief of Online Searcher magazine, provides some historical perspective in textbox 1.4. She has run her own research and writing business and has worked at corporate, academic, and public libraries. She also does program planning for the Internet Librarian and Computers and Libraries Conferences, and is in demand as a speaker for Special Libraries Association groups and events as well as other international conferences. Her undergraduate degree in English is from Brown University and she has a master’s in library science from the University of Pittsburgh.
TEXTBOX 1.4. Marydee Ojala Editor in Chief Online Searcher
I started working in my public library when I was sixteen, so I sort of grew up in libraries, and when I was in graduate school, that was really the dawn of online searching. That was when people were still trying to figure out what it was all about. And in case anybody has ever wondered why some of our subscription databases are so highly structured, it’s because back in the late seventies, when online searching came on the scene, the only people who ever were going to use it, went the theory, were librarians, and we tend to be highly structured in what we want to see in an information source, and so that’s how they were designed. My first job out of library school was at a large multinational bank. Most of our research was done on companies, industries, products, and strategic planning for financial services, but it was primarily business research, and that is a very different animal than general research. More recently, what I do now, in addition to business searching—because I do write “The Dollar Sign” column for Online Searcher magazine, which is all about business searching, business research—as an editor, I do a lot of fact-checking. We publish articles on any number of different topics, covering loads of different disciplines, and fact-checking is really important when you’re doing this. What I’ve learned from doing fact-checking is how differently searchers think about different topics. It’s not just the sources, it’s how we conceive of a research question, and I actually find that quite fascinating. People have a tendency to ask for things they think you can do. They have no clue about what you really can do. So you have to talk them through it and recognize how different the web search technology is from our subscription databases. As professional librarians, we need to know how the databases are structured, and I will say that a business database that has company information or industry information is structured quite differently from a database about general articles, even if they’re from business-type publications. When you’re doing business research, you need to think about things like the
industry codes, such as NAICS [North American Industry Classification System], or the difference between a public company and a private company. Never promise somebody you can get full financial information about a private company, because you can’t do it. With public companies you can; they have to tell you and they tell the government, and so yes, that’s obtainable. At one point, I was actually teaching business research as an adjunct professor for a couple of different universities. And I think that was one of the hardest things to get across to the students—the basics of business. Because until you know the basics of business you can’t really do a good online search, and that’s true whether you’re looking at subscription databases or whether you’re looking at a general web search. Suppose somebody comes up to you and says, “I want information on airplanes.” Well, that’s not enough information. Maybe they want to know about manufacturing. Maybe they want to know about flight schedules. Maybe they want to know about crash reports. There are all kinds of things they might want to know about. They might even be interested in that movie, who starred in the movie Airplane. You don’t know until you ask them, until you work it through, and what is it that you really want to know, and have you thought about this aspect, have you thought about that aspect, and then translate that into language for searching. The fact is that if you were looking for information on airplane manufacturing, you’d probably be better off with the word aircraft instead of airplane. It’s not necessarily that specialized a vocabulary, but when you get into business in particular, very subtle differences in words make a difference in how valid the information you retrieved is to your end user. People who are not librarians think they are expert searchers. This is very dangerous because by and large they are not. But if we are going to say that we are expert searchers, we need to make sure that we really are. We need to understand the difference between a subscription database and the web. We need to understand the differences among all of the
different websites that we might access. We need to be able to evaluate the information we retrieve. We need to understand that Boolean searching works great sometimes, but not always. We need to know when to employ Boolean logic and when to walk away from that. (Edited for clarity and length) https://vimeo.com/723848197/edaafcb3aa
What shines through in the words of all four of these information professionals is a sense of wonder: at the behind-the-scenes view of indexing in a major health sciences database; at the utility of limiting search words to only one section of a document; at the magic of having a database read an encyclopedia entry to a child; and at the striking differences between information resources for business research versus general research. Here, you might be thinking, are areas of professional practice that can remain interesting for the entirety of a long career. And you would be right.
SUMMARY This first chapter has a historical bent to it, looking backward at the factors that have given rise to online searching to harness the recent information explosion and forward to advances in new technology that have steered online searching onto its present course. Online searching was born during the Cold War, when computers developed from stand-alone behemoths to networked personal devices and when information organization standards were developed to make sharing of resources possible in the networked environment. The first online search systems and databases were expensive and difficult to search, so their users were librarians who were trained as
expert intermediaries to search on behalf of others. It wasn’t until the 1980s that library users could perform their own searches. Despite the difficulty of querying early search systems, users welcomed the opportunity to search for information on their own. The internet was a by-product of the Cold War. What made it so usable was Tim Berners-Lee’s several inventions that gave birth to the web. Infinitely more versatile than the printed page, the web was an overnight success. Not only could people access digital information, but with a nominal investment in a computer, internet connection, and web browser, they could become publishers, posting anything they wanted to say and communicating instantaneously with anyone who cared to read, listen, look, watch, play, share, or like their messages. Shortly after the turn of the century, Google became the search engine of choice, and the number of Google searches far outnumbered library searches. This prompted librarians, library services companies, and database vendors to improve their systems, with graphical user interfaces and federated searching that retrieved results from the library’s subscription databases along with records from the catalog. Librarians have driven many of these changes, and they have deftly handled, even leveraged, the changes driven by other forces. As the four information professionals interviewed for this chapter suggest, online searching is a dynamic field of endeavor dependent not only on technology but also on the human element in every quest for information.
REFERENCES Abbate, Janet. 1999. Inventing the Internet. Cambridge, MA: MIT Press. Beiser, Molly, and Ann Medaille. 2016. “How Do Students Get Help with Research Assignments? Using Drawings to Understand
Students’ Help Seeking Behavior.” Journal of Academic Librarianship 42, no. 4: 390–400. Berners-Lee, Tim. 1999. Weaving the Web: The Original Design and Ultimate Destiny of the World Wide Web. New York: HarperCollins. Björner, Susanne, and Stephanie C. Ardito. 2003. “Early Pioneers Tell Their Stories, Part 2: Growth of the Online Industry.” Searcher 11, no. 7 (July/August): 52–61. Bourne, Charles P. 1980. “On-line Systems: History, Technology, and Economics.” Journal of the American Society for Information Science 31, no. 3: 155–60. Bureau of the Census. 2022. “Information Industries—Type of Establishment, Employees, and Payroll: 2020.” ProQuest Statistical Abstract of the US 2022 Online Edition. Bureau of Labor Statistics. 2022. “Librarians and Library Media Specialists.” Occupational Outlook Handbook. https://www.bls.gov/ooh/Education-Training-andLibrary/Librarians.htm. Colón-Aguirre, Mónica, and Rachel A. Fleming-May. 2012. “‘You Just Type in What You Are Looking For’: Undergraduates’ Use of Library Resources vs. Wikipedia.” Journal of Academic Librarianship 38, no. 6: 391–99. Dextre Clarke, Stella G. 2018. “Knowledge Organization System Standards.” In Encyclopedia of Library and Information Sciences, 4th ed., edited by John D. McDonald and Michael Levine-Clark, 2665–76. Boca Raton, FL: CRC Press. Donlan, Rebecca, and Ana Carlin. 2007. “A Sheep in Wolf’s Clothing: Discovery Tools and the OPAC.” Reference Librarian 48, no. 2: 67–71. Fast, Karl V., and D. Grant Campbell. 2004. “‘I Still Like Google’: University Student Perceptions of Searching OPACs and the Web.” Proceedings of the ASIS Annual Meeting 2004 41: 138–46. Hafner, Katie, and Matthew Lyon. 1996. Where Wizards Stay Up Late: The Origins of the Internet. New York: Touchstone. Head, Alison J., and Michael B. Eisenberg. 2010. “How Today’s College Students Use Wikipedia for Course-Related Research.”
First Monday 15, no. 3. http://firstmonday.org/article/view/2830/2476. Head, Alison J., and Michael B. Eisenberg. 2009. “How College Students Seek Information in the Digital Age.” Project Information Literacy Progress Report, v2. http://www.projectinfolit.org/uploads/2/7/5/4/27541717/pil_fall2 009_finalv_yr1_12_2009v2.pdf. King, Douglas. 2008. “Many Libraries Have Gone to Federated Searching to Win Users Back from Google. Is It Working?” Journal of Electronic Resources and Librarianship 20, no. 4: 213– 27. Markey, Karen. 1984. Subject Searching in Library Catalogs. Dublin, OH: OCLC. Markey, Karen, Chris Leeder, and Soo Young Rieh. 2014. Designing Online Information Literacy Games Students Want to Play. Lanham, MD: Rowman & Littlefield. Mischo, William H., and Jounghyoun Lee. 1987. “End-User Searching of Bibliographic Databases.” Annual Review of Information Science & Technology 22: 227–63. MIT Libraries. 2022. Elsevier Fact Sheet. https://libraries.mit.edu/scholarly/publishing/elsevier-fact-sheet/. Onwuegbuzie, Anthony J., Qun G. Jiao, and Sharon L. Bostick. 2004. Library Anxiety: Theory, Research, and Applications. Lanham, MD: Rowman & Littlefield. Spicher, Karen M. 1996. “The Development of the MARC Format.” Cataloging & Classification Quarterly 21, no. 3–4: 75–90. Statista. 2022a. “ePublishing—Market Data Analysis & Forecast.” Statista. 2022b. “Online Search Usage—Statistics and Facts.” Summit, Roger. 2002. “Reflections on the Beginnings of Dialog: The Birth of Online Information Access.” Chronolog (June): 1–2, 10. Thomas, Susan, Eamon Tewell, and Gloria Willson. 2017. “Where Students Start and What They Do When They Get Stuck: A Qualitative Inquiry into Academic Information-Seeking and HelpSeeking Practices.” Journal of Academic Librarianship 43, no. 3 (May): 224–31.
Vinyard, Mark, Colleen Mullally, and Jaimie Beth Colvin. 2017. “Why Do Students Seek Help in an Age of DIY?: Using a Qualitative Approach to Look Beyond Statistics.” Reference & User Services Quarterly 56, no. 4 (Summer): 257–67. Williams, Kathleen Broome. 2004. Grace Hopper: Admiral of the Cyber Sea. Annapolis, MD: Naval Institute Press. Yu, Holly, and Margo Young. 2004. “The Impact of Web Search Engines on Subject Searching in OPAC.” Information Technology and Libraries 23, no. 4 (December): 168–80.
2
Accessing Quality Information at the Library Website Library websites are as important as physical library spaces; the two together offer collections and services in support of intellectual and physical access to research, reading, learning, and recreation. Search runs through it all to a greater or lesser degree, depending on the library. Since search is fundamental to learning, academic library home pages put search boxes front and center. Public library home pages, reflecting their institutional mission, tend to display the search box in the upper-right corner, while the center of the screen features upcoming events and new books. A school library may have a simple web page that includes a link to the state library website, where centralized electronic information resources are available to all state residents. The resources for students offer search boxes for grade-appropriate results, but just as essential, especially for younger students, are the links for browsing preselected topics. Depending on the library, tools may include an advanced search screen with multiple search boxes in place of the basic screen with a single search box. Search results may include items only from the library’s online public access catalog (OPAC) or additional material from the commercial databases the library subscribes to, as well as from other sources. The exact configuration of the search tools and what they retrieve varies depending on the type of library, the extent of the library’s holdings and subscriptions, and the financial and staffing resources available to customize and maintain the system.
In whatever ways search and retrieval functions are configured, the library website provides access to quality information. To learn about, assess, and choose the most useful tools and collections for their communities of users, librarians work with companies in the information industry. Publishers, library services vendors, and database producers and aggregators often consult with librarians when introducing new products or revising existing ones. On top of these companies’ own efforts to make authoritative information findable, librarians evaluate content and tools to ensure that the library’s users have access to quality information aligned with their interests and goals. This chapter describes some of the ways in which library websites present options for accessing authoritative and credible information. For a more interactive experience as you read this chapter, you may want to go to the website being described and navigate to the features being discussed. You should also explore a favorite or familiar library’s search tools that are accessible from its website. Types of search tools and their names vary from library to library. Online Searching gives multiple examples of search tools but can’t be comprehensive because there are so many variations. You’ll have to experiment on your own, click on help, or ask a librarian at that very library for assistance. You may also want to identify the search tool vendor and go to its website to read more about its search products and services.
SEARCH TOOLS AT THE ACADEMIC LIBRARY’S WEBSITE
While it’s possible to use a popular search engine to find authoritative information on the web, it can take a lot of effort to craft successful searches and to sift the quality information from the inaccurate, false, and misleading material whose sources are obscure or unnamed. At the same time, people searching the web appreciate the ease of using keywords in a single search box and the relevance ranking of results. Consequently, search tools at the academic library website have evolved to resemble Google’s search interface by offering a single search box to gain access to a wide variety of results. Web-scale discovery (WSD) systems, or library discovery systems, can be counted on to retrieve authoritative information resources deliberately curated for quality. Behind the simple WSD search interface are a sophisticated indexing system and link resolver functions that make it possible for the user to identify and access material relevant to their research interests (Day 2017). To familiarize you with the search tools at an academic library’s website, Online Searching features the library discovery system, Easy Search, at the University of Illinois at Urbana-Champaign (UIUC). Figure 2.1 shows the Easy Search interface at https://www.library.illinois.edu/. Easy Search dominates the screen with its placement on the top left of the page, where your attention is likely to rest. To the right of the search box is a drop-down list that enables you to search broadly or limit the search to a specific subject area. Atop the search box are tabs that you can activate to generate results from searches of everything, books, articles, journals, or media. Scrolling to the bottom half of the home page, you will find icons with links to the library catalog, databases, digital collections, and other categories of information resources with their own accompanying search tools. As is typical of academic library home pages, there are links to information about the library and its services, as well as a chat box for asking questions and getting answers from a member of the library’s reference and information services staff. Choosing Easy Search’s “Everything” tab retrieves results from the library discovery system, including books, journal articles,
journals, and media. Easy Search’s everything search displays results in a bento-box arrangement, dividing them into units labeled by type such as articles, catalogs, subject suggestions, and other resources (Mischo et al. 2018). The Easy Search of everything is Google-like in the sense that the user can input a few keywords and get lots of results. For a more refined search, you can limit results to books, articles, journals, or media by choosing the tabs shown above the search box.
Figure 2.1 University of Illinois Library Easy Search interface. Courtesy of the University of Illinois Board of Trustees. Easy Search’s “Everything” tab is the default for searching digital resources at UIUC. Of course, Easy Search doesn’t search everything, but the label conveys to users that it is the most comprehensive search that the library’s discovery tool has to offer. To find out what the “Everything” tab searches, click on the “What am I searching?” link under the search box. You can also click on each tab and then click on the question “What am I searching?” to find out which resources are searched, by type and subject area, when using each tab. Generally, an academic library’s “everything”
search retrieves surrogates and sources from its subscription databases along with surrogate records, e-books, and possibly fulltext government documents from its catalog. At some libraries, the everything option searches the institutional repository and one or more digital collections as well. In figure 2.2, Easy Search’s everything search results are arranged in a bento-box format showing results in categories for a search using the keywords effect of sleep on academic achievement in college. To the left is a list of the journal articles retrieved by the search, and to the right are the books. Scrolling down, suggestions for additional resources are given, followed by results from a Bing search for your terms. The more general the user query is, the more suggestions Easy Search puts into the suggestions box located at the top of the results display. Common suggestions are to use the linked Library Resource Guides and the University Library’s Ask-a-Librarian online chat for help. Many articles and e-books have links to pdfs and e-readers so users can go directly to full texts instead of lingering over surrogates along the way. On the bottom right are other resources (such as Crossref, WorldCat, and Google Scholar) that are external to the library but promising for finding more information on the topic.
Figure 2.2 University of Illinois Library Easy Search results screen. Courtesy of the University of Illinois Board of Trustees. The moment you click on an accompanying link for the full text, Easy Search prompts you to enter a University of Illinois username and password. You should be able to use any library’s search tools, but the moment you click on a full-text link from a subscription database, you will be prompted to authenticate your affiliation with the institution. “Everything” isn’t the only Easy Search option. Choose “Books” instead and enter the query effect of television violence on children. You can limit the search to author or title words from the drop-down list, but leave it at the default “Keyword” search for now. Easy Search responds by placing results from the library catalog and from the I-Share Catalog in the bento box’s left and center units,
respectively. (I-Share is a union catalog of the holdings of Illinoisbased research, college, university, and community college libraries.) Catalog results are accompanied by real-time availability indicators, format designations (e.g., book, e-book, journal, DVD, movie), and direct links to e-books. Easy Search fills bento-box compartments with content suggestions, subject suggestions, and additional resources with useful information. When you browse beyond the first page of library catalog results, you leave Easy Search and go to the catalog’s discovery interface. Choose Easy Search’s “Articles” tab to search for journal articles indexed by the library’s subscription databases. Enter the query effect of television violence on children. Use the drop-down list to limit the search to a specific subject area or leave it at the default multi-subject search for the greatest number of results. Easy Search responds to the default by placing wide-ranging results on the left and those from the Scopus database in the center. Choose Easy Search’s “Journals” tab to search for journals and journal articles. You won’t find journals focused entirely on or named for our two sample queries, so simplify your queries, entering single words such as children, violence, television, or sleep. The system responds with two sets of results: (1) journals bearing your entered word in their titles, and (2) journal articles bearing your entered word. The first set is useful for finding full texts when a full-text or pdf link fails to retrieve the whole article automatically. It can also be useful for browsing issues of a journal to find articles. A better approach for retrieving journal articles by topic is to use the “Articles” tab instead. Easy Search’s “Media” tab limits results to movies, DVDs, music recordings, software, and computer files in the library’s catalog. Easy Search encourages users to search comprehensively, gives them one-click access to actual sources, blends library information and assistance into the results display, and extends retrieval to quality open-access resources. Other academic libraries offer the same kind of approach, although the search interface and retrievals display may differ depending on the commercial platform being used. For example, the
Wayne County Community College District Learning Resource Center home page includes a library catalog link, where the search box offers a drop-down list for selecting the catalog only, articles only, or everything. On the results page of an everything search, retrieved items are listed down the center of the page, each with a small label indicating whether it is a book, article, video, or other format. Figure 2.3 shows the first three results of an everything search for the keywords judy chicago (using the artist’s name as keywords). To the right of this list are filters for limiting results to specific document types (peer-reviewed journal articles only, for example), subjects, and publication dates, among others. Whatever the academic library’s discovery system is called and however it displays results, the common factor is the scale of the everything search, which is far more expansive than any single catalog, index, or collection.
SEARCH TOOLS AT THE PUBLIC LIBRARY’S WEBSITE Unlike academic library home pages, public library websites usually spread search tools across several web pages. Rather than offering an everything search, a public library home page will offer a search box where users enter a query that retrieves results from the library catalog. A good example is the home page of the Pima County Public Library in Tucson, Arizona, which features a catalog search box at the top (figure 2.4). Three paths to electronic databases and collections are present on the home page, by clicking on “E-Library” in the navigation bar, by using the E-Library A–Z quick link on the right side of the screen, and by scrolling down almost to the bottom of the home page. Other resources for browsing and searching are at the homework link in the navigation bar and are also listed at the
comprehensive E-Library link. Very large public libraries may offer digital collections related to local history and culture. On the home page of the Los Angeles Public Library (LAPL), for example, there’s a link to “Education & Research” in the navigation bar under the search box. It takes the user to a web page listing several types of resources, including LAPL’s special collections, which include a searchable digital collection of California-centric historical and rare materials such as fruit crate labels and movie posters.
Figure 2.3 Search results with document-type labels. Courtesy of Wayne County Community College.
Figure 2.4 Pima County Public Library home page. Courtesy of Pima County Public Library.
SEARCH TOOLS AT THE STATE LIBRARY’S WEBSITE State libraries collect and preserve publications and records of state government agencies. They also function as hubs in a statewide network of public and school libraries. Residents can visit their state library’s website directly, but it’s more likely that people will simply follow the links on their local public or school library websites to access the databases provided by the state library. New Mexico State Library’s El Portal (figure 2.5) is a good example of the ways in which a state library can provide students at every level with access to databases and other information resources that help them complete assignments. Resources include gradeappropriate databases from Gale, a Cengage company, which
publishes them for the education market. Many college and university libraries subscribe to the subject-specific Gale in Context suite of databases; K–12 students who use Gale databases for their homework will develop familiarity with the interface before they undertake undergraduate studies. Clicking on one of the Gale in Context databases or any of the other subscription resources will lead to a page for authenticating that the requestor is a New Mexico resident. The “All Resources” link in the upper-right corner of the El Portal home page leads to a list that includes all of the Gale databases available to New Mexico residents, including the Informe Académico index of Spanish- and Portuguese-language material. Among the other resources in the list is the Newsbank database, which indexes many newspapers, including three based in New Mexico. Another example is the Arizona State Library, Archives, and Public Records website. It includes the Digital Arizona Library page with clickable categories for databases by educational level, along with other electronic resources (figure 2.6). Among the databases available are the Britannica Library and several from Gale, including General OneFile, Books and Authors, and Opposing Viewpoints.
Figure 2.5 New Mexico State Library’s El Portal home page. Courtesy of the New Mexico State Library.
Figure 2.6 Resources for Learners, Digital Arizona Library. Courtesy of Arizona State Library, Archives and Public Records.
THE ROLE OF THE INFORMATION INDUSTRY Libraries work together with nonprofit organizations and for-profit firms in the information industry to put search tools in their users’ hands. It all starts with vendors who market proprietary and opensource integrated library systems (ILS) or library services platforms (LSP) to libraries, enabling them to automate their most important functional operations including acquisitions, cataloging, circulation,
interlibrary loans, public access, and serials control. As academic libraries now emphasize access to electronic resources over acquisition of print materials, vendors have developed cloud-based LSPs capable of handling print, electronic resources, and digital collections in a single system that includes the ability to generate use metrics that support collection management decisions. Some university libraries are transitioning away from Voyager, originally developed by the for-profit ILS vendor Endeavor Information Systems, which later merged into Ex Libris Group, and then was subsequently acquired by ProQuest. Many academic libraries are switching to Alma, the Ex Libris LSP, which works well with the Ex Libris discovery system, Primo. By the spring of 2021, more than two thousand academic libraries had begun using Alma. A simultaneous development is the creation of FOLIO, an open-source LSP involving a collaboration among libraries and vendors. In particular, EBSCO began offering an instance of the software branded EBSCO FOLIO Services in 2020, available to libraries that prefer to let an outside company handle training, support, troubleshooting, and development of an open-source product (Breeding 2021). While cataloging librarians create surrogate records for books, media, and periodicals, they have left the creation and organization of surrogate records for periodical articles to others in the information industry. Database creators, publishers, and vendors create the search system, which includes not only the structured data in the surrogates but also the indexing, interfaces, searching language and protocols, and facet filters that make scholarly and trade journal articles, magazine stories, and news content discoverable. Some databases for the academic market provide multidisciplinary scholarly sources, while others specialize in the published literature of a discipline, such as chemistry, education, or psychology. Databases come from governmental, for-profit, and nonprofit sectors. Especially important are professional societies, such as the American Psychological Association and the Modern Language Association, that have assumed the role of publisher, producing databases that serve their membership specifically and
entire professions generally. Social media has also given rise to databases, with users serving as contributors when they add citations, media, tags, or biographical information to websites such as LibraryThing, YouTube, and LinkedIn. Database publishers employ professional staff to select, describe, and organize content. Database publishers might index all of the articles in each issue of a periodical or they might index selectively to include only the material appropriate to a subject-specific database. For each article, they generate a citation, and most add subject terms from a controlled vocabulary that represents the article’s main topics and an abstract that summarizes the article. Full texts come from journal publishers or from aggregators with licenses allowing them to distribute journal publishers’ digital content. Fulltext pdfs look like printed pages in a journal, magazine, or newspaper. Databases and repositories storing numeric data may offer downloadable spreadsheets as well as pdfs of statistical tables. Some databases index news broadcasts, documentaries, and other types of videos. Information seekers can search a database using the publisher’s own search system, such as a member of the American Psychological Association searching the database on the organization’s website, or they can search through a database vendor with a license allowing them to provide access to the database on their own platform. The vendor may be a database aggregator providing many different databases on the same platform, where each can be searched singly or simultaneously with others on the platform. Some database aggregators are also database publishers, adding their own databases into the mix. For example, EBSCOhost allows multiple databases to be searched at one time, including EBSCO’s Academic Search Ultimate as well as the American Psychological Association’s APA PsycArticles, the National Library of Medicine’s MEDLINE, and the University of California at Berkeley Ethnic Studies Library’s Chicano Database. Users can also select only one or a group of related databases to search via the EBSCO platform.
Libraries are deliberate about the databases they choose for their information seekers, and they draw on knowledge of their user groups and communities when making selection decisions. The process starts with librarians and prospective database users evaluating the database. Some of the evaluations that librarians have conducted are reported in the professional literature, making their methods and conclusions available to others facing selection decisions (e.g., Ritchie, Banyas, and Sevin 2019). Although the evaluation criteria may be applied differently in different contexts, librarians generally agree on several desirable features: clarity of the interface, quality of the relevance ranking of results, incorporation of a subject thesaurus, full-text availability, and ease of use so that librarian assistance and instruction are not required (Calvert and Jordan 2021). Databases that pass the evaluation proceed to license and subscription negotiation, which may be carried out by a single library or by a consortium of libraries. Licensing involves the database vendor’s restrictions on use, such as requiring authentication that a user is affiliated with the university or library. The final three steps pertain to both licensed and open web databases: profiling the database at the library’s website, marketing it to prospective users, and maintaining it in the library’s search tools. Be prepared to evaluate new databases and shoulder responsibilities for profiling, marketing, and maintaining some of your library’s databases. At the largest libraries, one librarian may serve as the library’s chief electronic resources officer, competent at managing the business connected with databases and their evaluation, selection, licensing, renewal, or deselection (NASIG 2013). Library discovery systems are an alternative to searching databases, catalogs, and collections individually, allowing users to search a large portion of the library’s digital collection simultaneously. Libraries have a choice of systems: EBSCO Discovery Service, with versions tailored to academic, public, and school libraries; Ex Libris Primo; Ex Libris Summon; and WorldCat Discovery.
Because Google has indexed books (books.google.com) and academic journals (scholar.google.com), its search system finds not only web pages but also books and articles. Google is not licensed to provide direct full-text access to copyrighted articles available behind paywalls at the publisher’s website or in a subscription database. Even though library discovery tools have the easy functionality of a Google search and retrieve the massive numbers of results web searchers have become accustomed to, they present a quandary: how to choose the best tool for the task and then craft a search that produces much fewer but better results. Many of the library users who come to you for help may have little to no experience searching databases and may not understand how an everything search at the library website differs from the everything search of Google. At the same time, many other library users will have already used your library’s search tools to find some, but not all, of the information they need. Consequently, you may be helping one person learn the basics and helping another find more in-depth or obscure information. You might start by retracing their steps in the most popular and obvious search tools (i.e., the discovery system, the library catalog search box, and the subscription databases). When these fail, you’ll have to determine if the search tool, the search strategy itself, or both are the problem. You may need to go further afield, checking search tools users are less likely to consult such as the library’s digital assets management system, an open-access institutional or subject-based repository, or a federal government website. Searching databases individually should be high on your list because their interfaces, search systems, and structured information can provide functionality that’s not available in massive multidisciplinary discovery systems. Finding and using the right search tool—and using it expertly—to answer a user’s query is a difficult task, but it will become easier as you gain experience.
QUESTIONS These questions are meant to give you hands-on experience exploring search tools at library websites. Answers conclude the chapter. 1. Go to an academic library’s website. Does it feature an everything search, and if it does, what is it called? What sources does it search? Do a keyword search. Does the search system use a bento-box display? If it does, what are the names of each compartment, and what are the sources it searches to produce each compartment’s results? If it doesn’t, how does it order results? How far can you drill down before you must authenticate your affiliation with the library? 2. On the same academic library’s website, what do you have to do to search only the library catalog? How many links or tabs do you have to click on to get to the catalog search interface? What are the labels for each link or tab and what is the catalog itself called? Do a keyword search. How are results displayed? How far can you drill down before you must authenticate your affiliation? 3. What link or tab do you click on to display open-access and subscription databases? What tools are given to help you select a database for your topic? How far can you drill down before authentication is required? 4. Now look for the institutional repository (IR) of the university or college. (Not all have one, so you may have to visit the website of a top research university.) How did you find it? What kinds of materials are in it—technical reports, dissertations, journal articles? Do a keyword search. How are results displayed? 5. Visit the website of a public or state library. On what link or tab do you click to display databases? What tools are given to help you select a database? Are there separate tools for different age groups? For different information needs? How far can you drill
down before you must authenticate as a library cardholder or state resident?
SUMMARY Library buildings and library websites are both important to the user’s experience and to the user’s ability to access quality information. Library websites make it possible for information seekers to search the library’s subscription databases online, without entering the building. At most academic library websites, that means using the single search box on the home page, which retrieves items from the library catalog, the licensed databases the library subscribes to, and other resources such as the campus repository of open-access material by researchers and others affiliated with the university. At the public library website, it’s more common to have a single search box that retrieves items only from the catalog. At the state library website, a web page may list all of the subscription databases available to residents of the state, often categorized by age group to serve teachers and students.
REFERENCES Breeding, Marshall. 2021. “2021 Library Systems Report.” American Libraries, May 3. https://americanlibrariesmagazine.org/2021/05/03/2021-librarysystems-report/. Calvert, Kristin, and Whitney Jordan. 2021. “Gone, but Not Forgotten: An Assessment Framework for Collection Reviews.” In
Ascending into an Open Future: The Proceedings of the ACRL 2021 Virtual Conference, April 13–16, 2021, edited by D. M. Mueller, 29–39. Chicago: American Library Association. http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conference s/confsandpreconfs/2021/GoneNotForgotten.pdf. Day, James M. 2017. “Discovery Services: Basics and Resources.” Library Technology Launchpad, March 8. https://libtechlaunchpad.com/2017/03/08/discovery-servicesbasics-and-resources/. Mischo, William H., Michael A. Norman, Mary C. Schlembach, B. R. Bernhardt, L. Hinds, and L. Meyer. 2018. “Innovations in Discovery Systems: User Studies and the Bento Approach.” In What’s Past Is Prologue: Proceedings of the 37th Charleston Conference, edited by Katina P. Strauch, Beth R. Bernhardt, Leah H. Hinds, and Lars Meyer, 299–304. West Lafayette, IN: Purdue University Press. NASIG Core Competencies for Electronic Resources Librarians. 2013. https://www.nasig.org/Competencies-Eresources. Ritchie, Stephanie M., Kelly M. Banyas, and Carol Sevin. 2019. “A Comparison of Selected Bibliographic Database Search Retrieval for Agricultural Information.” Issues in Science and Technology Librarianship 93 (Winter): n.p. https://doi.org/10.29173/istl48.
SUGGESTED READING Wong, Sandra. 2020. “Web-Scale Discovery Service Adoption in Canadian Academic Libraries.” Partnership: The Canadian Journal of Library & Information Practice & Research 15, no. 2: 1–24. https://doi:10.21083/partnership.v15i2.6124.
ANSWERS 1. The everything search tool is more characteristic of an academic library than a public library because of the former’s emphasis on electronic resources for research. An academic library’s everything search likely retrieves journal articles from its databases, surrogates from its catalog, and digital content from its IR. UIUC’s everything tool goes beyond UIUC, searching multidisciplinary databases for books, e-books, journal articles, newspaper stories, and open-access resources. Results are augmented with a variety of full-text links, with some links implying one-click access to full-texts (e.g., “pdf full-text,” “pdf full-text from publisher,” and “Get pdf”) and other links implying that full-texts aren’t immediately available (e.g., “Request through interlibrary loan,” “Full text finder,” and “Look for fulltext”). 2. The UIUC University Library home page offers a tab for books. Other libraries may offer a link to an advanced search screen where you can limit your results to books as a material type. On some sites you’ll have to click once or twice to get to a screen that lets you search the classic catalog. The point is that a library’s catalog of local holdings is no longer the centerpiece of the academic library. 3. A library discovery search box may allow you to limit your search results to articles; you’re doing a search but not in a single database. To find a list of all the databases to choose from, you’ll need to find a link or tab labeled something like research tools, research databases, articles, or databases. Once you click on that link or tab, you may find a list of all the available databases in alphabetical order by title as well as lists of databases categorized by the subjects they cover. For commercial databases that the library subscribes to, you’ll have to log in with your university credentials to get access. 4. Many universities have open-access repositories that may hold research articles by their faculty members, dissertations by their
doctoral students, their librarians’ instructional material and articles, and other items produced by individuals affiliated with the institution. Academic libraries are often the administrators of these institutional repositories. Because the name institutional repository isn’t likely to be understandable to end users, you might try campus repository or university repository. Your IR probably has a name with local significance such as Deep Blue (University of Michigan), Knowledge Bank (The Ohio State University), and IDEALS (UIUC). A library’s everything search might include IR holdings. If not, and if you can’t find the IR by searching the library’s website, try using the search box on the university’s home page. Open access is fundamental to the IR’s mission, so authentication isn’t necessary to use the full texts you find. 5. Most public libraries provide a catalog search box on the home page so it’s easy to find a book, magazine, or music CD to borrow, but it will probably take at least one click to find the databases. A state library’s home page may emphasize its holdings of state government documents, archives, and digital collections of historical material, so you’ll need to look for clickable icons or links labeled resources, homework help, or research databases.
3
The Reference Interview You are learning about online databases, search systems, and search tools so that you can become an expert intermediary searcher, applying your knowledge to help library users satisfy their information needs. The interaction with users in which they express these needs is called the reference interview. In this chapter, we’ll consider the reference interview in different contexts shaped by the library user’s purpose and objectives. The reference interview is the first of seven steps in the search process, as shown in figure 3.1. A reference interview begins with the need to determine whether the library user’s initial question is just one question easily answered by a web search, dictionary, encyclopedia, atlas, or other resource, or whether it is the beginning of a more complex research query requiring a search for sources. The complex research query is common in the academic library, although queries in public and school libraries can also require a reference interview to determine the true information need and how to address it. In a public library, this may include a quest for the next novel to read or movie to view, and in a school library it may involve help with a homework assignment. An expert searcher intermediating between an information seeker and information resources must be an expert communicator. The expert searcher knows where and how to find information but must undertake a question-negotiation process to arrive at an understanding of what the information seeker needs. “Without doubt, the negotiation of reference questions is one of the most complex acts of human communication,” according to a classic article
by Robert S. Taylor (1968). “During this process, one person tries to describe for another person not something he knows, but rather something he does not know” (180). Arriving at a common understanding of a query can be challenging in person and even more so on the phone or in a chat where facial expressions and body language are absent. Initially, it may be difficult to distinguish queries that require in-depth analysis from the rest of the queries that come your way. Eventually, you will sense the open-ended nature of complex queries, and instead of searching for hard-and-fast answers, you’ll conduct a search that produces multiple results. You will engage users in the process of identifying relevant results so you can use them to find more information or take the search in new directions. The reference interview is fundamental to the process of information intermediation. Round out your understanding of it by consulting books about the reference interview (Harmeyer 2014; Ross, Nilsen, and Radford 2019), the briefer one-chapter treatments of the reference interview in major textbooks on reference services (Cassell and Hiremath 2018; Wong and Saunders 2020), and the relevant sections of textbooks on specialties such as data librarianship (Rice and Southall 2016) and school librarianship (Riedling and Houston 2019).
MODELS OF THE INFORMATION-SEEKING PROCESS Models are used to shrink real-world situations, processes, and systems down to their essential components. They help us understand complex phenomena and make them easier to study. No
one model represents the full gamut of information-seeking activities, but each can help you recognize and understand aspects of the process. Levels of Need Model Taylor (1968) identifies four levels of need experienced by an information seeker, with each level corresponding to a type of question: Q1—the actual but unexpressed need for information (the visceral need); Q2—the conscious, within-brain description of the need (the conscious need); Q3—the formal statement of the need (the formalized need); Q4—the question as presented to the information system (the compromised need). (255)
A person’s earliest experience of an unmet information need, Q1, may involve the inability to recognize exactly what is needed. As the person is exposed to more information, the need may take on a more definite shape. At the Q2 stage, the person may begin to articulate the information, but in an unclear way, and so they may reach out to someone else for more clarity. When you find it difficult to understand a user’s information need and respond with a plan of action, you may be encountering a user at the Q2 level. The role-playing exercise in textbox 3.1 creates the conditions for you to experience this phenomenon from the perspective of the end user, reference librarian, or both.
TEXTBOX 3.1. Helping Users Transition from Q2 (Conscious) to Q3 (Formal-Need) Questions To perform this role-playing exercise, work with a partner. One person plays the role of the end user, and the second person
plays the role of the reference librarian. You could even switch roles at some point. Before getting started, the person playing the end user should think about a topic that interests them but that they haven’t had time to pursue through conducting research online. It could be a topic of personal interest or a topic that they must write about for a class. Don’t give this much thought, and don’t spend time searching online in advance. The end user approaches the librarian and expresses their query. The librarian reacts, getting the end user started. Both end user and librarian need to decide what started means within the context of the role-playing episode and when to stop. The episode should include these events: the end user describing their information needs to the librarian; the librarian listening to the end user and negotiating with the user to arrive at a full understanding of the query; and the librarian getting the user started on the search for information, which might include user and librarian working together online to find relevant information. When you are done, take a few minutes to answer the questions below. 1. (For both) Do you think that the end user’s initial expression of their query was comparable to the Q2 question that Taylor had in mind? Why or why not? Was it a Q3 question instead? If so, why do you think it was a Q3 question? 2. (For the librarian) Do you think that you understood the user’s query? Why or why not? What strategies did you use to facilitate your understanding? What worked, what didn’t work, and why? 3. (For the end user) Do you think the librarian understood your query? Why or why not? How do you think the
librarian could have done a better job developing such an understanding? 4. (For both) As a result of the negotiation that went on between the two of you, what does the user really want? Are you both in agreement as to what the negotiated query is? (The negotiated query is the librarian’s understanding of what the user wants as a result of conducting a reference interview with the user.) Do you detect differences between the initial query and negotiated query, and if you do, what are they? 5. (For both) Did you go online to find relevant information? If yes, what were your respective contributions during the finding process? To what extent did searching help either or both of you refine the query? If you role-play in class, be prepared to share your answers in a class discussion.
Reaching the Q3 level means the person can express their information need clearly, although they may not have an accurate mental model of the scope of the available information resources and search systems. While negotiation that enables the librarian to develop a full understanding of the user’s query might be necessary at this level, it transitions in a timely manner to related concerns, such as how much information is needed, how technical or scholarly the information should be, deadlines connected with the project, and so on. Q4 involves developing the actual search statements and entering them into search boxes. Taylor’s name for Q4 questions, the compromised need, acknowledges that user queries and the search statements librarians formulate are not one and the same. The search statement is the compromise between the user and the system made necessary by the constraints of the database and its functionality.
Information Search Process (ISP) Model The information search process (ISP) model describes the user’s information-seeking experience as an evolution that involves specific thoughts, feelings, and actions (Kuhlthau 1991). In her seminal study of high school and college students, still widely cited by other researchers, Kuhlthau named and described six stages of the search process, along with the feelings students experience at each stage (366–68): 1. Initiation—The person has a vague need for information and feels uncertain about how to address it. 2. Selection—Working through possible topics to pursue, the student homes in on a particular one, giving rise to a feeling of optimism. 3. Exploration—As the search for information proceeds, the sense of optimism fades when the student finds confusing or conflicting information that requires effort to understand as they learn more about their topic. 4. Formulation—The effort of exploration leads to the formulation of a focused topic that the student feels quite able to research and understand at the level needed to complete an assignment. 5. Collection—The student can articulate their information need to intermediaries and can work with librarians and alone to craft search statements that retrieve relevant results, further solidifying their interest in the topic and their belief that they will be able to complete their project. 6. Presentation—The student is ready to prepare their final product and is relieved to have reached this point. There may be a sense of a job well done, or, if the search did not go as well as they hoped, a sense of disappointment. Berrypicking Model
As students and researchers work on a topic, they find information in a variety of places, and as they learn more about their topics they pick and choose sources that lead to new information (Bates 1989). This iterative process applies in the short-term case of the student working on a research paper for a course and in the long-term case of an expert engaged in research in an academic discipline. The berrypicking model acknowledges that some people search for information repeatedly on the same or similar topics; they apply the sources that they’ve found to enhance their knowledge; and, in the course of doing so, their research focus or topic may shift a bit. As a result, the next time they search, their follow-up query has changed, which changes the questions they ask librarians, the queries they enter into search systems, and their relevance assessments of the new sources that their follow-up searches retrieve. This sequence of events may have a long duration, especially in the case of domain experts—scholars, scientists, and veteran researchers—who build on the results of current research initiatives when proposing new research and advising and teaching students. Consider the implications of this model. Your initial assistance produces information for users. They process what you have found, and in the course of doing so, not only does their original query evolve, but they may also generate new and different queries altogether. Their subsequent searches (with or without your assistance) may yield results they deem more useful than the original ones or build on what they learned from the original results. If users revisit you for additional information, they may express new and entirely different queries generated from their exposure to information you helped them find earlier. The experience and expertise they bring to a reference interview means they may be able to express their information need quite clearly, eliminating some of the librarian’s basic questions to focus on a deeper search or perhaps a broader one. Reading Experience Model
Traditional readers’ advisory services in public libraries have focused on recommending books in certain genres or with particular features that a reader likes. A newer approach, the reading experience model, recognizes that today many readers (and viewers and gamers) use social media to discover books, films, music, and games (Dali, Vannier, and Douglass 2021). Additionally, automated recommendation systems incorporated into many public library catalogs offer a more like this feature that helps users find new fiction. But there is still a role for the human intermediary who works to understand the seeker’s reading experience and enhance it with new and appealing material. The reading experience model focuses on what the reader wants to experience based on the reader’s own story about their personal and social reading activities. Eliciting the reading experience involves a conversation visualized as a framework of three circles, moving from the largest and most general to the smallest and most specific (Dali et al. 2021): Circle 1. The reader’s story. The librarian begins the conversation by using a qualitative research method of interviewing (adapted for library users) called SQUIN, the single question aimed at inducing narrative. A question such as “Do you remember how you came across this book?” allows the reader to tell the story of their reading experiences (Dali 2013, 495). The librarian listens carefully and patiently as the reader reveals the history and context of their reading. Circle 2. The librarian asks an open-ended question to begin to focus on the kinds of stories the reader wants to experience, whether in print, digital, or audio books, or in movies, music, or games. Circle 3. Only after learning about the library user as a consumer of stories does the librarian begin to ask closed-ended questions to identify specific elements of appeal for the reader, such as genre, characters, action, and other factors characteristic of the librarians’ advisory approach to recommending titles. As with the pursuit of research in the berrypicking model, the reading experience unfolds over a long period of time as tastes and interests change. As with the ISP model’s attention to affective
factors, a person’s narrative about the stories they have enjoyed may include emotional reactions to the books, movies, music, and games they’ve experienced.
THE NATURE OF THE REFERENCE INTERVIEW The reference interview is a conversational exchange between a librarian and a library user, in which the user is likely to describe something they don’t know, which leads to a negotiation so that the librarian is able to determine what the user really wants. Because queries may involve sensitive or personal information and because they require mental concentration, it is best to conduct the reference interview quietly in a relatively private area. The interview may include a search of the web, the library collection, the library’s subscription databases, or an open access repository for what the librarian believes is relevant information that has the potential to completely or partially resolve the user’s quest. Some reference interviews are brief because the user expresses their information need clearly. Others are protracted, involving a negotiation between librarian and information seeker to reach a shared understanding of the query. Researchers have studied user inquiries, categorizing them and determining which are more or less likely in academic and public libraries (Arnold and Kaske 2005; McKewan and Richmond 2017; Radford and Connaway 2013). Categories, category names, and their definitions vary across studies, but there are enough similarities to suggest that you can expect the following: Subject Research. A query that involves a topic for which your searches are likely to produce multiple results, none of which
answers the user’s question entirely. To generate answers, the user must extract useful information from these results and synthesize what they learn to answer their query. Examples: “What is the effect of television violence on children?” “How successful was President Johnson’s War on Poverty?” Known Item. A query for a specific source that you or the user knows exists and for which your searches are intended to target the one item that the user seeks. Examples: “Where can I find this government report mentioned in today’s newspaper?” “Do you have Richard Flanagan’s latest novel?” Readers’ Advisory or Reading Experience. A readers’ advisory query is similar to known-item queries in that a library user has a book or game they enjoyed or a documentary they learned from and they’re looking for more like it. A reading experience query begins with a broad conversation about the reader’s interactions with books or other media in personal and social contexts and then winnows down to the more targeted focus of the readers’ advisory query. Ready Reference. A query for a fact that needs no additional analysis beyond verification in an authoritative source. Example: “What peaks do I have to climb to become a 46er?” Policy and Procedural Matters. An inquiry that involves the library’s policies or procedures. Examples: “I want to check these materials out, but I forgot my library card.” “Under what conditions do you forgive fines?” Technical Assistance. An inquiry that asks for help with hardware, software, apps, or physical resources. Examples: “How do I print this screen?” “How do I cite this blog in my term paper?” Directions. An inquiry about a virtual or physical library space. Examples: “Where is the local history section of the library?” “Where do I find the databases on the library’s website?” Referrals. An inquiry for information or services provided by another organization. Examples: “Where can I get food aid?” “I need help studying for my GED.”
Online searching resolves most queries categorized as subject research, known item, readers’ advisory/reading experience, and ready reference. Indirectly, it helps you answer inquiries categorized as policy and procedural matters, technical assistance, directions, and referrals. For example, you may search your library’s website for answers to a policy matter or the web for an application’s technical manual while troubleshooting equipment problems. This chapter’s questions and answers sections demonstrate how fluid queries are, beginning as one category and transforming to another category as your negotiation with users during the reference interview proceeds. Research findings reveal that query categories depend on type of library. According to McKewan and Richmond (2017), academic librarians can expect about two of every five queries to pertain to subject research while technical assistance and ready reference queries will be rarer, accounting for about one of every ten inquiries. At a public library, expect one of every two inquiries to involve either technical assistance or policy and procedural matters and one of every three queries to be related to either subject research or finding a known item. Ready reference and readers’ advisory/reading experience queries account for less than one of every twenty inquiries in a public library. Researchers who study the reference interview observe that certain behavioral attributes of the reference librarian are likely to leave library users with positive perceptions of the librarian’s performance and of libraries generally. In response, the Reference & User Services Association (RUSA) (2018). has published guidelines for the behavioral performance of reference-interview providers that advise them to be approachable and interested, good communicators, and effective searchers, and to be capable of conducting follow-ups to original interactions.
PHASES OF THE REFERENCE INTERVIEW Underlying the RUSA guidelines is a multiphase reference interview, in which the process of online searching is embedded. Figure 3.1 shows the interview’s phases, including a detailed depiction of the search process. For searches that fail or get off track, iterative loops back to earlier phases help further clarify the information need. Greeting The in-person reference interview begins with a greeting, such as “Hi, may I help you?” The librarian’s smile, relaxed demeanor, eye contact, and open body stance all contribute to putting the user at ease and establishing rapport so that the user feels comfortable and the interview begins on a positive note. The greeting “Hi, how may I help you?” may not be as inviting as it seems at first glance. Such wording might suggest to users that they must know how the librarian can help, leading them to phrase their initial question in a way that anticipates what the librarian should do with the query, such as which database the librarian should use. A greeting such as “Hi, may I help you?” encourages users to say what is on their mind so that you can begin to gain an understanding of what they want and use it to initiate the search. First impressions go a long way toward setting the tone of the interaction, and thus, your overall demeanor should convey the message that you are approachable, friendly, eager, and willing to help. If you are working on an unrelated task while you are on duty at the reference desk, users may be hesitant to approach and interrupt you. Whether in person or virtually, it is important to send a message that you are ready, prepared, and pleased to help everyone, and it is your job to do so.
If you are helping one user and notice more users approaching, smile and make eye contact with each of them so they know they have been seen. You may be uncomfortable interrupting the ongoing reference interview to verbally acknowledge them with a comment, such as “I’ll be with you shortly,” so a nonverbal signal, such as a nod of the head, may be a good substitute. Library reference staff may have an agreed-upon greeting for answering the phone. If you smile while you answer the phone, your voice is likely to convey the smile, sending a signal to the inquirer that you are ready and willing to help them. Answer the phone with an upbeat and enthusiastic tone of voice so that the interaction starts on a positive note. An online chat session may begin with a standard greeting as well, and, if a chat with another patron is in progress, may include an assurance that the librarian will return shortly.
Figure 3.1 Phases of the reference interview and steps of the online searching process. Initial Inquiry
The interview transitions to the initial inquiry when users make a statement that describes what they want. Librarians agree that users’ initial queries rarely describe what they really want, and research confirms their suspicions (Dewdney and Michell 1997; Radford et al. 2011; Ross 2003). Consider this reference interview: LIBRARIAN: Hi, may I help you? USER: I need some books on alcoholism. LIBRARIAN: Okay. [Types the subject heading alcoholism into the library catalog.] We have lots of books on alcoholism. It looks like most of them have the call number 362.29. I’ll write that down for you. Take a look at the books there, and you’ll probably find what you’re looking for. USER: Thanks. [Leaves for the bookshelves.]
No negotiation between the librarian and the user takes place. The librarian takes this user’s query at face value, checks the library catalog for the right call number, and dispatches the user to the bookshelves to browse books on alcoholism. Let’s see what happens when the librarian negotiates the user’s initial query. LIBRARIAN: Hi, may I help you? USER: I need some books on alcoholism. LIBRARIAN: We have lots of books on alcoholism. Tell me what it is about alcoholism that interests you. [Navigates to the library catalog just in case.] USER: Like, I’m interested in recovery and meetings and stuff. LIBRARIAN: Okay. We can look for books that talk about what happens at meetings, how meetings help with recovery. USER: Yes, meetings. Are there any meetings around here? LIBRARIAN: Well, let’s check a local events calendar.
This reference interview has a different outcome because the librarian probes the user’s interest in alcoholism by saying, “Tell me what it is about alcoholism that interests you,” to which the user
volunteers more information, first “recovery and meetings” and then “meetings around here.” These cue the librarian to what the user really wants: a list of Alcoholics Anonymous meetings with their exact locations, and meeting times. The query shifts from the subject research to the referral category. For a different user, such as a high school, community college, or university student, the interview might have shifted to research for a term paper on a specific topic related to alcoholism. In a public library, it might have become a readers’ advisory interaction. Consider this reference interview: LIBRARIAN: Hi, may I help you? USER: Do you have a map of Michigan’s Upper Peninsula?
Stop right here and ponder the librarian’s next step. You could take this initial query at face value, referring the user to the library’s map room, where atlases and maps are stored, or to map websites on the web. Or, instead, you can work to understand the user’s initial interest in maps: LIBRARIAN: Sure, but is there something in particular about the Upper Peninsula that interests you? USER: Yeah. The parks and wilderness areas. LIBRARIAN: OK. So you are interested in the parks and wilderness areas in the Upper Peninsula. Anything else? USER: I’m going birding. LIBRARIAN: OK. Is there a particular type of forest or wilderness area that is best for birding in the Upper Peninsula? USER: I want to see a Connecticut Warbler—that’s where they nest—and a few other northern species while I’m there. LIBRARIAN: OK. So you are interested in finding Connecticut Warblers and other birds in the Upper Peninsula. USER: Yeah.
LIBRARIAN: Let’s search the eBird database, where birdwatchers record the species they’ve seen. Maybe they’ve reported the bird you want to see, plus its location.
This second reference interview starts with the user pursuing a known item, a map of Michigan’s Upper Peninsula, and ends with the librarian and the user searching the eBird database on the web for reports of the species that interests the user. The librarian may work with the user for a while, both of them figuring out how to search eBird for a particular species in the geographical location that interests the user. When they do, the librarian is likely to pass the searching task on to the user for further exploration. Both of these discussions capture interactions with users who don’t immediately disclose their interests to the librarian. Table 3.1 indicates why people aren’t forthcoming about their interests. Of these reasons, the user seeking “some books on alcoholism” is probably concerned about self-disclosure, and the user seeking “a map of the Upper Peninsula” probably doesn’t think the librarian knows enough about birding to be able to help. Users accustomed to searching the web solo whenever they have an information need may underestimate the amount of relevant, authoritative information available via the library. They may overestimate their search skills, based on their satisfaction with the relevance ranking of their web search results. Users’ initial queries often do not describe what they really want, and as a consequence, you have to negotiate all your interactions with users to discover what it is that they hope to discover. Negotiation The negotiation phase is the crux of the reference interview during which the librarian determines what the user really wants. Getting it wrong means that whatever you do or advise the user to do may be for naught because both you and the user are on the wrong path toward an answer. As a result, the user might regret interacting with you and may be reluctant to consult you or other librarians in the
future. Even when the user appears confident, assertive, and knowledgeable about the topic, you should ask open-ended questions such as “What kind of information are you hoping to find?” “What Table 3.1. Reasons People Don’t Reveal Their Real Information Needs Overall Reasons Specific Reasons The nature of Users’ initial inquiries are their way of starting the reference and getting involved in the interaction. interview Users think they are being helpful by expressing their queries in ways that anticipate the systems that librarians will search to produce relevant retrievals or the genres that will characterize relevant retrievals. The difficulty of Users express their queries in broader terms developing a than their specific topics, thinking that, if the searchable topic library has information on the former, they will be able to find information on the latter. Users don’t comprehend the scale of information available at the library and its website. Users are “just beginning to explore” their topics and are sampling what’s available in the library. (15) Users don’t know whether there is anything to be found on their topic. Users asking an imposed query (a query based on another person’s question) are unsure what the original requestor wants.
The difficulty of expressing one’s information needs to someone else
Users don’t know what they want, and it is difficult for them to express what they don’t know in the course of asking someone else to help them find it. Users think that they have to express their queries in ways that anticipate what librarians will do to find useful information for them. Users don’t realize that even simple requests, such as “I’d like information on football,” are bereft of the context that they have in mind. To successfully communicate their needs, they have to add context to their queries; for example, “football in the United States, the college game, injuries, particularly to the head.” The erroneous Users think libraries are information assumptions users supermarkets, where they are on their own have about except to ask for directions when they can’t libraries and their find something. resources Users think libraries and their systems are simple and they can figure them out if only they have a few clues about how things work. Users don’t “volunteer information that they don’t perceive as relevant. And they don’t understand the relevance because they don’t know how the system works.” (15) Users don’t know that the library’s mission includes a commitment to service and helping people like themselves find information. Overall Reasons Specific Reasons
The erroneous assumptions users have about librarians
Users want to be seen as competent, secure, and knowledgeable
Users don’t know that librarians are there to help them satisfy their information needs. Users don’t think librarians know enough about their specialized topics to be able to help them. Users don’t want to bother librarians, who they think have more important things to do. Users avoid self-disclosure, especially if their queries reveal financial, health, or legal issues they are dealing with or sensitive personal matters that they feel expose something about them or make them feel vulnerable. Users think that the librarian will consider them to be incompetent, dumb, or worse because they have to ask for help. Users think that the librarian will think they are cheating when they ask for homework help.
Source: Adapted from Ross, Nilsen, and Radford (2019, 15–22). specifically are you looking for relating to [topic]?” or “Tell me more about [topic],” just in case the user has something in mind that hasn’t yet been revealed. Consider this chat session between a user and librarian: USER: Hi, I’m looking for something on the Vietnam War. LIBRARIAN: Welcome to Ask-a-Librarian. What specifically are you looking for relating to the Vietnam War? USER: Well, I’m interested in the response to the war that old veterans are having. LIBRARIAN: OK, so how veterans of the Vietnam War are looking back at it in a more current time?
USER: More like the old veterans who are going back to Vietnam. Some just visit, and others actually stay and do good things. I want to know the reasons they go back, OK? LIBRARIAN: OK, please hold while I look.
The user begins the interaction, telling the librarian she is interested in the Vietnam War. Do not expect the interaction to proceed immediately from initial query to negotiated query just because you used one open-ended question. It may take several back-and-forth interactions for you to understand what exactly the user wants. Be prepared for reference interviews that leave you hanging, wondering whether you truly understood what the user wanted. In some of these interviews, the user won’t know what they want. They may be testing the waters, determining whether there is any information available to warrant further investigation of the topic or some interesting aspect of it. The question about Vietnam War veterans is an example of an indepth subject query that may take time to set up and conduct. If such a question is posed in an ask-a-librarian chat session, you may want to ask the user whether they are in the library and can meet you at the reference desk or your office. There, you can involve them in the search more directly, negotiating questions, gathering search terms, getting immediate feedback on initial results, and so on. If they’re working remotely, it would be useful to schedule an appointment for in-person consultation or for a virtual synchronous meeting during which you can share your search screen. A distinguishing feature of in-depth queries is their openendedness, and the impossibility of finding one rock-solid answer to the question. Instead, users must gather as much information as possible, study it, and generate an answer or course of action based on their knowledge and understanding of everything they have read. Especially in academic settings, users typically report the results of their analyses in written form as an essay, journal article, or book, or in spoken form as a speech to a class or a talk at a scholarly conference. However, they could just as well put the results to work in some other way, such as to expand their reading, listening,
viewing, or gaming experience, or to make something such as a decision, a diagnosis, or an object.
Open-Ended Questions Open-ended questions help information seekers open up about their real interests because they elicit anything but a yes or no response. Interspersed between the queries in table 3.2 are a librarian’s openended questions that were instrumental in eliciting a user’s negotiated query. Open-ended questions encourage users to open up about their topics and explain them in their own words. They elicit user responses that reveal the context of their queries. For example, the information seeker may ask where the art books are shelved but really wants to know if the library has a book about the sculptor Augusta Savage. A teacher wants to prepare a lesson plan on California wildfires, but needs help focusing the topic for her sixthgrade students. There are many reasons why someone may seek information about dementia, and open-ended questions can help elicit the true nature of the query without prying into personal health matters. The reasons users express their queries in broad terms may be a result of their way of summing up their interests at the start of the reference interview, their limited knowledge of the subject, their desire to sample what the library has on a topic before committing to something specific, their obfuscation of personal interests that they don’t want to admit to others, their reaction to a topic imposed by a college instructor or a boss, and so on. The imposed query comes from someone else, typically a teacher, family member, boss, friend, or colleague. Whether users who approach you with imposed queries have negotiated them with the original inquirer, identifying the impetus for the query or what the inquirer really wants, is
doubtful, so you will have to do the best you can with the information they are able to provide. Imposed queries are common at public, academic, and school libraries, accounting for one-quarter to one-half of the queries posed to reference librarians (Gross 2001; Gross and Saxton 2001). Table 3.2. Open-Ended Questions That Give Rise to Negotiated Queries User’s Initial Librarian’s Open-Ended Negotiated Query Query Questions Where are We have a lot of art Biography of sculptor the art books. What kinds of Augusta Savage books? art are you interested in? California Tell me about your Encyclopedia articles with wildfires class and your facts and figures about the lesson plan learning objectives. largest wildfires in California’s history Dementia That’s a big topic. Current medical research on What is it that you’d the prevention of like to find out about Alzheimer’s disease dementia? As a reference librarian, you will encounter just about every topic under the sun. You cannot be an expert on everything, but you can be an expert on how to find information to meet all kinds of information needs. Get in the habit of responding to users’ initial questions with open-ended questions—it’s your key to getting past what users already know to what they’d like to know.
Being Present
In the context of the reference interview, being present means actively listening to the user, being interested in the task of answering the user’s question, and enlisting both verbal and nonverbal cues to reassure the user that you truly are engaged in their particular problem. Active listening is a communication skill that requires the librarian to listen to what the user says and repeat it back, paraphrasing it to confirm understanding. Active listening enables librarians to successfully usher user queries from initial to negotiated status. Unfortunately, the rapid pace and constant interruptions of everyday life make it difficult to actively listen. The only way to develop your active-listening skills is to practice them by concentrating on the reference interaction as it is happening, to the exclusion of almost everything else. Interest means your interest in the task of answering the user’s question, not in the actual question itself. How you demonstrate your interest in the task is conveyed to the user both nonverbally and verbally. Nonverbal cues are your eye contact with the user, smiling, head nods in response to the user’s replies, and an open stance. Resist the temptation to cross your arms, fold your hands, or put your hands in your pockets, because these are closed-body gestures that users may interpret as your disinterest, disapproval, or even outright hostility. Avoid nervous gestures, such as tapping your fingers, playing with your hair, or fidgeting with an object. In a reference chat interaction, periodically let the user know you are still there or still working on finding the answer. While assisting one user, resist the temptation to multitask and work on another user’s inquiry or your own work. In person, on the phone, and in chat or instant messaging, verbal cues include your explanations of the online sources you are accessing and why you are accessing them and, possibly, sentence fragments that are indicative of what you are reading. Your cues need not be a play-by-play account of everything you are doing, but should include enough information to assure the user of your attention and effort. Verbal cues are essential when responding to phone calls, chats, and texts because users cannot see you. When
the user is talking on the phone, an occasional “uh-huh,” “OK,” or “I see” substitutes for the head nods that are characteristic of inperson reference interactions. These utterances function as “noncommittal acknowledgments” that you are actively listening to what the user is saying and “can encourage the user to continue talking” (Smith and Wong 2016, 68). Somewhat comparable are the simple “searching . . .” and “working . . .” messages that you should send to a user every two or three minutes or so as you work to answer a reference question during a reference chat or text-message interview. Such messages give you time to think or work during the interview while reassuring the user that you are busy solving their problem. When you have answered the same question on numerous past occasions or are close to the end of your shift at the reference desk or in chat, demonstrating interest may be difficult. Get in the habit of physically and mentally resetting between users. Take at least two or three long, deep breaths and relax your shoulders, your jaw, and any other part of your body that tenses when you are under pressure. Taking at least twenty seconds to regroup from time to time should refresh you physically and mentally for the next information seeker.
Closed-Ended Questions When users have in-depth queries in mind that warrant a comprehensive database search, additional negotiation may be necessary to shed more light on their queries and to answer database-specific questions (table 3.3). Closed-ended questions requiring yes, no, or short answers are appropriate for obtaining answers to most of these questions. Knowing the user’s discipline or field of study, the kind of information they want, their level of sophistication with the topic and discipline, and what research they
have already done helps the librarian make database selection decisions. Of all the questions listed in table 3.3, asking users how they intend to use the information may be the most sensitive. Asking this in an academic or school library can be helpful because so many uses pertain to coursework. In other situations and settings, users might balk at a direct question that asks how they will use the information they find. The rule of thumb is to ask indirectly, explaining to users that knowing how they intend to use the information will help you answer their question. Users who respond with “I’m not sure yet” or “I’m helping a friend” are probably not ready to share their intended uses with you, so back off and do the best you can with the information they are willing to share. Resist the temptation to make assumptions about users based on their personal appearance, cultural differences, age, race, ethnicity, gender, religion, economic status, disability, or their use of English, including whether they speak with an accent. Such assumptions can compromise the effectiveness of the reference interview. Treat everyone with respect, and do your best to help each user. Respect their privacy, and let them be in control of what they are willing to share about their information needs.
Table 3.3. More Negotiation for In-Depth Queries What You Want to Know How to Ask What discipline or field of Are you looking for answers from a study underlies the user’s particular perspective or discipline? query What is it? What the user’s level of Do you want advanced material or sophistication is with both something basic and easy to the topic and discipline understand?
What kind of information the user wants
What instructions did your teacher give you about the types of information they want you to use? What kind of information do you want? [statistics, news stories, lesson plans, scholarly journal articles, etc.] What research the user has What have you done so far? done already If you’ve found something you really like, could you show it to me so we can use it to find more like it? Whether the search should Do you want to find everything produce high-recall or written about this topic, or do you high-precision results want a few useful articles? How the user intends to It would help me to answer your use the information he or question if you could tell me a little she finds about how you will use the information we find. We have lots of information on [topic]. I could help you better if I knew what you are trying to do with the information. When the user needs the When do you need this information? information What’s your deadline? Search The next phase is the librarian’s search for information. Some reference interviews don’t involve searches or may require only a quick one, particularly inquiries categorized as ready reference, referrals, directions, technical assistance, and policy and procedural matters. Depending on the circumstances, the librarian may conduct the search or guide the user through the searching process while explaining the ongoing search using relatively jargon-free language that is understandable to the user. Talking through the database
selection, search query formulation, and interface interactions as they are happening can serve as a form of instruction on the fly, letting the reference interview expand a bit if the librarian recognizes a teachable moment when the user is ready to learn something that will help them become a more self-reliant searcher. Table 3.4 lists the seven steps that make up the search process. Included are each step’s objective and the numbers of chapters that cover each step in-depth. The seven steps of the online searching process function as a road map to tell you where you have been, where you are, where you are going, and what you can expect to accomplish along the way. Initially, you will experience steps 2, 3, and 4 as discrete and separate activities, but as you transition to expert-searcher status, you will perform them almost simultaneously. Iteration and looping might occur during steps 6 and 7, when searchers enter their searches, display results, and assess the relevance of retrieved sources. Should the search fail, you might loop back to step 2 to choose another database and then continue. At any point, you might loop back to step 1, consulting the user for clarification and direction. Although the reference interview is the first step, elements of the interview may be repeated throughout the process as the librarian and information seeker interact with each other and with the search tools and results.
Table 3.4. Seven Steps of the Online Searching Process Step Objective Chapter s 1. Conducting the To determine what the user 3 reference interview really wants 2. Selecting a relevant To produce useful information 4 database that reflects the user’s knowledge of their topic
3. Framing the negotiated query as a subject or known-item 4. Conducting a facet analysis and logical combination 5. Representing the negotiated query as an input into the search system 6. Entering the search and responding strategically 7. Displaying retrievals, assessing them, and responding tactically
To reveal clues about how to formulate search statements
5
To plan for search statements that address the big ideas, concepts, or themes that make up the negotiated query To formulate search statements that produce relevant retrievals
5
To conceptualize the search overall so that its execution is efficient and effective To ensure that the execution of important aspects of the search are done efficiently and effectively
6 to 10
11
12
Results Presentation The interview’s results-presentation phase could involve a simple fact: the answer conveyed verbally on the spot and the source of the answer. For in-depth queries, regardless of their being in-person, on the phone, or virtual, the librarian and the seeker evaluate the nature of the results in the context of the information need. This phase may involve an email message from you or from the system you searched, bearing links to the ongoing search, search results, and full texts. Your email message should invite the user to contact you or your colleagues with additional questions about the search results or anything else and close the interview on a positive note. Follow-Up
Users who have additional questions set the follow-up phase in motion. Some users might ask you to explain the search results, the terminology you used to conduct the search, or the databases you searched. Other users might tell you that the search results are not relevant, making the follow-up phase loop back to the negotiation and searching phases so that you can further develop your understanding of the user’s query and respond with additional searches and relevant results. Closing The final phase of the reference interview is the closing, in which you verify that the information satisfies the user’s query. Your closing should make users feel good that they sought your assistance and assure them that they can ask follow-up questions on this particular query or on something else. Examples are the simple “Glad to help; come back and see us anytime” or the somewhat longer “Glad I could help; come back if you need help on this or anything else you are working on, and if I’m not here, one of my colleagues will be happy to help.” This might also be a good time to promote the online ask-a-librarian service as well.
KNOWING WHEN TO STOP SEARCHING FOR INFORMATION Knowing when to stop may be easy for referrals, directions, ready reference, policy and procedural matters, and technical assistance. It’s not as easy for in-depth subject research and hard-to-find known
items because there’s always more information available in subscription databases, on the web, and even in older print material. Being an expert intermediary searcher, you know it’s there and how to find it. Some users will tell you when they have enough information. Others will emit subtle nonverbal cues—focusing on the screen with their fingers on the keyboard, picking up their backpack, fidgeting, closing their laptop, beginning to edge away—that indicate they’ve reached the saturation point. Let them get started with what you found initially. In your closing, tell users that you know that you can find more information and they are welcome to contact you if they need to. And then edge away yourself.
QUESTIONS 1. Categorize table 3.2’s initial and negotiated queries. Notice how the librarian’s negotiation changes the categorization. 2. Is negotiation always necessary? Scrutinize these five initial queries: a. Good morning! Where would I be able to find really good information on fashion? I tried women’s studies but didn’t get much. b. Do you have a current travel guidebook to France? c. Hi. I need something on etiquette. d. I’m having trouble using PubMed. e. I keep hearing references to cryptocurrency, but I don’t understand what that is. Can you help me find a simple explanation? Decide which you should negotiate and which you should take at face value. When you’ve made your decision, jot down the characteristics of initial queries that fit into your negotiate and don’t negotiate categories. Answers are given at the end of the chapter.
SUMMARY This chapter begins with an examination of the information-seeking process to help you, the reference librarian or intermediator, bridge the gap between your understanding of this process and being a key participant in it. The process encompasses an incongruity between what users say they want and what they really want, placing the burden on you to resolve the incongruity during the reference interview and gain a complete and thorough understanding of what the user wants. This chapter also spotlights the reference interview and categorizes the types of queries you should expect from users. Your interaction with them is divided into these seven steps: (1) greeting the user, (2) eliciting the user’s initial query, (3) negotiating the query, (4) searching for relevant information online, (5) presenting search results, (6) following up, and (7) closing the interview in a positive manner that encourages the user to seek the assistance of reference staff in the future. An important goal is making users feel comfortable during the interview so that they are forthcoming about their information needs and positively inclined to consult you or your colleagues for help with their information needs in the future. Eventually, you’ll put your knowledge of the user’s information needs to work and conduct online searches that produce relevant information.
REFERENCES Arnold, Julie, and Neal K. Kaske. 2005. “Evaluating the Quality of a Chat Service.” portal: Libraries and the Academy 5, no. 2: 177– 93.
Bates, Marcia J. 1989. “The Design of Browsing and Berrypicking Techniques for the Online Search Interface.” Online Review 13, no. 5: 407–24. Cassell, Kay Ann, and Uma Hiremath. 2018. Reference and Information Services: An Introduction, 4th ed. Chicago: ALA Neal-Schuman. Dali, Keren. 2013. “Hearing Stories, Not Keywords: Teaching Contextual Readers’ Advisory.” Reference Services Review 41, no. 3: 474–502. Dali, Keren, Alyssa M. Brillante, Pearl I. Bass, Ashley M. Love, Leah Byrnes, Aimée Fontaine, and Miranda M. Buren. 2021. “Conversing with Readers: A Framework for the Reading Experience Conversation.” The Reference Librarian 62, no. 2: 81– 97. Dali, Keren, Clarissa Vannier, and Lindsay Douglass. 2021. “Reading Experience Librarianship: Working with Readers in the 21st Century.” Journal of Documentation 77, no. 1: 259–83. Dewdney, Patricia, and Gillian Michell. 1997. “Asking ‘Why’ Questions in the Reference Interview: A Theoretical Justification.” Library Quarterly 67 (January): 50–71. Gross, Melissa. 2001. “Imposed Information Seeking in Public Libraries and School Library Media Centers: A Common Behaviour?” Information Research 6, no 2. http://informationr.net/ir/6-2/paper100.html. Gross, Melissa, and Matthew L. Saxton. 2001. “Who Wants to Know? Imposed Queries in the Public Library.” Public Libraries 40: 170– 76. Harmeyer, Dave. 2014. The Reference Interview Today: Negotiating and Answering Questions Face to Face, on the Phone, and Virtually. Lanham, MD: Rowman & Littlefield. Kuhlthau, Carol. 1991. “Inside the Search Process: Information Seeking from the User’s Perspective.” Journal of the American Society for Information Science 42, no. 5: 361–71. McKewan, Jaclyn, and Scott S. Richmond. 2017. “Needs and Results in Virtual Reference Transactions: A Longitudinal Study.” Reference Librarian 58, no. 3: 179–89.
Radford, Marie L. 1999. The Reference Encounter: Interpersonal Communication in the Academic Library. Chicago: American Library Association. Radford, Marie L., and Lynn Sillipigni Connaway. 2013. “Not Dead Yet! A Longitudinal Study of Query Type and Ready Reference Accuracy in Live Chat and IM Reference.” Library & Information Science Research 35, no. 1: 2–13. Radford, Marie L., Lynn Silipigni Connaway, Patrick A. Confer, Susanna Sabolcsi-Boros, and Hannah Kwon. 2011. “‘Are We Getting Warmer?’ Query Clarification in Live Chat Virtual Reference.” Reference & User Services Quarterly 50, no. 3: 259– 79. Reference & User Services Association (RUSA). 2018. “Guidelines for Behavioral Performance of Reference and Information Services Providers.” Accessed December 19, 2021. http://www.ala.org/rusa/resources/guidelines/guidelinesbehavior al. Rice, Robin, and John Southall. 2016. The Data Librarian’s Handbook. Chicago: American Library Association. Riedling, Ann Marlow, and Cynthia Houston. 2019. Reference Skills for the School Librarian: Tools and Tips, 4th ed. Santa Barbara, CA: Libraries Unlimited. Ross, Catherine Sheldrick. 2003. “The Reference Interview: Why It Needs to Be Used in Every (Well, Almost Every) Reference Transaction.” Reference & User Services Quarterly 43, no. 1: 38– 43. Ross, Catherine Sheldrick, Kirsti Nilsen, and Marie L. Radford. 2019. Conducting the Reference Interview: A How-to-Do-It Manual for Librarians, 3rd ed. Chicago: ALA Neal-Schuman. Smith, Linda C., and Melissa A. Wong, eds. 2016. Reference and Information Services: An Introduction, 5th ed. Santa Barbara, CA: Libraries Unlimited. Taylor, Robert S. 1968. “Question-Negotiation and Information Seeking in Libraries.” College & Research Libraries 29, no. 3 (May): 178–94.
Wong, Melissa Autumn, and Laura Saunders, eds. 2020. Reference and Information Services: An Introduction, 6th ed. Santa Barbara, CA: Libraries Unlimited.
SUGGESTED READING Ross, Catherine Sheldrick, Kirsti Nilsen, and Marie L. Radford. 2019. Conducting the Reference Interview: A How-to-Do-It Manual for Librarians, 3rd ed. New York: ALA Neal-Schuman. St. Jean, Beth, Ursula Gorham, and Elizabeth Bonsignore. 2021. Understanding Human Information Behavior: When, How, and Why People Interact with Information. Lanham, MD: Rowman & Littlefield.
ANSWERS 1. Categorize table 3.2’s initial and negotiated queries. Directions (the library’s art books) becomes known item (a specific artist’s biography). Subject research for a specific purpose (wildfires, lesson plans) becomes reference (facts and figures). A broad subject (dementia) becomes focused research (Alzheimer’s disease prevention). 2. Is negotiation always necessary? Of the five initial queries, only two describe exactly what the user wants—the travel guidebook query and the cryptocurrency query. As a result of negotiation, the three remaining queries on fashion, etiquette, and PubMed become these negotiated queries. Successful entrepreneurs in the fashion industry
Whether it is ever appropriate to cry in the workplace Chronic fatigue in long-haul Covid-19 patients There’s nothing special, unique, or distinctive about the former or latter queries that initially identifies them as candidates for negotiation. Sometimes one-faceted queries, such as fashion or etiquette, mean that users are stating their interests broadly to get the interview started. Other times, users really are interested in their one-faceted queries. They just need to shop around for ideas to limit their topics, possibly using the clusters that accompany search results in some databases. The PubMed query is especially interesting; the librarian’s negotiation with the user transforms what initially appears to be technical assistance into a subject query. Question negotiation is a wise course of action, no matter how simple the initial query seems.
4
Selecting a Relevant Database As a result of the negotiation phase of the reference interview, you should have an understanding of what the user wants in the form of a negotiated query. Categorizing queries helps you identify which of them necessitate online searching. For queries that do, you can proceed to steps 2 through 4 of the online searching process: database selection, typecasting, and facet analysis. Selecting the right database from the hundreds you may have access to requires understanding the different genres and forms involved. In this chapter, you’ll learn about the different types of databases and you’ll gain an understanding of the importance of knowing what a database does and doesn’t index. In the context of a specific information need, effective intermediation requires database knowledge as well as database searching skills.
DEFINING DATABASES A database is a collection of data or information systematically organized to facilitate retrieval. That simple definition encompasses a vast array of forms. The web itself can be considered a database, with search engines indexing websites, web pages, blog posts, and
the contributions of everyday people who add reviews, citations, media, field observations, tags, biographical information, and much more to social media and other sites. Even individual websites have database-like attributes. For example, major retailers offer search features to help shoppers find pictures, descriptions, prices, and reviews of products and services. And although a library catalog is usually referred to as a catalog, it is a database by definition. What facilitates retrieval in a database is indexing. A nonfiction book often has an index at the back where you can look up a term and find the page numbers in the book where that term appears. The terms are indexed to make them findable. The same principle applies in the online environment. Databases contain records, each of which contains metadata that describes the item the record represents. The metadata is input into designated fields, with the author’s name in the author field, the article title in the title field, and so on. Database systems are programmed to index the metadata such as author name and title of the work, subject terms identifying the main topic of the work, the content of abstracts summarizing the work, and the full text if it is included in the database. This indexing function makes it possible for you to search by author name to find books by that author, even when you know nothing else about the book, or to find articles using the words that you included in your keyword search. We can think of an index at different scales as well. A database indexes the meaningful words in the surrogate records so we can identify when, where, and in what magazine an article was published. A full-text database may index every meaningful word in all of the full-text magazine articles it stores. Very common words such as the, it, and on, are not indexed; they are stop words and cannot be retrieved in any database system that is programmed to ignore them. The earliest electronic databases created for the library market were referred to as bibliographic databases because they organized and indexed records representing scholarly journal articles, trade and popular magazines, and other written works. In a sense, their content was a bibliography of works. Although the term is still used,
it’s more likely you’ll hear databases called research databases or just databases now that media, images, and data are also being indexed and included. In information-intensive environments such as libraries, the term database usually refers to electronic indexes and full-text collections of scholarly publications and other material useful to researchers, such as newspaper articles, statistical information, audiovisual content, and images. College and university libraries subscribe to many databases in support of coursework and research. Traditionally, these have been databases with powerful search systems requiring excellent online searching skills to get the most relevant retrievals for the intended use. But there are several other kinds of databases—data repositories and collections of digital archives, among others—that may serve the purposes you uncover in reference interviews. Being aware of the panoply of database types will help you discern the best ones to use for each individual’s information needs. The remainder of this chapter categorizes databases using various criteria, but there is one over-arching distinction to note now: whether the database is fee-based or freely accessible. This is not obvious to individuals who consider themselves web-savvy but are unaware that web search engines cannot retrieve copyrighted publications paywalled in commercial databases. It’s also not obvious to library users that the commercial databases they use at the library website are paid for by their tax dollars or student fees. Even though largely invisible to library users, paywalls can have a profound impact on the information seeker’s ability to find and use information relevant to their interests. This key difference, fee or free, has to do with intellectual property rights. Many, if not most, of the databases accessible at library websites index copyrighted material and charge for access, including subscription fees that libraries pay so their users can search the databases. Subscription database publishers use some of the revenue to pay licensing fees to copyright holders so their material can be legally included in the database. Database publishers use some of the revenue from subscriptions to provide sophisticated
search systems and user-friendly interfaces. And some of the revenue provides profits for database publishers and income for database creators (which are sometimes one and the same). The full text of copyrighted material in password-protected databases, such as texts available at library websites, is not openly accessible on the web. These paywalled databases are referred to interchangeably in this book as commercial, licensed, or subscription databases. The exceptions are publications whose copyright holders have waived some restrictions to make their work more widely available. Such publications are stored in open-access repositories, which do not charge fees for users to search and download the material they index. Open-access repositories do not have to pay licensing fees for the material they index because the copyright holders have chosen not to exercise their exclusive right to reproduce and distribute their work, instead making it freely available. Although the copyrighted version of an article published in a scholarly journal may only be accessible in a commercial subscription database or for a per-item fee at the publisher’s website, an open-access version, if one exists, may be discoverable using Google Scholar (scholar.google.com) or the Directory of Open Access Journals (doaj.org). Another kind of database, the digital collection or digital library, may consist of electronic replicas of physical items held in special collections; born-digital material that has no preexisting physical counterpart; and material created by the holding institution, such as videos, audio recordings, and transcripts of oral history interviews. The holding institution may make the digital collection freely available on the web, or it may license access to a database publisher who then markets it to other libraries. A source of free research publications, numeric data, and other information is the US government, whose publications are in the public domain when issued, with a few exceptions. Federal government databases are freely available on the web, and some commercial databases also include government publications. The search systems developed for commercial databases tend to offer more features and options than the search engines enabling retrievals from open-access and public domain databases.
Your decision regarding which databases to search after conducting a reference interview will involve many considerations. It’s important to recognize that individuals not affiliated with a university, public, school, or other institution’s library may face restricted access to material that could be of use to them, and that even affiliated users may have a limited understanding of the number and variety of databases offering access to information not discoverable by web search engines.
CATEGORIES OF DATABASES There are thousands of databases that libraries may make available to their users. The library’s mission and budget are factors librarians consider when curating their list of accessible databases. Although numbers don’t tell the whole story, you can get a sense of the relative availability of databases at different types of libraries by considering a few in one large city, Houston. The Texas Medical Center Library offers almost 170 databases, the Houston Public Library system and the Texas Southern University Library provide access to around 250 databases each, Houston Community College offers almost 300, the University of Houston around 530, and Rice University more than 1,400. No matter how many databases a library offers its users, it can be difficult for the intermediating librarian to remember the types of content and search features of every single one. A library’s list of available databases often includes a brief description of each one. Database search screens usually have an information icon or help link where a description of the database content and search features can be found. When you are logged in to the EBSCOhost suite of databases, you can use the “choose databases” link to see descriptions of the ones available from your library. As figure 4.1 shows, clicking on the arrow next to a database name opens a
description, in this case of the Historical Abstracts database, which tells you how far back in time publications are indexed and what geographic areas are and are not included. Checking the database publisher’s website is another way to find database descriptions. Your library’s website will have tools to help you with database selection as well, offering lists by subject, type of material indexed, type of user the database is designed to serve, and by database name and possibly even publisher or aggregator name. It can be useful to sort databases into categories that will help you quickly identify and eliminate types that won’t retrieve the kind of information a user needs. Categorizing databases in terms of five elements—source type, genre, selection principle, form, and editorial control—helps with the process of identifying the right databases to search for a given query. Although some databases fit into more than one category, thinking in terms of categories can help you quickly eliminate some from consideration while homing in on the best one(s) for the task.
DATABASE SOURCE TYPES Database search results may be of two types: actual sources and surrogates for sources. Whether a database is a source database or a surrogate database must be determined along two dimensions. The first dimension pertains to the document representation, in the form of database records, that your search statements search. In a surrogate database, each database record serves as a descriptive representation, a surrogate, of a unique item in the database. For example, a surrogate record for a journal article will include metadata such as the author’s name, article title, journal name, the date and page numbers indicating when and where in the journal the article was published, and often an abstract or summary of the article. When you use a surrogate database, the system has only this
abbreviated form of the actual source to search. In contrast, a source database includes the full items themselves. With many more keywords to process, a source database search system will likely retrieve more items that match your keyword search than a surrogate database can. Textbox 4.1 explains why surrogate records are important even in full-text databases.
Figure 4.1 Description of the Historical Abstracts database. By permission of EBSCO Publishing, Inc. The second dimension addresses whether the database system retrieves results that are the actual sources themselves or are
descriptions of sources, such as a citation or summary. The former is a source database, and the latter a surrogate database. Most surrogate databases at the library’s website bear resolver links that query other databases for desired sources and, when they find them, display them to users, along with the option to download them to their personal computers. To complicate things a bit, a database that generally provides sources may include only surrogates for some items. For example, JSTOR has agreements with many journal publishers who impose embargoes that allow surrogate records representing recently issued full-text articles, but not the articles themselves, to be indexed in the database. Embargoes may last from months to years; once they expire the full-text sources can be retrieved. Knowing that a journal’s most recent articles are not in the database may be a consideration when you’re choosing the best database for a particular search. Another example of a database that indexes both surrogate and source material but limits access to the full sources is the Shoah Foundation Visual History Archive Online at the University of Southern California. A digital collection that began as a repository of Holocaust testimony and has expanded to include additional twentieth-century genocides, the Visual History Archive includes approximately fifty-five thousand interviews conducted worldwide. Institutions that subscribe to the archive have access to all interviews (USC Shoah Foundation 2022). Individuals who register to use the archive on the web can conduct searches for free to retrieve surrogates, so they can discover specific interviews. Figure 4.2 shows the search screen available to those who register. About four thousand of the full video interviews are accessible without charge. The default results screen lists items by availability, with full interviews at the top of the list, making it easy to see which sources are freely available.
TEXTBOX 4.1. The Utility of Surrogates in Full-Text Databases
Let’s take a moment to determine why source databases include both surrogates and full texts. You may think that the existence of the latter should preclude a need for the former; that is, surrogates should be superfluous in a full-text database because of the availability of the full texts. Yet full-text databases almost always include surrogate records, typically in the form of abstracting and indexing (A&I) records bearing citations, subject descriptors, and abstracts. Why? The easy answer is that A&I databases predate full-text databases, and there’s no reason not to include surrogates. But even LexisNexis, one of the earliest providers of databases that indexed full texts from the beginning, includes surrogate records. The deeper answer to this question relates to the role of the surrogate as a representation of the full text it is linked to. Think about the keyword searches you’ve conducted in full-text databases. Most likely, they’ve retrieved hundreds of results, presented to you as a list of surrogate records. Retrieving all those sources minus the surrogate records would be akin to someone plunking down a stack of three hundred books on your desk in response to their search of the library’s book collection for a topic that interests you. How would you respond? After rolling your eyes and taking a deep breath, you’d probably scan through the pile, reading titles, winnowing the list to the most promising titles, and then reading their book jackets, forewords, or introductions to determine whether you should put the effort into reading the whole book. Surrogate records provide summaries of content that you’d otherwise have to spend more time searching for in a full-text record. You review surrogate contents, scanning their titles, reading their abstracts, and checking their subject descriptors, all the while looking for evidence that the source is relevant. Convinced of relevance, you are then ready to invest your time in reading the full texts of the most promising ones.
Figure 4.2 Search screen of the USC Shoah Foundation Visual History Archive Online. By permission of the USC Shoah Foundation– The Institute for Visual History and Education.
DATABASE GENRES Four types of sources, or genres, are contained in databases: texts, media, statistical information and numeric data, and geospatial data. Although these are described separately here, many databases include more than one genre. Texts are written documents. The most common text-based genres contained in commercial databases are the products of academic scholarship, such as journal articles, conference papers, dissertations, theses, research reports, books and book reviews, and
the products of news reporting, such as newspaper and newsmagazine articles, newswire stories, and broadcast news videos and transcripts. Other commercial databases contain entire texts of different types of material: e-books, poems, plays, and archival documents including primary sources such as digitized letters and diaries. Databases that include media such as audio files, videos, and images index the surrogates that describe media in words. For example, the Library of Congress Prints & Photographs Online Catalog provides an advanced search screen where you can limit your search to author/creator name, title, or subject. Figure 4.3 shows the first few results for an author/creator search for Gordon Parks. Next to each image is the metadata found on the surrogate record that describes and represents the photograph taken by Parks.
Figure 4.3 Results page in the Library of Congress Prints & Photographs Online Catalog. Source: Library of Congress Prints & Photographs Online Catalog, https://www.loc.gov/pictures/
Numeric data includes statistical information that can be presented in tables, graphs, charts, and figures as well as raw data generated during scientific research or gathered using methods such as surveys, polls, interviews, scientific observations, and experiments. Long-established organizations such as the Interuniversity Consortium for Political and Social Research (ICPSR) at the University of Michigan and the Roper Center for Public Opinion Research at Cornell University have made their data available for years. Countless other nonprofit, commercial, and governmental suppliers of data worldwide share and sell data as well. Since 2019, the OPEN Government Data Act has required US federal agencies to make their machine-readable datasets discoverable in the data.gov catalog on the web. Datasets are in high demand by expert researchers who deploy methods of data analytics and visualization to understand complex phenomena. College instructors and K–12 teachers also use datasets to help students develop math and coding skills and, in some cases, to help community members document local problems needing solutions. Queries for numeric information search surrogate records that describe in words the topic and type of data in tables, graphs, maps, charts, and figures along with additional metadata such as the creator’s name and date created. For example, Statistica, a subscription database, produces reports, infographics, and datasets that can be discovered by browsing topical categories, searching by keywords, or using a combination of browsing and searching. Search results, such as reports on a particular retail industry, include citations and links to the source of the original data. On the web, Google Images indexes the words in web page image tags, making it possible to input a query for electric vehicles table to retrieve not only statistical tables about electric vehicles but also links to the web page or publication for the full source that includes the table. Government websites, such as that of the Census Bureau, provide access to data that can be downloaded as pdfs or spreadsheets. The information seeker’s purpose—such as a speech needing a readymade illustration or a dataset for use with analytic software—will determine what format is needed and how to search for it.
With geospatial data providing locations of items on or near the earth’s surface, it’s possible to create maps depicting many kinds of phenomena, from the worldwide (women in national parliaments) to the local (land parcel ownership in a county). An excellent example of the use of library collections to create a geotagged dataset from which to generate an informative map is high school student Skye Lam’s “Mapping the Green Book in New York City” (2021). Because many individual and institutional creators of geospatial datasets share their data freely, researchers can use the data to create new knowledge, and teachers and students can use it to work with geographical concepts and build skills. Researchers who create geospatial datasets often share them freely. Open-access datasets can be found at university-sponsored sites such as the Big Ten Academic Alliance Geoportal (https://geo.btaa.org/) and the Open Geoportal (http://data.opengeoportal.org/), both of which offer advanced search forms where you can look for data by creator name, title, and other elements. The data.gov catalog also includes many geospatial datasets. As shown in figure 4.4, the left side of the data.gov search screen offers topical and other kinds of filters, including one for geospatial data. Once that limit is applied, you can use the search box to find data matching your keywords. Figure 4.5 shows results for a search for “oil wells,” including the quotation marks to retrieve the phrase and not the separate keywords. Results include data from NASA, NOAA, and a Pennsylvania state government agency. Clicking on the dataset title leads to a description of the data and to published articles whose authors have used the data to create maps and other graphics supporting the articles’ main points.
Figure 4.4 Data.gov search screen.
Figure 4.5 Geospatial dataset results for the search “oil wells” in the Data.gov catalog.
SELECTION PRINCIPLE
The selection principle governs the database publisher’s decisions about the types of resources to include and exclude from the database. Databases bear genre- or form-specific content, subjectspecific content, and/or multidisciplinary content. You can see the selection principle in action when you choose a database at your library’s website. You can list databases alphabetically and apply qualifiers to limit the list to general multidisciplinary content, specific subjects, databases by form, databases by genre, or combinations of these. For example, the University of Arkansas Library’s list of four hundred databases can be limited by type, as shown in figure 4.6. Subject-specific databases may choose to index only a journal’s articles relevant to the subject rather than every article in the journal.
FORM Form relates to the purpose of the database as reflected in the types of information included in it. Databases filled with facts and meant for quick look-ups are reference databases, which themselves come in a variety of genres. Research databases are also filled with facts and information but not designed for quick look-ups; instead, the material they index is meant for study, in-depth analysis, and synthesis.
Figure 4.6 Databases by title and subject at the University of Arkansas Libraries. Courtesy of the University of Arkansas.
REFERENCE DATABASES Reference sources are your destination for facts. A fact is something that exists now or is known to have existed in the past, such as a person or place, or an object, event, organization, or institution. This section analyzes reference databases based on their genre: biographical information, dictionaries, directories, encyclopedias, and a group of resources including almanacs, handbooks, manuals, and yearbooks. Understanding the wide variety of reference database genres is the first step to becoming more knowledgeable and experienced with the databases at your library’s website so that you may quickly and efficiently target the right reference database for the job. Biographical Information
Consult a biography for information about a person’s life, anything from their accomplishments to such factual data as birthdate, birthplace, names of immediate family members, and education. One example is Gale in Context: Biography, a database with more than six hundred thousand biographical entries from various sources (figure 4.7). With it, you can retrieve entries for well-known individuals, including their birth and death dates, family members, education, career histories, accomplishments, and other information. You can also search for categories of well-known people, such as by occupation, ethnicity, or both. Biographies may cover living or deceased persons or both. Biographical databases may be national, regional, or international in scope. Encyclopedias and dictionaries are also excellent sources of biographical information. The web is a source of biographical information, but be careful about accepting web-based information at face value. Someone who publishes a website about a famous person, living or dead, may be doing so to promote a personal or group agenda. Verify what you find on the web in other more trustworthy sources. Dictionaries Most people are familiar with language dictionaries, which give definitions, etymology, pronunciations, and usages of words. Less familiar are discipline-based dictionaries, which give definitions for and explanations of concepts, events, objects, and overarching topics related to a single subject area. The most comprehensive and authoritative dictionary of the English language is the Oxford English Dictionary (Oxford University Press 2022). Enter a word into the “quick search” box to learn its meaning, history, and pronunciation. Alternatively, launch the advanced interface, choose “Senses” or “Quotation” from the drop-down menu, and enter a word or phrase to search for its occurrence in the texts of definitions or in quotations. Another language-oriented dictionary is the web-based Acronym Finder (http://www.acronymfinder.com/). Enter an acronym
and the system will list written-out forms of the acronym. Browse the list, and you are likely to recognize the one that answers your ready-reference query.
Figure 4.7 Entry from the database Gale in Context: Biography. Cengage Learning Inc. Reproduced by permission. www.cengage.com/permissions. Discipline-based dictionaries can be helpful to information seekers new to a field of study. Some dictionaries focus narrowly on a specific subject area or field of study. Others are more comprehensive, covering whole disciplines comprising multiple subject areas. Some offer one-line entries, and others feature extensive entries with author attribution so you’ll know which subject-matter expert wrote each entry. Discipline-based dictionaries may use the terms encyclopedia, companion, and reference guide in place of dictionary in their titles, leading you to believe they give encyclopedia-length entries. Only by entering a few search statements and displaying retrieved entries will you know how much depth to expect. For example, netLingo (http://www.netlingo.com) entries describe the meanings of new technology-related words, phrases, and abbreviations in a sentence or two, whereas entries in the Dictionary of Canadian Biography Online
(http://www.biographi.ca/en/) are multipage essays with bibliographies, author names, and links to author affiliations. Licensed database publishers Gale and Oxford University Press have aggregated hundreds of reference sources into their Gale Virtual Reference Library and Oxford Reference Online databases. Both databases give users the option to search one reference source, multiple reference sources by subject, or all sources simultaneously. Directories Directories provide contact information for individuals and organizations. You can search telephone directories at http://www.whitepages.com and http://www.yellowpages.com to find people, businesses, and business categories and do reverse lookups by inputting phone numbers or addresses rather than names. The subscription database Gale Directory Library includes multiple resources such as Associations Unlimited, the National Directory of Nonprofit Organizations, and the International Research Centers Directory, among others. Entries include contact information; contact name; and a brief description of the organization’s purpose, services, and activities. Figure 4.8 displays an entry for the Warren County Beekeepers Association, showing contact information, the group’s website address, and a brief description. Encyclopedias Consulting encyclopedias for information enables readers to develop a working knowledge of the topics that interest them. Encyclopedias are also satisfactory for answering simple reference questions, such as “Why isn’t Pluto considered a planet anymore?” and “In what years did the Hekla volcano erupt?” In addition to Wikipedia, there are general encyclopedias for K–12 students that can help them learn something new and get started on a paper or project. At the college level, subject-specific encyclopedias feature entries written
by experts that give background information, definitions, detailed explanations, and current issues and trends about topics in the subject area. The bibliographical references at the end of an entry can lead students and researchers to authoritative sources. One example of an authoritative specialized encyclopedia is the openaccess Stanford Encyclopedia of Philosophy (https://plato.stanford.edu/index.html). Each entry includes the name of the expert in the field of philosophy who wrote the entry as well as a bibliography of sources used in the entry. Ready-Reference Mix
Figure 4.8 Entry from the Encyclopedia of Associations in the Gale Directory Library. Cengage Learning Inc. Reproduced by permission. www.cengage.com/permissions. The final group of reference sources is a potpourri that yields facts to answer typical ready-reference questions. Much of the information published in almanacs, handbooks, manuals, and yearbooks is
available elsewhere, but for quick answers, librarians keep these handy reference sources in mind. Consult almanacs for facts, statistics, and lists (Smith and Wong 2016). An example is the Information Please Almanac (http://www.infoplease.com/almanacs.html), which can answer such questions as “What was the cost of a first-class stamp in 1965?” “What day of the week was I born on?” and “What religions have current and former US Supreme Court justices practiced?” Yearbooks review trends, issues, and events pertaining to a topic, place, or phenomenon in a particular year. Part almanac and part yearbook is the CQ Almanac, which “organizes, distills, and cross-indexes for permanent reference the full year in Congress and in national politics” (CQ Press 2022). Handbooks “serve as a handy guide to a particular subject, with all of the critical information that one might need for a particular field” in a single source (Smith and Wong 2016, 478). A prime example of a handbook is the Diagnostic and Statistical Manual of Mental Disorders. Published by the American Psychiatric Association, it is included in the organization’s subscription-based Psychiatry Online collection.
RESEARCH DATABASES Consult research databases to answer queries that require subject searches. Typically, searches produce multiple results, none of which answer the user’s question entirely. To generate answers, users must synthesize the useful information they extract from retrievals and exercise their judgment, weighing the evidence and taking into account their particular situation and circumstances. Research databases can also be useful for ready-reference, known-item, and readers’ advisory queries, but knowing which research database is likely to yield answers and how to search
specifically for those answers requires knowledge of and experience with research databases generally. Although the boundaries defining different database genres have become porous as electronic resources have evolved, understanding the terminology and traditional importance of databases can be helpful when deciding how to spend your time searching. Catalogs A catalog holds surrogate records that describe sources contained in a collection, library, or group of libraries and that are organized according to a formal scheme or plan. The most familiar is the library’s online public access catalog (OPAC). Catalogers add subject terms to each item’s surrogate record that make the item findable even if its title doesn’t indicate what the item is about. Most libraries in the English-speaking world choose subject headings from the controlled vocabulary called the Library of Congress Subject Headings (LCSH). Although the metadata elements on the surrogate records in a catalog are indexed to make them discoverable, a library catalog doesn’t index the chapters in a book or the articles in a periodical. And while a catalog used to be understood as a list of all the holdings in a library—that is, everything the library owned—that distinction no longer makes sense as some library catalogs include surrogates for the e-books and databases they subscribe to but don’t own. Additionally, the single everything search box on a library’s home page blurs these distinctions by retrieving much more than the books, journals, and videos in a library’s own collection. Bibliographies A bibliography is a systematic listing of source citations; it can appear at the end of an encyclopedia entry, journal article, or book, among other publication types. It can also be a list of everything
published by a particular author. Here is an example of an item in a bibliography: Thurman, Robert A. F. 1999. Inner Revolution: Life, Liberty, and the Pursuit of Real Happiness. 1st Riverhead Trade Paperback ed. New York: Riverhead Books. 322 pp. The entry presents just enough information, or metadata, for the user to find the actual source in a source database or physical collection. It is easy to convert this item into a searchable surrogate in a database by parsing the metadata into separate fields (table 4.1). Table 4.1 Bibliographic Information in a Surrogate Record Field Field Value Type Title: Inner Revolution: Life, Liberty, and the Pursuit of Real Happiness Author: Thurman, Robert A. F. Edition: 1st Riverhead Trade Paperback Place: New York Publishe Riverhead Books r: Date: 1999 Pages: 322 A database search system’s indexing program processes field values (words, phrases, codes, or numbers) into separate indexes, one for each field type, and into one big combined index. With this process, the bibliography becomes a searchable index (i.e., a bibliographic database). A classic example is the Bibliography of Asian Studies, which was launched as an annual print volume in 1956, and which today is an online database of records containing citations but not abstracts. Users generally expect more than just citations, so over the years publishers of bibliographies have
enhanced them with subject terms, summaries, and even full texts. The Bibliography of Asian Studies has a searchable all-text field that makes it possible to retrieve full-text articles through resolver links from other library-subscribed databases. Another long-standing resource that began as a series of print volumes is the MLA International Bibliography, now a database of citations with some abstracts and with links to full texts available in other databases. If a database’s title contains the word bibliography, chances are that it has transitioned or is in the process of transitioning into a different genre of research database. Why their publishers have not changed their names to reflect their more-than-a-bibliography status may be out of deference to the many loyal and longtime users of these databases who are so familiar with their names. A&I Databases The surrogate records in abstracting and indexing (A&I) databases include not only bibliographic citations but also abstracts that summarize the items. You’ll encounter three types of abstracts in A&I databases. Informative abstracts function as a substitute for the source, detailing its quantitative or qualitative substance. Indicative abstracts function like tables of contents, describing the source’s range and coverage and making general statements about the source. Indicative-informative abstracts are part indicative of the source’s more significant content and part informative of its less significant content. Many A&I databases began life as print indexes dating back to the 1960s. Examples include Sociological Abstracts, Historical Abstracts, and Meteorological & Geoastrophysical Abstracts. Their titles were intended to distinguish them from the many indexes available at the time that did not include abstracts of the articles listed, such as the Sports Medicine & Education Index, which began publication in print with the title Physical Education Index and began to include abstracts in 2001 when it transformed into an electronic database. Few pure A&I databases exist today because full-text
sources may be available via the database itself or via journal publishers and journal aggregators through accompanying resolver links, as shown in figure 4.9, for example, on the brief results screen for a search for blues AND jazz in the Music Index.
Figure 4.9 Brief records and resolver links for a search in the Music Index. By permission of EBSCO Publishing, Inc. Full-Text Databases Search a full-text database, and the system matches your search terms with words in surrogates and sources. Full-text databases almost always include surrogates, typically in the form of A&I records containing citations, subject terms, and abstracts. Because the metadata elements are indexed, users can limit their searches to title only, abstract only, or a combination of fields, rather than searching through the entire full text. Users can also use the surrogate information to decide whether an item sounds relevant enough to download the full text.
EDITORIAL CONTROL The fifth of the five categories of databases involves whether domain experts mediate content by applying quality control methods. When librarians evaluate a database, they want to know the nature of editorial control that its creators have exerted over the content. Databases that usually pass the test have been issued by national governments; scholarly academies; professional associations; cultural heritage institutions such as libraries, museums, colleges, and universities; and long-established and respected database vendors and publishers. Such entities enlist professionals with scholarly, scientific, or technical expertise to both author and review content. For example, Web of Science, a suite of databases owned by Clarivate, has an in-house staff specializing in different subject areas who apply more than two dozen criteria when deciding which scholarly journals to include (Web of Science Core Collection 2022). The articles in scholarly journals, magazines, newspapers, government publications, and e-books include material that has already been vetted by the publishers, either through peerreview in the case of scholarly journals and government research publications or by in-house editorial staff for magazines, newspapers, and books. This kind of editorial control is built into databases indexing such pre-vetted material. Sitting between the long-established editorial control of traditional publishing and the lack of editorial control or oversight over many websites is the open-access repository. Institutional repositories, often managed by a university library, include research publications, dissertations, conference presentations, and similar kinds of material produced by faculty, graduate students, and others affiliated with the institution, including librarians. Subject-specific repositories archive the work of scholars in the discipline no matter where they are employed. Authors self-archive their work by uploading it to the subject-specific repository along with some descriptive metadata.
Although many of the works stored in a repository may be final versions of research articles, a lot may be in the form of preprints, drafts of articles that have not yet undergone peer review. The majority of subject-specific open-access repositories that accept preprints submitted by the authors do screen for basics such as plagiarism, completeness, and relevance. Although it may involve some subject experts doing quick reviews for obvious problems, repositories do not conduct the kind of in-depth, lengthy evaluations and critiques typical of peer-reviewed journals. During the fastmoving coronavirus pandemic, however, health-related repositories increased their editorial control by subjecting preprints to scrutiny by experts looking for research that might be harmful to health or that could be used to fuel bogus conspiracy theories (Kwon 2020). Other kinds of repositories also store and make available material submitted by registered participants. For example, GitHub serves as a storage and retrieval system for open-source computer code and related material uploaded by creators. Open-source code is by its nature reviewed, corrected, and developed further by the original creators and by others who find it useful as a starting point for their own projects. GitHub’s search box can be used to find a wide variety of code and documentation, with the results screen providing filters for computer language and other factors. One more example demonstrates the breadth and depth of repositories, which can also be considered digital collections. Participants in creating the Multimedia Education Resource for Learning and Online Teaching (MERLOT) repository of educational material upload assignments, assessments, syllabi, and other resources to create a collection that educators across the globe and at all levels can use. Figure 4.10 shows part of the results page for a search for calculus. Filters to the left make it possible to limit results by type of material, educational level, and other facets. Repositories and digital collections can be of use to a wide variety of people worldwide, including not only researchers and educators but also autodidacts and homeschoolers.
Figure 4.10 Results page in the MERLOT database. Courtesy of MERLOT, California State University Long Beach.
CHOOSING THE RIGHT DATABASE FOR THE JOB Sorting databases using this chapter’s descriptive categories gives you a broad idea of the potential of a particular database for answering user queries. Pairing your database categorization with your categorization of the user query as subject, known item, ready reference, or readers’ advisory gets you even closer to being able to answer the user query. You’ll conduct subject searches in research databases for many of the negotiated queries you categorize as subject. Likewise, you’ll choose reference databases for queries you categorize as ready reference.
The subject matter of a user’s query often dictates the choice of database, matching the query with a subject-specific research database. However, you could circumvent the subject matter issue altogether, choosing instead a database that indexes many different kinds of publications covering sundry subjects, such as Academic Search Ultimate. Another technique is to do a search of multiple databases at one time. For example, at a library that subscribes to 163 databases on the ProQuest platform, search for snowmelt and you will retrieve 115,510 results. On the left side of the list of results are several filters, including one listing the names of the databases ranked from greatest to least number of results returned by each one. For the snowmelt search, the top two are the Agricultural and Environmental Science Collection with 39,220 results, and Global Newsstream with 34,228 results. Using this filter helps you understand which database has indexed the most material on the topic, but rather than choosing on that basis alone, consider the types of publications the information seeker needs for their project. If they are only interested in current news stories, then Global Newsstream is the better choice. Database selection for ready-reference queries can be trickier than for research queries because of the problem of choosing from among the many genres of reference databases and because the genres themselves are in flux. As the reference interview proceeds, it may become obvious that a selection of both reference and research databases may be needed. Someone looking for new reading material they will enjoy is engaged in a kind of research task and may be served by searching the NoveList Plus database designed for readers’ advisory queries. Once a book is identified it becomes a known item that can be searched in the library catalog and reviews of it can be retrieved in a research database like Book Review Index Online. For the reader who wants to know more about the author, a biographical dictionary is useful; Gale’s Literature Resource Center includes Contemporary Authors, Contemporary Novelists, and the Dictionary of Literary Biography, among others. Initially, you might feel that trial and error best describes how you respond to user queries. Be patient with yourself, and when
things don’t go as planned, keep in mind that your use of a particular resource might not be fruitful this time, but by using it, you learned something about the resource on which you can rely in the future. Over time, as you gain more experience using library resources and interacting with library users, you’ll find that making choices among the many forms and genres of databases becomes easier.
QUESTIONS 1. How much can you learn about a database before searching it? Look through the list of databases at one of the library websites listed below. To what extent are the descriptions of each database helpful for selecting one to search? Is there any information you would find helpful that’s not included? Art Institute of Chicago, Ryerson and Burnham Libraries, https://artic.libguides.com/az.php California State Library, https://www.library.ca.gov/services/online-resources/ Gallaudet University Library, https://libguides.gallaudet.edu/az.php Pennsylvania State University, Harrell Health Sciences Library, https://hershey.libraries.psu.edu/resources/databases (use the show all descriptions button) Pine Technical and Community College Library, https://pine.libguides.com/az.php 2. Look through one of the database lists again and tally the databases by genre or type. How many are research and how many are reference databases? Are there overlaps that make it difficult to assign a database to only one category? Are any
freely available web-based collections or resources included in the list, and how did you identify them?
SUMMARY During the reference interview, you generate an understanding of what the user wants in the form of a negotiated query. You should also be able to categorize the negotiated query. Now you are prepared to tackle database selection, a daunting task because of the sheer volume of databases at your library’s website and on the web. It’s helpful to categorize databases according to five attributes: (1) source type, (2) genre, (3) selection principle, (4) form, and (5) editorial control. Categorizing queries and databases can help you bridge the gap between the reference interview and conducting a search for the user’s query. When searches go awry (and they will), rethink the query in collaboration with the information seeker and consider the full range of fee-based and freely available databases of potential use. After all, a research database indexing newspaper articles can answer a question about when an event happened, just as an almanac can. And an encyclopedia entry’s bibliography can lead to articles useful for a research paper, just as a research database can. Help yourself make a decision by reading the descriptions of databases on library websites, in the databases themselves, at database vendors’ sites, and at a repository, digital collection, or other database home page about link.
REFERENCES
CQ Press. 2022. “About CQ Almanac Online Edition.” Accessed September 3, 2022. https://library.cqpress.com/cqalmanac/static.php? page=about&type=public. Kwon, Diana. 2020. “How Swamped Preprint Servers Are Blocking Bad Coronavirus Research.” Nature 581, no. 7807: 130–31. doi: https://doi.org/10.1038/d41586-020-01394-6. Lam, Skye. 2021. “Mapping the Green Book in New York City.” Accessed January 6, 2022. https://storymaps.arcgis.com/stories/c61ac50131594a4fb2ff371e 2bce7517. Oxford University Press. 2022. “OED, Oxford English Dictionary: About.” Accessed September 3, 2022. http://public.oed.com/about/. Smith, Linda C., and Melissa Autumn Wong. 2016. Reference and Information Services: An Introduction, 5th ed. Santa Barbara, CA: Libraries Unlimited. USC Shoah Foundation. 2022. “Visual History Archive Online.” Accessed January 3, 2022. https://vhaonline.usc.edu/access. Web of Science Core Collection. 2022. “Editorial Selection Process.” Accessed September 2, 2022. https://clarivate.com/products/scientific-and-academicresearch/research-discovery-and-workflow-solutions/web-ofscience/core-collection/editorial-selection-process/.
ANSWERS 1. Descriptions of each database on a library’s list must strike a balance that gives the potential user enough specific information to understand the kinds of results to expect without taking up too much space on the screen, especially when a library subscribes to many databases. Descriptions may mention
document types (journal articles, dissertations), the approximate number of indexed items, and the topics covered. It may not indicate how far back in time the indexed publications go or whether full texts are available only for the later years. For most users, the more recent material is all that matters, but in some fields, researchers are interested in earlier work and might appreciate knowing dates covered in a database. 2. Reading the descriptions helps you sort databases into categories, if the description is clear about the kinds of material indexed. When open-access repositories are included in lists with commercial databases, the descriptions may be somewhat vaguer for the repositories. This stands to reason since the creators of commercial databases have concrete selection criteria for what to include and exclude, while repositories, driven by the interests and motivations of potential contributors, may have a more eclectic collection of works. The best approach for understanding the usefulness of any given database is to use it and evaluate your experience with it.
5
Presearch Preparation Search systems have had plenty of time to evolve over their sixtyyear history. Their developers have listened to both expert searchers and end users, responding to their needs by implementing a wide range of tools in their systems. Yet successful searches don’t start with great system functionality. They start with the expert intermediary searcher developing a clear understanding of the user’s query and restating this query in the language of the search system so that the system produces relevant results. Chapter 5 covers the presearch preparation that enables you to intermediate successfully between the seeker and the search system. The preparation that you do immediately after negotiating a user’s query, typecasting it, and choosing which database(s) to use involves two steps that will set you up for a successful search: facet analysis and logical combination. This chapter begins with an overview of Boolean logic as used in search systems as a basis for understanding query facets and the combination of facets using logical operators. Consider the purpose of the information seeker’s quest when preparing to search. It will help you think about the relative importance of two key gauges of search effectiveness: recall and precision. The result of a broad search statement, such as a single keyword, is greater recall of surrogates and sources. If you are interested in western wildfires, you can search using only the keyword wildfire for maximum recall. You can judge the level of recall by evaluating how much the results diverge from the intended topic. Perhaps some of the sources use the word wildfire as a simile, spreading like wildfire, matching your literal keyword but not your
intended meaning. In the early stages of a topic search, having these kinds of irrelevant results, or false drops, in a set of results is not a problem because they indicate you have reached a high level of recall, and there are techniques for creating results that more precisely match the information seeker’s negotiated query. The result of a narrow search is greater precision in the set of results. In subject research, beginning with a high-recall search and then using additional techniques to retrieve a more precise set of results usually works well. Beginning with a narrow search instead will retrieve fewer and better results, but at the risk of inadvertently eliminating some surrogates or sources relevant to the topic. When looking for a known item or a fact, on the other hand, the search should be designed with precision in mind. The goal is to find information relevant to the seeker’s information need. The idea that relevance of search results is a good way to evaluate the effectiveness of an information retrieval system dates back to the beginning of the information science field. In a study of the concept of relevance during the first decades of information retrieval system development, Mizzaro (1997) found no consensus on how to calculate the relevance of search results or to what extent the literal matching of the results to the search statement should be balanced with the information seeker’s own perceptions. Was a search result relevant because it matched the search statement, or was it relevant because the seeker saw something useful in the result that wasn’t represented in the search statement? To complicate matters further, the seeker’s perception of which search results are relevant and how to search for more shifts over time as the seeker learns more about the topic and becomes more familiar with its literature (Bates 1989). One of the main reasons why web search engines are so successful is that they can quickly calculate relevance based on matching keywords, the context of their use, the information seeker’s own search history, and other elements so as to present results ranked by the algorithm’s calculation of relevance. While many commercial databases offer the option of seeing results in relevance-ranked order, it’s still best to
craft a search statement that retrieves only, or mainly, relevant surrogates and sources for the user to choose from. Central to crafting a search that retrieves the most relevant material is the use of Boolean operators, which help achieve a balance between recall and precision.
BOOLEAN OPERATORS AND VENN DIAGRAMS Most database search systems are programmed not to search for stop words: the and an, prepositions, and conjunctions that are used so frequently that they have no substantive meaning of their own. The conjunctions and and or and the adverb not are typically stop words that are not indexed or searched. Instead, they are programmed to have a special function as Boolean operators. These operators are named for George Boole, a famous nineteenth-century English mathematician, who invented the form of logic that underlies their function. Online Searching represents Boolean logical operators in capital letters so you can distinguish the Boolean AND, OR, and NOT from the English-language conjunctions and and or and the adverb not. Boolean-based search systems use AND for increasing precision, OR for increasing recall, and NOT for eliminating items: AND. Inserted between two different facets of a topic, this logical operator tells the search system to find sources containing both facets. OR. Inserted between similar facets, this operator tells the search system to find sources containing any of the facets. Some sources may include some or all of the facets, but all of the results will have at least one of them.
NOT. Inserted between two facets, this operator tells the search system to exclude sources containing the second facet from the set of results.
Figure 5.1 Venn diagram of the Boolean operator AND. When you conduct a search, the retrieval system creates a set of results. Sets of search results can be illustrated using Venn diagrams (named for their originator, English logician John Venn). Figure 5.1 shows a Venn diagram depicting the use of AND between the two different facets of the topic California wildfires. The rectangle represents all the items in a database. The medium gray circle represents the set of more than one million records containing the keyword California in a search of the Academic Search Ultimate database. The light gray circle represents the set of more than seventeen thousand records containing the keyword wildfires. The darker gray overlap is the set of 3,273 results containing both keywords. The set for California represents 100 percent recall because it includes every surrogate or source containing that word.
The set for wildfires represents 100 percent recall of all the surrogates and sources containing that word. But if you’re researching California wildfires, neither set alone offers much, if any, precision. Most of those one million California items don’t even mention wildfires, nor do all the results in the wildfires set mention California. It’s only by combining the two sets using the logical operator AND that we begin to retrieve articles that we know mention both keywords. But what if there are some articles that don’t use the word wildfires but instead use the terms forest fires or grass fires? We would have missed these articles in our first search using AND. To increase recall, use the Boolean OR between synonyms, closely related terms, and variant spellings to make sure you’re not missing something. The Venn diagram in figure 5.2 represents the search wildfires OR “forest fires” OR “grass fires.” The wildfires set consists of all 17,224 items in the database containing the word wildfires. The forest fires set represents 100 percent recall of the 12,179 records containing that phrase, and the grass fires set represents all 200 of the records containing that phrase. All of the circles in figure 5.2 are shown in the same color because all the results are in one set. The set doesn’t contain 29,603 items, which is the sum of the results of each set when added. Instead, the set contains 19,350 items. Why? Because some of the records in this set contain all three terms, some contain two of them, and some contain only one; these overlaps are shown in the circles on the left side of figure 5.2. The OR operator tells the system to retrieve any record that has any one of the three search terms. Using OR generates huge recall, but very little precision, because the other facet of the topic, California, has not been added into the search statement using AND yet.
Figure 5.2 Venn diagram of the Boolean operator OR. If you and the information seeker decide that forest fires aren’t of interest, the NOT operator can be used. A search for wildfires NOT “forest fires” isn’t the best strategy, as figure 5.3 shows. All of the records that contain both keywords are eliminated. We won’t be able to consider whether the ones that mention wildfires and forest fires are useful to our research because we won’t even know they exist. In other words, using NOT in front of a search term will eliminate all the items with that term, even if they also contain the search term for another facet of the information seeker’s query. As the Venn diagram shows, NOT takes a bite out of the set of relevant results. There are better strategies for crafting search statements that yield precise results. The logical operator NOT is best reserved for eliminating other aspects of results, such as specific periodicals a student has been told not to use in their research or certain document types. For example, in a database that indexes full journal issues, including the scholarly book reviews that appear in each issue, it may be useful to use NOT to keep the document type book reviews out of the search results when the information seeker wants only research articles.
Figure 5.3 Venn diagram of the Boolean operator NOT.
CONDUCTING THE FACET ANALYSIS Boolean operators are useful search tools because of their ability to create and hone sets of search results. They work best after you have typecast negotiated queries and selected a database. The next step in presearch preparation, before actually doing the search, is a facet analysis of the user’s query, expressing it in no more than a handful of big ideas, major concepts, or facets that should be present in results. A facet is a word or very short phrase that describes a single concept or idea. To distinguish facets from other online searching conventions, facets are shown in bold type, with the first letter of each facet word capitalized. Consider the number of facets in this research question:
“Does methane cause climate change?” Your facet analysis should identify these two facets: A. Methane B. Climate Change If it did, then you detected two facets: one facet expressed as the one-word noun Methane and the second facet expressed as the two-word adjectival phrase Climate Change. Perhaps you analyzed this differently, arriving at these three facets: A. Methane B. Climate C. Change If you did, then you divided the facet Climate Change into two separate and broader facets, Climate and Change. To determine whether your division is right, wrong, or somewhere in between, a discussion of adjectival phrases is in order. Adjectival phrases are common in the English language. On their own, these phrases convey ideas that are both specific and complex. Here are some examples: college athletes role-playing health care reform academic achievement opioid addiction Separately, the individual terms college and athletes, role and playing, or academics and achievement convey ideas that are broader and simpler than their original forms, college athletes, roleplaying, and academic achievement. The English language uses adjectival phrases to express many specific, complex ideas. Accustomed to the English language, you understand an adjectival phrase as a specific, complex idea; it is a single, unitary, and indivisible idea. If your environmental studies professor gave a
lecture on the relationship between methane and climate change, you wouldn’t break the adjectival phrase into two separate parts, climate and change. Using adjectival phrases frequently in everyday speech, English-language speakers have come to think of them as one concept, not two or three concepts or however many words make up the phrase. In library and information science, this phenomenon is called precoordination, the combination of individual concepts into complex subjects before conducting a search for them. Let’s revisit the facet analysis for the query “Does methane cause climate change?” Now, knowing about precoordination, which facet analysis is right—the two facets Methane and Climate Change or the three facets Methane, Climate, and Change? For the time being, both facet analyses can be considered acceptable. You won’t know for sure which is correct until you choose a database and represent these facet names with the search terms that the database uses to express these facets. Nevertheless, the two-facet formulation is probably more correct than the threefacet formulation because of precoordination and because the phrase is so widely used.
CRAFTING THE LOGICAL COMBINATION OF FACETS Your next step is to combine the query’s facets using Boolean logical operators. The logical combination of facets you construct now will become the search statements you input in database search boxes later. For this query, you want the search system to find sources bearing both facets, so you insert the Boolean AND operator between them: Methane AND Climate Change. In figure 5.4, the rectangle represents all the sources in the database. The dotted circle on the left represents all sources in the
database that contain the word methane. The gray circle on the right represents all sources in the database that contain the phrase climate change. The wedge where the two circles overlap is the set of sources that discuss both methane and climate change. This overlap is the set of results that the database search system will retrieve in response to the AND Boolean search. Presumably, sources that discuss both will include some discussion of the cause-andeffect relationship between the two. Expressing Relationship Facets Some user queries are expressed in ways that anticipate a relationship between two facets, tempting the searcher to include a third facet describing a type of relationship between the conditions. The query “Does methane cause climate change?” is a candidate in this regard, inviting the searcher to identify a relationship or a causality facet. Table 5.1 describes some relationship types that predispose the searcher to establish a relationship facet and add it to the facet analysis and search statement. Queries like these tempt you to include an effect, impact, or influence term in the logical combination of terms in the search statement. Let the Boolean operator AND establish the relationship for you. Conduct the search without including the relationship facet and then evaluate the results with the information seeker before deciding whether a revised search is necessary.
Figure 5.4 Venn diagram depicting the overlap containing two facets. More about Facet Analysis and Logical Combination Let’s work on a second facet analysis and logical combination. The query is “I need to know whether switching to flextime or flexible work schedules will improve my employees’ morale and motivation.” Take a few moments to think about this query’s facets. If you are stumped, restate the same query in different ways: Is there a relationship between flextime and employees’ morale and motivation? Does flextime matter when it comes to the morale and motivation of employees? Do employees on a flextime work schedule have better morale and motivation? The facet analysis for this query yields our facets: A. Flextime
B. Employees C. Morale D. Motivation Table 5.1. Relationship Types Relationship Type Query Example Effect I am doing a project on how steroid use (A) affects election into the Baseball Hall of Fame (B). Impact What impact does moderate exercise (A) have on heart health (B) and brain functioning (C)? Influence How did Laura Bush (A) and Michelle Obama (B) influence education policy (C)?
Description Does condition A have an effect on B? Does condition A have an impact on B and C? Do conditions A and B influence C?
Your next step is to indicate to the search system how it should combine these four facets in an online search. You want the search system to find sources bearing the Flextime and Employees facets, and thus, insert the Boolean AND operator between them. Then things get tricky. The user is interested in both their employees’ morale and their employees’ motivation. The retrieved sources don’t have to discuss both; as long as they discuss one or the other in the context of flextime and employees, that will be satisfactory. Insert the Boolean OR operator between the Morale and Motivation facets. The logical combination for this query is: Morale OR Motivation AND Flextime AND Employees A Venn diagram helps us visualize the logical combination of this query’s facets. In figure 5.5, the rectangle represents all the sources
in the database. The light gray areas are the results for the three different facets. The medium gray areas are the results containing two facets. The medium gray wedge on the upper left depicts the set of items that contain the Flextime facet and at least one of the keywords from Morale OR Motivation. The medium gray wedge on the upper center depicts the set of items that contain both Flextime and Employees, while the one below it and to the left contains at least one of the Morale OR Motivation facet words and Employees. The dark gray area represents results containing all three facets.
Figure 5.5 Venn diagram depicting the overlaps of all three facets.
Your logical combination for this query uses two Boolean operators: one OR operator and two AND operators. Each search system has its own rules for processing queries. If a single search box is used, a system may process a search statement from either left to right or right to left. Some systems have a rule specifying the precedence of operators; for example, they process ANDs first, then ORs, or vice versa. All of the databases on a single publisher’s platform will process Boolean operators the same way. Nevertheless, memorizing many different publishers’ precedence of operators, or having to look them up in the help screens or discover them through trial and error, isn’t necessary. Instead, insert parentheses into your logical combinations to force the system to perform Boolean operations in your preferred order. The use of parentheses is called nested Boolean logic, and it works the same way as the parentheses used in algebra: (5 × 5) + 3 = 28 5 × (5 + 3) = 40 That is, the search system first performs the Boolean operation(s) nested in parentheses and then moves on to operations not in parentheses. When nested Boolean logic is added, the logical combination becomes: (Morale OR Motivation) AND Flextime AND Employees. When users discuss their queries with you, they will use naturallanguage parts of speech (everyday English, in this case). The negotiated query statement uses the English conjunction and between the Motivation and Morale facets. Boolean operators and natural-language parts of speech are not one and the same. Disregard the parts of speech information seekers use when you conduct the facet analysis and logical combination. If you are unsure whether the user would agree with your logical combination, ask something like: “Would you be satisfied with sources that discuss whether flextime improves employee morale but fail to mention motivation, or are you only interested in sources that discuss
flextime, morale, and motivation?” The response determines your search statement’s logical combination: (Morale OR Motivation) AND Flextime AND Employees or Morale AND Motivation AND Flextime AND Employees Let’s conduct a third facet analysis and logical combination. The user’s query is “I’m working on a term paper about graphic novels and the adults who read them. I’ve found a couple of books, but I need some research articles.” Take a few moments to think about this query’s facets. The facet analysis for this query yields two topical facets. A. Graphic Novels B. Adults Since the information seeker has told you that books on the topic have been published, you know to expect some book reviews in your results when you search a research database. The seeker only wants substantive research in the form of articles. The logical combination of these facets as keywords in a search statement would be: (graphic novels OR comic books OR manga) AND adults NOT book reviews. The Venn diagram in figure 5.6 visually represents the logical combination of this query’s facets. The rectangle represents all the sources in the database. The light gray circle represents the Graphic Novels facets of the topic. The medium gray circle represents the Adults facet. The dark gray circle represents all of the book reviews in the database, and it eliminates the book reviews from both of the facet circles. The gradient gray wedge is the set of items that meet the seeker’s information need. There’s almost always a hitch when using the NOT operator: NOT is likely to eliminate relevant items. In the graphic novels example, it’s possible the database has indexed a research article discussing how adults use book reviews to discover graphic novels they want to
read, which would be of interest for this user’s research paper. Using NOT to exclude book reviews would eliminate that very relevant article. When you build logical combinations including the NOT operator, think about the relevant results that this Boolean operator eliminates. For this query, you may have saved the user a bit of time since they don’t have to go through all the results to identify each book review and eliminate it from their list of sources. But that time saving may come at the cost of missing a relevant article.
Figure 5.6 Venn diagram depicting NOT eliminating book reviews from the results.
FACET ANALYSIS AND LOGICAL COMBINATION FOR READERS’ ADVISORY QUERIES A reference interview with a teenager looking for reading material like what they have enjoyed in the past will yield some elements that differ from the kinds of research-oriented queries discussed so far in this chapter. Nevertheless, the facet analysis and logical combination process can be applied to the quest for the next book to read. Such a quest is itself a form of research, even though the purpose in this example does not involve writing a paper on a topic. After hearing a description of the teenager’s reading experiences and preferences, the negotiated query might develop into something along these lines: Holly Black’s Folk of the Air series of books, especially the first one, The Cruel Prince, the suspenseful story of a young woman who’s abducted from her home and taken to the land of faeries, where she has to be smart and brave to survive. A lot of elements are mentioned and an analysis might include these facets: Fantasy Fiction Young Adult Strong Female Protagonist Fast-paced and Suspenseful Author’s Name Book or Series Title Websites and databases designed for readers index facets that reflect the distinctive features of literature: genre, intended readership by age, character types, and story appeal factors. The ability to search by author name and title yields results describing
authors and books, along with read-alike suggestions of other authors and books. Readers’ reviews of books may also be included to help the user decide if their potential next read might help them experience similar emotions as the books they’ve already read. Although a lot of facets can be identified from this reader’s story of their reading experience, not all should be included in a logical combination. That’s because readers’ advisory databases, like public library websites, emphasize the discovery of new books through browsing, just as physical library shelves do. Because it will lead to read-alike suggestions and to browsing by genres and reader age groups, the simplest logical combination for this readers’ advisory query is: Holly Black AND Cruel Prince. But that might be too literal and too limited for a voracious reader interested in expanding their horizons, something that might become more apparent after they see the results for this first search. The focus of the negotiated query might shift. An approach might be this logical combination: (Fantasy OR Science Fiction) AND Strong Female Protagonist. As with a research-oriented search, the reader must evaluate results and choose the ones that fill the immediate information need, knowing that the need may change over time as they read more and learn more.
FACET ANALYSIS AND LOGICAL COMBINATION FOR KNOWN-ITEM SEARCHES The facet analysis for known-item searches involves the same process. Scrutinize the negotiated query for facets that are publication-specific attributes of an existing source. A reader’s desire
to read more of an author’s work has one facet, the author’s name. Other specific attributes of an existing source can be used, such as the title of an article, book, report, conference paper, film, game, or dataset; contributor name; publisher; journal, magazine, or newspaper name; sponsoring organization or funding agency; and date. Identify these elements in the negotiated query. For the logical combination, insert Boolean AND operators between each publication-specific element. For example, the information seeker may remember only the author’s last name and a few words from the article title, so the facet analysis is obvious. Since you want the search system to find sources—or the source—containing both title and author facets, insert the Boolean AND operator between them to complete the logical combination: [Title] AND [Author]. Brackets around the publication-specific element names here mean that you’d replace element names with the title and author identified in the query. Figure 5.7 visually represents the logical combination of this query’s facets. The rectangle represents all the sources in the database. The circle on the left represents all sources in the database that have an author whose name includes the facet Allende. The circle on the right represents all sources in the database that have the facet word Spirits in the title field. Where the two circles overlap reside sources that match both title and author elements, retrieving results that are most likely to include the one item that interests the user. If the author has a distinctive or unusual name and the seeker remembers many of the words in a title, chances are good that the overlap is a single item and that it is the one the user is seeking. If the author has a common name and the title uses common words or phrases, there will be more results. Having several results to choose from at least increases the likelihood that the needed one is in the set. It’s not necessary to have the author’s name to track down a needed item. The known-item seeker may be a student whose professor recommends an article. By the time the student contacts a librarian, they may be able to remember only a few of the keywords from the title. Since most database search systems index every word in every title in the database, using AND to string
together as many keywords as the student can remember will yield a few results that the student can scan for recognition. Precision is important when looking for a known item, but recall can also serve a purpose.
Figure 5.7 Venn diagram depicting author name AND book title word overlap.
QUESTIONS First, typecast the following negotiated queries as candidates for a subject, ready-reference, readers’ advisory, or known-item search. Second, perform a facet analysis and logical combination on these queries. If you are stumped, rephrase the query in your own words, and draw a Venn diagram to represent the logical combination visually. Omit relationship facets. Third, jot down whether you’d search for each query in a research database (including a readers’
advisory database) or a reference database. Answers conclude the chapter. 1. I’m a new school bus driver, and I want to know more about handling discipline problems on school buses. 2. I just read Kate Bornstein’s memoir, Gender Outlaw, and learned a lot. Can you help me find more memoirs or biographies of nonbinary or trans people? 3. Are teens whose parents are divorced likely to develop problems like eating disorders and substance abuse? 4. When adult learners engage in conservation education, does their behavior change? For example, are they more likely to recycle, turn down their home thermostats in the winter, purchase energy-efficient vehicles and appliances, and so on? 5. Get a complete citation for a source with the title The Intersectionality of Sex, Race, and Hispanic Origin in the STEM Workforce. 6. Who was the speaker preceding Martin Luther King Jr. on the day he gave his “I Have a Dream” speech, and what did this speaker have to say? 7. Someone told me that a wandering mind is an unhappy mind. Really? Show me the research! 8. Who were the forty-niners?
SUMMARY When you develop a working understanding of what the user wants, you’ll typecast the negotiated query as a research, reference, readers’ advisory, or known-item query. You’ll identify at least one database that can potentially provide an answer. Then, you’ll conduct a facet analysis of the query, expressing it in no more than a handful of big ideas, major concepts, or facets that should be
present in the results. The next task is to construct a logical combination of facets. Although there is not necessarily a set order for performing these steps, initially you may find yourself being deliberate about the process, accomplishing each step in a checklist fashion. With experience and practice, your execution of presearch preparation steps (typecasting the query, selecting a database, analyzing facets, and combining facets with logical operators) will take place almost simultaneously. On occasion, the facet analysis reveals queries that would benefit from a relationship facet. For the time being, restrain yourself from adding such a facet because the nature of the research query and the use of the Boolean AND operator together may retrieve results that discuss impact, effects, causes, and similar relationships among the facets. Thinking through and putting into practice all of the presearch preparation steps will help you conduct an efficient and effective search.
REFERENCES Bates, Marcia J. 1989. “The Design of Browsing and Berrypicking Techniques for the Online Search Interface.” Online Review 13: 407–24. Mizzaro, Stefano. 1997. “Relevance: The Whole History.” Journal of the American Society for Information Science 48, no. 9: 810–32.
ANSWERS Answers are given in this order: typecasting (subject, reference, readers’ advisory, or known item), facet analysis and logical
combination, and database type. 1. Subject. School Buses AND Discipline Problems. Research. 2. Readers’ advisory. Bornstein AND Gender Outlaw. Research. 3. Subject. (Eating Disorders OR Substance Abuse) AND Divorce AND Adolescents. Research. 4. Subject. Conservation Education AND Adults AND Behavior Change. Research. 5. Known item. Intersectionality AND Sex AND Race AND Hispanic Origin AND STEM AND Workforce. Research. 6. Subject. Speakers Before AND Martin Luther King Jr. AND I Have a Dream. Reference. 7. Subject. Wandering Mind AND Unhappiness. Research. 8. Reference. Forty-niners OR Fortyniners OR 49ers. Reference.
6
Controlled Vocabulary Four steps of the online searching process are now behind you. You’ve conducted the reference interview, selected a database, typecast the negotiated query as a subject or known-item search, and created a facet analysis and logical combination. Next, you’ll represent the search as an input to the search system. This chapter focuses on subject searches using controlled vocabulary, a technique for greater precision in search results. It begins with an overview of the basics and then delves into the details of searching databases and catalogs using their respective forms of controlled vocabulary. As seen in chapter 5, facet analysis generates terms that can be used as keywords when you search, either for information about a topic or to locate a known item. These facet names and keywords are in natural language used in everyday speech and writing. Keyword searches work well when the keywords have clear meanings in common usage, and when the keywords you think of match the words the author uses. But language is messy. Synonyms are part of the problem; you may think movies and the author may write films. Homonyms are another problem; a single word can have many meanings: He rose from his seat as his sister handed him a rose. Dinner was served on china from China. It was a tie between the guy wearing a tie and me. A controlled vocabulary is designed to eliminate such problems by designating a word or phrase as the preferred term for a concept. Knowing the preferred term, and using it correctly, will be more efficient because you won’t have to use OR between every synonym you can think of to make sure you have sufficient recall of relevant items. No
need to use a long search statement such as glasses OR eyeglasses OR spectacles when the controlled vocabulary adopted for use in a database designates eyeglasses as the preferred term. An indexer will assign the authorized subject descriptor eyeglasses to an article about glasses that never uses the word eyeglasses, and thus it will be in the results when you search using that preferred term.
TYPES OF CONTROLLED VOCABULARY There are different types of controlled vocabulary and they vary in complexity. One type is a relatively simple list of the correct terms to use in a database. A good example is the NoveList readers’ advisory database, which offers a list of terms to use when looking for books about subjects or with specific characteristics that the reader finds appealing. The “Subjects/Appeals” link on the NoveList search screen opens an alphabetical list of topics, names, character personalities, qualities of illustrations, and other elements. You can browse the list or use the search box to find a term in the list. For example, typing candy in the search box will take you to that word in the alphabetical list, and you’ll see related words and phrases below it: candy canes candy industry and trade candy stores Candy, John, 1950–1994 Clicking on one takes you to a list of results containing all the relevant books. The database publisher has produced an forty-one-page booklet of the preferred vocabulary terms for elements such as pace, tone, and themes. Each term has a short definition and a note regarding whether the term applies to adult or juvenile fiction or nonfiction (NoveList 2019).
Such lists can become quite elaborate, giving preferred terms for topics in a syndetic structure that indicate a term’s relationship to other terms. The Library of Congress Subject Headings (LCSH) list designates eyeglasses as the subject heading and indicates the hierarchy of related terms, such as the broader heading opticianry and the narrower heading eyeglass frames. Library catalogs use LCSH to indicate the aboutness of the books and other items in the catalog. It’s common for licensed database publishers to use a different type of controlled vocabulary, the thesaurus of subject descriptors, to indicate the aboutness of articles and other items in their databases. In the field of information organization and description, the term thesaurus has a technical meaning quite different from the thesaurus that writers use to look up alternatives to a word to enliven their prose. In information science, a thesaurus organizes subject terms into a hierarchy of broader terms and narrower terms, revealing the structure of a subject area. There are also related-term references, designating associations between index terms that are closely related conceptually but not hierarchically. A thesaurus also includes used-for references, terms that are synonyms but whose wording is not the preferred wording for the topic. A subject thesaurus disambiguates terminology by providing authorized index terms for topics. These are the preferred terms added to the metadata describing an item in a database that indicate what the item is about. With the addition of the appropriate subject terms—or descriptors, as they are usually called in the context of commercial databases—retrieval can be more precise, with more relevant items in results sets. But who decides which term is “preferred” or “authorized”? The creation and maintenance of a thesaurus of preferred terms for a subject area involves the input and review of experts working in the subject field and librarians or other information professionals knowledgeable about the field and the organization of information. An example from the field of education illustrates how this works. Subject Descriptors for Education
ERIC is one of the largest and most widely used indexes for educational literature, and one of the oldest electronic databases in existence. The index and thesaurus, in print form, launched in 1966, and the database went online in 1971 and on the internet in the early 1990s (“50 Years of ERIC” 2014). In addition to being offered on a variety of vendor platforms, including EBSCOhost and ProQuest, the ERIC database and the Thesaurus of ERIC Descriptors are now openly accessible on the web at https://eric.ed.gov/. The database indexes 1.7 million items, including a variety of document types such as journal articles, dissertations, and curricular material, with some full text. The thesaurus includes 4,552 descriptors in hierarchical arrangement and with scope notes explaining how a term is used. Both the database and the thesaurus are sponsored by the Institute of Education Sciences, an agency of the U.S. Department of Education. Thesaurus updates and revisions are handled by taxonomists and indexers, with suggestions and input from users. On the ERIC home page, the database is at one tab labeled “Collection” and the thesaurus is at a second tab next to it. It’s easy to start a search of the ERIC collection with some keywords, but it’s also easy to look up a word in the thesaurus and launch a search from there using the preferred subject descriptor. Figure 6.1 shows the results page for a search of the keyword aids in the thesaurus. Since the search is in the thesaurus, the results are subject descriptors, not surrogate records or sources indexed in the database. Of the five terms, at least one, autoinstructional aids, isn’t an obvious term to use if the user is doing a simple keyword search. To discover how the term is defined and how it is related to other terms, you can click on it to read the scope note and see where it fits in the hierarchy of broader and narrower educational concepts and terminology, as shown in figure 6.2. Also helpful is the “Use this term instead of” list; these are the terms rejected by the thesaurus’s experts, who prefer the term autoinstructional aids for the concept described in the scope note. After reading the scope note indicating the definition of the descriptor, the user may want to consider some of the related terms instead to get at a different aspect of the topic.
Figure 6.1 ERIC thesaurus results for the term aids. Source: ERIC, https://eric.ed.gov/
Figure 6.2 ERIC thesaurus entry for the subject autoinstructional aids. Source: ERIC, https://eric.ed.gov/
descriptor
From the thesaurus page, you can click on the “Search collection using this descriptor” link to launch a search without having to manually input it in the search box at the top of the screen. The link takes the user to the first page of search results, shown in figure 6.3, with publication information, a truncated abstract, and a list of a few of the subject descriptors assigned to each item. All 1,534 results have been tagged with the descriptor autoinstructional aids, but they also have additional descriptors assigned to them based on the indexer’s judgment
regarding what the item is about. Aboutness involves the indexer’s judgment regarding which topics the article discusses in enough depth to warrant subject descriptors for those topics. For example, the first result listed, “Online Chemistry Crossword Puzzles Prior to and during COVID-19,” has fifteen subject descriptors. Searching any one of them would retrieve this surrogate record from the ERIC collection. Notice that the author of the article refers to revision aids in the abstract, and the term autoinstructional aids appears nowhere else but in the list of descriptors assigned to this article. Having checked the thesaurus, we know that revision aids is not a subject descriptor. And we also know that a keyword search in the database for aids would retrieve results about all kinds of teaching aids as well as about the disease AIDS in educational contexts. Using the preferred descriptor yields greater precision in the search results.
Figure 6.3 First three results of a subject search for autoinstructional aids. Source: ERIC, https://eric.ed.gov/
DATABASE RECORD FIELDS
One thing to note in figure 6.3 is the search box. The system has automatically filled it with its language for the search we conducted from the descriptor page in the thesaurus. The system’s search statement, descriptor:“Autoinstructional Aids,” indicates that the search took place only in the subject field of the surrogate record. It is the combination of a subject descriptor and a fielded search specifying the subject field that makes subject searching so much more precise than keyword searching, and not only because natural-language keywords can be ambiguous. Default search boxes may be programmed so that the system searches for keywords in several different fields of the surrogate and throughout the full text of a source. Changing the default to a subject search makes a difference. Just as thesaurus has a technical definition in the realm of information science, the term subject search means something specific. In an information seeker’s common parlance, searching for a subject means looking for some information about a topic that they have to write about for a class assignment or that they are researching for their own purposes. It doesn’t matter how they search, they just want to find the information. In contrast, the intermediating search professional uses the term subject search to refer to the two elements that distinguish it from a common keyword search on a topic: using an authorized subject term and limiting the search to the subject term index. If you input a subject descriptor in the search box and leave it set to the default search—usually including title, abstract, and subject fields, but also possibly author names and organizational affiliations—the system will search all of the fields it is programmed to search by default. In effect, you are turning the descriptor into a keyword. The results set will include records with the word in the subject field, but it will also include records with that word in any of the default search fields even if the word isn’t in the subject field. Using a subject descriptor without also limiting the search to the subject field index undermines the system’s ability to provide the desired level of precision. On the EBSCOhost platform, the search screen offers a different way to do a subject search in ERIC, but it still involves limiting the search to the subject field index. The advanced search screen features multiple search boxes, between which are options for your choice of Boolean operator. Figure 6.4 shows the autoinstructional aids search on the
EBSCOhost advanced search screen. The drop-down menu to the right of each search box makes it possible to search a specific field index. The first search box could be a subject search for autoinstructional aids. The Boolean AND is the default operator between the search boxes. The second search box could be a default search for a keyword representing another facet of the user’s query and limited to the abstract only (as just one possibility). The multiple search boxes, readily available Boolean operators, and field limiters make it easy to input a sophisticated search statement that yields precise results. Another thing to be aware of is the availability of the thesaurus at a link in the navigation bar above the search boxes. As with the free web version, the thesaurus presents subject terms as links that can be used to launch a search. Records and Fields Each record in a database represents an item indexed in the database. Records break up the different elements of an item’s metadata into separate fields, such as author, title, or abstract. The record for the first item in the autoinstructional aids search, shown in figure 6.5, includes the field labels (in bold type) for each metadata element, as well as some additional information such as the publisher of the journal and the location of the article in the journal (which in this case is not available as a full text in ERIC). Fields perform a very useful function: they designate the meaning or semantics of fielded data. When search systems index the values in fields, they keep track of the fields from which the values were extracted. When a user enters terms into the search box and limits them to the title, for example, the system conducts a search of the title index, comparing the user-entered title words to values it has extracted from title fields of the records and placed in its title index. When there’s a matching record, the system reports the match to the user in the set of search results. The surrogate records in a subject-specific database may include a field not included in other databases. For example, educational research often relies on standardized surveys and assessment instruments, so ERIC has a field, “SB Assessment and Survey Identifiers,” that is indexed and therefore searchable. The field uses a simple controlled vocabulary
for consistent wording identifying the different instruments. For example, the Basic Reading Inventory, the Rosenberg Self Esteem Scale, and the Test of English as a Foreign Language are all terms that can be searched in the SB field, which, like other searchable field indexes, is listed in the drop-down menu to the right of the search box.
Figure 6.4 Advanced search screen for the ERIC database on the EBSCOhost platform. By permission of EBSCO Publishing, Inc. Without the structure of fields in surrogate and source records, the search system is unable to identify the meaning of surrogate-record or source-record data. About all the system can do is process all the data into one big index. Searches would be possible, but the results would lack precision. For example, a search for paris would retrieve sources published in or about the city of Paris, mentioning the movie Paris, Texas, or written by authors whose first, middle, or last name is Paris.
Figure 6.5 Surrogate record for an article indexed in the ERIC database. By permission of EBSCO Publishing, Inc.
How Field Indexing Works Table 6.1 is a sample database bearing seven surrogate records that Online Searching uses to demonstrate the use of controlled vocabulary in this chapter. To fit this database into Online Searching, its records have been edited for conciseness. The abstracts are shorter than their actual counterparts in the ERIC database, and the fields are limited to author, title, publication, year, and subject. To index Table 6.1’s seven surrogate records for subject searching using a controlled vocabulary, a search system’s indexing program processes each record, extracting descriptors from the subject descriptor field and placing them in a huge index arranged alphabetically. Next to each descriptor is the number of the record and the abbreviated name of the field from which it was extracted. This is called an inverted index because indexed words and phrases are listed along with positional data that can be used to rebuild the original text. Table 6.1 displays the sample database’s inverted index, which Table 6.1. Sample Database Bearing Seven Surrogate Records 1 Title Squirrels—A Teaching Resource in Your Schoolyard Author LaHart, David E. Publicatio Nature Study; v44; n4; pp20–22 n Date 1991 Abstract This lesson plan demonstrates how to use common animals in your backyard or school grounds to study basic ecological principles with students. An example study uses squirrels for observational study. Descripto Ecology; Educational Resources; Environmental Education; r Field Studies; Field Trips; Instructional Materials; Intermediate Grades; Learning Activities; Natural Resources; Observational Learning; Outdoor Education; Teaching Methods 2
Title Author Publicatio n Date Abstract
Woodland Detection Fischer, Richard B. Outdoor Communicator; v20; n1; pp2–7
1988–1989 Presents tips on nature observation during a woodland hike in the Adirondacks. Discusses engraver beetles, Dutch elm disease, birds’ nests, hornets’ nests, caterpillar webs, deer and bear signs, woodpecker holes, red squirrels, porcupine and beaver signs, and galls. Descripto Environmental Education; Naturalistic Observation; Outdoor r Education; Wildlife Identifier Adirondack Mountains; Hiking; Nature; New York 3 Title Transforming Campus Life: Reflection on Spirituality and Religious Pluralism Author Miller, Vachel W. (editor); Ryan, Merle M. (editor) Publicatio New York; P. Lang n Date 2001 Abstract This collection explores the religious and spiritual dimensions of college life. It offers innovative approaches for positive change and addresses legal, organizational, and cultural issues involved in making campuses more hospitable to the human spirit. Descripto College Students; Cultural Pluralism; Diversity (Student); r Educational Environment; Higher Education; Religion; Religious Differences; Religious Education; Spirituality 4 Title Author Publicatio n Date
How to Help Students Confront Life’s “Big Questions” Walvoord, Barbara E. Chronicle of Higher Education; v54; n49; pA22 2008
Abstract
Many college students are interested in spirituality and the “big questions” about life’s meaning and values, but many professors seem not to know how to respond to that interest. Descripto Critical Thinking; College Students; Higher Education; r Religious Factors; College Faculty; Religion Studies; Religious Education; Beliefs 5 Title The Correlates of Spiritual Struggle during the College Years Author Bryant, Alyssa N.; Astin, Helen S. Publicatio Journal of Higher Education; v79; n1; pp1–27 n Date 2008 Abstract This study explores factors associated with students’ experiences of spiritual struggles during college. Data indicate that spiritual struggle is associated with experiences in college that challenge and disorient students, affecting psychological well-being negatively but increasing students’ acceptance of individuals of different faith traditions. Descripto Religion; Spiritual Development; College Students; Student r Experience; Well Being; Psychological Patterns; Consciousness Raising; Religious Factors; Hypothesis Testing; Self Esteem 6 Title Religiousness, Spirituality, and Social Support: How Are They Related to Underage Drinking among College Students? Author Brown, Tamara L.; Salsman, John M.; Brechting, Emily H.; Carlson, Charles R. Publicatio Journal of Child & Adolescent Substance Abuse; v17; n2; n pp15–39 Date 2008 Abstract This study’s findings indicate that religiousness and spirituality are differentially associated with alcohol use and
that only certain aspects of religiousness (intrinsic but not extrinsic) are related to lower levels of alcohol use. Descripto College Students; Drinking; Religious Factors; Social r Support Groups 7 Title The Greening of the World’s Religions Author Tucker, Mary Evelyn; Grim, John Publicatio Chronicle of Higher Education; v53; n23; pB9 n Date 2007 Abstract This article presents what religious leaders and local communities from different countries are doing regarding environmental issues. Descripto Environmental Education; Religion; Conservation r (Environment); Environmental Standards; Ethics; World Problems; International Cooperation
Source: Adapted from the ERIC Database. the system will search whenever a subject search is conducted in the database. (Some systems use the label subject and the abbreviated SU or DE for the descriptor field.) Search systems impress their own searching protocols on searchers. Encoding search statements into a system’s searching language, searchers are able to control how systems process their search statements. In most database search systems, prefacing your search statement with a field label such as DE or SU instructs the system to search for your entered words and phrases only in the specified field index. Since most systems index each word separately, you’ll need to put quotation marks around subject descriptors that are phrases if you want the system to retrieve records tagged with an exact subject descriptor. Without the quotation marks, the system will retrieve all records that have even one of the words somewhere in the descriptors index. The following are ten search statements for retrieving records from the sample seven-record database in table 6.1. The sample database’s search system features a controlled vocabulary searching language that
requires the entry of abbreviated field labels (DE in this case) to limit results to the controlled vocabulary field, quotation marks around controlled vocabulary terms to limit results to the exact phrase, and Boolean operators between the different facets. Keeping this searching language in mind, refer only to the sample database’s inverted index (table 6.2) to determine which records these search statements retrieve (answers conclude this section): 1. 2. 3. 4. 5. 6. 7. 8. 9.
de de de de de de de de de
(“outdoor education”) (“religion studies” OR “religious education”) (“college students” AND “well being”) (“world problems” AND “environmental education”) (“environmental education” OR “outdoor education”) (“religion” OR “spirituality”) (“college studnets”) (“self-esteem”) (“critical thinking” AND “world problems”)
Tables 6.1 and 6.2 prepare you to perform manually what search systems do automatically, and if you completed the exercise, then you experienced firsthand how controlled vocabulary indexing and searching work. Table 6.2. Sample Database’s Inverted Index for Controlled Vocabulary Searching Controlled Vocabulary Term Record Number and Field adirondack mountains 2 id beliefs 4 de college faculty 4 de college students 3 de; 4 de; 5 de; 6 de consciousness raising 5 de conservation (environment) 7 de critical thinking 4 de cultural pluralism 3 de diversity (student) 3 de drinking 6 de
ecology educational environment educational resources environmental education environmental standards ethics field studies field trips higher education hiking hypothesis testing instructional materials intermediate grades international cooperation learning activities naturalistic observation natural resources nature new york observational learning outdoor education psychological patterns religion religion studies religious differences religious education religious factors self-esteem social support groups spiritual development spirituality student experience teaching methods
1 3 1 1 7 7 1 1 3 2 5 1 1 7 1 2 1 2 2 1 1 5 3 4 3 3 4 5 6 5 3 5 1
de de de de; de de de de de; id de de de de de de de id id de de; de de; de de de; de; de de de de de de
2 de; 7 de
4 de
2 de 5 de; 7 de
4 de 5 de; 6 de
well being wildlife world problems
5 de 2 de 7 de
Answers: 1. 2. 3. 4. 5. 6. 7. 8.
Records 1 and 2 Records 3 and 4 Record 5 Record 7 Records 1, 2, and 7 Records 3, 5, and 7 No records because students is misspelled No records because there is no hyphen in this term in the inverted index 9. No records because no record bears both subject terms
CONDUCTING SUBJECT SEARCHES Having provided an overview of the basics, this section of chapter 6 delves into detailed subject searches in a research database, and the section after this discusses subject searches in library catalogs. If you have access to the databases discussed in this section through your university library, you may want to practice each step to help solidify your learning. The negotiated query we’re using is “the use of humor to treat people who are depressed.” The facet analysis for this query results in the three facets Humor, Depression, and Treatment. The logical combination is: Humor AND Depression AND Treatment. Database Selection
The facet Depression suggests that a psychology database is a good choice for this query, and one of the largest and most widely used is APA PsycInfo. Published by the American Psychological Association (APA), the database began as an abstracting and indexing (A&I) service offering surrogate records with citations, subject descriptors (also called index terms), and abstracts. Its more than five million A&I records describe the scholarly literature of psychology published in journal articles, books, and dissertations back to the field’s sixteenth-century origins (American Psychological Association 2022a). The PsycInfo controlled vocabulary for subject terms is the Thesaurus of Psychological Index Terms (American Psychological Association 2022b), which lists authorized terms in a syndetic structure. The APA has licensed their databases to EBSCO, ProQuest, and Ovid. The following examples are from the EBSCOhost platform so the screens included here will look different than other aggregators’ screens, but the database and the thesaurus are the same. Browsing the Online Thesaurus Browse PsycInfo’s online thesaurus, choosing subject index terms for each facet. Start with the Depression facet. On the advanced search screen, click on the “Thesaurus” link. EBSCOhost’s PsycInfo allows you to find authorized subject descriptors in three ways: (1) browsing for subject terms beginning with the entered term depression, (2) browsing for subject terms containing the entered term depression, or (3) browsing for subject terms based on relevance ranking. For now, choose “Term Contains,” enter depression into the search box, and click on the “Browse” button (figure 6.6). PsycInfo responds with a list of authorized subject terms containing your entered term depression. No PsycInfo index term is an exact match of your entered term. The closest match is “Major Depression.” Select the index term “Major Depression,” and the EBSCOhost search system displays these elements in the index term’s authority record (figure 6.7): The index term “Major Depression” The date this index term was added to the PsycInfo thesaurus
The index term’s scope note giving its definition and proper usage The index term’s syndetic structure in the form of broader terms, narrower terms, and related terms The synonyms that the subject index term is used for (used for). Examine the authority record, first reading the index term’s scope note to make sure this term is in sync with your interests. This scope note defines “Major Depression” and suggests the index term “Depression (Emotion)” for PsycInfo content on people with nonclinical depression. “Major Depression” has one broader and several narrower and related terms. Almost all listed narrower terms are satisfactory for representing the Depression facet. You can check the boxes to the left of these subject terms to add them to the search. If you deem all narrower terms relevant, check the box to the right of the index term in the “Explode” column. After finishing up here, you can check the authority record for “Depression (Emotion)” for additional subject terms.
Figure 6.6 Browsing for all terms containing the word depression in the APA Thesaurus of Psychological Index Terms. By permission of EBSCO Publishing, Inc. Always be circumspect about checking “Explode” and “Major Concept” boxes in authority records. “Explode” automatically selects all
listed narrower terms—make sure this is really what you want because the increased recall of records may include many nonrelevant ones. Checking “Major Concept” limits results where the items aren’t just about the topic the descriptor describes along with other topics but also whose main theme is the one topic the descriptor describes. Use it sparingly because its keen precision may eliminate items that would be of interest to the information seeker. Next, click on “Depression (Emotion),” scan its syndetic structure for relevant index terms, and check their selection boxes (figure 6.8). When you have finished selecting subject terms from the “Major Depression” and “Depression (Emotion)” entries, click the “Add” button, and leave the default Boolean operator set to OR. In response, EBSCOhost encodes your thesaurus selections in its controlled vocabulary searching language and places a search statement in the search box (figure 6.9). Your next step is to click on the “Search” button. EBSCOhost places PsycInfo results into set 1, reports the number of results, and displays the first twenty. Find this query’s remaining two facets, Humor and Treatment, the same way, entering terms for each facet into the Thesaurus of Psychological Index Terms browse box and choosing relevant index terms from authority records. Ultimately, the expert searcher’s objective is to gather index terms for each query’s facet using the online thesaurus, save the results for each facet in sets (e.g., Depression results in set 1, Humor results in set 2, and Treatment results in set 3), and then combine these three sets using the Boolean AND operator.
Figure 6.7 Authority record for the subject descriptor “Major Depression” in the APA Thesaurus of Psychological Index Terms, with additional terms selected. By permission of EBSCO Publishing, Inc. Once you have created three separate sets of results for this query’s three facets, click on the “Search History” link under the search box to see your sets. Combine results using the Boolean AND operator by checking the box for each set and clicking the “Search with AND” button (figure 6.10). In response, EBSCOhost combines the three separate sets 1, 2, and 3 in a Boolean AND operation, retrieving surrogate records bearing at least one index term per facet. You can also combine sets directly, entering set numbers and Boolean AND operators into the
search box: s1 AND s2 AND s3 The s preceding each number stands for set. If you omit it, EBSCOhost searches for the numbers 1, 2, and 3. Rule of Specific Entry When you check a database’s thesaurus for index terms, you may find several relevant ones to represent one or more of the query’s facets. That there are several relevant subject terms per facet has to do with the rule of specific entry (also called specificity). This rule governs indexers’ assignment of subject terms to surrogate records, requiring them to assign the most specific index term to the surrogate
Figure 6.8 Authority record for the subject descriptor “Depression (Emotion)” in the APA thesaurus with the related term “Sadness” selected. By permission of EBSCO Publishing, Inc.
Figure 6.9 EBSCOhost’s searching language for selected APA thesaurus terms. By permission of EBSCO Publishing, Inc.
Figure 6.10 The “Search History” feature in PsycInfo. By permission of EBSCO Publishing, Inc. that describes the subject content of the actual source, not a broader index term that encompasses the specific term. Here are three examples of the rule of specific entry in action: If the source is about the use of humor to treat patients diagnosed with recurrent depression in client-centered therapy, the PsycInfo indexer will assign the index term “Recurrent Depression,” not the broader term “Major Depression.”
If the source is about the use of humor to treat clinically depressed patients in client-centered therapy, the PsycInfo indexer will assign the index term “Client Centered Therapy,” not the broader term “Psychotherapy.” If the source is about the use of jokes to treat clinically depressed patients in client-centered therapy, the PsycInfo indexer will assign the index term “Jokes,” not the broader term “Humor.” Expert searchers have to accommodate the rule of specific entry in their controlled vocabulary searches, choosing the most specific subject terms to represent each facet. Strive to be comprehensive, selecting as many relevant subject terms as there are for each facet, and avoid going too far afield. So far, the focus has been on well-defined negotiated queries. Databases and search systems are able to help users who have unfocused queries as well, situating them in a search space that addresses their interests but placing the burden on the database to suggest fruitful avenues that they might explore within the search space. One approach presents categories of controlled vocabulary terms that can be used to filter results. The Depression query serves as an example. A subject descriptor search for this query yields almost 153,000 results in EBSCOhost’s PsycInfo database (figure 6.11). On the left side of the results screen, EBSCOhost invites the searcher to “refine results” with filters organized in categories of elements present in the results. The filters useful for limiting results by subject include the following:
Subject: Major Heading (various PsycInfo index terms, such as “Self Esteem,” “Social Support,” and “Dementia” that represent the main focus of the article described in the surrogate record) Subject (various PsycInfo index terms, such as “Drug Therapy,” “Risk Factors,” “Epidemiology,” and “Mothers”) Classification (various PsycInfo classification captions, such as “Clinical Psychological Testing,” “Military Psychology,” and “Childrearing & Child Care”) Tests & Measures (names of specific tests and measurements, such as “Hamilton Rating Scale for Depression,” “Beck Depression Inventory,” and “Mini Mental State Examination”)
Figure 6.11 Filters on the PsycInfo results page. By permission of EBSCO Publishing, Inc. Clicking on the arrow adjacent to each facet opens the category, revealing more details. For example, open in figure 6.12 is the “Subject: Major Heading” cluster, where specific major subjects assigned to items in the results set are listed in rank order from greatest to least number of results. The highest-posted cluster values aren’t very useful because they are PsycInfo index terms you searched for in the earlier examples. Clicking on the “Show More” link opens a pop-up window presenting more subjects than can be listed along the left side of the page. Scroll down to the medium-posted subjects, and you will see several major subjects that are likely to pique the interests of users researching depression. Examples are “posttraumatic stress disorder,” “aging,” and “quality of life” (figure 6.13). Selecting one of these subjects reduces this search’s results to several hundred and keeps the focus at least a bit more on the aspects of the topic of most interest.
Figure 6.12 Filters for two subject categories in the PsycInfo database. By permission of EBSCO Publishing, Inc.
Figure 6.13 Filter for “Subject: Major Heading” with descriptors ranked by number of results in PsycInfo. By permission of EBSCO Publishing, Inc. Facets and Filters On occasion, there are so many relevant subject descriptors for a facet that entering them directly would not only be time-consuming, but would also require much effort, patience, and attention to detail to collect the subject descriptors, spell them correctly, and type them into the advanced-search interface boxes. Search systems offer a shortcut, allowing users to search for the words that occur repeatedly in index
terms. Take your sample query’s Treatment facet as an example. The scope note under “Treatment” advises searchers to use more specific index terms. Checking several other index terms’ authority records reveals that the number of relevant descriptors under “Treatment,” “Psychotherapy,” “Therapy,” and “Counseling” may surpass six dozen or so. Keeping in mind the rule of specific entry, the searcher might want to include all of them for greater recall of results. Typing them into the system’s advanced-search interface or placing check marks in the boxes accompanying these terms in authority records could take a long time. You also know that the controlled vocabulary search for this topic retrieves few results. Populating your controlled vocabulary search with all descriptors bearing the words “treatment,” “psychotherapy,” “therapy,” and “counseling” may increase the final result set without sacrificing precision because the search is still based on subject terms. Here are your next steps. Enter the four relevant index terms into the advanced-search box, nest these terms inside parentheses, and preface them with the SU field label. Here is a search statement for the Treatment facet: SU (treatment OR psychotherapy OR therapy OR counseling). The searcher can either preface nested index terms with the SU field label or select “SU Subjects” from the “Select a Field” drop-down menu (figure 6.14). The EBSCOhost search system responds by retrieving surrogate records bearing these words in subject index terms and author-supplied keyword fields. Putting quotation marks around each word limits retrieval to the word in quotation marks, and does not include its plural, singular, and possessive forms. Omitting quotation marks retrieves the word and its plural, singular, and possessive forms, in effect doing an OR search for variant spellings. Substituting a search statement that retrieves index term words for the Treatment facet more than doubles the results for the humor query. Examples of additional relevant titles are: “Effects of Laughter Therapy on Depression, Cognition, and Sleep among the Community-Dwelling Elderly” “Laughter Therapy in a Geriatric Patient with Treatment-Resistant Depression and Comorbid Tardive Dykenesia” “Effects of Laughter Therapy on Immune Responses in Postpartum Women”
You can preview the subject index term words that the search system retrieves by browsing the thesaurus, choosing the “Term contains” option, and entering the index term word you intend to search. If you think that the majority of listed subject terms are relevant, then use the shortcut. If not, select terms from the display of authority records, or search for index terms directly.
Figure 6.14 Subject search using field selection menu. By permission of EBSCO Publishing, Inc.
SEARCHING WITH SUBJECT HEADINGS For more than a century, library catalogs have used Library of Congress Subject Headings (LCSH) as the controlled vocabulary for books, serials, and other “whole” items (i.e., periodicals but not the articles in the periodicals). LCSH works well in a classic library catalog, the OPAC. However, it can be difficult to get good results when using a library’s everything discovery system, since it searches not only the catalog but also databases using different controlled vocabularies. In this section, the focus is on using the classic library catalog, but not just the one at your local college, public, or school library. LCSH is also used in WorldCat, the database of OCLC member libraries’ holdings; in the
Catalog of United States Government Publications, maintained by the U.S. Government Publishing Office; and in the Library of Congress catalog containing more than fourteen million records. The National Library of Medicine’s PubMed, an A&I database with links to the full text of free open-access articles, uses the more specific Medical Subject Headings (MeSH), designed to reflect the nomenclature of the health sciences. Figure 6.15 shows the MeSH authority record for monkeypox, including a scope note.
Figure 6.15 MeSH authority record for Monkeypox. Source: National Library of Medicine, https://www.ncbi.nlm.nih.gov/mesh Clicking on the MeSH “Tree Structures” tab on the authority record reveals where the concept fits in the hierarchy of headings for infections. Infections [C01] Virus Diseases [C01.925] DNA Virus Infections [C01.925.256] Poxviridae Infections [C01.925.256.743] Cowpox [C01.925.256.743.175] Ecthyma, Contagious [C01.925.256.743.193] Ectromelia, Infectious [C01.925.256.743.239] Fowlpox [C01.925.256.743.366] Lumpy Skin Disease [C01.925.256.743.494] Molluscum Contagiosum [C01.925.256.743.611]
Monkeypox [C01.925.256.743.615] Myxomatosis, Infectious [C01.925.256.743.665] Smallpox [C01.925.256.743.826] Vaccinia [C01.925.256.743.929]
LCSH expresses some subjects in main headings (e.g., “Women,” “Women adventurers,” and “Women in computer science”) and subdivided headings expressing additional subjects, geographic locations, genres, or other elements (e.g., “Women—Korea (South)— Economic conditions—Regional disparities” and “Women adventurers— United States—Biography—Juvenile literature”). Except for one-word main headings, most combined main and subdivided headings are examples of precoordination—the combining of individual concepts into complex subjects before conducting a search for them. For example, implicit in the LCSH concept “Women in computer science—Developing countries” are three facets: Women, Computer Science, and Developing Countries. Surrogate records assigned this heading are likely to be relevant in searches with these facet analyses and logical combinations: Women AND Computer Science, Women AND Developing Countries, and Women AND Computer Science AND Developing Countries. A classic-catalog subject search using the heading and limiting the search to the subject field will retrieve records for all the items tagged with that subject heading. On the full record for each item (usually accessed by clicking on the item’s title in the list of results) are the subject headings assigned to the book (or other item). In many catalogs, each heading is a link you can click on to see all the items in the catalog under that heading. To browse the complete LCSH, you have three choices. You can download files in sections by letter of the alphabet at https://www.loc.gov/aba/publications/FreeLCSH/freelcsh.html. Although downloading pdfs provides static pages, looking at all the headings in alphabetical order provides a clear view of the syndetic structure of the list. The 842-page pdf for headings beginning with C can be downloaded and then searched or scrolled. You can see at a glance that information about many straightforward headings is minimal. For main headings with many possible uses, you will find a scope note. For example, on page C601, in the downloaded pdf, the entry for the heading Computer art
includes a scope note and used for, broader terms, and narrower terms references. (figure 6.16).
Figure 6.16 Selection of Library of Congress Subject Headings beginning with the letter C. Source: Library of Congress Subject Headings, https://www.loc.gov/aba/publications/ArchivedLCSH41/freelcsh.html Another way to access LCSH is to start at the Library of Congress Linked Data Service page, https://id.loc.gov/. Starting here gives you a glimpse of the prodigious amount of effort put into the creation and maintenance of headings lists, thesauri, and ontologies, all intended to disambiguate terminology and provide structure for items in databases and catalogs and on the web. Searching LCSH from this website will yield not only the customary information such as scope note (if there is one) and syndetic structure, but also contextual information indicating the relationship to various open-data projects. For most purposes, the most convenient way to consult LCSH is by beginning at the Library of Congress Catalog website (https://catalog.loc.gov/). Scrolling down on the home page leads to a link to the LC Authorities page at https://authorities.loc.gov/, where you
can limit your search to “Subject Authority Headings,” and enter a word or phrase to find the LCSH term for your topic. The Library of Congress Catalog page also offers a browse tab where you can search for words and specify “SUBJECTS beginning with” or “SUBJECTS containing.” The result will be an alphabetical list of headings beginning with or containing your term, with narrower terms listed for authorized headings and with clickable “see” references to the authorized heading for the unauthorized ones shown in the list. The convenience of accessing LCSH this way is that clicking on an authorized subject heading takes you to the list of items in the catalog tagged with that heading. Also of interest is Library of Congress Subject Headings Supplemental Vocabularies: Children’s Headings (LCSHAC) at https://id.loc.gov/authorities/childrensSubjects.html. It can be consulted when LCSH terminology is too complex to use in a catalog that includes juvenile collections. For instance, in a university library catalog the LCSH term is Lunar petrology while in a public or school library catalog the LCSHAC term is Moon rocks. The sixteen-page pdf of children’s subject headings is linked at https://www.loc.gov/aba/publications/FreeCYAC/freecyac.html. As with thesauri in databases, you can use LCSH from within a catalog to look up headings and click on links to find all the records tagged with those headings, as in the LC catalog, but this is not always the case in local library catalogs. If the headings list isn’t browsable in your local catalog, you can 1) visit the LC Authorities website to find the correct headings, or 2) do a keyword search in your local library catalog, scroll the results to find a relevant item, look at its full surrogate record, choose one of the subject headings listed on the record, and click on your chosen heading to see all the items in the catalog tagged with that heading.
ADDITIONAL TIPS FOR CONTROLLED VOCABULARY
SEARCHING Author-Supplied Keywords PsycInfo and many other research databases include an author-supplied keyword field in their surrogate records. In this field are words and phrases that scholarly journal editors ask authors to add to their manuscripts when they submit them to the journal for review. Authors don’t choose them from a controlled vocabulary. Instead, they list keywords based on their articles’ contents and their knowledge of their field’s jargon. In some databases, the author-supplied keywords are included in the subject index to increase recall in subject searches. Dynamic Term Suggestions Some search systems respond with dynamic term suggestions while you enter your terms into the database’s online thesaurus or enter your search statements into the system’s search box. These can be useful to end users who are having trouble identifying good keywords or topics for a homework assignment. For the search expert, however, it’s probably best to ignore them and go about your more methodical strategy of identifying and using authorized subject terms. Multiple Databases Occasionally, you’ll search multiple databases on a single aggregator’s platform. When you do, you’ll have to be deliberate about selecting index terms from each database’s thesaurus. As we have seen with the ERIC and APA thesauri, different scholarly disciplines use different vocabularies in subject-specific databases. Narrower and Broader Terms
Don’t be fooled into thinking that narrower terms have fewer postings than broader terms. This is not true. For example, the index term “Major Depression” produces a whopping 110,000 results, in comparison to only 13,000 for its broader term, “Affective Disorders.” Populating the Search Box Automatically If you know all the subject descriptors to use for the facet Depression, you can use the field label DE (for descriptor) to specify searching only in the subject field index, parentheses to tell the system that everything inside the parentheses should be searched in the subject field index, and quotation marks to tell the system to search for phrases. Here’s what you’d need to type in the search box: DE (“anaclitic depression” OR “dysthymic disorder” OR “endogenous depression” OR “late life depression” OR “postpartum depression” OR “reactive depression” OR “recurrent depression” OR “treatment resistant depression” OR “depression (emotion)” OR “major depression” OR “sadness”) Or you could just check the boxes for the relevant index terms in the thesaurus and let it populate the search box. Choosing descriptors from a database’s online thesaurus saves time and effort. Quick and Dirty That said, sometimes the information seeker at your elbow doesn’t have the time or patience to work through a subject headings list or thesaurus or watch you do it. You can go the quick-and-dirty route without compromising your commitment to controlled vocabulary. Do a keyword search representing the query facets, perhaps limiting them to the title or abstract field in databases that index full texts, then skim the results with the information seeker, find one they like, and use its subject headings or descriptors in the next search. Good enough, under the circumstances.
Search History As you become adept at using databases and their search languages, it may be tempting to input the entire search statement at once. That can be perilous, however. Search systems are programmed to process Boolean operators in a certain order. EBSCO databases process the AND first, then the OR, while WorldCat processes operators from left to right. If your search statement includes both operators, you’ll need to be careful to enter them correctly, perhaps using parentheses around terms with AND or OR between them to tell the system to process that operation first. Entering search statements for facets one at a time, however, is easier for you and less confusing for the user you are assisting. It also gives you more flexibility since the system will create separate results sets for each facet, which you can then combine using the Boolean AND operator, as we did in the query about the use of humor to treat depression. As the user provides feedback about results and the search develops, you can also use those sets in different combinations or decide not to use one of them at all if the query takes a turn in another direction. Featured in textbox 6.1 are the changes searchers make to ongoing searches. If you find yourself repeatedly making one or more of these changes, you might want to use the automatic-entry and separate-sets methods.
TEXTBOX 6.1. Changes Searchers Make to Ongoing Searches 1. Correcting misspelled search terms 2. Fixing logic errors due to missing parentheses or incorrect Boolean operators 3. Limiting search terms to specific fields 4. Adjusting proximity operators 5. Adding related search terms that the searcher identifies in relevant results 6. Omitting terms from search statements that have an adverse effect on results
7. Eliminating search statements combination to increase results
from
the
final
logical
QUESTIONS Conduct online searches for one or more of these negotiated queries. Choose search terms from the database’s controlled vocabulary, selecting them from the database’s online thesaurus or list of subject headings or entering them directly using the search system’s controlled vocabulary searching language or drop-down menu. If you don’t have access through your local library, check your state library’s website for the databases. You can also use the web-based ERIC for queries since it will likely retrieve some records. Suggested controlled vocabulary search formulations for these queries conclude this chapter. 1. For the ERIC database: I’m a new school bus driver, and I want to know more about handling discipline problems on school buses. 2. For the PsycInfo (or ERIC) database: Are teens whose parents divorce likely to develop eating disorders? 3. For the ERIC database: When adult learners engage in conservation education, does their behavior change? 4. For the PubMed database: New technology devices to help ALS patients communicate with family, friends, and caregivers. 5. For the NoveList database: Manga about dating, for teenage readers.
SUMMARY
Controlled vocabulary is a tool for increasing the precision of your searches. By designating preferred terms for topics, controlled vocabularies, at their simplest, eliminate the ambiguities and inconsistencies of natural language. More sophisticated ones incorporate a syndetic structure indicating hierarchical and lateral relationships among terms. They may take the form of subject heading lists used in library catalogs or of thesauri used in research databases. In the subject fields of surrogate records are controlled vocabulary terms representing the topics the sources indexed in the database are about, not just topics that they mention in passing. The database system indexes the subject fields, making the terms searchable. When you search using an authorized subject term and limit the search to the subject field index, you are building precision into the set of results the system will retrieve. Working from a database’s online thesaurus, you are able to select index terms and the system will populate the search box with your selected terms and automatically limit the search to the subject field index. The database also numbers each set of results so you can reuse them. By working with each facet separately, you can use the numbered sets to combine the different facet results with the Boolean AND operator. Searchers can also enter phrases (designated by quotation marks) or words in controlled vocabulary terms directly into a system’s search box, by using either the system’s drop-down menu next to the search box to limit the search to the subject index or by including the system’s searching language in the search box, such as subject: autoinstructional aids. Controlled vocabulary searching can help users with unfocused queries. Search for controlled vocabulary terms that satisfy one or two facets the information seeker is sure about, then invite them to filter results using one or more of the terms available on the database’s results screen. Facet analysis and logical combination are implicit in controlled vocabulary searching of library catalogs because of the precoordination that is built into Library of Congress Subject Headings. All you have to do is identify a Library of Congress subject heading that covers all of the facets in the user’s query, conduct a subject search using it, and assess the relevance of the library materials to which the LCSH is assigned. For unfocused queries, start by searching a main heading, then work with
the user to assess results and find a more focused heading to use in a subsequent search. When your chosen database has a thesaurus, use it. The publisher has gone to great lengths to develop and maintain the thesaurus: establishing each authorized term after deciding which of several synonyms should be the authorized term for a concept; linking term variants and synonyms to the authorized term; characterizing relationships between authorized terms as broader, narrower, or related; and retiring outdated terms and adding new ones when necessary. Also, the human indexers who apply the thesaurus have gone to great lengths to identify what books, articles, and other publications are about and to add the correct subject headings or terms to their surrogate records. When your chosen database doesn’t have a thesaurus, you may miss the tools it offers for increasing the precision of results. In your work as an information intermediator, you can’t rely entirely on controlled vocabulary searching. Not every database uses a controlled vocabulary, and not every topic has an authorized subject term. In such cases, you’ll use the kinds of search techniques described in chapters 7 and 8.
REFERENCES “50 Years of ERIC, 1964-2014.” 2014. https://eric.ed.gov/pdf/ERIC_Retrospective.pdf. American Psychological Association. 2022a. “APA PsycInfo®.” Accessed November 30, 2022. http://www.apa.org/pubs/databases/psycinfo/fact-sheet.pdf. American Psychological Association. 2022b. “Thesaurus of Psychological Index Terms®.” Accessed January 23, 2022. https://www.apa.org/pubs/databases/training/thesaurus. NoveList. 2019. The Secret Language of Books: A Guide to Story Elements. Accessed November 30, 2022. https://www.ebscohost.com/promoMaterials/NoveList-Guide-toStory-Elements.pdf.
SUGGESTED TRAINING MATERIALS Database publishers offer training specific to the products on their websites. Some also have YouTube channels. A few are listed here. You can browse them for videos about subject searching using thesauri, and for other topics you’ll learn about as we move through the following chapters. APA Publishing Training, https://www.youtube.com/c/APAPublishingTraining/featured. For example, view APA PsycInfo Advanced Search, https://www.youtube.com/watch?v=OWXK7jwAAEc. EBSCO Tutorials, https://www.youtube.com/c/EBSCOSupportTutorials. For example, view Browsing Subject Terms in EBSCOhost Databases —Tutorial, https://www.youtube.com/watch?v=7BfukkrDoQc&t=38s. National Library of Medicine, https://www.youtube.com/user/NLMNIH. For example, Use MeSH to Build a Better PubMed Query, https://www.youtube.com/watch?v=uyF8uQY9wys&t=5s. NoveList, https://www.youtube.com/channel/UCUAwZIC_tfHsSMoIfys1Y9Q. For example, use NoveList’s Story Elements, https://www.youtube.com/watch?v=nwq4oAh-wEo. OvidWoltersKluwer, https://www.youtube.com/user/OvidWoltersKluwer/featured. For example, view Ovid Term Finder Demo, https://www.youtube.com/watch?v=WN36_rrslnc. ProQuest® Training, https://www.youtube.com/user/proquesttraining/featured. For example, view ProQuest Thesaurus, https://www.youtube.com/watch?v=iz0wGUfgDIQ.
ANSWERS 1. I’m a new school bus driver, and I want to know more about hand discipline problems on school buses. ERIC
Facets School Buses Student Behavior To combine sets
Choosing Terms for Search Statements (in EBSCOhost) DE (“school buses” OR “busing” OR “bus transportation”) DE (“student behavior” OR “discipline” OR “discipline problems” OR “aggression” OR “bullying” OR “antisocial behavior” OR “sexual harassment”) Choose “Search History.” Checkmark set numbers for these two facets, and click on the “Search with AND” button.
2. Are teens whose parents divorce likely to develop eating disorde PsycInfo
Facets Eating Disorders
Choosing Terms for Search Statements (in EBSCOhost) Click the “Thesaurus” tab, and enter eating disorders. Checkmark the “Explode” box under “Eating Disorders,” forcing PsycInfo to select all narrower terms. Choose “Select term, then add to search using OR.” Divorce Click the “Thesaurus” tab, and enter divorce. Select relevant index terms, such as “Divorce,” “Child Custody,” “Divorced Persons,” and “Marital Separation.” Click on these index terms to browse and select their broader, narrower, and related terms, such as “Parental Absence,” “Mother Absence,” and “Father Absence.” Choose “Select term, then add to search using OR.” Adolescent Choose a relevant value from the “Age“ cluster to filter s results. To combine Choose “Search History.” Checkmark set numbers for sets the first two facets, and click on the “Search with AND”
button. Then choose “adolescence 13–17 yrs” from the “Age” cluster. 3. When adult learners engage in conservation education, does t behavior change? ERIC
Facets Conservatio n
Adult Education Behavior Change
To combine sets
Choosing Terms for Search Statements (in ProQuest) mainsubject.exact(“conservation education” OR “energy education” OR “environmental education” OR “energy conservation” OR “sustainable development” OR “recycling”) mainsubject.exact(“adult learning” OR “adult students” OR “adult programs” OR “adult education” OR “adult development” OR “continuing education”) mainsubject.exact(“behavior change” OR “behavior patterns” OR “behavior” OR “attitude change” OR “attitudes” OR “attitude measures” OR “community attitudes” OR “motivation”) Choose “Recent searches.” In the search box, enter set numbers for the three facets combined with the Boolean AND operator: si AND s2 AND s3.
4. New technology devices to help ALS patients communicate with fam friends, and caregivers. PubMed
Facets ALS New Technology Devices
To combine sets
Choosing Terms for Search Statements (in PubMed) “amyotrophic lateral sclerosis”[Mesh] Select these two unsubdivided MeSH: “Self-Help Devices”[Mesh:NoExp] OR “Communication Aids for Disabled”[Mesh]. Select also these two MeSH bearing these subdivisions: instrumentation, methods, standards, supply and distribution, therapeutic use, trends, utilization. Click “Advanced Search.” Into the “Builder,” enter set numbers for the two facets combined with the Boolean
AND operator: #1 AND #2. 5. Manga about dating, for teenagers. NoveList
Facets dating
Choosing Terms for Search Statements (in NoveList) On the advanced search screen, enter dating in the first search box and use the drop-down menu to select SU Subject. manga On the advanced search screen, enter manga in the second search box and use the drop-down menu to select GN Genre. To combine sets Leave the default AND between the two boxes. On and filter results the results page, check the box labeled “Teen” in the Audience cluster.
7
Free-Text Searching Free-text searching allows searchers to use any words and phrases in their search statements, not just those that come from a database’s controlled vocabulary. Most people aren’t familiar with controlled vocabulary, and most people’s first resort when seeking information is their favorite web search engine, where natural language is king. Consequently, we can think of free-text searching as the default, the first (and perhaps only) attempt at finding needed information. Once you’ve become adept at using controlled vocabulary for subject searches, it may seem odd to consider keyword or key phrase searching as a useful endeavor. Since not all databases use a thesaurus and since not all thesauri have a preferred term for all topics, free-text searching remains important. But it doesn’t have to be just a matter of throwing some keywords into a search box. Chapter 7 offers techniques to make free-text searching a useful alternative when preferred subject descriptors aren’t available and when simple keyword searching retrieves too many irrelevant results. Sophisticated free-text searching can help you find needed information, but it’s definitely an acquired skill worth practicing in order to balance recall and precision. Free-text searching is an especially fruitful approach for searching queries that explore obscure or cutting-edge topics for which there is no consensus about the right terminology to apply to the object, event, or phenomenon. For example, there was no Medical Subject Headings (MeSH) term for COVID-19 for months after the pandemic began. A couple of days after the World Health Organization announced the disease’s official name on February 11, 2020, the National Library of Medicine (NLM) added a placeholder, in the form of a supplementary concept record, for COVID-19 in MeSH, and the heading was officially established later that year (National Library of Medicine 2020). During
the interim, NLM recommended using a strategy when searching the PubMed database that would create as much recall of potentially relevant publications as possible. The recommended search statement included a number of synonyms with the Boolean OR between them: 2019-nCoV OR 2019nCoV OR COVID-19 OR SARS-CoV-2. For the first months of the pandemic, free-text searching was the only way to find scientific research papers on the topic of the novel coronavirus and the disease it causes. For today’s cutting-edge topics, free-text searching remains the only method available until preferred subject terms for those new topics are authorized.
FREE-TEXT TECHNIQUES Using a controlled vocabulary has its strengths and weaknesses, and the same holds true for free-text searching. Using descriptors in subject searches builds precision into results, as chapter 6 explained. Sometimes subject searching may be too precise, however, leaving out relevant items because of the way the indexer applied the guidelines for assigning descriptors to articles or the way the indexer judged the aboutness of the article. Free-text searching builds large-scale recall into results, on occasion retrieving items that are relevant but that were not tagged with the subject descriptor being searched. With all of those relevant results comes a lot of irrelevant material, however (Tenopir 1987). Three approaches to free-text searching to add to your repertoire are designed to capitalize on its strengths and mitigate its weaknesses: 1. Combining controlled vocabulary and free-text searching 2. Using fielded searches for keywords and phrases 3. Modifying words and word order with truncation, wildcards, proximity connectors, and adjacency operators These approaches will take different forms depending on the type of database you use, the features programmed into the search system, and
the information seeker’s objectives. Of major significance is whether the database contains only surrogate records or also includes full texts whose words are indexed and thus searchable. Retrieval recall in a database of surrogate records will be on a smaller scale, simply because there are fewer words indexed from records than from full texts. Combined Controlled Vocabulary and Free-Text Searching All searching begins with a facet analysis and logical combination, and both can guide you as you determine whether and how to combine subject descriptors with natural-language words and phrases. This is where working with one facet at a time and then using the system’s search history function can be especially helpful as you construct sets of results. For example, an information seeker interested in investing in environmentally friendly products may ask for a lot of information about the future of electric vehicles (EVs). The facet analysis and logical combination is Future AND Electric Vehicles. EBSCO’s Business Source Ultimate is a good database to search for this topic, since the seeker needs to understand what business and finance experts are thinking about EVs. The search history in figure 7.1 shows three searches, the first for the descriptor electric vehicles in the subject field, the second for the keyword “future” in the default search box, and the third a search using OR with four synonyms for future. In business and investing, there is a specific term, futures, that defines a particular type of financial transaction. Since the EBSCO system automatically searches singular and plural forms of keywords, the term “future” is in quotation marks to prevent the plural, with a different meaning, from being in the retrievals.
Figure 7.1 Search history showing three sets of results in Business Source Ultimate. By permission of EBSCO Publishing, Inc. The OR search statement with its synonyms retrieves ten times more results than searching the single word “future,” displaying the power of natural-language free-text searching throughout a database’s full text for maximum recall. When the first and third sets are combined, there are 7,230 results that are about electric vehicles and that mention at least one of the words for the concept future, as shown in figure 7.1. Even the most motivated investor can’t be expected to read that many articles, especially when a lot of them aren’t relevant. At this point in the search, it would be wise to negotiate the query in the context of these results and add another keyword or subject descriptor that makes the investment facet explicit. Before taking that step to make the results more precise, it would be wise to question whether the subject descriptor electric vehicles is too precise. Are there narrower descriptors for related terms, such as electric automobiles or electric trucks? If there’s an article about the future of electric trucks, would it have been tagged with the narrow descriptor electric trucks instead of the broader electric vehicles? And is the seeker interested in electric automobiles and trucks or do they want more sweeping assessments of the EV industry as a whole? Crafting a search that uses subject descriptors along with keywords can make you aware of the strengths and weaknesses of each. The advantage of the combination may also become obvious as you see how the precision of controlled vocabulary can mitigate the expansive recall of natural-language searching as well as how free-text searching can
retrieve some relevant results that the use of a focused subject descriptor misses. Fielded Searching We’ve been limiting our descriptor searches to the subject or descriptor field. Other indexed fields in the surrogate records can be used to target a topic even when the database doesn’t have a controlled vocabulary, or when an indexer didn’t tag relevant records with the descriptor you used in a search. The ultimate recall technique is keyword searching through all the surrogates and full-text sources in a database. Changing the field where keywords and phrases are searched will yield fewer results, with a higher proportion being relevant to the user’s query. For research queries, the obvious fields to use are title and abstract. A long wellwritten title may include words signaling the topic covered in the article. An abstract includes more words describing the article’s topic, but not the overwhelming number of words in the full texts of all the items indexed in a database. When databases began adding full-text sources in the 1980s, researchers and practitioners began to investigate the utility of different search techniques. Although much of the search literature from the 1990s is out of date, mainly because search systems have greatly improved, some of those early articles provide insights that remain helpful. In a summary of researchers’ findings regarding full-text medical databases, for example, Ojala (1990) noted that natural-language searching limited to the title field retrieved less than 10 percent of the relevant full-text articles, while searching limited to the abstracts field retrieved between 12 and 15 percent of the relevant full texts. These percentages may have changed over time, but the fact remains that keyword searching in a small pool of words will yield fewer results, and thus fewer relevant results, than searching in a large pool. Interestingly, Ojala noted that some of the relevant items retrieved in the keyword search of abstracts were not retrieved in the subject search using a descriptor for the topic. Ojala reported that the researchers’ explanation was that searchers worked hard to account for all the synonyms and variant spellings they could think of when they conducted natural-
language searches, while they relied solely on descriptors when they conducted subject searches. Search systems in full-text databases don’t always search full texts by default. The ProQuest search system does, but the EBSCOhost system doesn’t. On the EBSCOhost platform, you can select the checkbox below the search boxes to “Also search within the full text of the articles,” or you can use the drop-down menu to the right of the search box and select the field labeled “TX,” or you can input the label “TX” inside the search box at the beginning of your keyword or phrase search statement: TX “electric vehicles” Modifying Words and Word Order The Boolean OR can be used to include not only synonyms but also variant spellings, such as plurals and British terms, and variant endings following word stems. Rather than using OR between all the variant spellings you can think of, you can use wildcard symbols to modify words and truncation symbols to add endings such as -ing, -ed, and -er to word stems. Wildcard and truncation symbols (or operators or devices, as they are sometimes called) save the searcher from having to construct long OR statements to include simple spelling differences. But even thinking about how to use wildcards and truncation takes a bit of effort, so it’s worth it to check the help screens in a database to see if its search system automatically searches common word variants. For example, the ProQuest system automatically searches for plurals, British spellings, and related terms such as bigger/smaller. If you want only one of the variants you have to put it in quotation marks to force the system to search only the characters, in order, enclosed in the quotation marks. For example, a search for the simple keyword puppies is interpreted by the system as a search for puppies or puppy, and if you don’t want the singular form included you’ll need to put quotation marks around the word “puppies.” The EBSCO advanced search screen includes checkboxes for applying related words and for applying equivalent subject descriptors to keyword searches. The searcher who uses such handy features should still evaluate the spellings and terms added to their original search to ensure nothing unnecessary is included and nothing necessary is missing.
The location of words in an abstract or full text affects their meaning and thus the relevance of retrievals. A search for two keywords connected by the Boolean AND may work well when abstracts are being searched, since the two words will be somewhere in the short summary of an article. But when full texts are being searched the two keywords may be many paragraphs away from each other in an article that’s about one of those keywords but only casually mentions the other. Proximity and adjacency operators allow you to specify how near the words should be to each other and in what order the words should appear, respectively. Check the database help screens before you craft a search using proximity and adjacency operators, since some systems may have built-in functionality. For example, in EBSCOhost databases, the advanced search boxes have a default proximity operator, N5, so the system automatically retrieves keywords within five words of each other, in any order.
FREE-TEXT SEARCH EXAMPLE The following example guides you through the process of crafting a free-text or natural-language search conducted in a research database that indexes scholarly publications. The negotiated query is “the entrepreneurial activities of women whose livelihoods depend on farming.” Your facet analysis and logical combination result in two facets combined by the Boolean AND operator: Entrepreneurship AND Farm Women. A straightforward search using terms from the thesaurus as subject descriptors and then as keywords demonstrates the importance of freetext searching for greater recall of relevant items. EBSCOhost’s Academic Search Ultimate database offers full texts indexed from thousands of journals in many different disciplines. It’s huge. It uses an extensive thesaurus created by EBSCO and accessible at the “Subject Terms” tab in the blue band above the search boxes on the advanced search screen.
The most precise search would be DE “WOMEN farmers” AND DE “ENTREPRENEURSHIP”, with the DE label limiting the search to the subject field, but for purposes of illustration, the search includes other descriptors to increase recall. The relevant descriptors for the first facet are “women farmers,” “women in agriculture,” “women coffee growers,” and “women dairy farmers.” For the second facet, they are “entrepreneurship,” “business enterprises,” and “self-employment.” Here is the search statement: DE “WOMEN farmers” OR DE “WOMEN in agriculture” OR DE “WOMEN coffee growers” OR DE “WOMEN dairy farmers” AND DE “ENTREPRENEURSHIP” OR DE “BUSINESS enterprises” OR DE “SELF-employment” In this massive database, this search retrieves seven results, and a quick look at the surrogate records shows they are all relevant. But the small number makes the expert searcher wonder what might be missing. To find out, here is the broader search statement using the descriptor terms as natural-language keywords instead: ”ENTREPRENEURSHIP” OR “BUSINESS enterprises” OR “SELFemployment” AND “WOMEN farmers” OR “WOMEN in agriculture” OR “WOMEN coffee growers” OR “WOMEN dairy farmers” This free-text search using keywords and phrases yields twenty-three results, sixteen more than the subject search using the database’s controlled vocabulary. To see the difference, the NOT operator can be used to eliminate the set retrieved by the subject search from the set retrieved by the keyword search. The only reason NOT is used here is to help us evaluate the effects of the two different approaches to searching. Among those sixteen free-text search results are articles with these apparently relevant titles that were not in the subject search results: “Effects of Award Incentives and Competition on Entrepreneurship Development of Women Farmers in North West
Province, South Africa” and “Microfinance Intervention in Poverty Reduction: A Study of Women Farmer-Entrepreneurs in Rural Ghana.” The small number of results in these searches might indicate the need for a search statement revision to achieve greater recall. This is where the use of fielded searching and truncation, proximity, and adjacency operators can be useful. Here is one possible revision of the previous free-text search statement: TX (ENTREPRENEUR* OR (BUSINESS N2 enterpris*) OR (SELF W1 employ*)) AND TX (WOM#N N4 (farm* OR agriculture OR grow*)) The search statement for the Entrepreneur facet asks the system to look through every indexed word in the database for the words entrepreneur, entrepreneurs, and entrepreneurship; the word business and the words enterprise, enterprises, and enterprising with two or fewer intervening words; and the word self before the words employees, employed, employing, and employment with no more than one word between them. Any item that includes any of them will be retrieved. The retrieval set numbers 564,822 results, as shown in figure 7.2. The search statement for the Women Farmers facet asks the system to look through every indexed word in the database for the words woman or women within four words of the words farm, farmer, farmers, farmed, farming, farmwomen, farmworkers, agriculture, grow, grows, grower, growers, growing, and growth. The results number 52,900. Combining the two sets with the Boolean AND yields 6,258 results. The much greater recall comes from changing the search field to all text and adding the truncation operator at the end of word stems to include different word endings. What precision there is in this search statement comes from limiting word order and proximity and combining the enormous sets with the Boolean AND. With so many results, a lot will be irrelevant, but some relevant items not found in the previous attempts may be included. Skimming the first one hundred records, there’s one possibly relevant article that wasn’t in the previous search: “Analysis of Extent of Credit Access among Women Farm-Entrepreneurs Based on Membership in Table Banking.” Rather than continuing to skim
records, it would be wise to use the filters on the left side of the results screen and to consult with the information seeker to decide how to proceed.
Figure 7.2 Results sets showing high recall of free-text searches in fulltext sources indexed in Academic Search Ultimate. By permission of EBSCO Publishing, Inc.
HOW FREE-TEXT INDEXING AND SEARCHING WORK This demonstration of how free-text indexing and searching work uses the sample database of seven surrogate records in table 7.1. First, the system parses words from all fields in our sample seven-record database, defining a word as a string of alphabetical, numeric, or alphanumeric characters separated by spaces, tabs, paragraph breaks, or symbols. If a word bears an apostrophe, then the system deletes it and closes up the trailing letter; for example, world’s becomes worlds and life’s becomes lifes. For hyphenated words, the system deletes the hyphens and closes up the words into one. Second, the system numbers the words parsed from fields. Third, it places each word in its inverted index for free-text searching, omitting stopwords (i.e., frequently occurring articles, conjunctions, prepositions, and single-character
words) and registering this information about each indexed word in the index: The record number in which the word resides The field in which the word resides (using the field labels TI for title, DW for descriptor word, and IW for identifier word) The word-position number from the field(s) in which the word occurs Table 7.1 displays the sample database’s inverted index for free-text searching. It is limited to indexed words from title, descriptor, and identifier because including words from all fields would have resulted in a very long list. Although the system doesn’t index stopwords, it retains their word-position number. An alternative approach is to treat stopwords as if they don’t exist, assigning their word-position number to the word that follows. When searchers enter their queries, the sample database’s search system uses this data to produce retrievals: (1) record number, (2) field name, and (3) word-position number. Let’s say you want to retrieve records on “campus life.” These search statements use the free-text searching language of the EBSCOhost search system: campus w0 life. First, the system searches the inverted index for free-text searching, looking for records bearing the word campus, and retrieves record 3. It follows up with a search of the inverted index for records bearing the word life. It retrieves record 3. This is hopeful because record 3 bears both words. Second, the system checks whether both words occur in the same field. Yes, they both occur in the title field. Last, the system checks word-position data to see whether the word campus occurs before the word life and whether these two words are adjacent to one another. The word campus is word 2 in record 3’s title. The word life is word 3 in record 3’s title. For word adjacency, the first word in the phrase must have a lower word-position number than the second word in the phrase, and subtracting the word-position number for the first word from the word-position number for the second word must equal 1. That’s precisely what happens here. Record 3 is retrieved! Perhaps you are thinking to yourself that this same idea can be expressed in a different way, such as the phrase “life on a college campus.” This phrase presents the retrieval situation in which the two
words campus and life are three words apart and the word order is switched, life preceding campus. To search for the adjacent words campus and life with no intervening word and without regard to word order, enter: campus n0 life. Table 7.1. Sample Database’s Inverted Index for Free-Text Searching Word Rec. no., field, pos. no. activities 1 dw 15 adirondack 2 iw 1 are 6 ti 7 being 5 dw 9 beliefs 4 dw 15 big 4 ti 7 campus 3 ti 2 college 3 dw 1; 4 dw 3; 4 dw 9; 5 dw 4; 5 ti 8; 6 dw 1; 6 ti 14 confront 4 ti 5 consciousne 5 dw 12 ss conservation 7 dw 4 cooperation 7 dw 12 correlates 5 ti 2 critical 4 dw 1 cultural 3 dw 3 detection 2 ti 2 development 5 dw 3 differences 3 dw 13 diversity 3 dw 5 drinking 6 dw 3; 6 ti 12 ecology 1 dw 1 education 1 dw 5; 1 dw 21; 2 dw 2; 2 dw 6; 3 dw 10; 3 dw 15; 4 dw 6; 4 dw 14; 7 dw 2 educational 1 dw 2; 3 dw 7 environment 3 dw 8; 7 dw 5
environment al esteem ethics experience factors faculty field grades greening groups help higher hiking how hypothesis instructional intermediate international learning life lifes Word materials methods mountains natural naturalistic new observation observation al outdoor
1 dw 4; 2 dw 1; 7 dw 1; 7 dw 6 5 7 5 4 4 1 1 7 6 4 3 2 4 5 1 1 7 1 3 4 1 1 2 1 2 2 2 1
dw 19 dw 8 dw 7 dw 8; 5 dw 15; 6 dw 5 dw 10 dw 6; 1 dw 8 dw 13 ti 2 dw 8 ti 3 dw 9; 4 dw 5 iw 3 ti 1; 6 ti 6 dw 16 dw 10 dw 12 dw 11 dw 14; 1 dw 19 ti 3 ti 6 Rec. no., field, pos. no. dw 11 dw 23 iw 2 dw 16 dw 3 iw 5 dw 4 dw 18
1 dw 20; 2 dw 5
patterns pluralism problems psychologic al questions raising reflections related religion religions religious
5 3 7 5
dw dw dw dw
11 4; 3 ti 9 10 10
4 ti 8 5 dw 13 3 ti 4 6 ti 9 3 dw 11; 4 dw 11; 5 dw 1; 7 dw 3 7 ti 6 3 ti 8; 3 dw 12; 3 dw 14; 4 dw 7; 4 dw 13; 5 dw 14; 6 dw 4 religiousnes 6 ti 1 s resource 1 ti 4 resources 1 dw 3; 1 dw 17 schoolyard 1 ti 7 self 5 dw 18 social 6 dw 6; 6 ti 4 spiritual 5 dw 2; 5 ti 4 spirituality 3 dw 16; 3 ti 6; 6 ti 2 squirrels 1 ti 1 standards 7 dw 7 struggle 5 ti 5 student 3 dw 6; 5 dw 6 students 3 dw 2; 4 dw 4; 4 ti 4; 5 dw 5; 6 dw 2; 6 ti 15 studies 1 dw 7; 4 dw 12 support 6 dw 7; 6 ti 5 teaching 1 dw 22; 1 ti 3 testing 5 dw 17 they 6 ti 8 thinking 4 dw 2
transformin g trips underage well wildlife woodland world worlds years york your
3 ti 1 1 6 5 2 2 7 7 5 2 1
dw 9 ti 11 dw 8 dw 7 ti 1 dw 9 ti 5 ti 9 iw 6 ti 6
The next search statement retrieves records containing both phrases: “campus life” and “life on a college campus”: campus n3 life. Of course, you can enter n6, n8, and so on, but eventually so many intervening words between campus and life will result in a different meaning because the two words may be separated by one or more sentences. When you search using free-text keywords, you must think of all the possibilities for the ways in which the phrases you have in mind could be said and then formulate a search statement using your chosen system’s searching language that retrieves all of them. Let’s work on another free-text search. Let’s say you’re interested in the idea of religious education. Think of all the possibilities for how this idea could be phrased in written text: religious education religious and moral education education for religious purposes education for religious communities In these phrases, the words religious and education are adjacent, separated by one or two intervening words, and either preceding or following one another. Here is a search statement that retrieves all four: religious n2 education
Let’s check this statement in our seven-record database’s inverted index for free-text searching to determine which records it retrieves. Here are these two words’ indexing entries: education: 1 dw 5; 1 dw 21; 2 dw 2; 2 dw 6; 3 dw 10; 3 dw 15; 4 dw 6; 4 dw 14; 7 dw 2 religious: 3 ti 8; 3 dw 12; 3 dw 14; 4 dw 7; 4 dw 13; 5 dw 14; 6 dw 4 Records 3 and 4 are common to both entries. The system ignores all indexing entries except for those in records 3 and 4, which leaves these entries: education: 3 dw 10; 3 dw 15; 4 dw 6; 4 dw 14 religious: 3 ti 8; 3 dw 12; 3 dw 14; 4 dw 7; 4 dw 13 Fields common to both entries are DW. The system ignores all indexing entries except for those in the DW fields of records 3 and 4, which leaves these entries: education: 3 dw 10; 3 dw 15; 4 dw 6; 4 dw 14 religious: 3 dw 12; 3 dw 14; 4 dw 7; 4 dw 13 The search statement calls for nearness—that is, word order doesn’t matter, so the subtraction of word-position numbers can be negative or positive, and it cannot exceed +2 or -2. The system compares 3 dw 10, the first record 3 entry for education, to 3 dw 12, the first record 3 entry for religious. Subtracting 10 from 12 equals 2. Record 3 is retrieved! The system ignores remaining record 3 entries and goes on to record 4 entries. The system compares 4 dw 6, the first record 4 entry for education, to 4 dw 7, the first record 4 entry for religious. Subtracting 6 from 7 equals 1. Record 4 is retrieved! That’s how free-text indexing and searching work. When field labels are absent from search statements, many search systems default to searching all fields. Add one or more field labels to your search statement to force the system to search the fields specified. In fact, you can conduct a controlled vocabulary search by adding subject/descriptor field labels to your search statements (see the second example below). Search statement examples with field qualification are:
ti (underage w0 drinking) retrieves underage adjacent to and preceding drinking in the title field dw (religious w3 education) retrieves religious up to three words away from and preceding education in the descriptor field dw, ti (teaching n2 resource) retrieves teaching up to two words away from resource in the descriptor field, title field, or both Check table 7.1’s inverted index to determine which records these queries retrieved (answers conclude this section): A. B. C. D. E. F.
religious w0 factors problems n3 world (college n1 student) AND (self w0 esteem) (outdoor OR wildlife OR nature) AND (field w1 trips) conservation n1 standards (environment OR environmental) n3 (education OR standards OR conservation)
Now that you know how Boolean search systems construct their inverted file indexes, you will be able to conduct searches effectively and efficiently. Comparing your free-text search statements and retrievals, you will start to develop an understanding of why some free-text retrievals aren’t relevant. Answers: A. B. C. D. E. F.
Records 4, 5, and 6 Record 7 Record 5 Record 1 None Records 1, 2, and 7
FULL-TEXT NEWS DATABASES
In this section we shift from a focus on databases that index journals and other publications for scholarly content to a focus on databases that index newspapers and other journalistic reporting outlets. News stories can be essential sources for a wide variety of research and reference questions, but news databases can be a bit challenging to search. They typically do not have a thesaurus, and they typically do include full text. That’s a combination that can yield many false drops—results that match your search statement literally but have nothing to do with your actual query. Two news databases that include US and foreign sources are Access World News and Nexis Uni. A third important resource is Chronicling America, a freely available searchable collection of newspapers from all regions of the country going back to the eighteenth century. Each of these database interfaces has a different look and feel. The following examples of free-text searching in each one highlight techniques for balancing recall and precision when searching through millions of words. Access World News Because Access World News indexes more than twelve thousand news sources from around the globe, it’s wise to talk with the information seeker about any geographic elements of their query. One of the best first steps for increasing precision is to limit to a specific region, state, or even a single newspaper. The advanced search screen of Access World News offers multiple ways to limit to geographic areas. Clicking on a blue box labeled “USA,” as done in the following example, limits results to US news sources.
Figure 7.3 Results page for simple keyword search in the Access World News database. Courtesy of Readex, Inc. Figure 7.3 shows the number of results—more than 1.4 million—for the simple keyword search farm women. There are two reasons why the results are so numerous. The Boolean AND is the default between words in a single search box, and the full text of all the news stories indexed in this database are searched. The first result is a false drop; the literal words are there, but not in the sense we mean them. There’s not an option to check a thesaurus for the preferred term for farm women, but the use of adjacency and proximity operators can be helpful by forcing the system to retrieve results where the two words appear in a certain order and within a few words of each other. Quotation marks used to force a phrase search are also helpful. The search “farm women” OR “women NEAR3 agricultur*” yields 24,000-plus results (not 1.4 millionplus). Given the ways in which the articles found in Academic Search Ultimate referred to women farmers, it’s possible that even with so many results, the use of only two phrases in the search statement may have eliminated some relevant stories. At the same time, 24,000-plus is a lot, and probably includes many irrelevant stories. As always, once you begin using AND with other facets, the numbers will get smaller and the results will get better, as shown in figure 7.4.
Another technique that can increase precision is limiting the search to the first paragraph or two of the news stories in the database. For databases indexing newspapers and other news sources, it’s important to understand the specifics about news publishing. Journalistic articles usually put the most important information in the lead (or lede) at the beginning of the news story. Consequently, one of the searchable fields in the Access World News search interface is “first paragraph/lead.” Other fields are also particular to newspapers, such as the “headline” and the “byline” fields, which correspond to title and author fields in other types of databases. Even without a controlled vocabulary or a subject field, the structure of news stories and the ability to limit terms to the first paragraph or headline can help create a more precise set of results. Figure 7.5 shows a search limited to the lead/first paragraph of all the articles from the “USA” sources indexed by Access World News. Since the search is limited to a single paragraph at the beginning of each article indexed in the database, the search terms can be modified for maximum recall without resulting in the kind of overwhelming retrievals generated by a search through the full texts. The term woman includes the wildcard to force the system to find singular and plural forms of the word, since Access World News does not automatically search both. The OR operator is also used to broaden the search. There are 3,292 results for this search. Limiting the search to the headline field rather than the first paragraph/lead field would retrieve far fewer results. Changing to a search of all text would retrieve far too many results. While the lead/first paragraph approach appears to be a happy medium, in some situations it might be advisable to do a free-text search using an expansive search statement and then, in consultation with the researcher, filter using elements other than facets, such as limiting to the current year, a single newspaper, or a single section of many newspapers.
Figure 7.4 Results page for a keyword search using Boolean and proximity operators and truncation in Access World News. Courtesy of Readex, Inc.
Figure 7.5 Results page for a broad free-text search limited to the “lead/first paragraph” field in Access World News. Courtesy of Readex, Inc. In addition to news stories being structured in a consistent manner, newspapers themselves often have a dependable structure, with a news
section, a features section, sports pages, calendars of community events, letters to the editor, opinion/editorial pages, and obituaries, among others. It can be difficult to search many papers at once by section, however, since they may use different names. One newspaper may call their editorial pages “Opinion” while another calls them “OpEd.” One newspaper may call their human-interest section “Features” while another may call it “Today” or even, in historical papers, “Women’s Pages.” Figure 7.6 shows a search for the phrase women farmers limited to the features section, retrieving fifty results. Even without the facet Entrepreneur in the search, the first result appears to be relevant. The number of results is small (which isn’t necessarily the same as relevant) because the stories are all features, a journalistic genre that typically includes longer human-interest stories that aren’t considered timesensitive news. Limiting this way makes sense for some topics and for some information needs. We could be missing some good articles, however, since the phrase women farmers, which has to be somewhere in the feature story, may not have been used in other relevant articles. And there could be relevant information in the articles published in the news section or in essays published in the opinion/editorial (op-ed) section that this fielded search eliminated. Nexis Uni Nexis Uni indexes the full text of legal information, including court cases and laws; business information, including company profiles and financial data; and worldwide news content from newspapers, magazines, wire services, broadcast media transcripts, newsletters, and blogs. The default is to search all categories, but the interface makes it easy to switch to searching only news sources. The system has two search modes, referred to as (1) natural language and (2) terms and connectors. Using the natural-language search function allows you to input the information seeker’s question in the search box as a question, and the system will identify the keywords and try to interpret the concepts behind the words. If you prefer, rather than inputting a question, you can input keywords and phrases just as you would in any free-text search in any other database’s search box. In contrast, the terms and connectors search requires you to use tools such as Boolean
logic, proximity operators, and other techniques to specify the relationships among search terms.
Figure 7.6 Results for a free-text search limited to the features sections of US news sources indexed in Access World News. Courtesy of Readex, Inc. Nexis Uni also provides some controlled vocabulary features, although they are the product of automated processes rather than human expertise on the part of subject matter experts and indexers. While these controlled vocabulary features can be helpful, there is no thesaurus or subject headings list to work from. A controlled vocabulary term search looks like this: term(word) with topical words or phrases, including proper names, in the parentheses. Once you’ve entered a term search, you can skim the results, find something relevant, open its full record, and scroll to the bottom to see the list of subjects for that item. Each subject has a relevance score next to it, based on a Nexis algorithmic calculation of how relevant the item is given the search statement. You can choose subject terms from this list for a revised search. Not all items include a list of subject terms, however. Although the Nexis search system uses Boolean operators (with the slight variation that its NOT operator is AND NOT), its fielded searching and proximity operators are more useful for overcoming the challenges
of searching through the vastness of full texts. Proximity operators include the following: Adjacency: pre/1 (word order matters) and w/1 (word order doesn’t matter) Proximity: w/x (within x number of words in any order) Within the same sentence: w/s Within the same paragraph: w/p In the same field (Nexis calls fields segments): w/seg Negative versions of these are not pre/1, not w/x, not w/s, not w/p, and not w/seg. For exact phrases, use quotation marks. Use the adjacency and proximity operators between facets, loosely setting the number of words between 9 and 15, a range that roughly approximates the length of a sentence in a newspaper or magazine. Still too many results? Tighten the operator with a number from 5 to 8. Like EBSCOhost, Nexis automatically searches for singular, plural, and possessive forms. The system’s unlimited truncation symbol for use at the end of a word stem is an exclamation point (!), and its wildcard symbol is an asterisk (*). Demonstrated here is a search for the topic “the impact of screen time on teens.” This query has three facets: (1) Impact, (2) Screen Time, and (3) Teens. Notice that this query’s facet analysis includes the Impact facet. Chapter 6 downplays such a facet in controlled vocabulary searches, where the aboutness of an article is signified by the descriptors the human indexer assigns to it. Free-text searching of full texts involves the matching of words and phrases in search statements with the occurrences of words and phrases in texts. That the system is able to identify matches between the words and phrases in search statements and texts does not necessarily mean that retrieved sources are about the matched terms. Searchers who want to establish relationships between one or more conditions and effects should add relationship facets to their search statements when searching a full-text database without a controlled vocabulary. For this example, we’ll begin with a term() search and work from there. The first search is term(impact) AND term(screen time) AND term(teens), as shown in figure 7.7. There are only five records retrieved, including duplicates. Such a small number in such a huge
database suggests that precision has defeated recall. Acting on the idea that a relevant item may lead to better words and phrases for a new term() search, you can look at the full record (including the entire text) of an article, and see the subjects and their relevance scores determined by the Nexis algorithm, as shown in figure 7.8. A few ideas for synonyms are in the list, including “adolescents” and “digital addiction.” For some topics, beginning a search using the term() feature may be useful, in particular for proper names of companies, large associations and organizations, geographic locations, and people. When searching for information about topics that don’t involve proper names, it may be wiser to craft a careful search using the Nexis terms and connectors mode.
Figure 7.7 Small number of results in Nexis Uni for a search using only one term for each of three facets. Reprinted from Nexis Uni with permission from LexisNexis. Copyright 2022 LexisNexis. All rights reserved.
Figure 7.8 Algorithmically assigned subject terms for a full-text article indexed in Nexis Uni. Reprinted from Nexis Uni with permission from LexisNexis. Copyright 2022 LexisNexis. All rights reserved. The Nexis Uni landing page offers a basic search interface, but in many cases it is better to click on the “Advanced Search” link under the search box. The advanced search interface offers a second search box with a drop-down menu between the two boxes for choosing a Boolean operator. The advanced search screen also makes the proximity operators and other features obvious with some built-in instruction for ease of use. A tab above the search box offers the choice of searching news rather than the entire collection of legal and business information. In the Nexis terms and connectors mode, the logical combination uses proximity operators to cast a wide net for the different keywords representing the facets. The search is: (internet OR digital) w/3 addict! AND (adolescen! OR teen!) AND regulat! pre/3 (emotion! OR anxiety) As seen in figure 7.9, there are forty-six results. Search terms are highlighted in the first result, which seems relevant to the topic. This seems to be a case in which a carefully crafted collection of Boolean and proximity operators and truncation works well with the right terms, and
where leaving out less productive terms, such as screen time, makes for a better set of results. The Nexis system presents one list of results for a search, and it might be helpful to think of it as a set. On the left side of the results screen are filters, including an option to narrow results using the “Search Within Results” search box. Rather than creating a new set of results for the new keywords, the system searches only within the first list of results, thus creating a subset. This can be a useful iterative process of assessing results, adding a search-within term, assessing the new smaller set of results, adding another search-within term, and so on, eventually winnowing the final list of results to highly relevant items. The searcher can start with the first two facets, check retrievals to see if they are on track, and then enter the third facet using the search-within feature. Chronicling America
Figure 7.9 First result for a free-text search using Boolean and proximity operators and truncation in Nexis Uni. Reprinted from Nexis Uni with permission from LexisNexis. Copyright 2022 LexisNexis. All rights reserved. More than 19 million pages from 3,667 newspapers are searchable in the openly accessible Chronicling America database at the Library of Congress website. The home page offers a single search box, to the left of which are drop-down menus for limiting results to newspapers
published in a particular state and for stories published during particular time periods. The advanced search screen uses plain, easy-tounderstand language and a clear design. Even novice uses can create searches using Boolean and proximity operators and apply limits so that retrievals are from their chosen sources, for their chosen dates, and in their chosen language. Filters for state, name of newspaper, and dates are at the top. It’s not possible to limit a search to a section of the newspaper—a lot of historical newspapers weren’t divided into sections —but you can use the check box to specify if you want the search to be limited only to the front page or to a specific page number. The Boolean operators are represented, even though they are not called that. The search box labeled “with any of the words” is the Boolean OR, while the search box labeled “with all of the words” is the AND. The third box is for phrase searching, so no quotation marks are needed. There’s also a box for specifying the nearness of search terms to each other with a minimum separation of 5 words between terms. A student or historian interested in silver mining in Arizona during the latter half of the nineteenth century might try the search in figure 7.10. Limits are for front-page stories published between 1849 and 1890 and the search entered into the form is the equivalent of (mines OR miners OR mining) AND silver AND (bisbee N5 tombstone). Search terms are highlighted, making the relationships among the search words and phrases fairly obvious, which can help when gauging relevance and revising the search. The Library of Congress also offers a list of topics that received a lot of news coverage in their day, which could be extremely useful for students looking for help with history topics, especially if they are required to use primary resources such as news reporting published at the time the events occurred. The fact that some newspapers in Chronicling America go back to the eighteenth century means that today’s keywords may not match those of the past. Historians and others accustomed to historical research will be able to think of alternate terms and spellings with little trouble, but students will need help. The National Endowment for the Humanities provides guides and lesson plans for K–12 teachers, along with advice about how to handle sensitive content that may appear in search results (Teacher’s Guide 2022).
FREE-TEXT SEARCHING TIPS This chapter’s examination of free text covers the basic functionality that you can expect most systems to have. Be prepared for free-text search functionality to vary across systems, especially in terms of the syntax and symbols for proximity operators and truncation, the inclusion of implicit and explicit truncation, and the abbreviations for field labels.
Figure 7.10 Chronicling America advanced search screen. Source: Library of Congress, https://chroniclingamerica.loc.gov/ Free-text searching of full-text databases can be powerful for finding obscure information, since broad recall can be expected, especially if many OR statements, wildcards, and truncation devices are used in the search statement. Don’t let enormous sets of results intimidate you. Many ways to refine a massive set of results for greater precision are available, including fielded searching; the use of phrase, adjacency, and proximity operators; and the application of source and date limiters. The ultimate balance between recall and precision may be the combination of controlled vocabulary (when available) and free-text techniques in a single search. But, as with all online searching, it depends on the
information seeker’s purpose, the database’s features, and the creative interventions of the expert intermediator.
QUESTIONS Conduct online searches for one or more of these negotiated queries. To generate free-text search terms, check your selected database’s controlled vocabulary, if there is one, to find subject descriptors you can use as keywords and phrases and to identify used-for terms. Failure to include relevant controlled vocabulary terms that you enter as free text in your search statements will produce fewer relevant retrievals than the database has to offer. In databases with no controlled vocabulary, try doing a quick-and-dirty search to get a sense of how the topic is discussed in natural-language discourse. More sources for free-text search terms are in textbox 7.1. Once you’ve identified words and phrases to use, incorporate wildcard and truncation devices and search all fields or all text for greatest recall. After you’ve built recall into your search statement(s), try some techniques for winnowing, such as adjacency and proximity operators and fielded searching, so that results are more likely to be relevant to the user’s query. Suggested free-text search formulations for these queries conclude this chapter.
TEXTBOX 7.1. Sources of Free-Text Search Terms The terminology that the information seeker uses to express his or her interests during the reference interview Use references in the database’s thesaurus Authorized subject terms in the database’s controlled vocabulary
Use references and authorized subject terms in another database in the same discipline or field of study Use references and authorized subject terms in general or multidisciplinary controlled vocabularies such as Library of Congress Subject Headings or Sears List of Subject Headings for small- to medium-sized libraries Captions for relevant classes or codes found in Library of Congress Classification, the North American Industry Classification System, or others A database record for a relevant source that you find online or your client has in hand, especially its abstract and author keywords Your knowledge of how people write, ranging from the vernacular (for sources like popular magazines and newspapers) to serious scholarship (for sources like journal articles, conference proceedings, and research reports)
1. For the ERIC database: I’m a new school bus driver, and I want to know more about handling discipline problems on school buses. 2. For the PsycInfo (or ERIC) database: Are teens whose parents divorce likely to develop eating disorders? 3. For Access World News, Nexis Uni, or Chronicling America: What sorts of societal changes are caused by pandemics?
SUMMARY Free-text searching is the everyday kind of keyword approach to finding information, but with a professional twist: the use of sophisticated techniques designed to retrieve results that use the keywords in the same sense and with the same meaning as the information seeker uses them. As with a controlled vocabulary search, it requires a facet analysis and logical combination process. From there, the next step is to think of
useful synonyms, variant spellings, and how close together words should be to have the meaning intended in the query. Selecting keywords and phrases may involve consulting a thesaurus, not to use the preferred subject descriptors as descriptors, but rather to make sure the words are present in the items retrieved, even in those items not tagged with the official descriptor. The syndetic structure of the thesaurus can also help identify synonyms listed in the used-for references. Searching with controlled vocabulary terms weights the results toward the precision side of the scale while free-text searching tips in the direction of recall. In many instances free-text searching is the only option, as when a controlled vocabulary term hasn’t yet been designated for a new topic or a database uses no controlled vocabulary or a very simple one. Being able to overcome the lack of precision that controlled vocabulary searching offers is the key skill needed for efficient and effective free-text searching. The intermediator’s ingenuity regarding the identification of keywords and phrases coupled with their mastery of a search system’s features and filters will lead to successful searches with or without the assistance of a controlled vocabulary.
REFERENCES National Library of Medicine. 2020. “New MeSH Supplementary Concept Record for Coronavirus Disease 2019 (COVID-19).” NLM Technical Bulletin, no. 432: https://www.nlm.nih.gov/pubs/techbull/jf20/brief/jf20_mesh_novel_c oronavirus_disease.html. Ojala, Marydee. 1990. “Research into Full-Text Retrieval.” Database 13, no. 4: 78–80. Teacher’s Guide. 2022. “Chronicling America: History’s First Draft.” https://edsitement.neh.gov/teachers-guides/chronicling-andpicturing-america. Tenopir, Carol. 1987. “Searching by Controlled Vocabulary or Free Text?” Library Journal 112, no. 19: 58–59.
SUGGESTED READING Badke, William. 2011. “The Treachery of Keywords.” Online 35, no. 3: 52–54. Sosulski, Nicolette Warisse. 2017. “Six Words.” Reference and User Services Quarterly 57, no. 1: 26–28.
ANSWERS 1. I’m a new school bus driver, and I want to know more about hand discipline problems on school buses. ERIC
Facets School Buses Student Behavior To combine sets Note
Choosing Terms for Search Statements (in ProQuest) “bus” OR “buses” OR “busses” OR “busing” OR “bussing” student[*1] near/behav* OR disciplin* OR aggressi[*2] OR bully[*3] OR (antisocial pre/0 behavior) OR (sexual[*2] near/2 harass[*4]) OR fight* OR taunt* OR fistfight* or slugfest Choose “Recent Searches.” Checkmark set numbers for these two facets, and click on the “Search with AND” button. Truncating bus retrieves too many nonrelevant hits (e.g., busy, bush, and author names beginning with this stem), so enter bus terms with quotation marks to control what the system searches.
2. Are teens whose parents divorce likely to develop eating disorde PsycInfo
Facets
Choosing Terms for Search Statements (in EBSCOhost)
Eating Disorders
(eating n0 disorder*) OR bulimi? OR anorexi? OR (eating n0 binge) OR hyperphagi? OR pica OR (kleine w0 levin w0 syndrome) OR (appetite n2 disorder) Divorce divorc* OR (broken w0 home*) OR (marital n2 conflict) OR (marital n0 separat*) OR (marriage n2 problem*) OR (child w0 support) OR (joint w0 custody) OR (custody n2 child*) OR (parent n2 absen*) OR (mother n2 absen*) OR (father n2 absen*) Adolescent teen* OR youth* OR (high w0 school*) OR (middle w0 s school*) OR adolescen* To combine Choose “Search History.” Checkmark set numbers for the sets three facets, and click on the “Search with AND” button. 3. For Access World News, Nexis Uni, or Chronicling America: What sort societal changes are caused by pandemics? Because these are extremely large full-text databases, experiment with tools for precision rather than recall. For example, search a selection of newspapers for a defined geographic region, use tight proximity operators, and limit the search to headline and lead paragraph fields. In Nexis Uni, be sure you are searching “News” and not “All,” and try this in a single search box: hlead(pandemic) AND hlead(societal changes). Although you can use a proximity operator instead of AND, limiting each term to the headline and lead paragraph retrieves around three hundred results. That’s a manageable number to scan for additional terms to use in a revised search. If you search Chronicling America for this question, consider the terms you might need to use instead of societal change, since that term may not have been in wide use when these historical newspapers were being published.
8
Web Search Engines At this point, you may have a keen sense of the difference between database search systems and the web search engines you were accustomed to using when you began reading Online Searching. Although commercial search systems and web search engines help information seekers find what they are looking for, what’s being searched is distinctively different. Material in licensed databases has been vetted by an editorial or peer review process, unlike much of the self-published material on the web. Databases accessed at the library website often incorporate a controlled vocabulary for names and subject terms to help disambiguate natural-language search terms; web search engines don’t. The subscription databases index copyrighted material not freely available on the web and not included in search engine results, with some exceptions. They are free of advertising and clickbait. In contrast to a library’s subscription databases, which may add new material once a day, once a week, or even longer if publishers’ embargoes are in place, websites are constantly in flux, with pages being added, deleted, updated, and revised every minute of every day. Small wonder, then, that web search strategies and the types of results retrieved will vary. Search engine retrievals come with a twist: the presentation of results ranked by relevance, as determined by an algorithm. Google was the pioneer in presenting users with relevance-ranked results, and that innovation changed search into something comprehensible to everyone with access to the web. The idea behind relevance ranking is that algorithms can compensate for gaps in users’ skills by factoring in search contexts, links followed, and information-seeking
patterns to determine the order of results, promising to take the burden of evaluating the relevance of a long list of web links off of the user. Typical searches yield high recall, as evidenced by the millions of results matched when a few keywords are entered; relevance ranking introduces a form of precision. Although chapter 12 focuses on using database search results to improve searches, this chapter includes a discussion of the search engine results page (SERP), since it is central to the Google, Bing, and DuckDuckGo approach. After an overview of how search engines work, this chapter describes how to apply many of the search techniques you are already familiar with to the web environment and presents some methods for leveraging the unique qualities of web searching.
INDEXING Search engines deploy web crawlers, or bots, to index web pages. Web crawlers such as Googlebot, Bingbot, and DuckDuckBot locate publicly available web pages and create an index of every word on them along with the words’ locations. The index may include other elements as well, such as how recently pages were updated, which can be useful when timeliness matters, such as with sports scores, breaking news, and stock prices. When you search the web, the search engine goes to this index to identify matches and runs its algorithms to determine the results and the order of the results to present to the searcher. The indexing process is not so different from construction of an inverted index for a commercial database. An important difference, however, is the web’s constant change and web crawlers’ constant updating to keep up with the changes. A search in the morning will likely retrieve results different from those of the same search done the night before, or even minutes before. Google has more than one index for its various services, including Google Books, Google Images, and Google Scholar. Because the
pages of copyrighted books and articles are not openly available on the web, Google cannot display entire books and articles unless they are in the public domain, issued under an open-access license, or freely offered by the copyright holder. Results from Google Books and Google Scholar searches may be surrogates (citations with snippets showing where your terms appear) or sources providing the full text of public domain books and government publications as well as the full text of open-access scholarly material. Google has also compiled a collection of facts from openly available and licensed content that it calls the Knowledge Graph. All of the Knowledge Graph material is indexed, so when you do a Google search for people, organizations, places, things, and other facts, the results may be from the Knowledge Graph, even though some of the content is not openly available on the web. The Knowledge Graph functions as a kind of almanac, useful for ready-reference questions.
THE SEARCH ENGINE RESULTS PAGE Google’s PageRank algorithm, patented in 2001, changed the world of search by presenting results in order of relevance, calculated in part by the quality and quantity of web pages linking to a page. It wasn’t just a matter of retrieving web pages whose text matched your search words but of weighting the retrieved pages by how often other high-quality web pages linked to them. The assumption was that the creators of web pages were selective about the pages they linked to, and their links functioned as a kind of vote of confidence that could increase the relevance of the page for the information seeker. A page that matched the search terms and that was linked to by many different web pages that themselves were also linked to many other pages would be presented high up on the
results page, in ranked order by relevance, as Google defined it. Users developed trust in the ranking system, in part because they could see that Google had found millions of pages but was presenting only ten of them on the screen suggesting that the search engine was capable of “precise comprehensiveness,” as Said Vaidhyanathan has called it (2011, 49–51). In the years since, Google has revised its proprietary algorithms many times to keep up with changes in web content and use. No matter how search engines calculate relevance, the focus is on the SERP. Figure 8.1 shows the top of a Google search results page and figure 8.2 shows the top of a Bing results page for the same search, women farm owners. The terms in bold type on the Google SERP provide clues that keywords are automatically truncated (stemmed) so that farming and farmers are included in the results. Both Google and Bing SERPs include at least one link to a government website and offer related questions and answers. These seem to be reputable sources and potentially useful results. There are malevolent and misleading sites on the web, but these search results demonstrate that a great deal of authoritative and useful material is there too. These particular results reflect the nature of the search, a straightforward set of keywords about a noncontroversial topic coming from a networked computer that has issued similar kinds of searches in the past. Try the search yourself from your customary device and compare the results. Google, Bing, and DuckDuckGo all offer advanced search interfaces, and if you are surprised by that it’s only because they are not advertised, promoted, or highlighted. The assumption seems to be that if the relevance algorithms are working correctly, users don’t need to know fancy operators or other search techniques. With the focus on SERPs, search engines are in the business of improving results, not necessarily of improving searches or the skills of searchers. This focus has led to a new industry, called search engine optimization (SEO), the efforts that businesses expend to push their web pages to the top of the SERP. According to one SEO expert, “Not Google, not Bing, nor any other major search engine is in the
business of providing organic [unpaid] listings” (Davies 2019). Unpaid,
Figure 8.1 Google search engine results page.
Figure 8.2 Bing search engine results page. Microsoft and Bing are trademarks of the Microsoft group of companies. organic search results are there to give people satisfactory information so they will keep using the search engine and thus keep seeing ads as well. Unlike Google and Bing, DuckDuckGo doesn’t track users. It sells ads based on the keywords used in searches, rather than selling finely targeted ads that only a well-defined set of individuals will see (DuckDuckGo.com 2022). DuckDuckGo partners with Bing, meaning the same results will appear on their respective SERPs, but not in the same order. Although results won’t necessarily be identical, they will be similar, with Bing displaying more targeted ads on a flashier page. Figure 8.3 shows the Duck-DuckGo SERP for the women farm owners search. For the intermediator helping individuals find factual, authoritative information, the need to identify and interpret the misleading or inaccurate sources in a search engine’s list of results adds to the complexity of the search process. The focus in the remainder of this chapter is on using Google, but most of the techniques presented here can also be adapted for use with different search engines. Google continues to be used much more frequently than Bing and DuckDuckGo, but that doesn’t mean it’s used as efficiently or effectively as it could be.
Figure 8.3 DuckDuckGo search engine results page. Just because search engine makers focus on presenting results doesn’t mean you have to. You should instead focus on crafting the best possible search up front so that the results are as refined as possible before you evaluate them and revise your search statement if needed. In a book titled The Joy of Search (2019), Google’s Senior Research Scientist for Search Quality and User Happiness, Daniel M.
Russell, presents case studies demonstrating how to work between the search box, the SERP, and the links the SERP offers to find needed information. Writes Russell: “Those who are fluent in search and retrieval not only save time but also are far more likely to find higher-quality, more credible, more useful content” (5). Depending on the information seeker’s query, Google, Google Images, or Google Scholar may be useful. Not as useful is Google Books. Although it’s possible to find the full text of public domain works along with snippets from copyrighted books and links to bookstores and libraries that have the book, a good alternative, particularly for older publications, is the HathiTrust collection at hathitrust.org, with its classic library interface and robust metadata (Fagan 2021). For readers’ advisory queries, a better choice is a web service such as Goodreads, the commercial database NoveList, or the many “BookTube” channels on YouTube.
GOOGLE ADVANCED SEARCH Although Google’s home page is famous for its minimalism, the search engine offers a separate advanced search screen accessible from the “Settings” link in the lower-right corner of the home page or by using the URL https://www.google.com/advanced_search. The advanced search screen provides a form with some brief instructions. You can use it not only to construct a more precise search than is usual when using the single search box on the home page, but also to learn how to do more sophisticated searching using that single search box. The advanced search form, shown in Figure 8.4, offers boxes at the top for typical searches; to the right of each box is a tip for how to achieve the same search using only the home page search box. Above the search boxes is the command “Find pages with,” followed by each box with its own separate function. For example, the first
box can be used to “Find pages with all these words,” while the second is for “this exact word or phrase,” the third is “any of these words,” and the fourth is “none of these words.”
Figure 8.4 Google advanced search form. The fifth box makes it easy to find a range of numbers. Any combination of the boxes can be used. Figure 8.5 shows a search for population data about poverty levels by county or city from 2010 to 2020. Clicking on the “Advanced Search” button activates the search. On the results page, the search is shown in the single search box at the top (figure 8.6). That is the search statement you could have input manually rather than using the advanced search form.
Figure 8.5 Google advanced search for data about poverty levels.
Figure 8.6 Google results for the poverty-level data search.
Farther down on the advanced search screen are filters. Beginning with the command “Then narrow your results by” are boxes with drop-down menus that make it possible to limit results by language, geographic region, and recency. The box labeled “terms appearing” is more or less equivalent to fielded searching, with the default “anywhere in the page” and a drop-down menu offering the option to limit results to the title, text, or URL of the page or in links to the page. Other filters allow for the automatic elimination of sexually explicit pages; for limiting by file type, such as Adobe Acrobat (.pdf) or Microsoft Word (.doc); and for usage rights, including licensing that waives some of the restrictions of copyright. There’s also a box for limiting to a specific domain, such as .org, .edu, and .gov. For the poverty levels search, figure 8.7 shows how to filter the results to include only material from government websites and only Excel spreadsheets. The results will be from the U.S. Census Bureau, but also from other federal agencies as well as state agencies, since the .gov domain includes governments at all levels. Without limiting by file type, results would have included web pages, pdfs, and other material that can be read but not manipulated. Limiting results to .xls files means the data can be downloaded, sorted, excerpted, added, subtracted, and so on, depending on the user’s purpose. This search—using the all, exact, any, and none boxes and limiting the results to government data in spreadsheets—can be input in a single search box. When using the one box, there’s no need to nest the OR statement inside parentheses to make sure it’s processed first: county OR city “poverty level” site:gov filetype:xls. Limiting results to filetype.pdf will find easily printable material, such as guides, handbooks, health-related brochures, and posters. Students looking for science fair examples and teachers looking for science fair guidelines from other schools and organizations can put the .pdf limiter to good use. Figure 8.8 shows the search guide OR guidelines “science fair” filetype:pdf and the first few results. Notice that the results tally to around 237,000. A basic Google search for science fair guidelines tallies around 498,000,000 results. The first on the results page is a series of short videos, “How to Do a Science
Fair Project,” on the website of the NASA Jet Propulsion Laboratory at the California Institute of Technology. The second result is a series of web pages explaining the scientific method and how to apply it in a science fair project. Both of these are useful for students in the classroom, but they don’t address the needs of students and teachers looking for something that can be printed out and referred to as work on projects proceeds. Google’s algorithms do their work, however, so that some pdfs and even Word documents are on the first page of results, interspersed with videos and web pages. Adding the filetype limiter, though, reduces the clutter of other file types on the SERP.
Figure 8.7 Google search results for .gov sites and .xls files.
Figure 8.8 Google search results for science fair guides limited to pdf files. The advantage of finding resources from the federal government is that they are, for the most part, in the public domain. They can be freely printed, altered, and used without asking for permission. Adding the domain limiter site:gov yields results from all levels of government, so you will need to identify among the SERP links the ones from federal agencies and perhaps from your own state library or other relevant agency. It may also help to limit results to a particular science-oriented federal agency such as site:nasa.gov for Administration
the
National
Aeronautics
and
Space
site:noaa.gov for the National Oceanic and Atmospheric Administration site:usgs.gov for the United States Geological Survey site:nih.gov for the National Institutes of Health The federal government’s own portal, usa.gov, indexes web pages and publications from federal, state, and other government agencies. The usa.gov search engine makes it easy to retrieve material on a wide variety of topics, much of it in the public domain. A basic search box is on the upper-right side of the home page. Scrolling down on the page leads to a set of links in alphabetical order by topic, from “About the U.S.” to “Voting and Elections.” Browsing the links may be the best approach when you are helping users unfamiliar with the breadth of government resources and how to craft searches to find precisely what’s needed. The usa.gov search box uses the Bing search engine. A basic query, plutonium research, in the usa.gov search box yields results from several different federal agencies, including, among others, a news release about stockpile safety from the Los Alamos National Laboratory, historical documents in the collection of the U.S. Department of Energy Office of Scientific and Technical Information, and a profile of plutonium on the website of the Centers for Disease Control Agency for Toxic Substances and Disease Registry. On the results page are filters for images and videos, but there is no indication of the total number of results or pages of results. Using the Bing search engine directly at bing.com yields many of the same results as well as many not issued by government science agencies and not in the public domain. The usa.gov results page is an unadorned list of links with snippets, in contrast to the Bing results page serving up results arranged in a bento-box presentation with selected images and videos displayed along with web page links and snippets. Running down the right side of the Bing results screen are the Wikipedia entry on plutonium, clickable photos of plutonium researchers, and similar topics other Bing users have searched. Filters include links to “school,” for items at the institution where the searcher studies or works, as well as images, videos, maps, news,
and shopping (although, as you can imagine, the shopping results for the plutonium research query are pretty weird: one is “Night Plutonium Light Up” boots with platform soles!). Bing indexes the web, and usa.gov indexes government websites. Bing invests huge amounts of its revenue—earned by selling ads displayed with search results—adjusting its search algorithms and SERP, while the federal government collects no ad revenue and spends its tax revenues elsewhere. Depending on the information need, the predictability and precision offered by usa.gov can eliminate the noise inherent in a search of the wider web. Using the usa.gov search box means you won’t have access to all the filters available on the Bing SERP, so you may have to tailor your search query to include the types of material you seek. For example, the query plutonium maps filetype:pdf yields results that include the word map, such as a technical report that suggests a future project to identify plutonium reserves on a world map, but that doesn’t itself include a map. The images filter on the usa. gov results page can be used in this case to display maps, but you may have to do some extra work to track down the metadata for maps that have titles and legends but lack complete citation information. In this example, using Bing to search plutonium site:gov and then using the maps filter doesn’t work well. Using Bing, the query should be the same as in a usa.gov search, plutonium maps filetype:pdf site:gov, with the addition of the domain limiter. The Bing search results number 5,725. Doing the same search using Google, the results number 49,100, but the actual display is truncated to only five screens after eliminating duplicate or very similar results. The order of the top links differs on the Bing, Google, and usa.gov results pages, an indication of the differences in their indexing bots and their search and ranking algorithms. And, of course, there are no ads on the usa.gov SERP. Google’s search algorithm automatically includes synonyms, antonyms, and variant spellings, so it’s not necessary to use truncation symbols or wildcards. The automatic inclusion of such variations increases recall, so it’s important to build some precision into your search statements. Using quotation marks around phrases
is one of the easiest and most obvious ways to do that. Although it’s not shown on the advanced search form, Google does offer a basic proximity operator, AROUND(x), where x is the maximum number of words between two search terms. The search “poverty level” AROUND(5) homelessness will retrieve results in which the phrase poverty level is within five words before or after the word homelessness. Somewhat similarly to the commercial database search systems that make it possible to filter by publication names or document types, the Google source feature limits a search to a particular news source that’s indexed in Google News. To find Mardi Gras parade coverage in the New Orleans newspaper the Times-Picayune, search mardi gras parades source:times-picayune. Since the New Orleans Times-Picayune newspaper website is nola.com, an alternative is to search mardi gras parades site:nola.com. For news sources not included in Google News, the domain limiter can be used. For the business angle on Mardi Gras, try searching mardi gras site:marketplace.com to find coverage broadcast on Marketplace, the Minnesota Public Radio program. In fact, a Google search including the site limiter can be used for all kinds of websites and is especially useful when a website’s own search box doesn’t function well.
GOOGLE IMAGES Although images are retrieved in a basic Google search, you can also do a basic Google search and then use the Images filter on the results screen. For many information needs, however, it’s best to go straight to images.google.com. As GoogleBot crawls the web it looks for images embedded in websites whose html codes include image tags. If GoogleBot identifies a website with images, the specialized GoogleBot Images crawler may go through the website to index them if they are in a format that Google can handle, such as .jpg
and .png. Google crawlers look for text in image titles and captions, in the words of image URLs, and in the text surrounding an image on a website in order to generate the labels that appear under each image on the Google Images results page (Google Images Best Practices 2022). In a regular Google search, a few images may be displayed near the top of the results page, among the highestranking web page links with snippets. In a Google Images search, the results page shows image thumbnails, the only text being small titles and short URLs, making it easy to skim for the images that seem most relevant to the task. Using the Google Images search engine eliminates the clutter of non-images on a basic Google SERP. Google Images has its own advanced search screen at images.google.com/advanced_image_search. As seen in Figure 8.9, the first four search boxes are the same as in the regular Google advanced search form: all, exact, any, and none. Although it is possible to filter by geographic region and site, as with regular Google, the rest of the filters are specific to images, including size; aspect ratio; image colors; image type such as photo, clip art, or animated; and file formats such as .jpg and .png. A final filter allows you to limit results to works under Creative Commons licenses, which may yield images the information seeker is free to use without requesting permission from the copyright holder as long as credit is given to the person or entity holding the copyright. For many information needs, an obvious search statement is all that’s necessary to find relevant images with the sought-after attributes. For someone seeking an image of an American brand of electric car, an easy Google Images search would be ford OR gm OR “general motors” “electric cars”. Above the thumbnails on the results page is a “Tools” tab; clicking on it will display some of the same filters offered on the advanced search form: size, color, type, time, and usage rights (figure 8.10). Between the filters and the thumbnails are suggested topics to refine the results. For the American electric cars search, these suggestions include “Tesla,” “luxury,” “hybrid,” and “charging” as well as other subtopics of possible interest to the user.
Figure 8.9 Google Images advanced search form.
Figure 8.10 Google Images results for “electric cars” search. There are also not-so-obvious searches that can lead to authoritative information and research articles. One example is a search for statistical tables about a phenomenon. Statistical tables can be informative on their own, but they can also lead to the authoritative information and research articles that contain them. If a public library user is wondering how common diabetes is among teenagers, an image search may provide the answer at a glance. The search statement diabetes by age table OR chart site:gov OR site:edu will yield image results that are graphics providing visualizations of the data, available on the websites of either government agencies or educational institutions. Adding the form of visualization that’s needed as a keyword helps make the results more relevant for the information seeker’s purpose. Adding the domain limiter helps ensure that noncommercial information is retrieved. That search statement may be enough for the user wondering about the prevalence of juvenile diabetes. For advanced students and researchers, clicking on the image will open a box to the right that shows the image in a larger size along with a “visit” button that
will go to the image in its context, which may be an article, book, web page, or other type of publication. The box will also include a citation that links to the full document. With this strategy, it’s possible to use Google Images as a route into research reports and other forms of fact-based publications that you know will include tables, charts, figures, diagrams, and other graphic illustrations (depending on the keywords you use in the search) alongside the text.
GOOGLE SCHOLAR The more common route to research material is Google Scholar. Many academic libraries include Google Scholar on their list of databases accessible at their websites. That’s because Google Scholar indexes articles published in peer-reviewed journals as well as preprints (not yet reviewed) and other publications stored in open-access repositories. In addition to research articles, Google indexes dissertations and theses, technical reports, scholarly books, government research publications, and conference proceedings. Martín-Martín et al. (2018) estimate that about 50 percent of Google Scholar results are available on an open-access basis, which can be invaluable for information seekers not affiliated with academic libraries and thus not able to use their subscription databases. The amount of open-access material will continue to increase, especially since the White House Office of Science and Technology Policy issued guidance to grant-making government agencies to make the resulting research publications freely available immediately on publication (Nelson 2022). Prior to the new policy, agencies issuing research grants and publishers could place embargoes on publications, keeping tax-funded research behind paywalls for months.
Google Scholar has agreements with publishers allowing them to index copyrighted articles. For paywalled articles, Google Scholar can display citations and snippets with links that go to the publisher’s website, where users can read the abstract but must pay for access to the full article. University students and faculty with access to their library’s subscription databases can find those articles in the databases without having to pay fees for them. But it’s not necessary to leave Google Scholar, sign in to a university database, and do the search all over again to get access to full texts. Instead, searchers can use Google Scholar settings to identify themselves as eligible to access paywalled full-text articles. To do so, they can click on the three horizontal lines in the upper left corner of the Google Scholar home page, then click “Settings,” then “Library Links.” There they will be able to input their university’s or college’s name. Once they’ve changed the setting, they will see links not only to openaccess versions of the article (if there are any) but also links to “FullText @ [University Name] Library.” Instead of having to leave Google Scholar and sign in to a database at their university library’s website, they can simply click on the link on the Google Scholar SERP for access. One more thing to note about Google Scholar is the pace of updates. As with many commercial databases, new articles are added frequently. Some websites restrict how often GoogleBot can crawl through them, however, and that makes for considerably longer wait times for updating items that Google has already crawled. By now you know to expect that Google Scholar has its own advanced search form, findable by clicking on the three horizontal lines in the upper-left corner of the home page. The advanced search form offers the customary four search boxes, for all, exact, any, and none. It allows a rudimentary form of fielded searching, where you can choose to limit the search to the titles of articles or search full texts. Although there are no filters per se, there are three additional search boxes for refining results, with the AND implied between all the search boxes. These three boxes are for author names, journal names, and publication date ranges. They can be
used in conjunction with the topic-oriented search boxes above them, or they can be used separately. If a doctoral student needs to find all the articles written by an influential researcher in their field of study, they can use only the author name box to retrieve them. Since Google Scholar doesn’t rely on a name authority list to disambiguate author names, it’s possible to retrieve results by a researcher whose name is the same as another researcher’s in an entirely different field. There is a way to disambiguate after the fact, however, on the SERP. Above the list of citations and snippets on the results page is a link to “user profiles,” which goes to a list of authors with the same or similar names, along with their university affiliations and areas of research. Researchers themselves create their profiles, not only to distinguish their work from others’ but also to list their articles and to track the citations to their articles from other researchers’ publications to document the impact their research is having. The advanced-search box for journal names, labeled “Return articles published in,” can be used alone, if a researcher needs to skim the kinds of articles that a particular journal in their field publishes. The journal name box is also useful for discovering if a particular journal’s articles are indexed in Google Scholar. Do put quotation marks around journal names so the search engine will look for the phrase rather than the individual words. Although it is possible to search by journal name, Google Scholar claims not to index journals but to index articles and recommends searching known articles to see if they are retrieved, indicating that a journal is included in Scholar. The claim not to index journals is accurate, if you consider the different approach taken by commercial databases that typically do index journals. Checking a commercial database’s list of indexed publications will also tell you when indexing of the journal began, either with the first issue or at some later point, and when full text began to be added. This is the kind of information about journal coverage that is difficult if not impossible to discover from Google Scholar. When using Google Scholar it is more likely that you will be working with queries posed by researchers and students
investigating particular topics and questions, rather than the general public. You’ll want to consider the same things you do after conducting a reference interview and determining the search query. It might be best to use Google Scholar after using subscription databases, where you will have determined not only good naturallanguage keywords for a topic but also the controlled vocabulary terms. For example, you and an information seeker needing scholarly articles on the use of drones in combat in Afghanistan may have discovered that Academic Search Ultimate uses “drone warfare” as a subject descriptor. Figure 8.11 shows results from an advanced search for the topic combining keywords and subject descriptors used as keywords to increase recall, with drone warfare in the exact phrase box and afghanistan kabul in the OR box. Leaving the default as “anywhere in the article” retrieves more than four thousand results. Switching to “in the title of the article” retrieves only four. One way to get a better balance of recall and precision with this search is to add keywords to the search but limit to article titles. The idea is to use article titles as a kind of proxy for subject and to add keywords to the search that we might expect authors to use in their titles. Figure 8.12 shows how the search has been revised right inside the search box; we’ve used the advanced search form only to get started and to see how the form translates our input into the search language of Google Scholar. The search allintitle: afghanistan OR kabul “drone warfare” OR drone tells the search engine to look only in the article titles for the phrase drone warfare or the words drone or drones and retrieve the ones that also have either the word Afghanistan or the word Kabul or both in the article title as well. There are seventy-five results.
Figure 8.11 Google Scholar advanced search form results. The advanced search form has limited utility, but when you see how it translates your query into the Google Scholar search language, shown in the box above the list of retrievals on the SERP, you can see the other fields or tags that may also be used for greater precision. For example, using the advanced search form to limit results to items where the keywords and phrases are in the article title is translated as the field/tag label “allintitle.” Although Google Scholar does not itself rely on a subject thesaurus, it does retrieve articles based on the subjects assigned to them in a commercial database. And it also indexes the words in the full text of articles, including their abstracts. Taking a pointer from the Google Images search and adding a keyword for charts, tables, figures, or other graphics can sometimes be useful, depending on the information seeker’s expressed need for research that includes data visualizations.
Figure 8.12 Google Scholar search limited to titles. Many people—including librarians, researchers, K–12 teachers, and students at all levels—may shy away from web search engines for finding quality information and data to support serious projects. At the other end of the spectrum are users who consider themselves experts at web searching and who think everything is on the web. In the middle are confident and skilled information intermediators who continually deploy and update their expertise, honing techniques that work well no matter which subscription or open-access resource is being used, knowing that some techniques overlap and others are specific to the type of system being used. Search engines are invaluable at helping information intermediators answer readyreference questions and track down known but elusive items. Use them to find all kinds of factual information about people, places, things, organizations, events, and occurrences.
QUESTIONS
Use the queries you have been working with in previous chapters to help you develop a keener sense of how web searching for authoritative information differs from database searching. Choose one or two of the following queries and develop a search statement for it, using the best keywords, phrases, limiters, and other techniques you can think of. 1. I’m a new school bus driver, and I want to know more about handling discipline problems on school buses. 2. Are teens whose parents divorce likely to develop eating disorders? 3. I’d like to read about the real-life experiences of American fighter pilots in World War II. 4. When adult learners engage in conservation education, does their behavior change? 5. New technology devices to help ALS patients communicate with family, friends, and caregivers. 6. The role of manga in helping teens deal with social situations, such as dating. 1. Run the search using Google Scholar. Record the number of results. How relevant to the topic are they? Compare the relevance of items on the first page to that of items on the eighth or tenth page. On the first two results pages, how many are openly available and how many are only in paywalled databases or journals? 2. Now use Google Images to discover a graphic element (a chart, table, figure, illustration, etc.) relevant to the topic. How did your search statement change? How relevant to the topic do the results seem to be? Choose one thumbnail that appears to be authoritative and informative about the query. Identify the article or web page where the graphic element appears and consider how authoritative it is. Did your Google Images search seem to retrieve useful articles that weren’t on the first couple of pages of Google Scholar results? To what extent do you think an image search can be useful for scholarly queries?
3. Finally, use plain-old Google to find information about your chosen query. How did your search statement change (if it did)? Characterize or categorize the first couple of pages of results: What is the mix of ads, images, and scholarly and popular links? How relevant does the first page of results seem? Are any of the results on the first one or two pages the same as the ones from the Scholar or Images searches?
SUMMARY Information intermediation doesn’t begin and end with the sophisticated, structured search systems that provide access to authoritative information. The web includes a bounty of useful and reliable information amongst the dreck. Perhaps surprisingly, some of the same search tactics used in structured databases offered by libraries can retrieve scholarly articles, data visualizations, and other forms of information from reputable sources on web search engines. The advanced search forms for Google and its Scholar and Images subsets can help you get started as you craft a search designed to offset the tyranny of too much information with some techniques for precision. Because the forms show the search in Google’s language in the search box on the SERP, they also help you learn some shortcuts for creating searches that retrieve highly relevant results, as judged by the human information seeker. You’ll have to watch out for ads on SERPs as well as misinformation and downright dangerous advice and theories posted by the ill-informed and the malicious. The same kinds of assessments you would make of any set of results applies even more on the web.
REFERENCES Bush, Daniel, and Alex Zaheer. 2019. “Bing’s Top Search Results Contain an Alarming Amount of Dis-information.” Stanford Freeman Spogli Institute for International Studies, December 17. https://fsi.stanford.edu/news/bing-search-disinformation. Davies, Dave. 2019. “How Search Engines Crawl & Index: Everything You Need to Know.” Search Engine Journal, October 8. https://www.searchenginejournal.com/search-engines/crawlingindexing/. DuckDuckGo. 2022. “Advertising and Affiliates.” https://help.duckduckgo.com/duckduckgo-helppages/company/advertising-and-affiliates/. Fagan, Jodi Condit. 2021. “Google Books.” The Charleston Advisor 22, no. 4 (April 1): 33–39. Google Images Best Practices. 2022. https://developers.google.com/search/docs/advanced/guidelines/ google-images. Martín-Martín, Alberto, Rodrigo Costas, Thed Van Leeuwen, and Emilio Delgado López-Cózar. 2018. “Evidence of Open Access of Scientific Publications in Google Scholar: A Large-Scale Analysis.” Journal of Informetrics 12, no. 3: 819–41. Nelson, Alondra. 2022. “Memorandum for the Heads of Executive Departments and Agencies: Ensuring Free, Immediate, and Equitable Access to Federally Funded Research,” August 25. https://www.whitehouse.gov/wp-content/uploads/2022/08/082022-OSTP-Public-Access-Memo.pdf. Russell, Daniel M. (2019). The Joy of Search: A Google Insider’s Guide to Going Beyond the Basics. Cambridge, MA: MIT Press. Vaidhyanathan, Said. 2011. The Googlization of Everything (And Why We Should Worry). Berkeley: University of California Press.
SUGGESTED READING Russell, Daniel M. 2019. The Joy of Search: A Google Insider’s Guide to Going Beyond the Basics. Cambridge, MA: MIT Press. Dan Russell’s Home Page & Site, https://sites.google.com/site/dmrussell/.
ANSWERS Since search statements and results vary, this answer discusses only the first query: I’m a new school bus driver, and I want to know more about handling discipline problems on school buses. From this discussion, though, you should be able to see a good example of how using Google Scholar, Google Images, and plain-old Google results in different retrievals. 1. Run the search using Google Scholar. Record the number of results. How relevant to the topic are they? Compare the relevance of items on the first page to that of items on the eighth or tenth page. On the first two results pages, how many results are openly available and how many are only in paywalled databases or journals? I looked up the subject descriptor in ERIC and found “school buses” but not “school bus drivers.” For the search “school buses” driver discipline OR punishment, there are 7,090 results. The first results are highly relevant, with titles such as “Video Monitoring Devices on School Buses” and “Behavior Management Interventions for School Buses.” On the eighth page of results, very little looked relevant; many of the results are older, out-of-date publications. On the first page, five results
are openly accessible, two of them at ed.gov (the U.S. Department of Education website) and one at nih.gov (the National Institutes of Health website). On the second page, three are openly accessible. Assuming our new school bus driver does not have access to paywalled articles, we can change the search to include the domain limiter site:edu; retrievals will be mainly from open-access repositories maintained by universities. 2. Now use Google Images to discover a graphic element such as a chart, table, figure, illustration, etc. that is relevant to the topic. How did your search statement change? How relevant to the topic do the results seem to be? Choose one thumbnail that appears to be authoritative and informative about the query. Identify the article or web page where the graphic element appears and consider how authoritative it is. Did your Google Images search seem to retrieve useful articles that weren’t on the first couple of pages of Google Scholar results? To what extent do you think an image search can be useful for scholarly queries? After a few tries, the search statement changed dramatically, to simply “school bus” behavior. When you think about it, school buses have drivers and students, so adding those as keywords narrowed the results too much. Keywords for types of graphics also hindered recall. The simple search I finally used was broad, but listed across the top of the SERP were a number of refinements, including the word chart. Clicking on that provided better results. Above all the organic unpaid listings were ads for different kinds of bus safety charts available from retailers like Walmart and Amazon, and some of the organic images were for charts available for purchase. One useful and freely available item on the SERP was a decision tree showing different paths to take for different kinds of behavior problems on buses. Clicking on that thumbnail opened a window to the right showing that it was a slide from a school district training slideshow. Clicking on the “visit” tab led to the complete set of thirty-one slides on the topic of positive behavior interventions and supports (PBIS), which was freely downloadable. This is an authoritative source
for the topic, but not the type of material that Google Scholar indexes. For this particular topic, Google Images may not be as helpful as it might be for more scholarly works that include charts and graphs generated from research data. 3. Finally, use plain-old Google to find information about your chosen query. How did your search statement change (if it did)? Characterize or categorize the first couple of pages of results: What is the mix of ads, images, and scholarly and popular links? How relevant does the first page of results seem? Are any of the results on the first one or two pages the same as the ones from the Scholar or Images searches? I began with the same search statement used in question one, but after seeing the results I changed it to “school bus drivers” discipline OR punishment. The types of formats were markedly different, with no links to texts showing up until I scrolled down on the first SERP. The first result is a YouTube video titled “Student Management and Discipline for Bus Drivers,” and the video itself has been indexed so the user can easily jump to one of the “10 key moments in this video.” More videos are listed, followed by suggestions for other ways to search the topic. Finally, there are links to helpful-sounding web pages at all kinds of domains, including .com and .org. None of the first ones appeared in the Scholar or Images results, but on the second page is a relevant link to an open-access journal article which likely was in the Scholar search results. Toward the bottom of the SERP are more suggestions for related searches, including one for school bus behavior chart that leads to many of the same graphics found using Images.
9
Known-Item Searching A known-item search is a request for a specific source that you know exists. The item can be a journal or magazine or an article published in one; a book in any format; a government document; a corporate annual report; conference proceedings or a single conference paper; a data set or a data visualization; a technical report; a blog or blog post; a podcast; a film or video; audio or a transcript from a television or radio program; a newsletter or zine; a manuscript or archival collection; a still image; an oral history interview; or an advertisement. Known-item searching involves finding one source among the billions of sources in the vast and varied information universe. Subject searching is based on the assumption that someone somewhere has published something on the information seeker’s topic. Known-item searching is based on the knowledge that a specific source exists and it’s just a matter of finding it. The information seeker’s purpose, uncovered in the reference interview, will help you identify whether their quest is for a known item. They may need to find a source that someone else referred to, the complete citation for an item they’ve used in the past, or the full text of an article whose citation they have. They may have an openaccess preprint, which is an early version of an article, and need to find the final peer-reviewed version published in a journal. They may have a citation to an older article that they found in a database whose full-text sources don’t go back that far. They may know an old ERIC document or government report exists on microfiche and want to find a digital copy instead. Your decision to conduct either a subject or a known-item search is the result of typecasting the user’s
query and choosing a database, steps 3 and 4 of the online searching process. Both steps help you craft a search as an input to the retrieval system so that the search system produces relevant results or, even better, one single result for the known item. This chapter provides an in-depth examination of the two most common types of known-item searches, for authors and for titles. When only the name is known, an author search can be used to retrieve material issued by individuals and by many types of entities, including government agencies, organizations, businesses, and places. Title searching applies to article, report, blog, media, book, periodical, and series titles. These last two, periodicals and series titles, can be used for a kind of subject search as well, allowing the user to search a journal or book series title and then browse issue by issue or book by book in the series to identify material in the subject area of the journal or series. When the seeker knows a word or two of the author name and a word or two of the title, you can apply techniques from subject searching, fielded searching and the Boolean AND, for a very precise search. Other searchable fields, such as journal name, publisher, or year of publication, can also be added to the search when those elements are known. After presenting some techniques for assisting the user with citation completion and full-text fulfillment, this chapter concludes with a discussion of the use of the digital object identifier (DOI) to locate a known item.
AUTHOR SEARCHES Author searches come in two basic flavors: 1. Users want to read, scan, listen to, look at, or watch a particular source that a person wrote, edited, illustrated, photographed, scripted, painted, performed, sculpted, or produced. There may
be more than one author/creator involved, and it’s possible that the author is a group or other entity, as is the case with the Brookings Institution and the annual reports, newsletters, and guides it produces. 2. Users want to scan a list of sources that a particular person wrote, edited, illustrated, or otherwise created because they like what the person creates, and they want to find more of their work. Here we’ll call it an author-bibliography search to distinguish it from the author search. This can sometimes be part of a reader’s (or listener’s or viewer’s) query for fiction, poetry, plays, movies, and music. For nonfiction, it can serve as a kind of subject search, similar to searching for articles in a journal by journal title but beginning from an author name search rather than a title. Impetus for Author and Author-Bibliography Searches The impetus for author and author-bibliography searches is an author name that is known to the user from past experience or others’ recommendations. Names can be inaccurate or ambiguous, but a well-executed reference interview can help you find the correct name. To find one or more works connected with the name that interests the user, you must get the name and more information about it from the user. Be attuned to the following: Whether the user seeks a particular title written, edited, illustrated, or created by this person or entity. This is the first information you should get from the user. If the user knows the title, then several avenues are open to you for finding the source: title, author, or a combination of the two. Because research demonstrates that people are more successful searching for titles than for authors when both are known (Kilgour 2004; Lipetz 1972; Wildemuth and O’Neill 1995), consider conducting a title search first. If that proves
unsuccessful, use the advanced search screen and try a word from the author name in the author field AND with known words from the title in the title field. The user’s confidence that the name is correct or clues about how it might be incorrect so that you can experiment with various names in your author searches. Characteristics of the person behind the name (i.e., economist, chef, feminist, conservative politician) so you know which database(s) to search. Where the person got the name so you can retrace their steps to find the complete and correct name. Problems Searching for Author Names and Proper Nouns Because author names are proper nouns, this discussion of author searches can be generalized to searches for proper nouns generally, such as names of persons, organizations, places, programs, and projects. People’s names change because of marriage, divorce, remarriage, Anglicization, stage names, pseudonyms, gender transitions, and deliberate changes of name (legal or otherwise). Authors themselves can supply variations of their own names, such as sometimes including and sometimes excluding a middle initial. Some publishers apply strict editorial rules about the names on their title pages, such as using the first and middle initials followed by a single or hyphenated surname. Names of organizations change because of mergers, acquisitions, buyouts, breakups, Anglicization, rebranding, adoption of acronyms, and the decisions publishers make about how organization names appear on their title pages. Family names and names of places change. Names of projects, programs, legislation, roads, buildings, monuments, brands, governments and government agencies, and other proper nouns change. Some may come to be known by nicknames, shortened forms, or acronyms. They may also adopt one-word names that are words used in everyday language (e.g., Apple, Amazon, Word). Names also bear numbers that could be
written out or represented as Roman or Arabic numerals. Because proper nouns also figure prominently in subject searches, all of this applies to subject searches for people and entities, too. Authority control is the editorial process used to maintain consistency in the establishment of authorized index terms. When library catalogers and database publishers practice name-authority control, they assume the burden of establishing authorized names for persons, corporate bodies, and families; linking all the unused, synonymous names to the authorized names; and building a syndetic structure into their databases to refer searchers from unused names to authorized names. Library catalogers maintain a record of their authority-control decisions in the Library of Congress Name Authority File (LCNAF). When you think a name may pose a problem, search LCNAF for alternate names (https://authorities.loc.gov). If you aren’t familiar with the MARC (machine-readable cataloging) format, LCNAF records may be difficult to understand, so display authority records using LCNAF’s “Labeled Display.” For example, in a catalog or other database governed by authority control, you can expect it to respond to your entry of the name babe didrikson with a reference to the database’s authorized name: “Use Zaharias, Babe, Didrikson, 1911– 1956.” Figure 9.1 shows the authority record indicating the form to use and listing variations that are the same person but are not the authorized version of her name. The problem of multiple names is not limited to women’s name changes due to marriage or divorce. Check LCNAF for these men whose authority records list multiple names: John Creasey, Pope Francis I, Prince, Barack Obama, and Stephen King. Depending on the author name being sought, use the “Personal name heading,” “Corporate name heading,” “See also,” and “Variants” fields to formulate the search statements you enter into relevant databases. LCNAF includes names of organizations, places, projects, programs, titles, and proper nouns generally. Other name authority files are the Getty Thesaurus of Geographic Names (TGN) at http://www.getty.edu/research/tools/vocabularies/tgn/ and the Union List of Artist Names (ULAN) at
http://www.getty.edu/research/tools/vocabularies/ulan/. Biographical databases, such as Marquis Who’s Who and American National Biography Online, may provide cross-references to the many names a person is known by.
Figure 9.1 Record for Babe Didrikson in the Library of Congress Name Authority File. Source: Library of Congress Name Authority File, https://id.loc.gov/authorities/names.html Authority control solves the problem of multiple names for an author. It doesn’t help with misspelled author names or variants. To be comprehensive about retrieval, formulate search statements bearing as many variant forms of the name as you can think of. Consider initials, initials only, missing name elements, punctuation, symbols, double letters, e before y at the end of a name, and other variations. Whether you are conducting a known-item search for works by an author or a subject search for works about the author, consider all the possibilities, and then formulate your search statements accordingly. If you suspect a misspelled name, try Google, Bing, or DuckDuckGo because these web search engines suggest corrections that you can cut and paste into another search system. Often they can detect and correct users’ misspelled author/creator names: illustrated books by gerry craft: all three search engines retrieve results using the corrected spelling of the author and illustrator’s name: illustrated books by jerry craft
flaco jiminez’s last album: all three search for the corrected name with and without the accent mark over the first e: flaco jimenez’s last album JVN podcast: all three search for the spelled-out name jonathan van ness podcast In academic publishing, the ORCID number (Open Researcher and Contributor ID) is improving the accuracy of manual and automatic approaches to authority control. ORCID is a nonprofit, member-supported organization working toward “a world where all who participate in research, scholarship, and innovation are uniquely identified and connected to their contributions across disciplines, borders, and time” (ORCID 2022). Participating academic, research, and professional institutions encourage anyone involved in scholarship to register free for an ORCID identifier that persists despite name changes throughout one’s professional career. Millions of researchers around the world and in all disciplines have created profiles. Web of Science, Scopus, and Dimensions offer ORCID identifier searches in addition to author name searches. Selecting Relevant Databases As soon as you establish the user’s interest in a particular name, ask for clarification, including, for an individual author, the person’s affiliation, discipline, or reputational attributes. For an organization, ask what kind of organization or business it is, what type of work or business it does or did, or what it is known for. Consult Wikipedia or another encyclopedia article to learn more about the proper nouns that figure into user queries. If the user wants everything written by the person, they may be conducting an author-bibliography search—a subject search in disguise. Such searches may be for names of modern-day researchers connected with academic institutions, government agencies, nonprofit laboratories, research centers, and think tanks or for names of writers whose literary works interest the user.
Search for researchers using your favorite web search engine to find the researcher’s web page, where their curriculum vitae is posted and provides you with a comprehensive list of their publications. Follow up with author searches of large general-interest and multidisciplinary databases such as Google Scholar, your academic library’s discovery system, Scopus, and Web of Science and then one or more discipline-based databases. Search for authors of literary works in your library’s catalog. For a comprehensive list of books, films, collected plays and poetry, and sound recordings, search WorldCat and suggest that the user request relevant items that your library doesn’t own through interlibrary loan. Depending on the user’s query, try resources such as Granger’s World of Poetry, ProQuest’s Literature Online or Latino Literature databases, Gale’s LitFinder, Alexander Street’s Black Drama or Asian American Drama, or EBSCO’s NoveList. For the patron wanting to watch a dramatist’s performing in a play, a poet reading their work, or a musician singing their own songs, consider searching the names of authors/creators on YouTube or Vimeo. Representing the Author Search as an Input to the System Formulating search statements for both author and authorbibliography searches is the same. Differences between the two pertain to the user’s handling of results. For the former, users are searching for one particular item. For the latter, users want all or selected works by this author. Although some search systems allow users to browse author names in alphabetical order by surname, it is more common for author fields to be word-indexed for searching that results in higher recall. With alphabetical browsing, you can scan the first element of the string for the desired author surname and then check first and middle names. Scanning is quick and almost effortless because you are likely to quickly spot the name you want in the list of alphabetized author names. Alphabetical browsing has several different names:
Author index Author/Creator (last name first) Author (last name, first name) Browse authors Look up authors The Library of Congress catalog browse function’s drop-down menu lets you scan author names in alphabetical order by last name. For the library patron seeking novels by someone with the last name Lessing, browsing offers a list where they can find the desired author, Doris Lessing, along with a see-also reference to her pseudonym Jane Somers, as shown in figure 9.2. University and public libraries typically offer word-indexed author fields, and a search for only the last name will yield many irrelevant results, unless the name is very distinctive. Browsing the Library of Congress Online Catalog by last name can be a helpful first step.
Figure 9.2 Library of Congress catalog browse for last name Lessing. Source: Library of Congress Catalog, https://catalog.loc.gov/ Other government catalogs also provide a browse feature. For example, the navigation bar on the Catalog of U.S. Government Publications web page offers a browse link that allows you to scan alphabetical listings by surnames of individuals or by the first word of corporate/agency names. Corporate in this case means a collective entity rather than an individual author or coauthor of a
work. For federal agencies as authors of their own publications, including defunct agencies, begin your browse with United States followed by the first word of the agency name, as shown in figure 9.3. The results will be in the form of an alphabetical list by agency name. If you leave off United States at the beginning, results will be from a variety of programs, projects, and other entities whose names begin with the first word you input in the corporate/agency browse box. Some scholarly databases also provide a browse-like function that can help identify the correct author. For example, the National Library of Medicine’s PubMed advanced search builder provides a drop-down menu with the option to search by author last name. As shown in figure 9.4, using the author-last search option allows you to check the author index to identify the correct author. In most commercial databases, the author field is word-indexed. You can input the author’s name in any order and get results, but some of the results may not be by the intended author. Most research databases don’t provide ready access to their inverted author index. The fields drop-down menu will list author name, but the search will be for words in the author field index. However, there may be another route elsewhere on the interface.
Figure 9.3 Catalog of U.S. Government Publications browse for United States agencies. Source: Catalog of U.S. Government Publications, https://catalog.gpo.gov/
Figure 9.4 Searching by author last name in the PubMed advanced search builder. Source: PubMed, https://pubmed.ncbi.nlm.nih.gov/ In EBSCOhost databases, click on “More” in the navigation bar above the search boxes and select “Indexes.” Choose “Author” in the “Browse an Index” box, enter the inverted form of the name, and choose from the alphabetical list of names by clicking the little box next to the author name you want (figure 9.5). You can launch the search from the index by clicking on the “Add” button to input the name into the search box and then clicking on the “Search” button.
Figure 9.5 Browsing the author index in Academic Search Ultimate and selecting names to add to the search box. By permission of
EBSCO Publishing, Inc. In ProQuest databases, choose “Author–AU” from the fields dropdown menu, which exposes the “Look up Authors” link under this menu. Click on this link, enter the inverted form of a name into the pop-up search box, enable the “Begins with” radio box, and select the “Find” button. For the most complete list to choose from, limit the names you search via alphabetical browsing to surnames and the initial of the first name so that ProQuest displays forms of names that are fuller than your entered elements. Click on all the boxes that seem to be the sought-for author, then click the “Add to Search” button without changing the default from the Boolean OR. Searches for names can confuse information seekers. They aren’t always sure whether the person or entity figures into their search as an author or a subject. Consider this chat session: USER
18:5 5 LIBRARIA 18:5 N 7
How do I find a journal on a specific person? The title is Philosophical Studies, the person is Aristotle. Under the first box type in Aristotle, and then in the second box type in Philosophical Studies.
Unfortunately, this chat is doomed from the start because the librarian doesn’t bother to find out what the user really wants. Consider the negotiation that is needed to help this user. You have to determine whether the user actually has a citation in hand for a journal article about Aristotle published in the journal named Philosophical Studies. If so, conduct a citation verification search. If not, search the Ulrichsweb Global Serials Directory (if you have access to it) to see if there really is a journal named Philosophical Studies and which databases index its articles, and then proceed accordingly. Most likely, the student’s instructor gave the class a directive to use sources from scholarly journals. Unsure what to do, the student poses a question to the librarian in a way that elicits a response that remains faithful to the professor’s directive about using sources from scholarly journals. If this is the case, you might advise the student to search a database that specializes in
philosophy, choose “Subject” from the fields drop-down menu, enter Aristotle into the search box, and review the filters on the results page for a particular aspect about Aristotle that is of interest.
TITLE SEARCHES Title searches are used under the following circumstances. 1. Users want to read, scan, listen to, look at, or watch the source. 2. Users want to scan a source’s contents because they know that it has published, issued, or broadcast information in the past on their topic of interest. They think that, by browsing the source’s contents, they might find more like it. Ultimately, this is a subject search in which the user browses the source’s content to find more like a piece read, scanned, listened to, looked at, or watched in the past. Impetus for the Title Search The impetus for the title search is an exact or not-quite-exact title that the user knows exists. Here are sample scenarios that describe how the source’s existence may have come to the user’s attention by a variety of routes: Someone else, such as a colleague, instructor, friend, or relative, recommended the title. The user found the title in the footnotes or bibliography of a book, a journal article, an encyclopedia entry, or credits from a film or television program. Several times the user has noticed this title in bibliographies, suggested readings, or retrieved sources, which are clues that it
merits closer scrutiny. In such scenarios, the exact title may not make the trip intact from its point of origin to your interaction with the user. Maybe the person who suggested the title got it wrong; the citation was incorrect; the user omitted, added, or transposed title words; or the user was relying on their memory of a search conducted in the near or distant past. As a result, always be skeptical about the titles users give you. To find the known item that the user wants, you must get its title and more information from the user. Be attuned to the following: The user’s confidence in the title’s correctness or how it might be incorrect so that you can experiment with various titles and title searches Where the user got the title so you can retrace the user’s steps to find it The title’s genre so you know which database to search The discipline that characterizes the article title to give you direction for choosing a database in this same discipline or a multidisciplinary database that is likely to index it Here’s a reference interview from a chat transcript in which the user describes her desire for a particular title: USER
15:0 Hi . . . I contacted you last week and someone 9 gave me an online e-book name to help me write a practicum report. I am looking for guidelines in writing practicum reports. I did another search and couldn’t find anything. Do you happen to know of the online book I can access? LIBRARIA 15:1 Hi. Searching now. Are you an Ed[ucation] student? N 0 USER 15:1 Yes, I’m getting my MSED in FACS to be a teacher. 2 LIBRARIA 15:1 . . . Still searching . . . N 5
USER
15:1 I am still going to look as well. The funny thing is 6 the title didn’t have the word practicum in it. I think that’s why I can’t find it. Thanks anyway for any help you can give me. LIBRARIA 15:2 Still looking. Checking with the Education Librarian N 0 ... The user’s uncertainty about the title makes this an exceedingly difficult search. You will conduct a known-item search for this title, but you’ll find that it eventually morphs into a subject search since the user provides so little information about the title, even though she used the e-book a week earlier. Selecting Relevant Databases for Title Searches Find out what the user knows about the desired title, and use this information to select a database. Search your library’s catalog to find titles of monographs, journals, annual conference proceedings, films, musical compositions, and maps. If you still can’t find the title, follow up with searches of the WorldCat database, which lists the holdings of member libraries worldwide. If the title is a journal article, conference paper, or unpublished working paper, search your library’s everything discovery system or Google Scholar (launching Google Scholar through your library’s database hub to avoid the paywall). If you come up empty-handed, follow up with searches in database and journal aggregators such as EBSCOhost, Gale, ProQuest, SpringerLink, and ScienceDirect and in multidisciplinary databases such as JSTOR, Scopus, or Web of Science. Representing the Title Search as an Input to the System Search systems may offer one or more types of title searches, including alphabetical browsing of the search system’s inverted title index and free-text searches of words and phrases. Alphabetical
browsing is an option when title fields have been phrase-indexed, meaning the search system will look for your input character by character. If you know the title’s initial words, choose alphabetical browsing because you are likely to quickly spot the desired title in the alphabetical list of retrieved titles. Alphabetical browsing is a standard feature of classic library catalogs, but it has become a threatened species in most other systems because of the popularity of free-text searches with relevance-ranked results (Badke 2015). Free-text title searching is possible when title fields are wordindexed rather than or in addition to phrase-indexed. The Library of Congress Online Catalog offers the option to browse the collection. Clicking on the “browse” link on the home page takes you to a page with a drop-down menu with options for browsing titles, authors, and subject headings, among other items. The default is “TITLES beginning with (omit initial article),” signaling that the user will be doing a phrase search and that results will be in alphabetical order. Figure 9.6 shows the results page for a title search beginning with sky islands. The catalog lists results alphabetically by title. The title beginning with search is almost effortless because it requires a minimal amount of data entry and finding the title on the first page of listed titles is a recognition task that takes a few seconds.
Figure 9.6 Browsing the Library of Congress Online Catalog by title. Source: Library of Congress Catalog, https://catalog.loc.gov/ Rather than browsing, you can do a keyword search using words from the title that the seeker remembers and limit the search to the title field index. Word-indexed title fields make it possible to find titles even when the seeker doesn’t remember the first word, all of the words, or the exact order of the words in a title. Most library catalogs provide a keyword search box with a drop-down menu for limiting the search to the title field index. Search systems may be programmed to include more than just the book title, however. Some library catalogs index the titles of all the chapters in a book, for example. Consequently, using keywords known to be in a book’s title may yield more results than the single surrogate for the desired book if there are other books containing chapters whose titles have the same words in them.
Research databases are likely to offer a drop-down menu that lets you limit your search to the journal name field or the document/article title field. For the former, choose “publication,” “publication title,” “journal name,” or “source” from the fields dropdown menu, and enter title words into the search box. For the latter, choose “document title,” “article title,” or “title” from the fields dropdown menu, and enter title words into the search box. Few research databases have alphabetical browsing, and when they do, it’s usually limited to the titles of journals, not the individual articles in journals.
CITATION VERIFICATION AND FULL-TEXT FULFILLMENT For the researcher or student who has an incomplete citation for an article, the EBSCOhost system provides a citation-matching search form, which you can find under the “More” link in the navigation bar at the top of the advanced search screen. It offers a search box for each element of a citation that the user knows. Figure 9.7 shows how you might search for a complete citation when all you know is the name of the journal, a keyword or phrase likely to be in the article’s title, and the probable year the article was published. Ovid databases also offer a citation-matching feature. To find it, click on “Journals” in the navigation bar, then on the “Find Citation” link under the journal name search box. For the scholar who remembers that vitamin K was in the title and the article was in “one of the chemistry journals, maybe Chemical Transactions or Chemical Proceedings,” one possible approach is shown in figure 9.8. Only the first word of the journal name is input and the default setting to truncate is left in place. With only a few results, it’s easy to browse through them for the desired item.
In databases that don’t provide a separate search page for citation matching, you can use the advanced search interface in any database that provides one and use the fields drop-down menus for the known elements of a citation. For example, using the advanced search screen in the Access World News database, you can help a researcher working on a documentary about the explosion of the Challenger space shuttle in 1986. The researcher seems to remember reading a story in Texas Monthly magazine about the local reaction to the government’s investigation of the accident. Figure 9.9 shows how you can use the information the seeker remembers to find the complete citation and the article itself. The ability to search full text in a database that indexes the magazine that published the story makes this an efficient search.
Figure 9.7 Citation-matching form with journal name, title phrase, and year of publication in the labeled search boxes. By permission of EBSCO Publishing, Inc.
Figure 9.8 Form for the “Find Citation” feature on the Ovid platform. Permission granted courtesy of Wolters Kluwer/Ovid.
Figure 9.9 A search for a known item using the advanced search screen in Access World News. Courtesy of Readex, Inc. When the seeker has a citation but needs the full text, the easiest approach is to input what you have (e.g., author name, title
words, publication name) in the library’s discovery search box and then scan the list of surrogates for links to the source. Searches may fail because of the ways in which citation data are represented in the surrogate record and indexed in the database. For example, databases might represent journal titles, conference proceedings, and book and anthology titles in full, acronym, and/or abbreviated forms. For example, here are PsycInfo’s indexed forms for the journal Psychiatric Quarterly: psychiat. quart. psychiatric quarterly psychiatric quarterly supplement Much the same happens to indexed forms for the Journal of the American Helicopter Society in the Compendex database: j am helicopter soc j amer helicopter soc journal of the american helicopter society If a citation verification or full-text search fails, the problem may be that your search statement doesn’t accommodate the variant forms of a journal’s title. You need to perform alphabetical browsing of the database’s journal-name index, scan for variant forms of the desired journal title, select them, and combine their results into one set using the Boolean OR operator. In EBSCOhost and ProQuest databases, look for the “Publications” tab in the navigation bar on the advanced search screen, and in Gale databases look for “Publication Search.” Unfortunately, not all search systems offer alphabetical browsing of journal-name indexes, so you might have to guess at variant forms, enter them directly, and/or use truncation to get the job done. Journal titles may change over time as well, and that can have a big impact on locating an older article published when the journal had a different name. Search Ulrichsweb for the journal name, then open the “Title History Details” tab for a list of previous names.
The Journal Run Marcia Bates (1989) suggests browsing issue by issue through the entire run of a periodical, or a selected time period depending on the topic, as an effective retrieval strategy when used with journal that is core to the subject being studied. Browsing issue by issue makes it possible to have perfect recall, since you know the exact contents of the issues and which ones are relevant to the seeker’s topic, but offers too much precision, since relevant articles from other journals are missed. Kacem and Mayr (2018) have found that information seekers typically initiate a journal browsing approach at the end of a session of subject searching, when they have developed an understanding of the available literature on their topic and have identified the key journals publishing most of the articles in their results sets. The journal run approach can also be used to find the full text of an article, especially for older articles that are indexed in databases but whose full text is not included. It’s a good last resort in such cases and in cases in which link resolvers don’t function properly. Navigate to your library’s website and scan it for a link to online journals. The link may be hidden under a drop-down menu, and the terminology used for the link will vary from library to library. Examples are: Digital journals list E-journals E-journals A–Z Electronic journals list Journal titles Journals Online journals Clicking on the online journals link takes you to a search box, into which you type the journal’s name. The system might give you a choice between alphabetical browsing of the database’s inverted title index or a free-text search of words in journal titles. Skim the
retrieved journal titles, and in response to your selection of the desired title, the system displays links for one or more full-text providers. Some providers supply full runs that go back to the journal’s first volume and issue published many years ago, and others are limited to volumes in a specific range of years. Make sure the provider supplies the desired journal for the years and volumes that interest the user. An alternate approach to finding online journals and printed backfiles in the stacks is a search of the library’s catalog. Use the fields drop-down menu to select the journal title search. If there’s no such menu, switch to the catalog’s advanced search screen, and look for it there. The catalog will link to the journal online and display the call number that indicates where printed issues can be found on the shelves. For incomplete citations to older articles, the user may have to browse issue by issue to find the needed item. Because browsing individual issues takes time, effort, concentration, and perseverance, the journal run approach, whether for subjects or known items, is for serious, motivated researchers and students. It is not for the casual inquirer who needs a handful of refereed journal articles to write a paper that is due next week. If searching the library website’s journal-titles index or the library catalog doesn’t yield the desired journal title, your library probably doesn’t subscribe to it. Make sure the seeker has a complete citation before referring them to the interlibrary loan service for the full text. In a public library, a reader may have an interest or hobby they’d like to learn more about by browsing magazines devoted to that interest or hobby. For example, a high school student interested in all things automotive might want to access their public library’s OverDrive app. In OverDrive, magazines can be sorted by title, popularity, and release date, among other options. On the left side of the OverDrive interface are filters, including one for subjects, where the student can click on the “cars and motorcycles” category. The magazine browse facilitates serendipitous discovery, a desirable, if often invisible, library service.
DIGITAL OBJECT IDENTIFIERS The digital object identifier (DOI) is a unique number assigned to digital items including books, articles, graphics in articles, movies, and other information resources (Paskin 2010). It takes the form of a string of numeric and alphabetic characters separated by a slash and preceded by the acronym DOI. The DOI assigned to an object is permanent, and even if the object moves, link resolvers make it possible to continue to access the object using its DOI. The system began to be used in 2000 and today more than five thousand publishers, science and data organizations, and other entities assign DOIs (DOI 2022). The unique identifiers appear in the margins of journal articles, book chapters, tables, and figures; in footnotes and bibliographies; and in surrogate records in databases. Figure 9.10 shows search results that include a DOI, in small type, for each individual item. If you have a source’s DOI, you are a few heartbeats away from finding the actual source, so make it a point to ask users whose queries involve known items whether they have a page from the actual source or another source that references the source they want. If they do, check it for a DOI. You can input the DOI in your library’s everything search box, and if the source is not retrieved move on to http://www.doi.org/index.html or to a web search engine, or simply input the entire string beginning with https://doi.org/ in your browser’s address bar. Using the DOI will retrieve the source’s surrogate and provide the source, if it is an open-access object, or enough information so that you can access it via your library’s website or order it via interlibrary loan.
Figure 9.10 OSTI.gov search results including DOIs. Source: U.S. Department of Energy Office of Scientific and Technical Information, https://www.osti.gov/
QUESTIONS Keep a record of the steps that you took to find answers to these questions, and then compare your steps with those described in the answers. 1. A professor has the first page of an essay in an anthology. The essay is titled “The Political Structure of the Federal Reserve System” and the author is Michael D. Reagan. The professor needs the full citation, including the names of the anthology’s
2. 3.
4.
5.
editors; the book’s title, publisher, and year of publication; and the page numbers for the essay. A user who has read all the library’s books by Dan Chernenko wants to know whether the library has more recent books by this author. A student is preparing a speech about media depictions of masculinity and wants to show a short clip of an old television commercial for Marlboro cigarettes featuring the Marlboro Man on horseback. They say that cigarette commercials were outlawed in 1970, so the commercials would have aired in the 1960s. Use the Internet Archive, https://archive.org/, to find one. You may also find some on YouTube. A user needs the full citation for a source titled “Competitive Sports Activities for Men,” which is about sports for elderly men suffering from memory problems. The user had trouble downloading the source, so they copied and pasted the source’s text into a three-page Word document and the only metadata on it is the title. A PhD student in the university’s political science department who’s studying how presidents communicate with their constituents wants to compile a complete bibliography of books and articles by Kathleen Jamieson.
SUMMARY When you conduct a search for a known item, you expect to retrieve a source that you and the person you’re assisting know exists. Examples are a journal article, book, conference paper, blog, organization, film, television commercial, journal, or other material. This chapter spotlights the two most common known-item searches: for author names and for titles. Although the seeker may need only to complete an incomplete citation or to locate the full source when
they have a complete citation, author or title searches can sometimes be subject searches in disguise. In such cases, the user wants to see a list of an author’s works to find more works like those the user knows about. Or the user wants to skim a particular periodical’s issues because the periodical has published relevant information in the past and the seeker assumes that browsing issues will reveal relevant articles. Known-item searching can be made more difficult because of the original citing source’s published inaccuracies; the faulty memories of the individuals recommending material to their students, colleagues, and family members; and the incomplete information users provide. Nevertheless, adapting the subject search techniques described in previous chapters, including the use of Boolean operators and fielded searching, and adding the specific known-item approaches introduced in this chapter can lead to efficient identification and location of elusive information.
REFERENCES Badke, William. 2015. “What Happened to Browse?” Online Searcher 39, no. 2 (March/April): 71–73. Bates, Marcia J. 1989. “The Design of Browsing and Berrypicking Techniques for the Online Search Interface.” Online Review 13, no. 5: 407–24. DOI. 2022. “Key Facts on Digital Object Identifier System.” https://www.doi.org/factsheets/DOIKeyFacts.html. Kacem, Ameni, and Philipp Mayr. 2018. “Analysis of Search Stratagem Utilisation.” Scientometrics 116, no. 2: 1383–400. Kilgour, Frederick G. 2004. “An Experiment Using Coordinate Title Word Searches.” Journal of the American Society for Information Science & Technology 51, no. 1: 74–80.
Lipetz, Ben-Ami. 1972. “Catalog Use in a Large Research Library.” Library Quarterly 41, no. 1: 129–39. ORCID. 2022. “About ORCID.” https://info.orcid.org/what-is-orcid/. Paskin, Norman. 2010. “Digital Object Identifier (DOI) System.” In Encyclopedia of Library and Information Sciences, 4th ed., edited by John D. McDonald and Michael Levine-Clark, 1325–31. Boca Raton, FL: CRC Press. Wildemuth, Barbara M., and Ann L. O’Neill. 1995. “The ‘Known’ in Known-Item Searches: Empirical Support for User-Centered Design.” College & Research Libraries 56, no. 3 (May): 265–81.
SUGGESTED READINGS Sprague, Evan R. 2017. “ORCID.” Journal of the Medical Library Association 105, no. 2 (April): 207–8.
ANSWERS 1. Find the anthologized essay titled “The Political Structure of the Federal Reserve System” by the author Michael D. Reagan. The professor needs the full citation, including the names of the anthology’s editors; the book’s title, publisher, and year of publication; and the page numbers for the essay. More recent surrogate records in library catalogs include a field where the titles and authors of the separate chapters published in anthologies are listed. For a book published in the 1960s or earlier, that may not be the case. A first approach for this query is to search the author and title in WorldCat, but no book is
retrieved, indicating that the record for the anthology in question does not include a list of the essays in the anthology. An article with the same author and title that was published in 1961 in the American Political Science Review is retrieved, however. The professor would prefer to cite the version they have the page for, so move on to a university library discovery system and input both author name and title. The results will include articles and books, but not necessarily the one from the 1960s. (This title seems to have been reprinted several times!) If the university library includes full-text books from the Internet Archive, you may find the book that way. If not, your next step is to search the Internet Archive directly, inputting the author name and putting the essay title in quotation marks to force a phrase search. On the results page, use the sort function to list results by publication date. Since the Internet Archive shows the book cover along with the metadata for each item, the professor may recognize the sought-for book by its look as well as by its publication date. Another option is to use Google Books, and on the results page use the time period drop-down menu and input a customized date range for 1960–1969. There’s only a snippet view inside the correct book, but once you have the book’s title, you will be able to use one of the other methods above to discover the page numbers where the essay appears. Citation: Reagan, Michael D. 1966. “The Political Structure of the Federal Reserve System.” In Money and Finance: Readings in Theory, Policy and Institutions, edited by Deane Carson, 202–12. New York: Wiley. 2. Find more recent books by Dan Chernenko. Check LCNAF for this author’s authority record (http://authorities.loc.gov/). Dan Chernenko is a pseudonym for Harry Turtledove, an author who writes under several names: Dan Chernenko, Mark Gordian, Eric G. Iverson, and H. N. Turteltaub. Follow up in your library’s catalog with author searches for each name, possibly limiting
your search by date; show the user your library’s holdings for each; and suggest pursuing those that are of interest. 3. Locate television commercials featuring the Marlboro Man. On the Internet Archive home page, click on the little television icon above the search box to go to the “TV News Archive” search page. Go for recall rather than precision, inputting only the words marlboro commercials. 4. A user needs the full citation for a source titled “Competitive Sports Activities for Men.” Based on what the user tells you about the source, you should establish a pecking order of databases likely to provide a citation. This appears to be a journal article, so search your library’s discovery system first, then Google and Google Scholar (launching the latter through your library’s website). If your searches fail, continue searching, choosing such database aggregators as EBSCOhost, Gale, and ProQuest. Also consider such current newspaper databases as Nexis Uni, ProQuest News & Current Events, and Access World News, in case the author of the article is quoted and the article title mentioned. A title search of all EBSCOhost databases and a Google search of the title phrase “competitive sports activities for men” retrieves the full citation: Tolle, Ellen. 2008. “Competitive Sports Activities for Men.” Activities Directors’ Quarterly for Alzheimer’s and Other Dementia Patients 9, no. 3: 9–13. 5. Compile a complete bibliography of books and articles by Kathleen Jamieson. If you don’t know who she is, ask the information seeker, but also search the name using your favorite web search engine to find variants, including middle initial, variant names, or variant name spellings. Check LCNAF also. Use the library’s discovery system to find books and articles, but also search subject-specific databases in the fields of communication studies and political science. The author may also have a website or an ORCID page with a list of her publications.
10
Assessing Research Impact This chapter focuses on the impact of the research publications retrieved in topic-oriented searches. Many databases now include at least a few indicators of impact, which can be helpful hints to information seekers about the credibility and usefulness of their search results. More to the point, indicators of impact are used to assess faculty and other researchers on the basis of their publications and related contributions to the body of knowledge in their field. Government agencies and private foundations that award grants to fund research projects evaluate grant proposals on a number of criteria before awarding the funds. A common criterion is the ability of the researchers proposing the project to carry it out, and their reputation among other researchers in the same field represents one factor in the assessment. One way to gauge researchers’ capabilities is to assess their research impact, and one of the main ways to do that is to consider how many other scholars have cited the publications of the researchers seeking funding. Research impact is also important for university faculty working toward lifetime tenure and promotion from assistant to associate and then to full professor. Award committees consider research impact when making decisions about recipients, as do publishers and professional associations making appointments to editorial positions or other roles requiring evidence of scholarly achievements. The databases, publishers, and services that help researchers document their impact on their disciplines and research fields are described in this chapter. Citation indexes make the footnotes,
endnotes, and sources listed in research articles searchable, making it possible to generate a cited-by set of results and some bibliometric analysis of the results. In most databases, an article’s footnotes, endnotes, and sources are referred to as references, while subsequent publications citing the article are referred to as citations. Some databases also include altmetric analysis of authors that takes into account evidence of impact beyond the bibliometric. Citation databases can be used for other purposes as well, including finding articles on a particular topic using the search techniques that by now are quite familiar to you. In addition to documenting influence and impact, the elements that go into the creation of bibliometric and altmetric tabulations bolster the information seeker’s quest for authoritative information about a topic: disambiguation of author names, information sharing and linking across platforms and publishers, and timely updates of article content and citations. Methods used to indicate author influence and impact can help the information seeker judge the credibility of the sources found and can lead to related material without the need to reformulate a search strategy. Although the focus is on scholarly publications, we’ll also consider impact in the context of popular works such as fiction, memoir, and comics.
IMETRICS The term iMetrics stands for all the methods scholars in various fields have developed to quantify and analyze information and knowledge in published form (Milojević and Leydesdorff 2013; Maltseva and Batagelj 2020). Among these are scientometrics for analyzing knowledge production in the sciences, cybermetrics and webometrics for the internet, and other metrics that have emerged with developments in information and knowledge distribution. The
method that has been around the longest and that remains important for gauging research impact is bibliometrics, with scholarly citation counts as the basis for quantifying the impact of an article, a journal, or an author. Altmetrics, a set of alternative measures meant to provide a holistic assessment of scholars and their work, extends traditional bibliometrics by including: mentions and discussions of a research article or a researcher on social and mainstream media; downloads of works in open access repositories and journals; external funding such as grants; a researcher’s peer reviewing and related editorial work; and acknowledgment of and reference to authors and their works in policy documents and discussions. Bibliometrics As originally conceived by Eugene Garfield (1955), indexing of citations allows scientists to avoid citing out-of-date or erroneous articles by finding later papers that update or correct and cite earlier papers. A scholar could use the backward-chaining technique to identify earlier and earlier works to finally arrive at the one expressing the new idea or method that was developed further in subsequent works. A scholar could also use forward-chaining, searching for the subsequent works that cite a book or article. A scientist could generate a list of all the articles citing their own work, thus building a case for their impact on their field of endeavor. Garfield further suggested that tracking others’ use of one’s ideas, methods, and findings could help a scientist understand their own work more deeply and help them discover other researchers investigating similar topics. Garfield and his colleagues released the first Science Citation Index (SCI) in the mid-1960s. They also created citation indexes for the arts and humanities and for the social sciences, which, with SCI,
make up the core collection in the Web of Science database available at many university libraries. Further, Garfield (1996) used citationindexing data to identify the most-cited science and social science articles from 1945 to 1992, calling them “citation classics.” To identify important journals that should be included in the citation indexes, Garfield and a colleague developed what they called the journal impact factor (JIF). It involves counting the number of times articles published in a particular journal are cited in the two years after the articles are first published. Garfield and his colleagues created an annual publication, Journal Citation Reports (now a subscription database) that librarians could use to identify highly cited journals they might add to their collections. Over time, the JIF began to be applied to the evaluation of faculty. The higher the JIF of the journals in which a faculty member published their articles, the more positively the faculty member was evaluated. Concerned about the inadequacy of the JIF as an indicator of the quantitative and qualitative impact of an individual researcher’s work, a group of scholarly publishers and editors meeting in 2012 issued the San Francisco Declaration on Research Assessment (DORA). More than twenty-one thousand individuals and organizations worldwide have signed DORA, which calls on funders, employers, peer reviewers, and other stakeholders to stop using only the JIF as the basis for decisions about individuals and instead use a variety of methods to evaluate a broader set of research-related activities. Similar expressions of concern about the misapplication of metrics and suggestions for improving their use have come from researchers, policy experts, scientometricians, and others (Hicks et al. 2015; Wilsdon 2015). Nevertheless, McKiernan et al. (2019) found that 40 percent of research-intensive universities in the United States and Canada still used the JIF when doing performance reviews of individual faculty members and when considering faculty for promotion and tenure. Quantification based on citation tallies also became entrenched in considerations of individuals’ research impact, despite criticism (Bornmann and Daniel 2008). E. J. Hirsch (2005) proposed the hindex, a single number representing the maximum value h of papers
a scholar has published that have been cited h times. At its simplest, the h-index means that a scholar has an index of 4 if four of their papers have been cited by others at least four times. The Web of Science and Scopus databases and Google Scholar can be used to find a scholar’s h-index. In the sciences, where millions of dollars in research funding are at stake, using the JIF or h-index can provide a useful at-a-glance measure that may encourage funders to take a closer look at researchers’ proposals. In the arts, humanities, and social sciences, where published research is less cited than in the sciences, such calculations may be less useful. Additionally, several studies have documented that women authors are routinely less cited than men, calling into question the fairness of using a single metric to signal research impact (Dion, Sumner, and Mitchell 2018; Chatterjee and Werner 2021; Zhang et al. 2021). Cited-by numbers tell only part of the story of scholarly reputation and research impact, even in the sciences, where bibliometrics have been entrenched for decades. Altmetrics Article-level bibliometrics are based on researchers citing the published work of other researchers. There can be quite a time lag before the first citations begin to be recorded. Researcher A becomes aware of researcher B’s article, cites it in a manuscript, submits the manuscript to a journal for review, receives an acceptance with some suggested revisions, and submits a final, revised version to the editor, who posts it on the journal’s online website. Then the article is indexed in databases and Google Scholar, and the citation counting proceeds. More immediate are altmetrics: indicators of people’s interactions with the written products of academic inquiry, scholarship, and research. These indicators are varied, ranging from views of a journal article’s abstract on the publisher’s website to tweets referencing something substantive in the article, to name only two possibilities. Fueling altmetrics are transactional, content, and usage data pertaining to the written
products of academic inquiry, scholarship, and research and generated by users of social media, open-access archives, search systems, and online reference managers such as Mendeley. Data science companies, such as Altmetric.com and Plum Analytics, market almetrics services to publishers and database vendors as well as to universities pursuing higher rankings and research funding agencies making difficult decisions about competing proposals. Their services involve mining data for references to scholars and to academic sources; matching references to indexed sources in a single database or across a platform; categorizing references according to the types of interactions users have with them; and scoring sources based on both interaction types and volume of interaction types. A database that is enhanced with altmetrics from Plum Analytics (2022) can produce an altmetrics report for a retrieved source that tabulates the following: Usage. Number of clicks, downloads, views, library holdings, and video plays, which signal people are reading the source. Captures. Number of bookmarks, favorites, readers, or watchers, which indicate people want to come back to the source. Mentions. Number of and links to blog posts, comments, reviews, Wikipedia articles, or news media reports, which demonstrate people are engaging with the source. Social media. Number of and links to likes, shares, tweets, or +1s, which measure the “buzz” around a source based on attention paid to the source and its promotion. Citations. Number of bibliographic, patent, clinical, or policy citations, which are traditional indicators of a source’s use in research and its broader societal impact. A database that is enhanced with data compiled and analyzed by the company Altmetric will display a summary page for a retrieved source that includes a distinctive Altmetric donut-shaped badge. A score is shown in the donut hole and encircled by color-coded bands
indicating the number of different outlets where the item was discussed, and a legend for each color is provided. The Altmetric summary page also includes tabs for the following: news blogs policy documents Twitter Facebook Wikipedia Google+ research highlights Dimensions database citations Selecting a tab takes you to a page listing the items and providing links to the full texts of sources if they are available. For example, the “policy documents” tab will display citations for items published by the World Bank, European Union, United Nations, and other governmental and nongovernmental organizations, along with links to the full text of the documents that cite the retrieved source. The “Dimensions” tab displays items citing the retrieved source that are indexed in Dimensions, a database that links citations and other aspects of research impact across platforms. Metrics in Databases The huge multidisciplinary databases Dimensions, Scopus, and Web of Science feature highly developed metrics, although each presents them a bit differently. Google Scholar also offers bibliometrics—lists of cited-by references—and links to the author’s ORCID page, where you can view the author’s h-index scores, a list of their publications, links to the articles that have cited their publications, and links to coauthors and their work. On a smaller scale, bibliometrics and links to ORCID profiles can be found in collections offered by publishers, such as Taylor & Francis Online. Some open-access publishers,
whose journals are listed and indexed by the Directory of Open Access Journals, include iMetrics of one type or another on their sites. Some subject-specific databases, such as PsycInfo, offer citedby links to articles indexed in the database, as well as links to authors’ ORCID profiles.
DIMENSIONS As a relative newcomer and a deliberate disruptor of the citation index and analytics business, Digital Science’s Dimensions database operates a bit differently from Elsevier’s Scopus and Clarivate’s Web of Science (Thelwall 2018). Where other databases offer bibliometric and altmetric data, Dimensions links to additional forms of impact. Dimensions provides links not only to publications and their related citations, but also to related datasets, grants, patents, clinical trials, and policy documents. Digital Science updates the database by adding entities that have unique identifiers. One source of such records is Crossref, an association of publishers who share their publications in a database so that the DOIs can be used to link related works to each other. Digital Science also provides links to author profiles with ORCID numbers as well as items with PubMed identifiers. Also added are grant opportunity descriptions from grantmaking agencies and organizations, and Digital Science mines acknowledgments sections of publications to identify relationships between grant-making organizations and the authors receiving their grants (Hook, Porter, and Herzog 2018). Digital Science offers a tiered pricing plan, allowing researchers free access to publications, citations, and datasets at https://app.dimensions.ai/discover/publication. An institutional subscription is needed to access information about grants, patents, clinical trials, and policy documents linked to publications. The free version does not require registration, but users who register can use
additional features such as keeping a list of favorite publications, exporting data for analysis, and linking their publications to their own ORCID profile. One of the most striking things about the Dimensions interface is its commitment to recall and ranking (figure 10.1). The indexed data is massive; in mid-2022, the database included more than 128 million records representing publications, including 32 million openaccess publications, almost 12 million datasets, around 6 million grants, more than 146 million patents, about 700,000 clinical trials, and almost 750,000 policy documents. The interface does provide a search box at the top of the screen, with some limited field searching and the usual Boolean, truncation, proximity, and related functionalities. But in look and feel, the home page is essentially a results screen. Not a search results page, but instead, a list of the publication records in the database auto-loading down the middle of the screen, with the display defaulting to most-recent records first. A link near the top of the page allows the user to display datasets rather than publications. To the left of the results list are multiple filters for refining and winnowing. To the right of the results list are research subject categories and the number of publications (or datasets) in each; a chart depicting citation counts; a list of researchers beginning with the most cited; and a list of source titles (journal name, open-access repository, book), beginning with the most cited. Winnowing results for greater precision can be challenging, with no controlled vocabulary and somewhat less author disambiguation than may be achieved as development continues. The emphasis on recall and ranking reflects the purpose of the database, with its emphasis on linking and the determination of a scholar’s research impact based on the amount and type of material linked. According to Hook, Porter, and Herzog (2018), “The aim of the database is not for a user to find a single citation or a particular grant (although that should be possible). The aim is to allow a user to quickly survey a field or to place an entity in context with ease” (3). They offer as an example the scholar who learns of a new area of research at a conference and turns to Dimensions to learn more. The scholar
inputs a few keywords and looks through the results to discover who’s publishing in the area and which institutions they’re affiliated with, who’s giving and getting grants for that kind of research, and which policymaking organizations are taking notice. Additionally, for universities and research centers with paid subscriptions, the database offers analytics documenting the institution’s research impact based on the overall bibliometrics and altmetrics of its faculty.
Figure 10.1 Dimensions home page. By permission of Digital Science & Research Solutions Ltd. Given the way the creators of Dimensions have conceptualized its utility, established and new research faculty and graduate students are better served by its approach than undergraduate students are. As one reviewer put it, “Efficient and productive search success with Dimensions in its current level of development relies, in a small but notable part, upon the level of the searcher’s a priori knowledge of the topic” (Duffy 2021). Rather than looking for information on a topic, users familiar with a research area can look for a known item or a known author and let the links around them lead to new publications and people. Although the Dimensions literature downplays known-item searching, it can be a good route in, especially since there are no search tips or other cues on the home page to help users wanting to research specific topics.
Finding a known author in Dimensions will reveal the references for and citations to their work, categories of the subjects they research, altmetrics indicating the attention they’ve received, the datasets they’ve created and used for their publications, their patents (if the user’s institution is a subscriber), and so on. Datasets themselves do not have metrics in Dimensions, but there are links to the repository where a dataset is accessible, and many repositories do track views and downloads. Although some also track citations, the citing of datasets separately from the articles associated with them has yet to become a standard practice (Khan, Thelwall, and Kousha 2021). To see how much influence an article published in Nature in March of 2017 has had, do a search for it as a known item in Dimensions, using the radio button under the search box to limit the search to the title and abstract field and putting the title in quotation marks so it’s searched as a phrase: “global warming and recurrent mass bleaching of corals”. If needed, filter the results to publication year 2017 to display the one correct record (figure 10.2). The results page gives the full reference for the article along with its abstract, followed by all the references in the article and then all the citations to the article by others. Down the right side of the screen are a Dimensions badge summarizing the publication metrics, an Altmetric badge summarizing the attention the article has attracted, a list of the agencies that funded the research, broad subject categories, MeSH terms, a link to the full text of the article on the publisher’s page (which may require payment or allow you access through your institution), and a link to the article’s record in PubMed.
Figure 10.2 Full record including publication metrics in the Dimensions database. By permission of Digital Science & Research Solutions Ltd. The Dimensions badge for this article indicates that 1.7K (1,700) other items have cited it. Clicking on this badge leads to a “details” page indicating that 829 of the citations to the work have been published in the last two years, meaning the article was still being discussed and used by other researchers years after it was published. The details page also shows a field citation ratio (FCR) of 282.11. The text explains that “Compared to other publications in the same field, this publication is extremely highly cited and has received approximately 282 times more citations than average” for articles in the same research area that are about the same age. The details page also gives the relative citation ratio (RCR), which applies only to articles that received funding from the National Institutes of Health and which is a measure of how much an article has been cited when compared to other articles in the same research area. As with the FCR, anything over a score of 1 indicates above-average levels of use by other scholars working in the same research area; the coral bleaching article’s RCR of 47.03 is high.
SCOPUS Elsevier’s Scopus database includes more than 84 million records, 18 million of them representing open-access items, 1.8 billion cited references, and 17 million author profiles. It also includes records for about 250,000 textbooks, monographs, edited volumes, and reference books (Scopus 2022). Scopus calculates journal impact using its CiteScore methodology. CiteScores are available for free on the Scopus website (https://www.scopus.com/sources.uri). The same search done in the Dimensions database is done in the Scopus database to compare the databases. A search for “global warming and recurrent mass bleaching of corals” as a title in Scopus yields the one correct result. Click on the title to see the detailed, feature-filled record shown in figure 10.3. In a column on the right side of the “details” page is a list of all of the cited-by documents, that is, the articles that have cited the known item we retrieved. Scopus offers the following information:
Figure 10.3 Detailed record in the Scopus database. Copyright © Elsevier B.V. All rights reserved.Scopus® is a registered trademark of Elsevier B.V. number of articles indexed by Scopus that have cited the article (1,499); field-weighted citation impact, which takes into account different citation practices and frequency in different scholarly disciplines (61.99); total number of views of the abstract in Scopus and clicks on the link to go to the publisher’s website for the full text (693) Clicking on the link labeled “More metrics” takes you to a page showing citations by year in chart form, along with options to customize the chart by changing the date range and eliminating selfcitations and citations from books (figure 10.4). A summary of PlumX metrics shows counts of citations and views on other platforms.
Figure 10.4 Scopus and PlumX metrics for an article indexed in Scopus. Copyright © Elsevier B.V. All rights reserved. Scopus® is a registered trademark of Elsevier B.V. Clicking on the link labeled “View PlumX details” takes you to a page that breaks down each PlumX metric category into finer subcategories (figure 10.5). Important to note on this page are the links to the full text of many of the alternative sources, including
tweets, blog posts, news stories, and policy documents. Including PlumX metrics provides a deeper understanding of the reach of this article and the influence of its authors. Beyond that, the search for research impact can lead students and other researchers to government documents, working papers, and other policy texts that can be scattered and difficult to find. For example, our search for this known item, a scholarly research article about the mass bleaching of corals, leads, through PlumX, to the full text of a working paper on geospatial practices related to sustainable development by the United Nations Economic and Social Commission for Asia and the Pacific as well as a series of fact sheets produced by Australia’s Great Barrier Reef Marine Park Authority. Although not all links lead to full texts, they do provide complete publication information, which may lead to a subsequent known-item search, depending on the information seeker’s project.
Figure 10.5 Complete PlumX metrics for an article indexed in Scopus. Copyright © Elsevier B.V. All rights reserved.
WEB OF SCIENCE
Until Elsevier launched Scopus in 2004, Web of Science was the only multidisciplinary suite of citation indexes available. Web of Science is useful for finding articles on a research topic. But it’s known for its ability to help document a researcher’s impact through its citedreference search, which makes it possible to retrieve a known item and see all of the articles in the database that cite it. Acquired by Clarivate in 2016, Web of Science has been upgraded and now includes ways to see not only which articles have cited the known item you retrieved, but also how often it’s mentioned and where it’s mentioned in the citing articles (Clarivate 2021). Web of Science also includes author profiles featuring Author Impact Beamplots that provide more context on an author’s research publication impact, and how their research has been received by other researchers over time, than the simple h-index provides (Szomszor 2021). Figure 10.6 shows an author’s beamplot, which can be displayed after searching for the author’s name in Web of Science. The beamplot visualization takes into account factors such as the research field and the typical citing behavior in that field as well as the time period during which publications were issued and then cited. Along with the visualization are the customary metrics for the citation network that includes the author’s work: h-index, total number of publications, total times cited, and number of articles citing the author’s work.
TAYLOR & FRANCIS ONLINE
Figure 10.6 Web of Science beamplot. By permission of Clarivate. The Taylor & Francis Online database indexes some 4.8 million articles published in peer-reviewed journals issued by Taylor & Francis and Routledge. The home page offers a basic search box and a link to an advanced search form. Scrolling down on the home page takes you to an alphabetical list of subjects covered in the database that can be browsed. Clicking on one of the subjects, such as Bioscience, leads to a results page listing all the available journals and articles about the subject. In the case of Bioscience, the list includes 424,766 articles and, at a separate tab, 237 journals. Changing from the default display, order of relevance, to the mostcited ranking allows you to discover the most influential articles and authors in a broad subject field. Filters to the left of the results list make it easy to narrow results by article type, broad subject area, publication year, or journal title, and there’s a search box for a search-within approach to narrow results. A surrogate record for an article lists several options: the user can look at the figures and data from the article, the references to sources listed in the article, citations by other authors to this article, and metrics. Clicking on the “metrics” link takes you to a page
summarizing the bibliometrics and alt-metrics for the article, including the number of views (meaning downloads of the pdf or accessing of the html text) since its publication date. Three tallies of cited-by data are also provided: one from Web of Science, one from Scopus, and one from Crossref. An Altmetric.com badge showing the article’s attention score is also displayed. Other publishers also provide some metrics for the articles they publish. In the open-access arena, searching for articles on the Directory of Open Access Journals site can lead to articles that include a selection of metrics, depending on the journal. For example, a search for matcha health benefits yields six articles, one of which is titled “Health Benefits and Chemical Composition of Matcha Green Tea: A Review.” Clicking on the link labeled “read online” takes you to the article page in the journal Molecules, which is published by a scholarly organization in Switzerland. On the article page is the complete reference, the abstract, and buttons that allow you to view the full text, download the article as a pdf, view only the figures from the article, and export the article’s citation. Metrics are at the bottom of the page and include the number of citations in Crossref, Scopus, PubMed, PubMed Central, and Web of Science, as well as Google Scholar (figure 10.7). All have links to their lists of items citing the Molecules article (except Web of Science, which is paywalled). There’s also a chart showing full-text and abstract views over time and a link to more information about the journal, where you can find its impact factor, the number of articles it has published, and the number of its articles cited more than ten times.
Figure 10.7 Metrics for an article published in the open-access journal Molecules. Open-access articles are indexed by Google Scholar and appear on results pages along with paywalled articles. Results include the complete reference for each result along with a snippet of content, the number of citations to the article and a link to each one, and links to related articles. If an author has a Google Scholar profile, their name will be a clickable link to their profile displaying a list of their publications ranked from most to least cited, the total number
of citations to all their work, their h-index, and Google Scholar’s i10index (the number of works with at least ten citations). Also shown is a bar graph depicting the number of citations by year. A simple path to an author’s Google Scholar profile is to search their name in Google. The result will be a page about them, as shown in figure 10.8, with a short biography, links to their publications, and links to videos of their recorded speeches and scholarly presentations, among other things. On the right side of the screen is a link to their profile in Google Scholar, where, along with their h-index and i10index, you will find a list of their publications with the number of citations to each one and a cumulative total—in this case, more than 28,000 citations to all of the author’s publications. Not only can this approach be used to find out about a scholar’s research productivity and impact, it’s also a way to find references to important work on a topic if the user has the name of an important scholar working on the topic. It’s also a great way to find an author bibliography, if that’s what the information seeker needs. Clearly, the big three research metrics databases—Dimensions, Scopus, and Web of Science—offer the most elaborate metrics for gauging the impact of researchers and their publications. Nevertheless, metrics of one sort or another can often be found in subject-specific databases and on publisher and journal websites. While not as comprehensive as the multidisciplinary databases designed to showcase research impact, subject-specific databases, scholarly publishers, and individual journals providing even limited bibliometric and altmetric data can help information seekers evaluate the impact of an article and the reputation of its authors.
Figure 10.8 Google page for Naomi Oreskes. Once you start thinking about publications from the perspective of metrics rather than topics, you may start noticing other databases that feature indicators of impact for different kinds of material, including fiction. In a readers’ advisory database, for example, popularity can be interpreted as impact, and reviews and readers’ reactions to them can reveal a reviewer’s influence. Online data makes it possible to assess impact and value beyond the “bestseller” designation.
GOODREADS
Goodreads is a book-oriented social media site whose reviews are created by registered participants. The work of readers and reviewers who contribute to Goodreads can be considered data to include in author metrics. PlumX calculates the number of readers and reviews of each scholarly book from Goodreads data, with the number of readers displayed in “Captures” and the number of reviews displayed in “Mentions” (Parkhill 2014). Kousha, Thelwall, and Abdoli (2017) analyzed the utility of Goodreads reviews for assessing the broader impact of nonfiction books beyond scholarly citations. They found that books in the arts, humanities, and social sciences were more likely to have reviews on Goodreads than books in the sciences. They concluded that scholarly authors use such reviews for self-assessment and not as metrics for academic performance evaluations, since it’s possible for authors and their friends to create accounts under pseudonyms and post positive reviews on the site. As a site whose content is created largely by its registered users, Goodreads offers two ways of looking at impact aside from scholarly altmetrics. The first is the impact of its reviewers and librarians, who can be considered authors since they are writing reviews of books. The second is the impact of the Goodreads Choice Awards, an annual list of the year’s best books based on the votes of registered users. In their computational analysis of Goodreads, Walsh and Antoniak (2021) point out that Amazon, the owner of Goodreads, uses a ranking algorithm that elevates reviews attracting the most likes and comments. This results in those highly ranked reviews being more visible and getting even more likes and comments. Taking that into account, it can nevertheless be useful to consider the impact of a particular reviewer and that reviewer’s contributions over time when evaluating whose taste to trust and why to trust it. Although not as thorough and systematic as the metrics used to judge scholarly work for professional academic purposes, the number of book reviews written by a Goodreads reviewer—and the number of friends following that reviewer—can be interpreted as a
gauge of influence. Users can read the opinions of popular reviewers specializing in a particular genre to discover new books to read. At the other end of the spectrum from the single influential reviewer are the crowdsourced determinations of best reading in the form of the annual Goodreads Choice Awards, which were based on close to five million votes in 2021. Clicking on the Choice Awards and scrolling down leads to a list of the winning books, sorted by categories such as fantasy, graphic novels and comics, history and biography, and humor. Clicking on the “view results” link leads to a list of nominated books ranked in order of the number of votes received. From there, clicking on the book title opens the full record for the item, including complete bibliographic information, a summary, the number of ratings and reviews on Goodreads, and the number of people registered on the site who are reading or plan to read the book. The categories that the book fits in are links for related books. Although there’s no badge with an impact or attention score, many of these features are equivalent, in the popular reading context, to the scholarly citation databases that rank works by most cited (Goodread votes), attention altmetrics (ratings, reviews, and reading), and lists of related works categorized into research areas (genres).
QUESTIONS 1. Use one of the citation databases discussed in this chapter to search an author name or article title. Identify the available metrics for the author or the article. Record the number of citations, the h-index, the altmetric score, and other indicators of research impact. What sorts of visualizations are available (badges, charts, tables, etc.)? What are the options for exporting data?
2. Using the same author name or article title you used in question 1, answer the same questions for a second citations database discussed in this chapter. What differences do you notice between the two databases? For example, are the cited-by numbers the same or does one database display more citations for the author or article? Is the h-index the same? See if you can figure out why there are different numbers for the same item in the two different databases. 3. Choose a book reviewer or librarian on Goodreads and look at their profile. What elements shown on their profile do you consider indicative of their influence over readers on the site? How useful do you think these elements are for you when you are looking for something to read, something to suggest to others, or something to add to your library’s collection?
SUMMARY Databases, scholarly publishers, and peer-reviewed journals offer iMetrics, which indicate the impact of research through bibliometrics (quantifying and linking citations to publications by other scholars), and altmetrics (quantifying and linking citations to news media, blogs, social networking sites, and policy documents). While information seekers doing searches for topics can find such metrics helpful for interpreting the credibility and usefulness of items in their sets of results, the real purpose behind the creation and presentation of iMetrics is to help researchers document their impact in the highly competitive world of academic life, from landing tenured positions to attracting grants in support of research projects. Three enormous multidisciplinary databases—Dimensions, Scopus, and Web of Science—have developed and continue to refine the metrics that indicate how a research publication has been received by other researchers and the larger public. The ability of
researchers to showcase their contributions to knowledge production in their fields affects their ability to secure tenure, win grants, and attract professional opportunities. Recognizing the utility of and demand for such measures, smaller subject-specific databases, scholarly publishers, and both paywalled and open-access journals offer at least some metrics. Researchers themselves recognize the importance of presenting comprehensive accounts of their impact and reputation and put in the extra time needed to keep their ORCID, Google Scholar, and other profiles up to date. Academic librarians have added support for faculty and other researchers by offering instruction and assistance to help them identify and interpret the metrics that represent their research impact. Some metrics are also available to gauge the influence of authors and readers who are not in the academy. Book-oriented social networking sites such as Goodreads that can be useful for readers’ advisory queries can offer data about how books are received in the form of reviews, readers’ professed intentions to read them, and registered users’ votes for best books of the year. As metrics continue to evolve, the information intermediation role must include a commitment to staying up-to-date on developments and learning how to apply them to benefit researchers, information seekers, and general readers.
REFERENCES Bornmann, Lutz, and Hans–Dieter Daniel. 2008. “What Do Citation Counts Measure? A Review of Studies on Citing Behavior.” Journal of Documentation 64, no. 1: 45–80. Chatterjee, Paula, and Rachel M. Werner. 2021. “Gender Disparity in Citations in High-Impact Journal Articles.” JAMA Network Open 4, no. 7.
https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2 781617. Clarivate. 2021. “The New Web of Science Is Here.” https://videos.webofsciencegroup.com/watch/eXyGvogVpnULkQF CqW7QNR. Dion, Michelle L., Jane Lawrence Sumner, and Sara McLaughlin Mitchell. 2018. “Gendered Citation Patterns across Political Science and Social Science Methodology Fields.” Political Analysis 26, no. 3: 312–27. Duffy, Jane C. 2021. “Dimensions.” The Charleston Advisor (July): 9– 12. Garfield, Eugene. 1955. “Citation Indexes for Science: A New Dimension in Documentation through Association of Ideas.” Science 122: 108–11. http://www.garfield.library.upenn.edu/papers/science_v122v3159 p108y1955.html. Garfield, Eugene. 1996. “Citation Indexes for Retrieval and Research Evaluation.” Consensus Conference on the Theory and Practice of Research Assessment, October 7. http://www.garfield.library.upenn.edu/papers/ciretresevalcapri.html. Hicks, Diana, Paul Wouters, Ludo Waltman, Sarah De Rijcke, and Ismael Rafols. 2015. “Bibliometrics: The Leiden Manifesto for Research Metrics.” Nature 520, no. 7548: 429–31. Hirsch, J. E. 2005. An Index to Quantify an Individual’s Scientific Research Output. PNAS 102, no. 46 (November 15): 16569–72. Hook, Daniel W., Simon J. Porter, and Christian Herzog. 2018. “Dimensions: Building Context for Search and Evaluation.” Frontiers in Research Metrics and Analytics 3: 23. Khan, Nushrat, Mike Thelwall, and Kayvan Kousha. 2021. “Measuring the Impact of Biodiversity Datasets: Data Reuse, Citations and Altmetrics.” Scientometrics 126, no. 4: 3621–39. Kousha, Kayvan, Mike Thelwall, and Mahshid Abdoli. 2017. “Goodreads Reviews to Assess the Wider Impacts of Books.” Journal of the Association for Information Science and Technology 68, no. 8: 2004–16.
Maltseva, Daria, and Vladimir Batagelj. 2020. “iMetrics: The Development of the Discipline with Many Names.” Scientometrics 125, no. 1: 313–59. McKiernan, Erin C., Lesley A. Schimanski, Carol Muñoz Nieves, Lisa Matthias, Meredith T. Niles, and Juan P. Alperin. 2019. MetaResearch: Use of the Journal Impact Factor in Academic Review, Promotion, and Tenure Evaluations. eLife 8. https://doi.org/10.7554/eLife.47338. Milojević, Staša, and Loet Leydesdorff. 2013. “Information Metrics (iMetrics): A Research Specialty with a Socio-Cognitive Identity?” Scientometrics 95, no. 1: 141–57. Parkhill, Mariane. 2014. “PlumX Adds Further Book Support with Goodreads Metrics.” https://plumanalytics.com/plumx-addsfurther-book-support-with-goodreads-metrics/. Plum Analytics. 2022. “PlumX Metrics.” https://plumanalytics.com/learn/about-metrics/. Scopus. 2022. Scopus (fact sheet). https://www.elsevier.com/__data/assets/pdf_file/0017/114533/Sc opus-fact-sheet-2022_WEB.pdf. Thelwall, Mike. 2018. “Dimensions: A Competitor to Scopus and the Web of Science?” Journal of Informetrics 12, no. 2: 430–35. Walsh, Melanie, and Maria Antoniak. 2021. “The Goodreads ‘Classics’: A Computational Study of Readers, Amazon, and Crowdsourced Amateur Criticism.” Journal of Cultural Analytics 4: 243–87. Wilsdon, James. 2015. The Metric Tide Executive Summary: Report of the Independent Review of the Role of Metrics in Research Assessment and Management. https://www.ukri.org/wpcontent/uploads/2021/12/RE-151221TheMetricTideFullReportExecSummary.pdf. Zhang, Lin, Gunnar Sivertsen, Huiying Du, Ying Huang, and Wolfgang Glänzel. 2021. “Gender Differences in the Aims and Impacts of Research.” Scientometrics 126, no. 11: 8861–86.
SUGGESTED READINGS AND VIEWINGS Dimensions YouTube Channel. https://www.youtube.com/@dimensions8879/featured. Lasda, Elaine M. 2020. “What’s in a Metric? Data Sources of Key Research Impact Tools.” Online Searcher 44, no. 1: 57–59. Scopus YouTube Channel. https://www.youtube.com/c/Scopus_Elsevier. Web of Science Training. https://www.youtube.com/user/WoSTraining.
ANSWERS 1. In Dimensions, I found an article titled “A Survey of Autonomous Driving: Common Practices and Emerging Technologies,” by Ekim Yurtsever et al., published in IEEE Access in January 2020 (https://doi.org/10.1109/access.2020.2983149). Within two years of its publication, it had received 299 citations, as shown in its Dimensions badge. With an FCR of 221, it is considered highly cited when compared to other publications in the same field and of roughly the same age. The Altmetric badge shows a score of 24, putting it in the top 25 percent of research publications scored by the company, and the amount of attention it received in two years put it in the top 90 percent. It doesn’t seem to have many mentions in traditional or social media, but 935 readers added it to their Mendeley library. 2. In Scopus, the article has 228 citations by other articles indexed in the database, putting it in the 99th percentile. Its FWCI is
39.76, meaning it is highly cited for an article of its age and in its research field. It has a total of 58 views in Scopus. According to the PlumX metrics, it has 196 citations on Crossref and the same number of Mendeley readers who added it to their library, 935, as Dimensions reported. Since Scopus counts only the citing articles it indexes, the cited-by figure is smaller than it is in Dimensions. Researchers will probably want to use Dimensions for total counts of citing articles, since that figure will usually be larger than any count limited to a single database indexing a curated collection of journals. 3. When evaluating a reviewer on Goodreads, look at how many people follow them, their ratings, and how highly they are ranked. If they link to their Twitter, Bookstagram, and YouTube accounts, take a look at those to see the sorts of things they post and who their followers are. But also consider qualitative information, such as how well written and thorough their reviews are, the manner in which they answer questions and comments, and, of course, the genres they read and review. If you work in readers’ advisory, you can compare your library patrons’ comments about particular books (and perhaps your own reading of those books) to the Goodreads reviewer’s comments. This can give you a sense of the similarities and differences regarding what appeals to each reader/reviewer and help you determine if a particular reviewer can be relied on to introduce you and your patrons to other books that may also be appealing.
11
Search Strategies The definition of search strategy is a “plan for the whole search” (Bates 1979, 206). Thinking through the complexity of the topic and the information seeker’s purpose will help you consider the whole search as you plan your approach. You may already have a single search strategy that you use over and over. This chapter provides other options, making explicit the features and uses of five search strategies. These strategies should suffice for almost all online searches that you conduct in Boolean and web-based search systems. When choosing a search strategy, there are several factors to take into account. Some are straightforward, such as whether the user’s query is single- or multifaceted and whether the search system has a particular functionality. Others are more complex, requiring your judgment about the importance of the query’s facets relative to one another and the user’s overall grasp of the topic. The examples offered in this chapter will help you choose the right search strategy for the job. Just be aware that you may begin with one strategy and shift into another as circumstances warrant. As with all online searching, remain flexible no matter which “plan for the whole search” you choose at the beginning.
BUILDING BLOCK SEARCH STRATEGY The building block search strategy is for multifaceted subject queries. It requires the searcher to develop each facet of the query separately and complete a subsearch for each one. Then the searcher assembles the individual subsearches in a logical manner. Figure 11.1 diagrams this strategy using blocks to represent sets of results for the individual facets. The top block represents the results of a logical combination of two of the facet sets, but any number of sets can be combined in whatever ways serve the information seeker’s purpose. The building block search strategy comes in two editions: (1) the buffet edition for Boolean search systems and (2) the à la carte edition for web-based systems. These names are analogies for the ways in which searchers represent search terms in their search statements. The buffet edition is reserved for Boolean systems, equating the buffet to the database, the buffet tables to the query’s facets, and the various foods that you choose from the buffet tables to the several search terms that make up your search statements for each facet. Choosing several food items per buffet table is like choosing a controlled vocabulary term and its relevant broader, narrower, and related terms from the database’s thesaurus to construct search statements for each of the query’s facets. When using web search engines, the à la carte edition involves entering search statements composed of one search term per facet. Use the most salient search term per facet. For most queries, this means entering the names of the query’s facets. The building block search strategy has several benefits: It produces a clear search history that is easy to follow while you are conducting the search, to review and understand later,
and to explain to the user. The search history may read like the actual negotiated query. The results address all aspects of the topic that interest the user. Once the searcher scripts a search that conforms to the building block search strategy, executing it requires less judgment on the part of the searcher, and thus, this strategy appeals to aspiring intermediary searchers who are less confident about their ability to make spur-of-the-moment adjustments online. The drawbacks of the building block search strategy include the following: The searcher enters and combines search statements for all facets of the query, when, in fact, fewer facets may be needed to produce relevant results. The searcher has no idea how many results will be retrieved until the search’s final moment, when sets for each facet are combined using the Boolean AND.
Figure 11.1 Building block search strategy. The building block search strategy deserves emphasis as it is generally a good strategy for multifaceted subject searches. You could apply this strategy to all multifaceted queries, but you will encounter queries for which there are better strategies, and they deserve your consideration.
CAN’T LIVE WITHOUT THIS FACET FIRST SEARCH STRATEGY The can’t live without this facet first search strategy is also for multifaceted subject queries. Over the years, it has been known by other names: most specific facet first, lowest-posted facet first, successive fractions, or the big bite strategy (Meadow and Cochrane 1981). It requires the searcher to assess the facet analysis for a query and determine which facet must be represented in the results to be considered relevant. It is a way of identifying the one essential element of a topic. The name for this strategy is particularly vivid, and it has caught on with students more readily than its other names. The most specific facet is one that is not likely to suffer from any vagueness of indexing. It may be a proper noun, such as a piece of legislation, a person’s name, or a concept that is limited in the ways in which it can be expressed using natural language, free text, or controlled vocabulary. The lowest-posted facet is the one with the fewest results, meaning it may not turn out to be essential for the search. You may be able to draw on your online searching experience to determine which facet is likely to be the lowest posted in the
database you’ve chosen. With successive fractions or the big bite approach, the set of search results becomes smaller at each step in the search process. The can’t live without this facet first search strategy works in the same way no matter what it is called. Figure 11.2 depicts the can’t live without this facet first search strategy. A query’s facet analysis may yield several facets, but the searcher has identified one that they can’t live without. That is, the user would be dissatisfied with every item in the final result set if this facet were not represented. The searcher builds a set of results for this facet first. In a Boolean search system, the user crafts a search statement for the essential facet, using several synonymous terms combined with the OR operator. The searcher also assesses the number of results and, if few are retrieved, ponders whether any of the other facets are necessary. Making the decision to continue, the user crafts a search statement for one of the optional facets, using several synonymous terms combined with the OR operator, and then combines the set of results using the Boolean AND operator with the set of results for the essential facet. The searcher’s choice of which optional facet to search for second can be based on which facet they think will produce relevant results. The searcher combines sets for the essential facet and the second optional facet and again assesses the number of results, deciding whether more facets are necessary or whether the results are sufficient.
Figure 11.2 Can’t live without this facet first search strategy. Taking into account results for the can’t live without this facet search strategy and subsequent optional ones may compel the searcher to end the search before all facets have been incorporated. Here’s an example using the negotiated query “What effect does video game violence have on adolescents?” The query’s facets are: Video Games AND Violence AND Adolescents. Which facet is this query’s essential one? If you chose Video Games, you are right! Build a set of results for the Video Games facet first. A search for the exact subject descriptor “video games” in ProQuest’s ERIC retrieves around 2,500 results. This number is still too many to review manually, so continue the search, building a set of results for one of the optional facets. Which facet would you tackle next—Violence or Adolescents? If you are unsure, think about the results that will be produced by the Boolean AND combination of this query’s essential facet and the two optional facets:
Video Games AND Violence Video Games AND Adolescents Which combination’s results would the information seeker rather have? It might be wise to use the combination bearing the Violence facet because its results would more likely address the topic. Even though the Adolescents facet isn’t introduced into the combination, reviewing results to determine whether they are or aren’t about adolescents would be an easy and straightforward task. In contrast, using the Adolescents facet in the first optional facet search could retrieve results not relevant to the topic, such as which video games adolescents prefer, educational video games for adolescents, the impact of excessive video-game playing on adolescents’ academic performance, and much more. Combining the essential Video Games facet and the secondary Violence facet, input as an exact subject descriptor, reduces ERIC results to fewer than 150. When searching Boolean systems, keep a running total of your results each time you combine search statements using the Boolean AND (or possibly NOT) operator, just in case a combination retrieves a significantly lower number of results. In many systems, you can rely on the search history feature to keep a running total for you. Knowing that the two facets Video Games and Violence yield fewer than 150 results should make the searcher pause and ponder whether the third facet, Adolescents, is necessary. Scanning the first page of results reveals several titles bearing the words youth or adolescents in article titles or journal titles. Does it make sense to stop here, casting a wide net and advising the user to scrutinize results for research that studies adolescents? Should the searcher continue searching, entering terms for the Adolescents facet? What information could the user provide to help you make this decision? Of course, it’s easy enough to go ahead and input the final facet in the third search box and see what happens. If there are too few results, you know you have the next-to-last search results to go back to.
When searching reference databases, the can’t live without this facet first strategy is the way to go. For example, if the information seeker asks in what year zebra mussels were discovered in the Great Lakes, the facets and logical combination are: Zebra Mussels AND Great Lakes. Choose a subject-oriented encyclopedia or dictionary to answer this query, starting with a single word or phrase to represent its can’t live without this facet, Zebra Mussels. Entering the search statement zebra mussels into the online McGraw-Hill Encyclopedia of Science and Technology retrieves a three-page entry entitled “Bivalvia.” Scanning it for a discussion of zebra mussels yields this sentence: “In 1985, zebra mussels (genus Dreissena) native to the Caspian Sea area in Eurasia, were discovered in Lake St. Clair between Ontario and Michigan [and] . . . were probably transported from Europe in the ballast water of a freighter entering the Great Lakes” (2007, 144). A follow-up search in Wikipedia using this same strategy confirms the theory about ballast water, but the source used for this information is from 1988. To reconcile these two different dates, search usa.gov, where the second result links to the National Invasive Species Information Center’s page stating that zebra mussels were added to the Nonindigenous Aquatic Species database managed by the U.S. Geological Survey in 1988. Another ready-reference query answerable by an encyclopedia involves who the speaker preceding Martin Luther King Jr. was on the day King gave his “I Have a Dream” speech and what the speaker said, the facets and logical combination are: Speakers Before AND Martin Luther King AND I Have a Dream. Try Wikipedia for this query’s can’t live without this facet, I Have a Dream. Wikipedia’s entry exactly matches this facet’s name. The entry doesn’t mention the speaker preceding Dr. King, but it names and provides a link to the event at which the speeches were given. Clicking on the event’s link, “March on Washington for Jobs and Freedom,” takes the user to a list of the speakers and displays the program, revealing Rabbi Joachim Prinz as the speaker preceding Dr. King. The entry bears a link to external audio files, and clicking on it
produces a listening guide that summarizes each speech’s content, including Rabbi Prinz’s speech. Except for lengthy encyclopedia articles, entries in reference sources are usually brief, so using the can’t live without this facet first strategy is especially efficient for targeting an entry in a reference source. You just have to put time into reading the entry or following its links to other entries you might not have considered. Often, the key to answering reference questions is identifying the correct source for answering the question. Choosing sources devoted to a particular subject means you can search for only the most distinctive facet; in effect, the source has already limited results to the other facet or facets. For example, a user wants to know where African American circus performers wintered during the early twentieth century. Instead of representing this query’s Twentieth Century and African Americans facets in search terms, you can include them automatically by selecting the Black Historical Newspapers and the African American Newspapers databases and by limiting results to early twentieth-century publication dates. All that’s left is searching for the Circus facet, using the facet name and the names of famous circus performers and troupes (if known) as search terms. Reading the retrieved articles may reveal where they spent the off-season. The benefits of the can’t live without this facet first search strategy are: It requires less time and effort on the part of the searcher. It permits the search to be completed at an earlier point in time than the building block search strategy does. It retrieves relevant sources that may be missed when all of the query’s facets are represented in the search. The drawbacks of the can’t live without this facet first search strategy are: Sometimes the query has no obvious can’t live without this, most specific, or lowest-posted facets that make this strategy
possible. Determining the lowest-posted facet may be difficult in the absence of such search aids as thesauri and may prove difficult if the searcher lacks experience searching databases. Failing to represent all facets in the search may result in many postings that the user must scan to find the relevant ones.
PEARL GROWING SEARCH STRATEGY The pearl growing search strategy is a series of searches conducted to find relevant search terms to incorporate into follow-up searches (Markey and Cochrane 1981). It is the most interactive of all the search strategies, requiring your full attention as you: scan the first set of results for relevant terms, assign the terms to the appropriate facets, make on-the-fly decisions about representing them using the database’s controlled vocabulary or free-text terms, formulate search statements, and combine these statements to produce relevant results. Useful for Boolean systems and web search engines, the pearl growing search strategy works well when you have little time to prepare for the search in advance. The pearl growing search strategy is not only a series of searches, but also a series of search strategies. You may start with the can’t live without this facet first strategy, transition to pearl growing to retrieve more results, and conclude with the building blocks strategy as a result of finding several relevant search terms per facet through pearl growing.
To initiate the pearl growing search strategy, enter a search that is faithful to the à la carte edition of the building block search strategy. Do so regardless of your chosen search system. Enter the most salient search term for each facet; for many queries, these are the names of the query’s facets. Review results; identify relevant terms in the title, subject, author-supplied keywords, and abstract fields of surrogate records; distribute them into their respective facets; and formulate everything into search statements. If you fall short of relevant terms, skim full texts for more. Then follow up with a search using the building block or can’t live without this facet first strategy and the search terms you gathered from surrogates and full texts. Figure 11.3 depicts the pearl growing search strategy as a set of results for a central term, such as a keyword, and the additional results that can be retrieved by mining the first set for descriptors and synonyms. The pearl can keep growing with additional keywords, related and narrower subject terms, and classification codes.
Figure 11.3 Pearl growing strategy. Let’s walk through an online search using the pearl growing search strategy. The negotiated query is “Is religious practice a necessary ingredient to prevent alcoholics from relapsing?” This query has three facets and its logical combination is: Religion AND Alcoholism AND Prevention. Choose EBSCOhost’s PsycInfo database, and conduct a naturallanguage search using this query’s three facet names as keywords. Such a search enlists the à la carte edition of the building block search strategy. The EBSCOhost system gives you several ways to gather search terms from this search’s 122 results: Clicking on the “Subjects: Major Headings” and “Subjects” links in the list of filters on the left side of the results page to see
subject descriptors for the primary topics covered in the set of results Clicking on the “Classification” filter’s link Scanning full surrogate records to see titles, author-supplied keywords, and abstracts Following up on the most frequently occurring subject descriptors by browsing them in the APA Thesaurus of Psychological Index Terms, linked in the APA PsycInfo navigation bar, to identify relevant broader, narrower, and related terms. Table 11.1 shows relevant search terms collected from the “Major Headings” and “Subjects” filters, as well as keywords from the titles, abstracts, and other fields on the surrogate records. It distributes them into facets and identifies them as SD (subject descriptors) or CC (classification code captions). Table 11.1 PsycInfo Subject Descriptors and Classification Captions Extracted from Results Using the Pearl Growing Search Strategy Facet Search Term SD or CC Religion religion SD, CC spirituality SD religious beliefs SD Alcoholis drug & alcohol usage (legal) SD m alcoholism SD, CC alcohol intoxication SD alcohol abuse SD alcohol drinking patterns SD
Preventio health & mental health treatment & prevention n drug & alcohol rehabilitation prevention sobriety drug rehabilitation drug abuse prevention alcohol rehabilitation alcoholics anonymous
CC CC SD SD SD SD SD SD
There are so many relevant PsycInfo subject descriptors and classification captions for each facet that it might be overkill to use them all; as always, consider the level of recall needed for the information seeker’s purpose. Other queries, such as those involving new or obscure topics, might present more of a challenge, particularly when identifying subject descriptors. In such cases, the searcher may choose search terms from titles, author-supplied keywords, abstracts, and possibly full texts. With the experience of the initial à la carte search, you know that there is plenty of information on the alcoholism topic. Follow up with a full-fledged Boolean search using table 11.1’s subject terms and the buffet edition of the building block search strategy. Figure 11.4 shows the search history for the follow-up search. When the three sets are combined with AND, the results include such relevant titles as “The Role of Spirituality in Treatment Outcomes Following a Residential 12-Step Program,” “Spirituality during Alcoholism Treatment and Continuous Abstinence for One Year,” and “Alcoholics Anonymous: Cult or Cure?” The pearl growing search strategy is often used as a follow-up to searches yielding small sets of results. A perfect example is the eating disorder query, a search topic that consistently produces few relevant results for keyword and subject descriptor searches. You should be able to double this query’s relevant results using the pearl growing search strategy, and then put your new relevant results to work finding even more relevant results using follow-up citation,
cited-reference, author-bibliography, and find-like searches. Such extensive searching is characteristic of the type of assistance given to a doctoral student, faculty member, or other researcher who is investigating a new topic for which there is little in the published literature or conducting a comprehensive search for a literature or systematic review. These are the benefits of the pearl growing search strategy:
Figure 11.4 Results sets for controlled-vocabulary searches as part of a pearl growing strategy. By permission of EBSCO Publishing, Inc. It has the potential to find additional relevant results for searches conducted in Boolean systems and with web search engines. It can be used alone or with other seach strategies to find relevant sources. Following up with more than one search can increase the size of results sets using searches governed by the building block search strategy or can’t live without this facet first strategy. When conducting a pearl growing search from scratch, little presearch preparation is needed, beyond entering facet names using natural-language search terms. These are the drawbacks of the pearl growing search strategy: You must be a confident searcher, knowledgeable about online searching, and thoroughly familiar with the search system’s
capabilities to accomplish the on-the-spot thinking and search formulation that the pearl growing search strategy requires. Following fruitless paths could result in greater use of the search system and more time spent online than you or the end user had expected. For example, if you head down an unproductive path, you might have to save results, ponder them offline, and then return to the search. Using relevant results to find additional ones can get complicated, requiring the full spectrum of the searcher’s expertise as well as the information seeker’s knowledge about the subject matter. Experience, patience, and perseverance are required for conducting effective pearl growing searches.
SHOT IN THE DARK SEARCH STRATEGY The shot in the dark search strategy is for single-facet queries. The Free Dictionary gives two definitions for the phrase “shot in the dark”: (1) “a wild unsubstantiated guess,” and (2) “an attempt that has little chance at succeeding.” Let’s examine single-facet queries to determine what is going on with this strategy’s name. Some queries that are candidates for the shot in the dark search strategy consist of one word: fashion ADHD censorship Oracle
Others are expressed in several words that form a phrase so familiar and commonplace that it has come to represent a single idea, concept, or theme. Examples are: capital punishment graphic novels cerebral palsy global warming Scanning these queries should leave you puzzled, wondering what it is about each one that interests the user. You might even say to yourself that these queries are impossible, that even if you searched for one of these, there would be so much information that the user couldn’t possibly read it all, and that the reference librarian didn’t conduct a thorough interview to find out which aspects of these topics interest users. And you would be right. Such singlefacet subject queries are impossible to answer. When you detect a single-facet subject query, negotiate the query, and if the user insists on sticking with the single-facet query and is in a hurry, do a quickand-dirty search in a relevant database. It’s quick because no one has given the topic much deliberation and it’s dirty because the set of results will have a lot of irrelevant material. Nevertheless, the results page may help the user identify an angle on the topic. Single-facet known-item queries can be answered more easily. Before proceeding with them, you have to continue the negotiation and find out exactly what the user wants. Doing so helps you determine which database has the potential to find answers. Examine the queries listed earlier. Ask the user for more specifics about aspects of the topic that are of interest. If the user’s interest in cerebral palsy involves the major charitable association for this disease, then this is a known-item search, sending you to the web where you might consider limiting your search to the .org domain. If instead the seeker wants to know about the disease, then this is a subject search appropriate for the friends strategy. The Oracle query is vague. In the reference interview, find out whether it means Oracle Inc., or the oracle that past and present
cultures consult for advice or prophetic opinions. If the former, then ask the user to tell you more about their interest in Oracle. Perhaps they want to learn about Oracle database products, read trade news about Oracle in the business electronics industry, find advice regarding how to get a job at Oracle, or something else. When you enter names or proper nouns into the Google, Bing, or DuckDuckGo search boxes, the official websites of well-known people, organizations, places, programs, and projects are at or near the top of the list. It may be that the Wikipedia page about the entity will prove helpful for background information and for ideas about narrowing the topic. The entity’s own website can also inspire new directions for the search. Adding a facet to provide context for a single-facet query that is a proper noun may be the only way to find relevant information when the noun is a common word or phrase, especially in databases that do not include lists of authorized headings or descriptors that are proper nouns. When you are searching for information about a company in a database, it’s good to check the subject thesaurus, since some descriptors are the authorized names of companies. For example, the Gale OneFile Business database subject guide search, linked in the navigation bar above the search box, presents a list of descriptors that begin with the word Oracle and include more information to indicate whether it’s the database management system, the e-commerce software, the US company, or other related entities. The subject guide shows that there are more than 20,000 records tagged with the descriptor Oracle America Inc., and a dropdown arrow below that subject term lets you narrow the search (figure 11.5). The subdivision of the “Oracle America Inc.” heading with the subdivision “Advertising” has 130 postings, for instance.
Figure 11.5 Company name and results by topic in the Gale OneFile: Business subject guide. Cengage Learning Inc. Reproduced by permission. www.cengage.com/permissions. In EBSCOhost’s Business Search Ultimate, go to “More” in the navigation bar, then select “Indexes,” and use the “Browse” dropdown menu to select “Company Entity.” Input Oracle to see a list of all the company entities in alphabetical order beginning with the word oracle; as shown in figure 11.6, Oracle America Inc. has 55 postings. Farther down on the screen, the multinational parent company Oracle Corp. is listed with 9,645 postings.
The benefits of the shot in the dark search strategy are: It acknowledges that this strategy is most successful when applied to known-item queries bearing one facet, making it easy to recognize which queries are likely to succeed as single-facet searches. It encourages shifting single-facet subject queries to the friends search strategy, described next, where they have a greater chance of success as a result of that strategy’s tools to help users further develop their queries. The drawbacks of the shot in the dark search strategy are: It can waste time on quick-and-dirty searches whose results add to the user’s confusion rather helping them learn more about their topic to refine their subject query. Single-facet queries for known items that are one-word titles or common author names might yield lots of results, requiring searchers to introduce a second facet to reduce results to a manageable size.
GETTING A LITTLE HELP FROM YOUR FRIENDS SEARCH STRATEGY
Figure 11.6 Browsing company entity field index for Oracle. By permission of EBSCO Publishing, Inc. When the reference interview reveals a single-facet subject query, this is a sign that more negotiation is necessary to discover exactly what the information seeker wants. If the seeker has a hard time articulating their information need, call on your “friends” to help the user out. Friends can be system features or specialized reference databases that have been designed with this very purpose in mind— helping users whose interests are undefined, unsettled, or vague. Which friend you recommend to the user depends on your assessment of the situation and the available sources, whether paywalled information resources at the library website or freely available web resources. Consider how our example queries would benefit from the friends that follow. Dictionaries and Encyclopedias
If users cannot articulate their topics, it may be because they don’t know much about them. Recommend a dictionary or encyclopedia that discusses broad-based topics at a basic level, defines topics, summarizes state-of-the-art knowledge, and cites sources to go to for more in-depth information. These sources may be general or disciplinary, depending on which treatment you think matches your user’s understanding and technical expertise with the topic. Several dictionaries and encyclopedias may be accessible on a single platform from a single vendor, such as the Wiley Online Library, with a feature allowing you to limit results to reference works, and the Gale Virtual Reference Library, where you can limit results to its General Reference Collection. On the web, the freely accessible site https://www.encyclopedia.com/ yields results from a variety of authoritative reference sources. The information seeker will learn about the concepts and issues related to their topic, but they also may find some citations to further information. The items cited may be a bit dated, so a follow-up search for more recent periodical or news articles may be needed. Topic Finders and Filters When users are having trouble finding a focus for their single-facet subject query, a database topic finder or hot topics feature is useful to get the search started. Gale databases offer a link to the “Topic Finder” in the navigation bar. You can input a single keyword or phrase and see a clickable visualization of the broad topic and its subtopics discussed in the articles indexed in the database. Clicking on one of the tiles in the visualization opens a window listing articles about that topic. In figure 11.7, the “Topic Finder” visualization of a broad search for cybersecurity includes the subtopic artificial intelligence. Other database search interfaces may include a clickable list of currently popular articles or trending topics; these can be especially helpful when the seeker has been inspired to find out more about something trending in the news or on social media.
Browsing For children still learning basic concepts, vocabulary, and spelling, and for adults new to a particular topic, browsing can be a productive path. Rather than inputting a keyword into a search box, the information seeker can start at a broad category link and drill down through subcategories and sub-subcategories to narrow down their query and to get ideas about key concepts, words, and phrases to use in a search. Figure 11.8 shows a browse visualization that leads from the main category “Plants” in the Gale in Context: Elementary database for K–12 students. The results include a “words to know” section for younger students and a “main ideas” section for older students. The free Encyclopaedia Britannica at https://www.britannica.com/ has a “Browse” link in its navigation bar as well, or the searcher can click on one of the categories listed under the search box to begin drilling down through a subject category. The benefits of the getting a little help from your friends search strategy are: It acknowledges that users might not be able to articulate their queries and provides them with tools to further develop their thoughts about their topics.
Figure 11.7 Gale OneFile: Business “Topic Finder” visualization of the subject cybersecurity, with the subtopic artificial intelligence selected. Cengage Learning Inc. Reproduced by permission. www.cengage.com/permissions.
Figure 11.8 Visualization of a browse starting with the category “Plants” in the Gale in Context: Elementary database. Cengage Learning Inc. Reproduced by permission. www.cengage.com/permissions. Its tools organize knowledge in systematic ways that users can explore to further develop their queries.
The drawbacks of the getting a little help from your friends search strategy are: Not all search systems and databases are equipped with friendslike tools, such as filters and topic finders. Because friends tools may require a little extra time, know-how, effort, or perseverance on the part of users, some might not want to use them, preferring speed over strict relevancy.
CHOOSING A SEARCH STRATEGY When you choose a search strategy, five external factors help you make the decision: 1. If you are working with a subject search involving more than one facet, try the building block search strategy’s buffet edition in a Boolean-based search system or the à la carte edition on a web search engine. If you have little time to prepare in advance, begin with the à la carte edition. Then extract relevant search terms from the results, enhance your search statements with them, and move on to the can’t live without this facet first strategy or one of the two building block editions for more relevant results. Pearl growing undergirds this multiple-strategy approach. 2. If one facet must be present for results to be even marginally relevant, choose the can’t live without this facet first strategy. In a Boolean system, this means first building a set of results for the essential facet, evaluating search results, and entering one or more remaining facets if needed. Keep the can’t live without this facet first strategy in mind when consulting reference
sources. Their entries are usually so short that reading them may be quicker than populating your search statement with search terms representing multiple facets. Plus, the more requirements there are in your search statements, the less likely reference sources will produce any results. 3. If searches from your building block search strategy or can’t live without this facet first strategy retrieve few results, extract relevant search terms from the few results and enhance your search statements with them. This is the pearl growing search strategy. 4. If the seeker’s query bears one facet involving a known item, use the shot in the dark search strategy. Try searching citation fields of surrogate records, such as author, title, or publication title. 5. In the case of a one-facet subject query, begin with the getting a little help from your friends strategy, which enables the user to further develop and refine the topic. Right now, search strategy selection might look complicated, but in time, you will find yourself moving between strategies with ease.
QUESTIONS The following are negotiated queries. Conduct a facet analysis and logical combination for each query, taking into consideration the suggested databases. Then choose one or more search strategies, and provide a rationale that describes why you chose the strategy or strategies. Carry out the search in a suggested or related database, evaluate the results, and discuss what you might do next for more relevant results.
1. ERIC database: A teacher who teaches social studies at the high school level wants instructional materials for a unit on cultural awareness. 2. Agricultural and Environmental Sciences database: Agent Orange. 3. Gale in Context: Elementary (If not available at your public library’s website, try your state library’s website.) A fourth grader needs to write a report on an animal but can’t decide which animal to study. 4. Nexis Uni: Recent news about and sources written by Naomi Shihab Nye for a university committee preparing to offer her an honorary doctorate. 5. Infoplease, https://www.infoplease.com: How many vice presidents have been from Texas? 6. Web search engine of your choice: Making money as a travel blogger. 7. PsycInfo: Do college students who get more sleep perform better academically than those who don’t?
SUMMARY This chapter presents five search strategies that should suffice for almost all online searches that you conduct in Boolean and web search systems. Defined as a “plan for the whole search” (Bates 1979, 206), a search strategy requires the searcher to take stock of the number of facets in the query; the importance of these facets relative to the user’s interests; the closeness of these facets with respect to the database’s discipline; the system’s or search engine’s functionality; and the user’s knowledge and understanding of the topic and its broader subject area. Determining which search strategy to use demands “all the knowledge one can gain about online searching systems and all one can learn about indexing
vocabularies and the conventions practiced in . . . data base construction” (Meadow and Cochrane 1981, 133). For the foreseeable future, you might feel comfortable sticking with the building block search strategy for the majority of subject queries. When you’re ready to expand your repertoire, this chapter’s search strategy examples indicate what to select based on the number of facets in negotiated queries. As your confidence grows, experiment with other strategies, particularly the can’t live without this facet first and pearl growing search strategies, because they will enable you to increase your efficiency and effectiveness as an expert online searcher.
REFERENCES Bates, Marcia J. 1979. “Information Search Tactics.” Journal of the American Society for Information Science 30 (July): 205–14. Markey, Karen, and Cochrane, Pauline A. 1981. Online Training and Practice Manual for ERIC Data Base Searchers, 2nd edition. McGraw-Hill Encyclopedia of Science & Technology. 2007. 10th ed., s.v. “Bivalvia,” by Michael A. Rice. New York: McGraw-Hill. Meadow, Charles T., and Pauline A. Cochrane. 1981. Basics of Online Searching. New York: Wiley.
ANSWERS 1. Find instructional materials on cultural awareness for high school social studies. ERIC database.
Facets and Logical Combination: High School AND Social Studies AND Cultural Awareness AND Instructional Materials Buffet edition of the building block search strategy: This query requires all four facets for results to be relevant. In the ERIC thesaurus there are at least a handful of descriptors for representing each facet. Too few results mean following up with the pearl growing search strategy. 2. Agent Orange. Agricultural and Environmental Sciences database. Single facet: Agent Orange Shot in the dark strategy: The user’s query bears one facet for a subject search. Single-facet subject queries are impossible to answer because there is so much information available for them. You can try a quick-and-dirty phrase search, but it’s probably best to divert to the friends strategy. Getting a little help from your friends strategy: In this case, the database’s friendly thesaurus will show you the subject descriptor for this query’s lone facet. Run the search, and then show the user the subject-related filters on the results page to identify a subtopic. 3. An animal. Gale in Context: Elementary. Single facet: Animal For the student who doesn’t know which animal to study, the getting a little help from your friends strategy makes sense, especially if your friend is a list of categories to browse. Begin with the “Animals” category, then let the student make choices. For example, they may click on “Fish and Sea Creatures,” then “Starfishes.” A number of sources will be offered, including an overview that indicates the alternate name “Sea Stars.” Those sources may be sufficient, but if the student wants or needs more, try searching for the term sea star using your favorite web search engine, then choose results appropriate to the child’s educational level, such as the National Geographic’s Kids site or a fact sheet provided by the Seattle Aquarium.
4. Recent press about and sources written by Naomi Shihab Nye. Nexis Uni. Single facet: Naomi Shihab Nye Shot in the dark strategy: The user’s query bears one facet, the individual’s name. Search for newspaper articles limiting her name to the byline field. For articles about or at least mentioning her, search her name in the full text. 5. How many vice presidents have come from Texas? Infoplease. Facets and logical combination: Vice Presidents AND Birthplace Can’t live without this facet first strategy: Because you’ve chosen Infoplease, a reference da tabase, this strategy is the way to go, starting with the essential Vice Presidents facet. Infoplease responds with a list of annotated results, including one referencing “state of birth.” 6. Making money as a travel blogger. Web search engine. Facets: Travel, Blogging, Income Logical combination: Let the search engine algorithm do its job. À la carte edition of the building block search strategy: Successively enter search statements bearing one term per facet. Some to try are travel blogging income, travel blogger earnings, and travel bloggers money. 7. Are college students who get enough sleep more successful academically than students who don’t? PsycInfo database. Facets and logical combination: College Students AND Sleep AND Academic Performance Can’t live without this facet first strategy. Start with the Sleep facet first because it must be present for the user to consider results that are even marginally relevant. Then proceed with Academic Performance. Add the College Students facet only if the first two sets of results are not relevant or too many in number. Too few results mean following up with the pearl growing search strategy.
12
Working with Results The seventh and final step of the online searching process involves working with search results. The foundation is understanding what you’re seeing on the search results page and assessing the characteristics of the results to gauge their relevance. Search tactics come to the fore the moment the reference interview begins and continue until the search ends. They are important tools not only for launching a search but also for revising search statements in response to your assessment of each set of results as the process proceeds. For the researcher, the hard work of reading, comprehending, and using the information you’ve helped them find begins as the search itself is ending. But even here, you can offer some methods to help them cite, manage, and update results efficiently. Chapter 8 discusses working with the search engine results page. This chapter helps you learn to understand and assess results retrieved in databases, using preliminary results to revise search tactics for better results, and managing results for efficient use and reuse.
UNDERSTANDING SEARCH RESULTS
Taking a close look at search results screens can reveal much about each individual retrieved item and the set of results as a whole. If you think of your initial search statement as an informed theory about what the user needs and what the database offers, you can think of the results screen as the outcome that confirms your theory or indicates where it needs adjusting. Many of the tools for making adjustments are conveniently located on the results screen. Some search systems display the most recent results first while others display relevance-ranked results based on the database vendor’s proprietary algorithm. Systems usually offer a way to change the default results if you prefer to display them in a different order. No matter what order the results are displayed in, the intermediating information professional and their customer or client will need to assess results for their relevance to the specific topic and purpose of the search. Commercial subscription databases often present results as a list of brief records—excerpts of the complete records—with a limited number of records on each search screen. The brevity of the records and their limited number make it easy for the searcher to skim to get a perspective on how the search is going and whether anything helpful can be mined, either for use by the information seeker or to craft a follow-up search. Figure 12.1 shows the first page of search results for the keyword search sports AND concussions in the Gale General OneFile database. Many state libraries make this database available to public library users and high school students, and many academic libraries offer it to college students along with the more research-oriented Gale Academic OneFile database. With such a wide range of users and topics, it is especially important to understand the results pages so you can help others understand them as well. General OneFile indexes a variety of different types of information, including popular and trade magazines; some scholarly journals; news, images, and videos; and some reference-oriented books. Most of the indexed articles are full text, but some retrieved results will be surrogate records giving only the citation or citation and abstract. All of these characteristics of the database come into play when you are assessing search results.
Figure 12.1 Search results in the Gale General OneFile database. Cengage Learning Inc. Reproduced by permission. www.cengage.com/permissions. The first page of General OneFile search results lists the amount of each type of item retrieved, with only the magazine stories shown in the list of brief records. (In General OneFile’s companion database, Academic OneFile, the default is to display journal articles rather than magazine stories.) To see other types of items, it’s necessary to click on the link to the desired type. Twenty brief records are displayed per page. Brief records include the citation— the item’s title, author names (if there are any), magazine name, date, and page numbers where the article appears—as well as the number of words in the story and a label indicating if it is an article, a brief article, a book review, an editorial, or a cover story, among other designations. The first record in the results list in figure 12.1 would be easy to misinterpret if you considered only the title, “Concussion Book.” You might assume the database is offering an e-book about concussions. The tags for this item show it is a brief article of 166 words that was published in a periodical with the name PN–Paraplegia News. This suggests it is a book review rather than an e-book, but it isn’t
tagged as a book review. Instead, it is a short announcement that the book has been published. The article may be helpful only to let the information seeker know such a book exists, and they may decide to look for it in the library catalog. For the student seeking an authoritative article from a journal, magazine, or newspaper, this particular item on the results screen doesn’t fit their purpose. Some brief records are marked as including citations only or citations with abstracts, and those usually include a resolver link for finding the full text in another of your library’s subscription databases. Brief records representing full-text magazine and newspaper stories include the first few lines of the abstract, if there is one, or of the text, if the full text is in the database. The title given in each brief record is a clickable link that will take you to the full record for the item. In some cases, the full record is the full text. In other cases, the full record is not much more than the brief record, the citation of the item, and the full abstract, if there is one. Because many articles in General OneFile are quite short, some with displayed word counts of fewer than one-hundred words, you’ll want to make sure the information seeker knows the difference between an abstract and a full article. Students writing research papers need to understand that an abstract merely summarizes the full source and is in itself an incomplete and insufficient source. In addition to the article title, the name of the periodical or newspaper in which the article was published is a clickable link on the brief record. Clicking on a publication name will take you to a record representing the periodical or newspaper, with links to issues in newest-first order. To see what’s in a particular issue, click on the date. This can be useful for those who want to skim articles issue by issue for the full run of the periodical and for known-item searching when the name of the magazine, journal, or newspaper is known. To the right of the list of brief records are filters for publication date, subjects, document type, and publication title, as well as a search-within feature for adding a keyword or phrase for the system to find in the set of already retrieved results. Each filter must be clicked on to see the range of dates, the list of subject terms, the various document types, and the list of publication titles. Full
records, whether they consist only of citations and abstracts or include the full text, offer additional links. A box titled “Explore” to the right of a full record display suggests “More Like This,” with links to a few stories or articles on the same or a closely related topic, and it also provides “Related Subjects” in the form of clickable descriptors. In Gale databases, the brief record screen offers titles and snippets to help the user make choices and get easy access to the full text, when it’s available. The database is intended for a more popular, less academic set of users, signaled by the word “General” in its name, and it is easy to get from a brief title to a full text, especially for magazine and newspaper stories. But it’s really the full record screen, where the full text is found, that offers the user other paths into the indexed material by linking to similar articles and by displaying subject descriptors that can be clicked on to find more articles on the topic.
Figure 12.2 Brief records and filters on the Academic Search Ultimate results page. By permission of EBSCO Publishing, Inc. Figure 12.2 shows the brief records results page in a researchoriented database on another vendor’s platform. EBSCOhost’s Academic Search Ultimate indexes research articles published in
peer-reviewed scholarly journals as well as articles from other periodicals, research reports, books, and videos. It is designed for college and university students and researchers. Its results page offers some of the same features as Gale’s General OneFile. It displays brief records and the user must click on the article title to see the full record, just as in General OneFile. The default is to display fifty brief records with the newest first. A drop-down menu lets you change to a relevance-ranked list or sort oldest first or by author name or publication title. There are also options for displaying detailed full records rather than brief records and for displaying fewer than fifty records per screen. While Gale results page filters are shown to the right of the brief records list, the EBSCOhost results page filters are to the left of the results list. EBSCOhost offers more filters, such as language and geography, than are offered on the Gale General OneFile results screen. In both the Gale and EBSCO systems, the results page lets you see at a glance how many of each document type (e.g., scholarly journal articles, magazine stories, news) are in the set of results. Unlike the General OneFile display of magazine stories first, Academic Search Ultimate offers a single results list that includes all document types. For developing a further understanding of the set as a whole, EBSCOhost provides a filter showing the subject descriptors used in the set and the number of items in the set tagged with each descriptor. The checkbox next to each subject term makes it easy to limit results to only those assigned an especially relevant descriptor, similar to a search-within approach but with known descriptors. The list of descriptors and their occurrences in the set can also give you a sense of the proportion of the different topics present in the set, which may offer clues about what topics are missing and which need more or less emphasis when you revise your search statement. The frequency of a word or phrase denoting the topic in titles and headlines can suggest ways to formulate a free-text search. The recurrence of authors’ names may offer a clue about who the experts on a topic are, perhaps prompting an author-name search in another database or an open-access repository. Similarly,
the recurrence of a journal name may indicate the need to try a high-recall search statement but limit results to that one journal.
ASSESSING SEARCH RESULTS For information seekers, especially students working on research projects, understanding and assessing search results may involve new and still-forming skills. Their assessments may include considerations beyond the topical relevance of the articles in sets of results, such as whether full texts can be quickly downloaded or whether an article is too heavily laden with disciplinary jargon or theoretical concepts. Librarians specializing in helping students develop their information literacy have designed checklists that can be used to guide the assessment of search results and particularly the evaluation of individual items that a student is considering using for a paper or project. Sarah Blakeslee (2004) coined the acronym “CRAAP” as a memory aid for the five elements that students should consider as they are evaluating a source such as a journal article, magazine story, or news item. The acronym stands for Currency Relevance Authority Accuracy Purpose Critics of the CRAAP approach and similar checklists have pointed out that the evaluation of websites requires different knowledge and skills than the evaluation of well-established document types including books, articles, and research reports that have been scrutinized by editors, and in the case of scholarly material, by academic peers in the same field of research. While the traditional
checklists are helpful for students new to evaluating individual articles, their criteria do not transfer well to web pages. Wineburg et al. (2020) studied the ways in which students assessed websites and found that 66 percent did not understand that a “news” story they read was published on a satirical site. They also studied students using a cloaked site whose publisher was not revealed but that claimed to be nonpartisan. The creator was an industry stakeholder pushing a policy beneficial to them. Wineburg and his colleagues found that 85 percent of the students failed to check other websites or sources, taking at face value what the site said about itself. In a blog post on the School Library Journal website, Joyce Valenza (2020) discussed those research findings and wrote, “Well before they reach the university, it’s time to put down the checklists and the score cards. . . . In many cases the students failed using approaches we’ve promoted.” Much of the web remains a collection of self-published websites and web pages, where material hasn’t passed through even a basic vetting and editing process. In such cases, students will need to develop skills and habits that help them evaluate web-based information and find credible sources (Caulfield 2017; Cooke 2018). Mike Caulfield (2019) has created the acronym SIFT to stand for the process users should undertake when evaluating web-based information: Stop and ask yourself what, if anything, you know about the website before reading what’s on it. Investigate the website to discover what individuals or organizations publish and maintain it. Find better information. Trace statements back to the original post or publication. Information intermediators can help students investigate who’s behind a website, find better information, and locate the original source. What the student finds and fails to find while SIFTing through web search results on their own can be elicited when the information intermediator conducts a reference interview, and the
student’s experience can fuel a revision of search tactics for better results. An advantage of using an established subscription database is that it indexes articles that have already undergone a vetting process, whether by scholars who reviewed their peer’s work and recommended improvements or by editorial staff at a magazine or newspaper. For such work, there’s a built-in layer of scrutiny surrounding the authority and accuracy of a published article, although information seekers should nevertheless stay alert to what authority and accuracy mean in the specific context of their project. A lot of material on the web has also gone through a vetting process, including research reports and data from federal scientific agencies such as NASA, newspaper and news magazine sites, and final versions of open-access articles published in scholarly journals. For web-based information from trusted sources (it’s worthwhile to make sure you’ve landed on the real site and not a spoof), the same sorts of evaluation criteria used to assess an article published in a scholarly or trade journal apply. Reading a full-text source to assess its value for the project at hand is too time-consuming during an active intermediated search. It can be productive to do a short technical reading focused on the sections of sources that are the most important for an overall understanding. For research sources, this means skimming: Title and abstract to gauge what the article is about Introduction to understand the purpose and significance of the research Methods section to find out how the authors designed and conducted the research project Findings and conclusions, where the researchers state their most important insights Magazine and news articles are more varied and often don’t have subheadings clearly labeling each section. At the very least, users should skim the source for this information:
Title, abstract, and introduction Headings and subheadings to understand the organization of the material presented and the main topics included Conclusion, which may summarize the whole article or underscore the most important points Currency may matter more in some fields than others, and teachers and instructors may specify the use of articles published within the last three years if the topic is in a science, technology, engineering, or medical area. Consequently, students may prefer using the publication date filter rather than additional topic-centric refinements. For other fields, such as geology and history, older publications may be just as relevant and useful as newer ones. Students may need help interpreting the relative importance of topic and timeliness. Often users are seeking sources that correspond entirely with their interests and perspectives. Even with the best of search strategies and the most pointed assessment of results, it is rare to find such sources. Instead, seekers have to read promising sources to determine whether and how the sources address their interests. Suggest to users that their chosen topics and their relevance assessments of retrieved sources are shaped by their exposure to new information, which continually shifts their points of view, right up to the moment they complete a project and even beyond. They will need to synthesize what they learn into a paper, speech, or project that fulfills the requirements of an academic assignment, answers that address their questions, or a decision that enables them to take their next steps. In the course of synthesis, gaps may become apparent and new questions may arise, leading the seeker back to the library, the information professional, the database, and the web.
SEARCH TACTICS Search tactics are the strategic moves searchers make to further the ongoing search (Bates 1979). To execute a search tactic, the searcher may wield one or more system features. Search tactics are cross-cutting: searchers apply them when it is appropriate to do so at any step in the process and no matter what search strategy is being used. Table 12.1 categorizes selected search tactics into types, defines them, and lists steps of the search process where the searcher is most likely to perform these tactics. The table draws on classic and still pertinent search tactics articles by Marcia Bates (1979; 1987) and Alistair G. Smith (2012); the notations (B) and (S) indicate tactics are from Bates or Smith. The table groups tactics into five categories, with a short definition for each type: Search monitoring File structure Search formulation Search term Evaluation Under each category are the related tactics. Some of these will be familiar from previous chapters, but they are brought together with new tactics in a single list for quick reference. Search Monitoring Understanding and assessing results are essential elements of search monitoring that help keep the search on track and efficient. As the search unfolds, check with the information seeker to gauge how relevant the results seem to them, and correct any errors or missteps. Keep track of how you’ve searched and notice patterns in the results that can help you revise the search. Weigh the costs and
benefits of continuing the search given the results already retrieved and the seeker’s purpose. Table 12.1. Search Tactics Tactic Definition Search Monitoring: Tactics to keep the search on track and efficient Check Check your search to make sure it reflects the negotiated topic. (B) Correct Correct logical and strategic errors. (B) History Keep track of where the search has been, where it is now, and where it is going. (B: Record) Pattern Alert yourself to familiar patterns so you can respond in ways that were successful in the past or by improving on them in the present. (B) Weigh Weigh current and anticipated actions with respect to costs, benefits, effort, and/or efficiency. (B) File Structure: Tactics for navigating from the database search system’s file structure to individual sources and information in those sources Find Use the browser’s find feature in search of information in a source. (S) Hubspoke Navigate from links in a source to additional sources, including from a source’s cited references to these actual sources. (S) (B: Bibble) Snatch Search using a noncontent search term that retrieves only what the user wants. (S: URL) Stretch Use a database, its sources, or information in the sources for other than their intended purposes. (B) Search Formulation: Tactics to aid in conceptualizing, assembling, and revising the search formulation Block Block one or more facets and/or search terms. (B) Broaden Broaden one or more facets and/or their search terms.
Exhaust
Search all of the query’s facets and/or search terms. (B) Field Restrict searches to occurrences of search terms in one or more fields. Precision Retrieve only relevant retrievals. Rearrange Alter the order of terms or search statements. (B) Recall Retrieve all relevant retrievals. Reduce Search for some, not all, of the query’s facets and/or search terms. (B: Cut) Specify Enlist search terms that are as specific as the desired information. (B) Search Term: Tactics to aid in the selection of search terms for the query’s facets Anticiterm Search for a wildcard term in a phrase. (S) Auto-Term Toggle on the system’s automatic addition of Add statistically co-occurring terms to add them to the search. Citing Search for sources citing an older source (i.e., forwardchaining). Tactic Definition Cluster Search using one or more co-occurring values from accompanying clusters. Contrary Search using terms or facets that are opposite in meaning. (B) Fix Apply truncation to search terms, or let the system to do it automatically. (B) Neighbor Search for terms in the same alphabetical neighborhood. (B) Nickname Search for a term’s acronyms or initialed or nicknamed variants. Pearl Search using relevant values from retrievals. (B & S: Growing Trace; S: Backlink)
Proximate Pseudonym Relate Space & Symbol Spell Sub Subdivide Super Evaluation: Audition Authorship
Content External Noncontent
Impetus Internal Noncontent
Designate how close together search terms should be. (S: Phrase) Search using pseudonyms for names. Search using terms that are not in a hierarchical relationship but are related to the facet. (B) Search using one- or two-word variants of the term and/or by taking hyphens, slashes, and other symbols into consideration. (B) Search using a term’s variant spellings. (B) Search using terms that are hierarchically subordinate to the facet. (B) Search using one or more subdivisions appended to a term. Search using terms that are hierarchically superordinate to the facet. (B) Tactics to aid in the evaluation of retrievals Cite page elements, such as graphics, layout, spelling, grammatical errors, etc. (S) Use criteria that refer to author credentials, reputation, affiliations with named or unnamed organizations, sponsors, publishers, etc. Use criteria that refer to the source’s subject content. Use criteria that describe noncontent of external characteristics of sources (e.g., altmetrics, citing references, format, genre, personal experience, recommendations from other people, reviews). Refer to the impetus that gave rise to the search. Use criteria that describe noncontent of internal characteristics of sources (e.g., currency, cited references, detail, features, length).
Source: Adapted from Bates (1979; 1987) and Smith (2012)
File Structure Bates (1979) uses the term file structure for all of the ways a database organizes information and makes it retrievable, including the surrogates representing each item, the fields that are indexed, the thesaurus where subject descriptors can be found and added to search boxes, and search system features and functionality. For example, Bates recommends applying your knowledge of the file structure to help you retrieve a literature review or bibliography on the seeker’s topic. This approach works well in a subject-specific encyclopedia whose entries cite the sources used. It can be adapted to citation databases indexing article references and to subjectspecific databases indexing full texts, where the phrase literature review can be combined with topic keywords and/or subject descriptors. The tactic also works well in e-book collections. It is especially efficient when an item’s citations are live links to the actual sources. You can also stretch beyond the customary uses of search systems. The original author of Online Searching recalls wanting to retrieve literature reviews on a subject, but the database didn’t have a controlled vocabulary, a document type or field for literature reviews, or useful results from a free-text search that included the phrase literature review. Consequently, she searched for lengthy journal articles (twenty-five or more pages long) with lengthy bibliographies (fifty or more cited references) and achieved the objective of finding literature reviews related to her research topic. Search Formulation Tactics In previous chapters, you have already wielded certain search formulation tactics to conceptualize, assemble, and revise a search. These tactics involve ways to broaden a search for the highest possible recall of potentially relevant results, such as using the Boolean OR between synonyms, free-text searching of full-text sources using natural-language keywords, and eliminating one of the
facets from a search. Other tactics involve ways to narrow a search, including using all of a query’s facets in the search statement, applying presearch limits and postsearch filters, using subject descriptors, and forcing phrase searching. Search Term Tactics Search term tactics aid in the selection and revision of keywords, phrases, and descriptors for the query’s facets. For some concepts, using search terms that are opposite in meaning will aid retrieval of relevant material. For the information seeker researching divestment from fossil fuels, a productive tactic for one facet might involve placing the Boolean OR between opposites: divestment OR investment. When you consult a database’s thesaurus, it allows you to browse, display, and select a term’s narrower terms, broader terms, and related terms. You may also think in terms of hierarchies or sets and subsets. For example, a search with Central America as one of its facets might yield too few results. Greater recall can be achieved by searching for all the countries that fall under the geographic designation Central America: belize OR “costa rica” OR “el salvador” OR guatemala OR honduras OR nicaragua OR panama and you might even consider including a lower rung on the hierarchy by also including OR and the names of major cities such as managua OR san josé. Another approach is to browse the database’s field indexes, where terms are listed in alphabetical order, to facilitate identification of variant forms of author names; corporate bodies including government agencies; titles; and other elements. If there’s no name field index, try a free-text search for an author’s pseudonyms, initials, or nicknames, and for corporate bodies, try an acronym or popularization of the formal name. Several search term tactics affect how words are entered in boxes to take advantage of system features. The spell tactic is a reminder to accommodate multiple spellings for a term, as well as to make sure spelling is correct. It includes a consideration of how to
enter words that have hyphens, dots, slashes, spaces, and other symbols; when in doubt, enter the variants with the Boolean OR between them: coffeehouses OR coffee-houses OR “coffee houses.” The spell tactic can also mean using truncation and wildcard symbols in place of long OR statements to account for plurals, British vs. American spellings, and word stems with different endings. Use the proximate tactic by enclosing phrases in quotation marks and by using adjacency and proximity operators to determine the order of words and their nearness to each other in the text. In citation databases, you can work from a relevant item in a set of results to find articles that are cited by the relevant item’s author or later articles that cite the original relevant item. In Academic Search Ultimate, for instance, the results page allows you to click on a limiter for “References Available” to display only the articles citing other authors’ works. After activating that limiter, each brief record in the results list will include a notation of the number of references in the article; clicking on it will display a list of all the references. Similarly, some databases allow searchers to discover more recent articles that cite an article they’ve deemed relevant. In the PubMed database, for example, clicking on the article title in a brief record will take you to the full record for the item, where the cited-by option on the right side of the screen displays subsequent articles that have cited the original article since its publication. Cited and cited-by references can help the seeker find relevant information when subject searching or when there’s a single known item on the topic. Evaluation Tactics In table 12.1, all evaluation tactics except for Audition (which is from Smith) come from an empirical study of the criteria searchers use to evaluate information from the web and from licensed databases (Markey, Leeder, and Rieh 2014). Audition describes such page elements as graphics, layout, spelling, and grammar that give searchers clues about a source’s credibility. Searchers typically enlist
the Authorship tactic to judge a source’s credibility, referring to any number of things about authors and their credentials, reputation, affiliations with named or unnamed organizations, sponsors, publishers, and so on. When searchers execute the two tactics internal noncontent and external noncontent, they rely on aspects of sources, such as currency, cited references, altmetrics, and format, to help them assess relevance and/or credibility. To judge a source’s relevance, searchers consider its subject or topical content and the impetus that gave rise to the search.
MANAGING RESULTS Once you’ve used the tactics necessary to hone the search statement to retrieve the most relevant results, you can introduce the seeker to a few readily available tools to help them manage results for use and reuse. Because users are likely to experience shifts in their topic as they review surrogate records, their second pass through retrieved sources allows them to be more decisive, eliminating saved surrogates that they deem no longer promising and downloading full texts. Users have at least three options for managing search results: Save downloaded full texts in a folder on their personal computers named for their project so they will be able to find them later. Use the database’s save function, if there is one, to keep results in an online folder only they have access to. Use reference management software, where they can store citations, abstracts, and keywords, as well as downloaded full texts and notes.
These are not mutually exclusive options, and some users may want to have full texts on their computers as well as readily accessible online. Using a database’s save function is convenient, but losing access to the database because the user is no longer affiliated with the institution means also losing access to their folder of retrieved items. The same is true for separate reference management software: if an academic library cancels its subscription to the software, students and faculty will no longer have access to their saved citations and texts. An individual can instead sign up to use the basic free service with limited storage space offered by some reference management services that also offer premium, multi-user subscriptions to libraries. For example, the free version of Elsevier’s Mendeley allows you to create and save lists of sources for different topics or projects. Another alternative is to use the freely available open-source software Zotero. Major database vendors such as Elsevier, EBSCO, Clarivate, and Gale offer many features on search results pages that make it possible to save, organize, share, and export results. Figure 12.3 shows a results page from ScienceDirect for the keyword search: groundwater management Arizona, with the first two results selected for downloading or exporting. Choosing the exporting function provides options to send the selected results to the subscriptionbased RefWorks citation management service, or to export in the .ris format for easy import into Zotero, Mendeley, or another reference management service. Zotero and Mendeley can also automatically format the user’s references into a consistent style for adding to research papers. Both support research collaborations and student group work since users can share their Zotero and Mendeley references with others using the same apps. Figure 12.4 shows a results page from Gale’s General OneFile for this keyword search: concussions AND helmets. Clicking on the title of one of the brief records leads to the full record, usually including the full text of the item. In addition to the functions represented in the top-right corner of the screen, small icons above the full text allow saving to Google Drive or Microsoft OneDrive; emailing the reference, full text, or pdf; or downloading or printing the article. At
the end of the full record is the complete citation for the source in a choice of four styles for easy cutting and pasting into a document. Below that is the option to export the item to the user’s choice of reference management systems, as well exporting it in the .ris format.
Figure 12.3 First two results selected for downloading or exporting. Source: ScienceDirect The major vendor platforms also provide personal storage space for users who create a free account through their institutional affiliation. Users can then save results in folders they label and organize for their projects. Figure 12.5 shows a results page for the search “environmental refugees” in EBSCOhost’s Political Science Complete. The page listing the retrieved brief records includes a small folder icon to the right of each item; clicking on it saves the item to the folder, which can be named and saved and then added to in subsequent search sessions involving the same topic. Clicking on the title in a brief record takes the user to the detailed record, with
options for managing the item to the right. Options include saving to Google Drive, emailing the citation or full text, downloading the pdf, generating a citation in the preferred style for copying and pasting into a document, exporting in the .ris format for use in a reference management program, creating a note to add to the item, and creating a DOI or other permanent link that can be copied and pasted into a document.
Figure 12.4 Beginning of full record for a search result in the General OneFile database. Cengage Learning Inc. Reproduced by permission. www.cengage.com/permissions.
Figure 12.5 Tools for managing results displayed on the right side of a detailed record page. By permission of EBSCO Publishing, Inc. In addition to making it easy for users to save and reuse their search results, database platforms make it easy to save and reuse the search statements that generated those results. Saved searches can be rerun when you log back in to the database where they are stored in the registered user’s folder. Another option is to create an alert to prompt the system to send an email when new items matching your saved search statement are added to the database. Figures 12.6, 12.7, and 12.8 show these functions on the EBSCOhost, Gale, and NewsBank platforms, respectively, using this search: (“global warming” OR “climate change”) AND (“nuclear energy” OR “nuclear power plants”). All of these terms were used as subject descriptors in the EBSCOhost Political Science Complete database and in the Gale Academic OneFile database. If search alerts cease to offer new results, it could be that the thesaurus term has changed. In that case, rewrite your search statement to replace the outdated subject descriptor with the newly designated one (Cody 2016). In NewsBank’s Access World News database, the terms were searched as keywords limited to the lead or first paragraph of news stories. As seen in Figure 12.6, the Political Science Complete database on the EBSCOhost platform, the first results page, displaying brief records for the first 50 of 69 total results, includes a button labeled “share.” The “share” drop-down menu can be used to add results to a folder and to add the search itself to a folder for reuse during subsequent search sessions in the database. The drop-down menu also includes the option of having the system alert the user via email or an RSS feed that new items matching the search are available. The Academic OneFile database on the Gale platform, as seen in Figure 12.7, includes a search alert option that allows the user to specify whether to send notifications by email or to an RSS feed and how often to update the search and send the alert. Although no personal folder can be created and used inside the database, it is possible to sign in to a Google account, where items can be stored
on Google Drive. An information seeker can download as many as fifty articles per session by using the checkbox next to each desired article and then clicking on the download icon near the top of the screen. This method saves a zip file of all the selected articles to the user’s hard drive as text files. For many users, it would be preferable to open each brief record and then download them as pdfs. The results page of the Access World News database on the NewsBank platform (figure 12.8) includes the option to save the search to a folder; you must register with the system to be able to take advantage of this feature. There’s also a “Create Alert” option that lets you specify both the email address to which new records retrieved from your saved search can be sent and the frequency of alerts. Since this is a news database and some users are following breaking stories, it’s possible to get alerts as many as four times a day. The way in which you work with results depends on the user’s project and purpose, just as the search process does. While information seekers and intermediaries should routinely evaluate results for ideas about additional ways to search and for relevance to the project, they may also find it helpful to use some of the tools available for managing references.
Figure 12.6 Options to add the search statement to a folder and send a notification when new items matching the search are added
to the database in EBSCOhost. By permission of EBSCO Publishing, Inc.
Figure 12.7 Search alert feature at the top of the results screen on the Gale platform. Cengage Learning Inc. Reproduced by permission. www.cengage.com/permissions.
Figure 12.8 Access World News search alert feature. Courtesy of Readex, Inc.
QUESTIONS 1. Search a science-related topic of your choice using two of the following resources: Public Library of Science (PLOS One), Gale Science in Context, ScienceDirect, and PubMed. What postsearch clusters do they give you to filter retrievals? Which filters might be most useful for an undergraduate student working on a term project and which for a faculty member engaged in research? 2. Conduct a high-precision search in a discipline-based database for a topic that is very familiar to you, perhaps one that you researched for a course recently. Review the first two pages of results, assessing relevance on a 0-1-2 scale for not relevant, possibly relevant, and relevant, respectively. Exchange your device with a classmate, describe your search topics to each other, and assess each other’s search results. Compare your assessments. Whose relevance assessments are higher? Which one of you is a more critical judge of relevance, and why? Then debrief with the whole class. 3. You conduct a search of the ERIC database on children’s safety using social-networking sites and the internet and retrieve two sources: ED558604 (“Cyberbullying among Children and Teens: A Pervasive Global Issue”) and ED573956 (“Young People and ESafety: The Results of the 2015 London Grid for Learning ESafety Survey”). Enter these two accession numbers into the ERIC database and download their full texts. Skim each one, paying attention to the author’s affiliation, the research methods used, the purpose of the research, and whether a literature review and list of sources used are included. Decide which of the two is the more high-quality source. Then score each source using these questions: Currency: Does this topic require current information, and if it does, is the source’s information current? (Rate: 0 = not current, 1 = somewhat current, 2 = very current)
Relevance: Does the information relate to the query? (Rate: 0 = not relevant, 1 = possibly relevant, and 2 = relevant) Authority: Who wrote the source? What are the author’s credentials, and do they qualify the author to write on this topic? (Rate: 0 = not qualified, 1 = somewhat qualified, 2 = qualified) Accuracy: What claims does the author make? Do they support the claims with evidence? (Rate: 0 = no evidence, 1 = some but not enough evidence, 2 = enough evidence) Purpose: Is the author’s purpose to inform, teach, sell, entertain, or persuade? (Rate: 0 = entertain or sell, 1 = persuade, 2 = inform or teach) Calculate scores for each source. To what extent is this use of the CRAAP test helpful? Which criteria are superfluous, and what might be added to strengthen this approach? 4. Finding yourself at an impasse while conducting a search is not unusual. The following are examples of typical impasses. Consider each, and then select one or more search tactics from table 12.1 that you would exercise to break the impasse. Tell why you chose the tactic(s): My results address the topic, but they seem overly broad. This person wrote much more than I’m retrieving here. Why does this nonrelevant topic appear so frequently in my results? I know I’ve chosen the right database, so why can’t I find anything on this topic? I’m working with a four-faceted search, and I didn’t expect it to produce thousands of relevant results. Despite plenty of controlled vocabulary terms for this facet, it seems so underrepresented in this database. This legislation was pivotal. Why am I finding so little about it?
SUMMARY This chapter helps you understand and assess results retrieved in commercial databases, use results to revise search tactics for better results, and manage the results as well as the search statement that generated them. Search tactics describe the many ways online searchers can maneuver during the ongoing searching process. There are five tactic types: search monitoring, file structure, search formulation, search terms, and evaluation. Such tactics can help you revise searches that yield disappointing results so as to retrieve more relevant items. By helping information seekers understand their results and revise searches tactically based on the assessment of results, you help them succeed at the immediate task of finding relevant information for a particular purpose. Libraries’ subscription databases offer tools for assessing, saving, and reusing results. Some also allow the search statement itself to be saved for reuse on subsequent logins or to generate an automatic alert informing the user of newly available results that match the search statement. By helping information seekers cite results correctly, save searches and search results, and find new relevant results through automated search alerts, you help them succeed at the longer-term goal of developing their knowledge about their topic.
REFERENCES Bates, Marcia J. 1979. “Information Search Tactics.” Journal of the American Society for Information Science 30, no. 4: 205–14. https://pages.gseis.ucla.edu/faculty/bates/articles/Information%2 0Search%20Tactics.html.
Bates, Marcia J. 1987. “How to Use Information Search Tactics Online.” Online 11, no. 3 (May): 47–54. Blakeslee, Sarah. 2004. “The CRAAP Test.” LOEX Quarterly 31, no. 3: 4. https://commons.emich.edu/loexquarterly/vol31/iss3/4. Caulfield, Mike. 2017. “Web Literacy for Student Fact-Checkers.” https://open.umn.edu/opentextbooks/textbooks/454. Caulfield, Mike. 2019. SIFT (The Four Moves). https://hapgood.us/2019/06/19/sift-the-four-moves/. Cody, Alison K. 2016. “PsycINFO Expert Tip: Updating Search Alerts When a Thesaurus Term Changes.” https://blog.apapubs.org/2016/03/01/psycinfo-expert-tipupdating-search-alerts-when-a-thesaurus-term-changes/. Cooke, Nicole A. 2018. Fake News and Alternative Facts: Information Literacy in a Post-Truth Era. Chicago: American Library Association. Markey, Karen, Chris Leeder, and Soo Young Rieh. 2014. Designing Online Information Literacy Games Students Want to Play. Lanham, MD: Rowman & Littlefield. Smith, Alistair G. 2012. “Internet Search Tactics.” Online 36, no. 1: 7–20. Valenza, Joyce. 2020. “Enough with the CRAAP: We’re Just Not Doing It Right.” School Library Journal blog, November 1. https://blogs.slj.com/neverendingsearch/2020/11/01/enoughwith-the-craap-were-just-not-doing-it-right/. Wineburg, Sam, Joel Breakstone, Nadav Ziv, and Mark Smith. 2020. “Educating for Misunderstanding: How Approaches to Teaching Digital Literacy Make Students Susceptible to Scammers, Rogues, Bad Actors, and Hate Mongers.” Working Paper A-21322, Stanford History Education Group, Stanford University, Stanford, CA. https://purl.stanford.edu/mf412bt5333.
SUGGESTED READING
Ivey, Camille. 2018. “Choosing the Right Citation Management Tool: Endnote, Mendeley, RefWorks, or Zotero.” Journal of the Medical Library Association 106, no. 3: 399–403. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6013132/.
ANSWERS 1. Search a science-related topic of your choice using two of the following resources: PLOS One, Gale Science in Context, ScienceDirect, and PubMed. What postsearch clusters do they give you to filter retrievals? Which filters might be most useful for an advanced undergraduate student working on a term project and which for a faculty member engaged in research? PLOS One: The search results page allows users to filter by journal, subject area, article type, author, and publication date. An undergraduate might find the broad subject categories useful for limiting their results to the area of their own major. A faculty member might use the “publication date” filter to find recent research, and would also be well-served by creating an alert using the button in the upper-right area of the results screen. Filtering for a specific journal may help with a known-item search. Gale Science in Context: From the results page, users can filter by publication date, subjects, document type, publication title, newspaper sections, Lexile reading level, and content level. These last two are intended to make it simple to adjust results for high school students, who would most likely have access to the database via their school or state library. Both high school and undergraduate students can benefit from using the “Topic Finder” linked on the results page to help them visualize their topic and find ways to narrow it down to a particular aspect indexed in the database. Faculty members probably would
choose databases more focused on scholarly material in their specific subject area. However, Gale Science in Context does index PLOS One and other specialized PLOS journals, and its greater number and variety of filters might prove useful. ScienceDirect: Several ways to refine a search are offered on the left side of the results page, including limiting to journals the library subscribes to, publication years, article type, publication title, subject areas, and access type (for limiting to open access). The openly searchable science-direct.com website does not include the “subscribed journals” and “access type” filters, nor does it offer search alerts or the ability to export to a reference manager. Students new to a subject field may select an encyclopedia entry from the “article type” filter, and those whose assignments require citing recent scholarly articles can use the “publication year” and “article type” filters. A faculty member starting a new research project might use the “article type” filter to select only review articles, which typically include long lists of citations. PubMed: The first option for limiting results is a publication year slider that helps users visualize the quantity of retrieved publications along a timeline and limit results to particular years. “Text availability” is the next filter, and it includes the ability to limit to free full texts, which is important for users who are accessing the free PubMed database and do not have an institutional affiliation that would allow them access to paywalled articles. Other filters include “article attribute,” “article type,” and “publication date,” with radio buttons for one, five, and ten years, as well as custom ranges. Clicking on the “more filters” link adds the ability to limit results by article type and provides checkboxes for a wide variety of material from addresses to webcasts. The “species” and “sex” are binary filters, the former for human and nonhuman and the latter for male and female, while the filters for language and for the age ranges of the people the publications discuss offer many options. Undergraduates can use the filters to identify full-text articles published in the last year while researchers might use
the “article type” filter to select only systematic reviews or the “article attribute” filter for those reviews that have associated data. 2. Conduct a high-precision search and review the first two pages of retrievals, assessing relevance. When students discuss their experiences in class, you can tally results for the whole class with respect to who is the most positive judge of relevance. Query originators tend to know exactly what is and isn’t relevant because of their experience using results to complete the assignment connected with the query. Students who didn’t initiate the query bring a more positive assessment to the results because they don’t have these constraints and they tend to defer to the judgment of the originator. 3. Skim two ERIC documents and decide which is the more highquality source. The criteria you used are from a longer version of the CRAAP test (Blakeslee 2004). After doing your own assessment, participate in a class discussion of the advantages and disadvantages of using this approach, whether for yourself or for helping students evaluate sources. 4. Select one or more search tactics that you would apply to break the impasse. Tell why you chose the tactic(s). My retrievals address the topic, but they seem really broad. SUB tactic, adding narrower search terms to one or more facets. SUBDIVIDE tactic, adding subdivided forms of CV terms that express aspects of the unsubdivided main heading to one or more facets. This person wrote much more than I’m retrieving here. NEIGHBOR tactic, browsing the database’s alphabetical author index and choosing variant forms of this person’s name. PSEUDONYM tactic, checking LCNAF and biographical databases for pseudonyms. Why does this nonrelevant topic constantly pop up in my retrievals? BLOCK tactic, using the NOT operator to eliminate specific search terms or entire facets of search terms.
I know I’ve chosen the right database. Why can’t I find anything on this topic? Increase retrievals by adding more search terms to the formulation. Tactics for adding synonymous search terms are SPECIFY, RELATE, SPACE & SYMBOL, and PSEUDONYM; narrower is SUB; and broader is SUPER. Alternatively, let the system choose terms by turning on its automatic vocabulary assistance (AUTO TERM ADD tactic). Consider the NEIGHBOR or CONTRARY tactics to increase search terms. Execute the PROXIMATE and FIX tactics to loosen up word-proximity operators and truncation. If you’ve identified at least one relevant retrieval, use its controlled vocabulary and free-text terms for PEARL GROWING, and click on the retrieval’s “find-like” link or its “citing reference” link (CITING tactic). Why aren’t my retrievals expressing the _________ facet? Execute the HISTORY tactic, reviewing the terminology of your previous search statements and the Boolean logic that connects them. Look for search logic errors and CORRECT them. If your search terms aren’t faithful to the rule of specific entry, then engage the SPECIFICITY tactic, substituting search terms that are as specific as the desired information. Check the form of your search terms, executing the SPELL tactic to correct spelling errors and enter variant forms’ the NEIGHBOR tactic to browse alphabetically for these forms; the SPACE & SYMBOL tactic to enter space and symbol variants; the NICKNAME tactic to enter acronyms, initialed, or nicknamed variants; and the PSEUDONYM tactic to enter pseudonyms for names. I’m working with a four-faceted search, and I didn’t expect it to produce thousands of relevant retrievals. Execute the HISTORY tactic, reviewing previous search statements and logic. Check the search logic to make sure it is CORRECT; make sure the search statements that combine sets for each facet use the Boolean AND or NOT operators. If they do, execute the CLUSTER tactic, limiting retrievals by one
or more nonsubject clusters, such as “language,” “publication date,” or “peer-reviewed.” Despite plenty of controlled vocabulary terms for this facet, it seems so underrepresented in this database. Consider the CONTRARY tactic for adding subject terms that are the opposite in meaning. Consider also one or more of the FIELD, PROXIMATE, and FIX tactics for adding free-text terms to the formulation. This legislation was pivotal. Why am I finding so little about it? Consider the NICKNAME, NEIGHBOR, SPELL, or SPACE & SYMBOL tactics for representing variant forms of this legislation’s name. (This is an impasse you might experience when searching any type of proper noun, such as a company, organization, place, event, or test.)
13
Performing a Technical Reading of a Database’s Search System A systematic approach to familiarizing yourself with a database involves performing a technical reading of the database and its search system. A technical reading can be an efficient method for learning about databases because it focuses on the essentials of understanding indexed content and search system functionality. Methods include consulting the vendor’s website for product information and instruction; studying the database search and results screens for tips and features; reading the help screens and about pages while in the database; and trying searches and working with results to experience how the search system functions and what the retrieved content looks like. Consulting different libraries’ LibGuides and tutorials explaining and demonstrating vendor platforms and individual databases may also be helpful. A comprehensive technical reading involves answering nine questions regarding the following factors: database’s relevance to the query use of Boolean operators default search fields browsability of field indexes presence of a controlled vocabulary creation of reusable search statements and results sets
symbols for proximity operators and other functions availability of presearch qualifiers, and availability of postsearch clusters or filters By putting this technical reading into action, you are able to make informed decisions about which databases to use and how to use them to achieve the aims of information seekers. This chapter takes you through the process of answering the nine questions in a comprehensive technical reading and concludes with a discussion of the elements of a truncated reading when time is short. Ideally, you will develop your professional expertise by conducting a thorough technical reading using all nine factors for each database you are responsible for understanding and using. As new databases become available at your institution and as familiar databases get updated with redesigned interfaces and enhanced search systems, you will need to update your knowledge by conducting additional technical readings. Devoting some time to such self-directed learning will make you more efficient at choosing and using databases when you are working with information seekers. Although you probably won’t retain everything your comprehensive technical reading reveals, you’ll be able to do a quick, abbreviated reading while engaged in a reference or research transaction. If you have to do a truncated version of a technical reading while you’re intermediating a search, it will go a lot faster if you have already familiarized yourself with the database by doing a thorough, systematic reading during time set aside for professional development.
IS THIS DATABASE RELEVANT?
Database relevance is contingent on the information seeker’s query and purpose. At the very least, you want to match the database’s subject coverage with the user’s topic of interest. For student assignments at all educational levels, this can be easy: just ask the student for more information about the class requiring the assignment. Use your reference interview skills to clarify the impetus for the query. For example, if a user’s query is about abortion, ask whether the medical, legal, political, historical, or other aspects of the topic are of most interest, and then choose a database that addresses the topic from the desired perspective(s). Consider also the database’s intended audience. Recommend databases to users that are in keeping with their levels of sophistication vis-à-vis their topics of interest. A high school student working on a science fair project about groundwater will require a different database than a professional hydrologist seeking the literature on groundwater flow modeling. In addition to considering the subject and educational level of the information seeker, you should also consider the five attributes covered previously: 1. source type (surrogate or full-text source) 2. genre (text, media, numeric and spatial data, or a combination of these) 3. selection principle (subject-specific, multidisciplinary, genre- or form-specific) 4. form (reference or research), and 5. editorial control (mediated or unmediated) Each one will have an impact on the relevance of the database, given the seeker’s project and preferences.
WHAT BOOLEAN OPERATORS ARE AVAILABLE AND WHERE CAN THEY BE USED? Knowing which Boolean operators are available, how they are expressed in a search statement, and which, if any, are defaults embedded in the search, thesaurus, and results interfaces can increase your search efficiency. A straightforward way to determine what’s available is to examine the system’s interfaces. The advanced search screen may make it obvious by offering separate search boxes with drop-down menus offering a selection of Boolean operators between each box. Most will offer AND, OR, and NOT (in some systems the operator may be expressed as AND NOT), but some will offer only AND and OR. Search tips near the boxes may show example search statements with Boolean operators. It’s common for search systems to be programmed to recognize the space between two words in a search box as the Boolean AND, making it unnecessary to type the operator between the words. In contrast, some systems are programmed to read the space between words in a single box as a proximity operator with a preset number of words. In a sense, the AND is implicit in any search using a proximity operator, which tells the system to find term x AND term y but only with a specified maximum number of intervening terms between them. For a search, however, it’s important to know whether the space between terms in a search box functions as a simple AND or as an AND with proximity parameters that generate a more precise result. Most college and university students see a database’s advanced search interface when they log in. That’s because the university library has chosen the advanced search screen as the default interface. State, public, and school libraries usually choose the basic search interface instead, with its single search box. The single
search box may recognize the same Boolean and proximity operators available in the advanced search interface, but it may be designed to search additional fields to increase the number of results for users to choose from. For this reason, a search in a basic search box may yield more results than the same search using the advanced search boxes. In other words, the basic search is modeled after the single box, broad recall approach of a web search engine. But just as a search engine offers an advanced search form where the Boolean and proximity search techniques are explicit, most commercial databases offer basic and advanced search interfaces. Finding the advanced search screen, if it’s not the default, can be a good way to start your technical reading. Figures 13.1 and 13.2 show the basic and advanced search screens in the Gale database OneFile: News. The basic search screen offers a single box with a few tools below it. The advanced search screen provides three separate search boxes, one drop-down menu listing Boolean operators, and a second drop-down menu listing the fields that are indexed. The query, ethics for research in [insert a commonplace discipline, field, or profession] can be used to verify that the system supports Boolean searching. Enter these three separate search statements in the following order:
Figure 13.1 Basic search screen in Gale OneFile: News. Cengage Learning Inc. Reproduced by permission. www.cengage.com/permissions.
Figure 13.2 Advanced search screen in Gale OneFile: News. Cengage Learning Inc. Reproduced by permission. www.cengage.com/permissions. ethics research ethics AND research ethics AND research AND [insert discipline] Use the search history feature to compare the sets of results. A Boolean search system will retrieve fewer results for successive search statements. Searching these three statements retrieves thousands of results for the single-word query, hundreds for the two-facet query, and less than a hundred for the final query. Next, test whether the space between keywords is processed as an AND. The first search, ethics, research retrieves fewer results than the second search, ethics AND research, which explicitly includes the Boolean AND. We can conclude that the space does not function as an implicit AND. But the results are a bit mystifying: Is the system searching two words with only a space between them as
a phrase? Remembering that some systems are programmed to render a space as a proximity operator specifying a default number of words, make a guess and give it a try: ethics n7 research. This search supplies too many results. The final search, ethics n4 research, confirms that the space between words in the advanced search box is the proximity operator specifying at most four words between our keywords, in any order; the results for this search match those of the first search. A test using the single basic search box shows that the same search statement ethics research retrieves more than twice as many results as the query in the advanced search box. Same database, same search statement, different results. To determine why, use the advanced search screen and the NOT operator to understand the difference between the search defaults in the basic and advanced boxes. Use the NOT operator to eliminate the set with fewer results from the set with more results; the set of these results tells you how many from the basic search are not included in the advanced search. Although it’s not possible to discern exactly what’s happening because commercial databases consider algorithms and other details proprietary, it’s clear that the basic search default is designed for greater recall while the advanced search default is designed for greater precision. In EBSCOhost databases, the space between keywords stands for the n5 proximity operator by default, as stated on their help screens. As with the Gale interface, a search for two keywords separated by a space will retrieve more results than the same keywords in quotation marks, which will force a phrase search and fewer results than a simple AND search. When you scrutinize results to understand how the search boxes are working, be aware that some systems search metadata fields that are not displayed in the database records. When you can’t figure out why a particular record was retrieved for the search you did, it could be because the database doesn’t display every field that is searched. A major factor contributing to the success of a Boolean search is knowing the processing order of the operators. Clicking on the help icon on a database search or results screen may reveal this. Using
the help link at the bottom of the Gale Virtual Reference Library leads to the information that the system processes the AND first and then the OR. Nexis Uni’s help screen offers more detail, indicating that the order of processing is OR; then proximity operators, with the smallest number processed first; then AND; and finally, AND NOT. The fact that Gale processes the OR last and Nexis processes the OR first means you’ll need to tailor your search statements accordingly. Almost all Boolean search systems recognize parentheses to designate the order of processing. Both Gale and Nexis search systems will understand this nested Boolean search statement: (ethics OR morals) AND “medical research,” which commands the system to process the OR search statement first and process the AND statement last. When the help screen doesn’t provide information about processing order, some experimentation with searches will make it clear. The nested search retrieves nineteen results in the Gale Virtual Reference Library. Deleting the parentheses retrieves more than three thousand results. If we didn’t know from reading Gale’s help screen that the AND is processed before the OR, we could discern it from this experiment. In the non-nested search, the system processed morals AND “medical research” first, then processed the OR results with the set of all records mentioning ethics. In effect, the search was processed as if this search statement had been entered: ethics OR (morals and “medical research”). Nothing in this search requires the results to include the word morals or the phrase medical research, but all of the results do have to have the word ethics in them. Most of the three-thousand-plus results don’t mention medical research at all. Boolean operators are also useful in other places, such as in a subject thesaurus where you can select a subject descriptor and add it to the search box along with the AND or OR operators. The Boolean AND is also implicit in the search-within function available on some database results pages, where a set of results can be winnowed further by inputting another keyword or phrase.
WHAT ARE THE DEFAULT SEARCH FIELDS? When you enter a search statement, systems search surrogate records, full texts, or both. Exactly which parts of the records are searched will affect results. If you know which fields of the records a system is programmed to search as the default, then you’ll have a better idea of why the retrieved results look the way they do and how you might want to change from the default search to a more customized approach. When you use the advanced search interface for news in the Nexis Uni database, the system looks for your keywords in all of the indexed fields, including full texts. If you search the word summer, the system will retrieve records with the word in the title, full texts, and even the bylines of stories written by reporters whose first name is Summer. It’s more common for a search system to be programmed to look for keywords in the fields most related to the topic of the item, including title, author-supplied keywords, subject descriptors, and abstracts, as is the case with EBSCO databases that index scholarly journals and popular and trade magazines. The Gale system default search identifies matches in the title, subject, and keyword fields and in the full texts, although this may vary a bit depending on the nature of the material indexed. It can be difficult to discover which fields are searched by default in any given database. If your searches of a database consistently retrieve larger numbers of results than you’d expect, there’s a good chance the system is programmed to search many of the indexed fields, including full texts, by default. Using the drop-down list of fields next to the search box is an easy way to limit the search to fewer, more targeted fields.
WHICH FIELD INDEXES CAN BE BROWSED? Browsing the indexed terms in a database’s record fields enables you to determine the exact values used to express ideas, concepts, objects, events, and names. Index browsing is mandatory in databases devoid of authority control to gather all variant forms, particularly in the case of authors, journal titles, publisher names, and author affiliations. The library catalog is one of the few remaining database types in which authority control is the rule rather than the exception. For all other databases, compensate by browsing indexes. Unfortunately, browsing is not always available, or when it is, it may not have been applied to the field that interests you. To determine whether a search system offers browsing, open its advanced interface, and look for a link called “Indexes,” “Browsing,” “Browse Indexes,” or “[field name] begins with.” Another implementation type is your selection of a field where the system displays a prompt, such as “Look up [insert field name],” “Start of [insert field name],” or “[insert field name] begins with,” to which you respond by entering a title, author name, publisher, or other information. Enter as much text as you think is necessary for the system to open its alphabetical display of your selected index at a location where your desired entry is filed. For example, entering three or four letters might suffice for finding the unique name of a company, but you might have to enter the full name of a company with a common name to find the right one. Some indexes include qualifiers to distinguish between entries. Personal-name indexes may include qualifiers for birth and death dates, occupations, gender, and/or nationality so you can distinguish the John Brown that interests you from the many other John Browns that are indexed in the database. Some databases offer a list of publications indexed, giving you the ability to browse journal, magazine, and newspaper names
alphabetically. In some databases the advanced search interface will include a menu above the search boxes with a link to the list of publications indexed in the database. In Access World News, at the top of the screen is a link to an A–Z Source List where you can browse the news publications indexed in the database and use checkboxes to search only in your selected publications. Similarly, in the advanced search interface in Nexis Uni, limit to news, then scroll down to search a word included in the name of the newspaper or other news source. An alphabetical list of all the news outlet names with your word in them will appear. The source browse is also available in the basic search screen’s guided search area, where you can limit your search to a specific publication.
IS THERE A CONTROLLED VOCABULARY, AND HOW IS IT USED? To find out whether items in a database are tagged with preferred subject terms, you can do a quick search and then look at a surrogate record to see if there’s a field for subjects. Another option is to look for a link or tab named “Thesaurus,” “Subjects,” or a similar label on the advanced search interface. Once inside the thesaurus, browse to identify subject headings or descriptors as well as their broader, narrower, and related terms. If a term you were looking for turns out not to be a preferred subject term, look for the use reference to find the correct descriptor. In EBSCOhost databases that use subject descriptors, the thesaurus offers a browse box where you can input a term. The default is “Term Begins With,” meaning the thesaurus will display your word at the top of a list of descriptors in alphabetical order. If you change from the default to
“Term Contains,” as shown in figure 13.3, you will see a list of all the descriptors that have your word anywhere in them or the authority record, including the scope note and used-for descriptors. If you decide to launch a search from the thesaurus, notice what the default Boolean operator is. In EBSCO, the default is OR. You can use the checkboxes to select descriptors, then click the “add” button. The database search box will contain the descriptor and its broader, narrower, and related terms you selected, with the Boolean OR between them. You can also launch a search from the thesaurus for the different facets using the Boolean AND. Select a descriptor for one facet of the query, change the default from OR to AND, and add it to the search box, then select a descriptor for another facet, and add it to the same search box. When you’ve constructed the search statement, click the search button to retrieve records in the database. To access Gale’s Academic OneFile thesaurus from the advanced search screen, click on the “Subject Guide Search” link. Input a term in the search box to see a list of subject headings in alphabetical order, with the number of database records tagged with each heading (figure 13.4). Clicking on the subject term will automatically retrieve all the records tagged with the term. You can also use the subdivisions drop-down menu to see the different aspects of the topic along with the number of records tagged with each. Another option is to choose the link labeled “Related Subjects” under the main subject heading to see broader, narrower, and related terms. Although you cannot construct a Boolean search statement from within the thesaurus, it’s easy enough to retrieve all the records under one subject heading and then use the search-within feature on the results page to add other facets. Your technical reading reveals that Academic OneFile uses a thesaurus but that the ability to launch a complex search statement from it is rather limited. When you intend to do subject searching for a query with multiple facets, be prepared to jot down the headings you will combine with the Boolean AND and OR.
Figure 13.3 List of subject descriptors containing the term gender. By permission of EBSCO Publishing, Inc.
Figure 13.4 Gale Academic OneFile subject guide search for toxins. Cengage Learning Inc. Reproduced by permission. www.cengage.com/permissions.
DOES THE SEARCH SYSTEM STORE SEARCHES AND RESULTS FOR REUSE? To determine whether your selected database supports storage of your work for reuse, look for a link or tab entitled “Search History,” “Previous Sets,” or “Previous Searches.” Typically, combining sets involves clicking on the “Search History” link; clicking the checkboxes
to select set numbers; and choosing the Boolean AND, OR, or NOT operators from a nearby drop-down menu to combine the sets. If the system numbers each set of results, you can enter set numbers in the search box rather than retyping the terms. Different systems have different ways of designating set numbers. Examples include using the pound sign (#) before the set number and using the letter s before the set number: #1 AND #2 (s1 OR s2 OR s3) AND s4 Also look for icons for search alerts, folders where you can store search statements and results, and an export function allowing you to add results to a reference management program.
WHAT PROXIMITY OPERATORS AND OTHER FREE-TEXT SEARCH DEVICES ARE AVAILABLE? Most systems accept quotation marks to bind word strings into phrases. There are two basic proximity operators, one where word order matters and a second for when word order doesn’t matter. These operators allow you to specify how many intervening words can separate your search words. Each search system has its own syntax for proximity operators, so check the system’s online “help” resource for details. Typically, search systems default to retrieving simple singular, plural, and possessive forms of your search terms but not always. At the very least, systems offer unlimited right-hand
truncation so you can retrieve words and their many different endings. Expect each search system to have its own syntax for truncation. A search system that offers truncation, or word stemming, often will offer a wildcard device you can use for withinword variants such as the singular woman and plural women. Check the database help information to discover whether some of these operations are built in. Some systems will do automatic word stemming and some will search common variants without your having to input them.
WHICH PRESEARCH QUALIFIERS ARE AVAILABLE? Many database advanced search screens offer a variety of ways to limit a search. The drop-down menus next to search boxes allow you to search terms in specific field indexes, but there are additional presearch qualifiers at other points on the screen. Figure 13.5 shows the advanced search interface for the National Agricultural Library’s AGRICOLA database, where you can limit results to a particular date range, place, media type, and more. An obvious presearch qualifier to use routinely is language, limiting results to only English or to English and Spanish or other combinations, depending on the user’s abilities and purpose. Presearch qualifiers are database specific, so don’t expect all to be available everywhere. Some databases don’t have presearch qualifiers, and other databases have so many qualifiers that you could use them to set up the entire search. PsycInfo is especially generous, giving you twenty presearch qualifiers to choose from. Choosing its classification codes “2953 Divorce & Remarriage” and “3260 Eating Disorders” along with “Adolescence” from the “Age Groups” presearch qualifier is a quick way to conduct a search on
the relationship of parents’ divorcing and remarrying to the development of eating disorders among teenagers. You may come to prefer the efficiency of selecting a database’s presearch qualifiers for subjects, if available, over building retrieval sets using controlled vocabulary and free-text search statements. Using subject qualifiers saves time because you can select one term for each facet instead of browsing the thesaurus and selecting multiple subject terms or manually entering multiple keywords and phrases. Presearch qualifiers don’t always have values that are sufficiently specific for representing every facet in a multifaceted query, but as you continue your technical reading of various databases you’ll begin to see where and how you can use subject qualifiers.
Figure 13.5 AGRICOLA advanced search screen with presearch qualifiers. Source: National Agricultural Library, https://agricola.nal.usda.gov/vwebv/searchAdvanced EBSCOhost’s advanced search page offers numerous presearch qualifiers as well. The selection of qualifiers varies with the database. For example, the RILM Abstracts of Music Literature offers the usual EBSCOhost presearch options, including search modes such as Boolean and phrase searching; expanders that tell the
system to search related words and equivalent subjects; and language, publication year, and document type limiters. The database indexes, organizes, and describes publications about music, and so it includes a presearch filter for major topics such as ethnomusicology, computer and electronic composition, and dance. As with the PsycInfo example, you can use the qualifiers without inputting search words. This may be helpful for known-item searching in cases in which you know the document type, the range of years within which the item was published, and the broad category of the topic. For example, if the query is for a book on baroque music published between 2000 and 2005, you can select “book” in the document type filter, “baroque” in the major topic filter, and input the range of years in the “publication date” box. Given the topic, if the speaker reads only English, the “language” filter should also be used. Not all databases will have such elaborate presearch qualifiers. A resource useful for reference questions and for helping students begin a research project is the collection of Oxford Research Encyclopedias online. The system is set up to present entries in alphabetical order for browsing, but with more than seventeen thousand encyclopedia articles in the database, it’s probably wiser to use the search box for most queries. A single search box on the left allows you to limit the search to selected field indexes, and to add another search box with your choice of Boolean operators between the boxes. Under the boxes are broad subject areas to check off to limit results. This is similar to the subject or discipline categories that can be selected on the advanced search page of the JSTOR database, in addition to the commonly available presearch qualifiers such as item type, language, and publication date. Oxford Research Encyclopedias and JSTOR are multidisciplinary databases, and so the ability to limit results to broad subject areas can be useful. Presearch qualifiers are most useful for the experienced searcher whose query is well defined and who has some experience with subject-specific databases. Most of the presearch qualifiers also appear as clusters or filters on results pages. In many
circumstances, it’s best to wait until you’ve crafted and performed the first search and are assessing results before you decide which filters to use.
WHICH POSTSEARCH CLUSTERS ARE AVAILABLE FOR FILTERING RESULTS? Postsearch cluster types and names vary from system to system. Elements of search results are clustered together and presented on the results page with the tally of results for each item in the cluster. Clusters help give a clearer picture of the set displayed on the results page, where they can be used to filter out some of the results. These postsearch tallies of results may reveal ideas users hadn’t thought of previously or describe an aspect of their topic that piques their interest. The availability of filters benefits users whose search topics are unfocused, diffuse, or broad. The science.gov results page provides filters for time period, topics, and authors (figure 13.6). It also includes a visualization categorizing the topics covered in the set of results, as shown in figure 13.7. The user can click on one of the topics in the visualization to see those results; in the figure, there are seventeen results in the zebra mussels set that are about the Great Lakes ecosystem.
Figure 13.6 Science.gov results page with date and topic clusters for filtering results. Source: Science.gov, https://www.science.gov/
Figure 13.7 Postsearch visualization of results in science.gov. Source: Science.gov, https://www.science.gov/
The same database platforms that provide search-page qualifiers for document type, language, publication year, and other common elements will provide those qualifiers as results-page clusters for filtering. In general, the results page may have more subject-related clusters than the search page does. Postsearch filters may include categories for subjects, controlled vocabulary, major subjects, academic discipline, and classification codes. A detailed subject descriptor cluster may be available on the results page for filtering, but not on a search page. A thorough technical reading will reveal which filters are available on the search page and which are available on the results page.
PERFORMING A COMPREHENSIVE OR TRUNCATED TECHNICAL READING A comprehensive technical reading asks you to scrutinize the availability and implementation of all nine features in each database you use. An efficient way to pace yourself through a series of technical readings is to concentrate on one database vendor at a time. Learn all you can about all the databases your library subscribes to on the EBSCOhost platform, for example, then move on to the Gale databases. Exploring each database on a single platform will reinforce your understanding of the functions and features common across the platform. This may help you identify the distinctive aspects of the individual databases on the platform. Let your intellectual curiosity lead you to experiment with different features for different kinds of searches. Reflect on your experience
and think about the circumstances that would make a particular feature in a particular database useful. If you are rushed for time, it makes sense to limit your technical reading to features that enable you to answer most queries that come to you. All readings begin with database relevance, so consider the database’s subject coverage, document types, and sources to determine its fit for the query at hand. In a truncated technical reading, the features you pay the most attention to will depend on the nature of the information seeker’s query. If the seeker is a monolingual student needing peer-reviewed articles for a research paper, look for a thesaurus of subject terms to work from and see whether presearch and postsearch qualifiers let you limit results to peer-reviewed full texts and by language. If it’s a professor needing to complete a citation, look at the drop-down list of indexed fields to limit the search to elements the professor already knows and find out how to limit results by publication year, if applicable. As you are intermediating between library users and databases, make it a habit to look at all the options displayed on the search and results pages, scrolling down if necessary to see the whole screen. Notice all the links, icons, tips, and examples above, below, and to the sides of the boxes on the search screen and the list of brief records on the results screen. Even if you can’t do or haven’t yet done a comprehensive technical reading, you can bring your full attention to the options offered as you are retrieving and evaluating results and revising searches.
QUESTIONS Both of these exercises can be accomplished as a small-group activity or as an assignment in which students work alone and then share their findings with the class.
1. Choose two different databases on a single vendor’s platform and do a comprehensive technical reading of each one, including a consideration of each of the nine factors. Note where and how you learned about the databases, such as by reading help screens, watching a video tutorial, experimenting with searches, and other methods. Make a detailed table representing your findings. 2. Choose two similar databases on two different vendors’ platforms and do a comprehensive technical reading of each one. Make a table highlighting their key similarities and differences.
SUMMARY When you’ve got free time to explore a database over which you want to achieve a larger measure of expertise and mastery, use this chapter’s technical reading of a database’s search system based on nine factors. A thorough technical reading will familiarize you with aspects of the database and its search system, including the subject coverage and types of records and publications indexed in the database; what the searchable fields are and which are included in a default search; whether a controlled vocabulary is in use; and the availability of presearch qualifiers and postsearch clusters for filtering results. If you conduct a comprehensive technical reading of each database available to information seekers at your institution, you will understand that some features are so common you can expect to find them in every database. You will also learn some surprising tricks for making the most out of each search system’s functionality. It’s unrealistic to think you can memorize all the features, filters, and syntax you’ll uncover in your technical readings, but you’ll have a head start on every search you do because you’ll know what to look for and where to look for it no matter which platform you’re using.
Taking the time to work through a series of systematic technical readings will serve you well later on when you’re working with an information seeker, not because you’ll remember everything you’ve learned about the database you’re using and its search system, but because you’ll know in general what to expect and how to find out more if needed. In such circumstances, a truncated technical reading can be an efficient and effective way to get the most out of the search as it unfolds.
ANSWERS 1. Your table may include a comparison of presearch qualifiers, postsearch filters, and how the space between keywords in a basic search box functions. Note the fields listed in the dropdown menu of a search box and investigate the ones you don’t recognize. You may also want to explore whether there’s a thesaurus, how elaborate it is, and if you can launch a search from it. 2. Notice the different look and feel of the search interfaces and the sorts of search tips, drop-down menus, and presearch qualifiers each one offers. Check out the links on the navigation bar, if there is one. On the results screen, consider the default order of results and how to change it. Study the filters, including which database has more and what the additional ones do. Which filters in each do you think might be the most useful in typical searches?
14
Interacting with Library Users At every step of the online searching process, intermediation with an information seeker may present teachable moments. Recognizing and acting on those opportunities to help seekers learn more about information resources and how to find them requires attentiveness to the seeker’s questions and comments. Other cues may also be present, such as body language if the interaction is in person or “how did you find that” questions if you’re in a chat encounter. Apart from the teachable moments that occur during the search process, librarians and other information professionals working in a variety of environments may also design and teach short “one-shot” instructional or training sessions; work with teachers, instructors, and professors in specific face-to-face or online courses; or teach a semester-long course to increase students’ information literacy. This chapter reviews the steps of the online searching process covered in the previous chapters with attention to the teaching and learning that can occur at each step. Information intermediaries as well as information seekers learn things during a search, including, obviously, the answer to their question or the identification of sources for their research project. Less obvious is the learning that relates to the formulation and revision of search queries; the ways in which information is organized, described, and represented in different kinds of resources; the features and filters that web search engines and database search systems make available; and the options for displaying and working with results. The advantage of organizing this chapter around the search process is that it reviews and reinforces your own understanding while also offering some
insights about search-related teaching and learning. We’ll revisit the search process, but this time through the lens of searching as learning.
SEARCHING AS LEARNING Recent research on searching as learning has tended to focus on web searching and on possible improvements to search engines to support learning while searching. Drawing on earlier research in the field of library and information science, von Hoyer et al. (2022) identify the interplay between technical factors—the design and function of the interface and the information retrieval system—and human factors, including the searcher/learner’s developing knowledge, information need, purpose, and search skills. One of the earliest researchers to investigate the relationship between searching and learning, Carol Kuhlthau, did much of her work before the advent of the web, yet her research continues to be salient. Her early work (1985; 1989) addressed the relationship between searching and learning by documenting that university students searching for information to complete an assignment developed deeper understanding of their topics and of the strategies for researching their topics over the course of their college years. Librarians can help students in the search process by growing students’ information literacy. The American Library Association (1989) issued a report on the subject, asserting that “To be information literate, a person must be able to recognize when information is needed and have the ability to locate, evaluate, and use effectively the needed information.” A decade later, the Association of College and Research Libraries (ACRL) (2000) issued its Information Literacy Competency Standards for Higher Education, which served as guidance for academic librarians designing training and instruction for student information seekers. ACRL standard two
(of five) focused on the ability to identify useful information retrieval systems, craft sophisticated search strategies and revise them as needed, and evaluate and organize the information found for subsequent use. These remain important concepts and skills, and librarians continue to help students acquire them. But as information resources grew more numerous, varied, openly accessible, and at times questionable, academic librarians began to rethink information literacy. In 2016, ACRL formalized its Framework for Information Literacy for Higher Education, which includes a section on “searching as strategic exploration.” This section states something you may have discovered by working your way through this book: “Searching for information is often nonlinear and iterative, requiring the evaluation of a range of information sources and the mental flexibility to pursue alternate avenues as new understanding develops” (22). The report discusses the contexts that surround searches, including the level of experience (if any) the seeker has with searching and the depth of knowledge (if any) the seeker has about the topic they’re pursuing. Information literacy approaches operate from the perspective of learning to search, with broader lessons in context and concepts along the way. Although experts at search, librarians who design information literacy instruction are not necessarily subject-matter experts, and so they learn about research topics from course instructors and from the students themselves. Conversely, information seekers operate from the perspective of searching to learn. They may want to learn a fact or how to do or make something. They may want to research a topic out of personal interest or because it’s required for a class or job assignment. Learning the answers is the point of the search. Along the way, they also have opportunities to learn how to find the answers. Rieh et al. (2016) refer to this reciprocal relationship between learning to search and searching to learn as “searching as a learning process emphasizing the interplay between cognitive learning and search activities” (23; emphasis in original).
TEACHING DURING THE REFERENCE INTERVIEW Reference interviews conducted via email, chat, and phone aren’t necessarily conducive to the teachable moment. The absence of visual cues can make clear communication challenging. Inserting an information-literacy tip that deviates from answering the question may confuse the user. You may prefer to stay on topic when conducting a reference interview via these modes of communication. Nevertheless, you can walk the user through the steps you take to answer the question and, as with all reference questions, cite the sources you are consulting. If you sense an impatient user whose sole interest is the answer to the question posed, perhaps the only information-literacy tip you can insert into the exchange is an invitation before signing off to contact you or your colleagues with subsequent questions. In-person reference interviews are more suited to teaching users about online searching and information-literacy concepts and skills generally. You will still have to judge how far afield you can go. Many elements can impede the occurrence of teachable moments, such as others waiting for your assistance; an impatient user who is pressed for time; or a user who is too preoccupied with a mobile device, a friend, or a crying child to give you their full attention. During a faceto-face reference interview you can talk to users about what they are seeing on the screen and you may turn the keyboard over to them to further the search. It’s also wise to get into the habit of saying what you are doing as you are doing it, rather than silently typing and looking at the screen. It can feel awkward at first, but saying out loud what you’re thinking and doing during a search is a subtle yet effective way to teach without seeming to, and it can lead to a moment when the seeker asks to learn more. Consider making a practice of leaving each and every user who visits you in person with one information-literacy tip above and
beyond the information that satisfies their need. Following are a few ideas to get you started. Choose one or more based on your assessment of the particular situation. As you begin the reference interview, be prepared to learn a bit about the seeker’s subject matter (from the seeker and, later, from early search results) so you can provide more effective guidance. For a research query, this may involve the two of you taking a quick look at a subject-specific encyclopedia or dictionary for basic concepts and themes. Explain the advantages of using the library’s catalog, discovery system, or individual subscription database as you are launching the one relevant to the seeker’s query. Explaining while launching applies whether you’re handling a reference, research, or knownitem query. Point out the document-type presearch qualifiers or results-page filters for limiting to peer-reviewed, scholarly, or journal articles if that is what’s needed. This may include a short explanation of what peer review means and why it matters. Show users how to apply classification or subject clusters to explore aspects of broad topics or to put into words ideas that they may be struggling to articulate. If the time seems right, state in one sentence what a subject heading or subject descriptor is and why they’re useful. For users engaged in research projects, explain whether the database search results they are viewing are surrogate records and/or full-text sources. Point out that an abstract in a surrogate record serves as a summary of a much longer full text and that it can be useful for discovering additional keywords to search and for deciding whether the full text is relevant enough to download and read. Let users know that the library offers phone, email, and chat services if they need additional help in the future and can’t make it to the brick-and-mortar library. This may include giving them the library’s email address and phone number, or pointing out the ask-alibrarian link on the library’s home page. If the reference transaction
is virtual, invite them to consider visiting the library in person for subsequent questions if you deem it advisable.
SUSTAINED TEACHING EVENTS A well-established vehicle for helping information seekers is point-ofuse or point-of-need instruction. It can take the form of a laminated poster near a print resource that highlights tips for using the resource. More likely in today’s world, it takes the form of online directions for using online resources. Many academic librarians use Springshare’s LibGuides to create point-of-need web pages that highlight resources and tips for using them, often tailoring them for specific courses. Teaching faculty can provide links from their online courses to the relevant LibGuides. Librarians can use Springshare’s LibWizard to create self-paced tutorials with quizzes for students to test their knowledge and surveys to capture students’ perceptions of their own learning (Brady and Kromrie 2022). Other well-established instructional methods include one-shot inperson sessions tailored to specific college courses; modules for college courses taught online; training workshops, usually conducted in a computer lab; and in-service training, which is a sustained event that should provide you with ample opportunity for teaching attendees about online searching. All involve helping participants become more information literate as they learn new concepts and skills. Post your teaching materials online so that attendees can consult them later and share them with their friends. Take your business cards to one-shot workshops and workshop series, tell users that it is your business to help them, and invite them to contact you when they get stuck. Promote future information-literacy events, and urge users to complete your evaluation forms, telling
them that you and your colleagues incorporate their input into future events. Also establish a site where you and your colleagues share your lesson plans and teaching materials with each other so that you can reuse content and coordinate your efforts. Course Documentation Formalize your preparation for sustained teaching events by creating documentation such as a training class outline or course syllabus, lesson plans, and teaching materials. An outline or syllabus gives everyone—prospective registrants, attendees, your colleagues, and yourself—an overview of your workshop, series of workshops, or half- or full-semester course. Publish your outline or syllabus on the open web, and draw from its contents to generate publicity statements. Use it as you are developing lesson plans and materials to guide your decisions about what to cover and what to omit. Lesson plans are detailed descriptions of what you expect attendees to learn, how you will deliver content, and what methods you will use to check for attendees’ understanding (Lierman and Santiago 2019; Milkova 2014). Preparing lesson plans will help you design and prepare course materials and activities to aid learning. Table 14.1 describes the specific contents of the outline or syllabus, lesson plans, and teaching materials; designates their target audiences; and suggests others who would benefit from access to your documentation. Table 14.1. Course Documentation Target Audience Access Syllabus
Contents
Your institution’s library users, prospective registrants, and course attendees
Your institution’s library users, prospective registrants, course attendees, you, and your colleagues
Vital information: Course specifics including instructor’s name and contact information, course name and number, location, and related information
Description: course content including topics covered, benefits of taking the course Organization: how the course will be conducted Learning objectives: Bulletpoint list of what attendees will learn as a result of taking the course Schedule of topics: A twocolumn table, with class dates in the left column and the topics scheduled for each date to the right Materials: A two-column table, with class dates on the left and readings, equipment, and other items needed for each class Grading: For credit-bearing courses, describes graded assignments. A four-column table with assignment names, brief descriptions, distribution dates, and due dates. Expectations: Rules of conduct for participants, rules for attendance and communication Contents
Target Access Audience Lesson Plans for Each Workshop or Class Period
You
You and your Learning objectives: Describes what colleagues you want attendees to learn in this particular workshop or class period Materials: Enumerates materials (e.g., lecture files, handouts, instruments, mobile devices, desktop and mobile apps, etc.) Agenda: Describes content and content-delivery methods Time periods: Estimates how much time is devoted to major agenda items Evaluation strategies: Describes how you will check for attendees’ understanding Teaching Materials for Each Interaction You and You, course A wide range of material, including course attendees, notes, outlines, slides, instructional attendees and your videos, demonstrations, games, colleagues journals, role-playing scenarios, case studies, simulations, concept maps, LibGuides, and so on Despite your many years of being a student on the receiving end of a teacher’s instruction, you may find it daunting to be the one doing the teaching. Enroll in an information-literacy course as you work on a master’s degree in library and information science or undertake an internship with a librarian whose duties include information-literacy instruction. The idea is to learn how people learn; study best practices for partnering with teachers, instructors, and professors to incorporate library use instruction into classes; and get actual experience drafting a syllabus, lesson plans, and teaching materials and perhaps even presenting an instructional session. One insight you’ll gain from such efforts is that taking responsibility for teaching others is a great way to cement your own understanding of
online searching concepts and skills as you prepare to teach a workshop, class, or course.
LEARNING ABOUT SEARCHING To learn about online searching, you are taking a for-credit, graduate-level course, and your performance, including formulating searches that retrieve relevant information, is being evaluated. Few if any end users will do the same. Using the seven steps of the online searching process as an outline, here are considerations and suggestions for teaching users at sustained information-literacy events. Step 1: Conducting the Reference Interview Decades ago, the library catalog became the first online informationretrieval system available to end users, and ever since, end users have preferred self-service over intermediation (De Rosa et al. 2005; Mischo and Lee 1987). When they reach an impasse satisfying their information needs, end users consult classmates, friends, parents, or someone they think has expertise on their topics of interest (Head and Eisenberg 2010; Rieh and Hilligoss 2008; Thomas, Tewell, and Willson 2017). Librarians are likely to be the last people end users consult for help (Thomas, Tewell, and Willson 2017). In fact, many people don’t think librarians are there to help them (Head and Eisenberg 2010; Ross, Nilsen, and Radford 2009; Thomas, Tewell, and Willson 2017). Show students that you and your colleagues are there to help them by offering face-to-face and chat assistance, by creating
LibGuides and tutorials, and by working with instructors and teachers to incorporate all aspects of searching into their classes. Tell them they can reach you through their preferred medium: chat, email, phone, face-to-face reference, workshops, library classes, and their online courses. You also want them to know that they have to be forthcoming about their queries so that you are able to develop a full understanding of the information they want. Emphasize that the reference interview is a challenging form of communication for both searchers and librarians as they discuss the information that’s needed and how to find it. The conversation should help the intermediator see things from the seeker’s perspective, and it can help the seeker clarify their information need for their own understanding. To quell people’s anxieties, remind them that you aren’t there to judge, evaluate, or gossip about their information needs; you are there to provide the right answers or at least the sources that lead to the answers. And acknowledge that they can teach you about their interest and their motivation while they express their topic and refine it as the conversation continues. Step 2: Selecting a Database For years, Google has been where most people get started on a research project (Asher 2011; De Rosa et al. 2005; Griffiths and Brophy 2005; Holman 2011; Lamphere 2017; Mizrachi 2010; Oxford University Press 2017; Perruso 2016). It’s convenient, as you can use the Google search engine on your smartphone and search on the go (Connaway, Dickey, and Radford 2011; Kvavik 2005). It’s easy, as you can type whatever comes to mind, and the search engine ranks algorithmically relevant results at the top of the list (Fast and Campbell 2004; Pan 2007). It’s reassuring because what the search engine finds (and it always finds something) usually includes a source at an understandable reading level, neither overly technical nor too simple, so that you can get an overview of your topic (Asher 2011; Lawrence 2015; Oxford University Press 2017).
It’s a confidence builder since you don’t need anyone’s help to find something seemingly useful (Head and Eisenberg 2009; Lawrence 2015). It’s comprehensive, as you search the whole internet in one fell swoop (Boss and Nelson 2005; Joint 2010). It’s dependable, as it’s always available (Connaway, Dickey, and Radford 2011). When it comes to selecting a database, expect end users to come to you after they have exhausted the top-ranked results at their favorite search engine. Promote your library’s website as the source of reliable and trusted information. If users want a Wikipedialike experience that will provide them with easy-to-understand information, refer them to the encyclopedias at your library’s website, such as World Book and Encyclopedia Britannica, or aggregated encyclopedia databases, such as Oxford Reference and the Gale Virtual Reference Library. Point out that disciplinary experts have written these encyclopedia entries and cited seminal sources that they should consult for more information. General multidisciplinary databases, such as Academic OneFile, General OneFile, General Reference Center Gold, ProQuest Research Library, and Academic Search Ultimate may be the next step up from encyclopedias, yielding sources that don’t necessarily require mastery of disciplinary knowledge to understand. News databases, such as Nexis Uni, Access World News, and ProQuest Newsstand, also have understandable content for generalists. College students are likely to receive directives from their instructors regarding the use of scholarly or peer-reviewed sources, so bring to their attention the presearch qualifiers that enable them to limit results to those kinds of sources. Because users can’t count on every database having such qualifiers, instruct them on how to differentiate between scholarly and nonscholarly sources listed on results pages. When academic instructors require students to search for information in subject-specific databases, show students where to look on the library’s website to identify and access databases by subject as well as where to find LibGuides for different academic disciplines.
Step 3: Typecasting the User’s Query Introduce students to typecasting, whether their query can be categorized as ready reference, research, or known item. A readyreference query can often be answered by a web search, but it just as often can be answered, or the web results verified, by an authoritative dictionary, directory, or encyclopedia accessible on the library’s website. For topic- or subject-based research, the user wants information about an event or series of events, person, place, organization, idea or philosophy, or other topics, whether for personal, professional, or educational purposes. For known items, the user wants a specific source they know exists, such as a book, an article, a film, an ad, or an image. Help seekers categorize their question so that both of you can identify ways to answer it. In a class or workshop, give users sample queries to typecast, letting them work with classmates so that they can practice typecasting together. Eventually, they’ll conduct subject or knownitem searches for these queries. The latter are closed-ended searches for one particular source using non-subject attributes such as author name, words in the title, publisher, publication year, DOI, or other unique identifiers. Some known items come with so little information that subject searches are necessary. Generally, subject searches are open-ended, with answers that are not from a single source but instead from the user’s synthesis of multiple sources. Step 4: Conducting the Facet Analysis and Logical Combination Few people using web search engines browse past the first results page (Asher 2011; Pan 2007). When web researchers run an experiment by switching the order of Google’s ranked results so then relevant ones are lower down or supply a test database with no relevant retrievals for the query posed, users may still choose the top retrievals or any results at all, even though they aren’t as
relevant (Pan 2007; Matysek and Tomaszczyk 2021). Another researcher concludes that users “seemed to see the keywords more as concepts rather than strings of letters to be matched” (Holman 2011, 25). Asked how Google ranks results, a user replied, “Somebody that works at Google . . . comes up with the answers” (boyd 2015). Users surveyed in an Oxford University Press study (2017) suggested that systems respond with useful information because they understand their queries. Underlying such research findings is a lack of understanding of how search engines index web pages and how their proprietary algorithms rank results. Search engines look for matches of user-entered search terms in indexed strings of alphanumeric characters, taking into account word proximity and many other statistical properties of words in texts. Many search engine users understand that their initial queries are weak, and when they see the first few results and deem them irrelevant, they reformulate their queries (Zhang, Abualsaud, and Smucker 2018). Searchers who begin with good queries may find relevant results but still reformulate their queries for more or better results. The learning that takes place is indicated by a general search becoming more specific and by the user’s relevance criteria changing as they learn more about the topic and about the published literature on the topic (Vakkari 2016). Students may have intuitive or experiential understanding that more carefully chosen keywords can retrieve more relevant results. But without knowledge of the sophisticated information organization that underlies subscription databases and without a well-thought-out plan for the search, they may waste time trying and retrying to find the best words for the best results. Facet analysis is crucial to online searching. When users get the facet analysis right, their chances of finding relevant information increase substantially. One factor that prevents them from getting it right is their failure to fully express their interests. Perhaps their lack of knowledge about the discipline underlying their topics affects their ability to fully specify things, or they haven’t given their topics much thought, or they are self-conscious about revealing their true interests. Emphasize to users that they must be forthcoming about
their interests in the statements they enter in search systems or in the conversations they hold with librarians in reference interviews. Reading minds isn’t something search systems or human beings are able to do. Failing to specify one or more facets means that results aren’t likely to include the missing facets; if they do, it’s purely by chance. When you talk about the facet analysis with users, you might want to substitute main idea or key concept for facet because some users won’t know what facet means. Use your judgment, inserting terminology you feel your users will understand. Define the facet analysis for users. A facet is a separate aspect (i.e., idea, element, notion, aspect, part, component, object, or entity) of a query. It represents one idea. It can be expressed in one word (e.g., automobile, ivory, Norway, bumblebees), or it can be expressed in a phrase that represents one idea (e.g., lung cancer, zebra mussels, controlled burn). Advise users to conduct the facet analysis before they start searching. If they have trouble conducting the facet analysis, invite them to write down their queries, choose the one prompt that “feels right” to them, and complete the sentence with their query: I I I I I I
want to know whether _________. am interested in _________. need information on _________. want to know about _________. am researching _________. need to find _________.
Table 14.2 shows how six users responded to these prompts, writing their queries on the blank lines. Before proceeding with the next step, ask users to review what they’ve written. Does it really express their interests or the aspect of their interests that is most important for their purposes? Give them a moment to revise what’s there, or perhaps choose another prompt to complete with their queries.
Table 14.2. Using Prompts to Elicit Queries from Users I want to know genetically modifiable crops affect people’s whether health. I am interested in the collapse of the Soviet Union. I need information on cyanobacteria as a source of alternative energy. I want to know about going to college. Does it change people’s behavior? I am researching why one should not take life too seriously. I need to find the book Narrow Road to the Deep North. Next, instruct users to dissect their queries into single words and phrases that express the main ideas that interest them. Two or three are usually enough for students who haven’t approached search in this way before. The easiest way for them to do this is to underline single words or phrases in their sentences. Table 14.3 shows such underlined words and phrases for the six queries. Table 14.3. Inviting Ideas I want to know whether I am interested in I need information on
Users to Dissect Their Queries into Main
genetically modifiable crops affect people’s health. the collapse of the Soviet Union. cyanobacteria as a source of alternative energy. I want to know about going to college. Does it change people’s behavior? I am researching why one should not take life too seriously. I need to find the book Narrow Road to the Deep North.
Finally, ask users to scrutinize their queries. If their underlined phrases exceed two words, ask them to simplify them. For example, if an underlined phrase is an adjectival phrase made up of three or more words, challenge them to reconceive it as two big ideas or
reduce it to two words without losing meaning or sacrificing specificity. If one of their underlined phrases is a prepositional phrase, suggest that they consolidate their interests into a single noun or two-word adjectival phrase. In table 14.4, three users have revised their facet analyses under their original underlined words and phrases: (1) breaking up “genetically modifiable crops” into “genetic modification” and “crops,” (2) restating “changing people’s behavior” as “behavior change,” and (3) restating “not take life too seriously” as “conduct of life.” Table 14.4. Advising Users to Break Up or Restate Wordy Big Ideas I want to know genetically modifiable crops affect people’s whether health. genetic modification crops health I am interested in the collapse of the Soviet Union. I want information on cyanobacteria as a source of alternative energy. I need information going to college. Does it change people’s about behavior? college behavior change I am researching why one should not take life too seriously. conduct of life I need to find the book Narrow Road to the Deep North. If users underline only one word or phrase, ask them to think about their interests more deeply and describe in written or spoken words what is only a vague notion inside their heads. If they still come up empty-handed, then that’s fine. Their query is a candidate for the friends strategy and searches in a database that features postsearch clusters. Additionally, the conduct of life query might benefit from clusters in a psychology, sociology, or religion database. Getting users to specify their queries, break them into facets, and transform them into search statements can be difficult to do. You and the participants in your workshops or classes may want to try
the keyword generator that uses a three-step process to script user queries into Boolean search statements (University of Texas at Austin 2018). Step 5: Representing the User’s Query as an Input to the Search System Since libraries introduced end users to online catalogs in the early 1980s, researchers have analyzed the queries users enter in search systems and concluded that users fail to use these systems’ searching languages or fail to use them correctly (Bates, Wilde, and Siegfried 1995; Bloom and Deyrup 2012; Holman 2011; Jansen, Spink, and Saracevic 2000; Turner 2011). Search engines and database systems now automatically do many of the things expert intermediary searchers have been doing for decades to generate high recall of relevant results: looking for search terms that are in close proximity; truncating keywords; searching variant spellings and plurals; and finding keywords in titles, abstracts, and author-supplied keyword fields that, aside from the subject field itself, are the most likely to represent the subject of the article. Programming search systems for greater recall removes the user’s burden of having to learn and apply searching languages. Consequently, your instructional sessions can emphasize using the facet analysis to craft the search rather than learning searching languages. In table 14.4, users’ underlined words and phrases or their restated words and phrases are the search terms that they should enter in their selected database(s). If the search system offers separate search boxes connected with the Boolean AND operator, advise the user to enter one underlined or restated word or phrase per search box. If the user runs out of search boxes, rows can be added, or the user can begin the search with the most important main ideas. If results are too broad, too numerous, or both, entering the remaining main ideas may help. Even better than entering these remaining terms may be browsing the results clusters in search of subjects or classification captions that are satisfactory
representations of one or more of the remaining elements of the topic and applying them. In figure 14.1, the user enters the search terms collapse and soviet union in the single box on the usa.gov search interface to retrieve government publications about their topic. The user may consider adding quotation marks around the search terms. Doing so may stop search systems from automatic stemming, which retrieves plural and singular forms and forms bearing different word endings (e.g.,-ed, -ing, -ation). In this case, adding quotation marks around the search term “soviet union” may be warranted to limit retrieval to this particular place only. In a search for the going to college query, it makes sense to omit quotation marks from the restated search term behavior change so that retrieval extends to such phrases as changes in behavior, behavioral changes, and changing one’s behavior.
Figure 14.1 Simple Boolean search using AND. Source: USA.gov, https://www.usa.gov/
If the database offers a single search box, show students how to switch to the advanced interface; the advanced search form for the National Agricultural Library’s AGRICOLA database is shown in figure 14.2. If no advanced search screen is available, the user should enter search terms with the AND operator between the terms that represent each different element of the topic: genetic modification AND crops AND health. If results are relevant, then the user should pursue them. If the system produces too few or no relevant results, then advise them to reformulate the query. You can gauge whether this presents a good opportunity to introduce the Boolean OR. If that seems too complex for the audience you are working with, brainstorm other keywords. For instance, someone might suggest using genetically modified or GMO instead of genetic modification. Each search term can be tried separately, if you want to avoid explaining the Boolean OR, the use of parentheses for nesting, or the different order in which Boolean operators are processed in different systems. Reformulated searches using synonyms in a series of queries could include the following: “gmo” AND crops AND health “genetically modified” AND crops AND health “genetically modified” AND corn AND health If there aren’t enough relevant results and you know you’re using a database appropriate for the topic, you, as an expert, would turn to browsing the database’s thesaurus (if it has one) to find subject descriptors. For many participants, learning what a thesaurus is and when and how to use it can be invaluable. Do consider what’s involved before you add it to your outline or syllabus, however. Think of the actions that must take place: spotting a link to the database’s thesaurus on the search system’s interface; browsing the thesaurus for appropriate search terms; interpreting term-relationship designations; understanding which of the listed terms are authorized descriptors; and choosing the descriptors that most closely match the seeker’s intentions. Furthermore, thesauri are not standardized
across all databases; they designate term relationships in different ways, and they are implemented differently in search systems.
Figure 14.2 Advanced search screen with three boxes for terms. Source: National Agricultural Library, https://agricola.nal.usda.gov/vwebv/searchAdvanced Whether you should teach end users about the role of the thesaurus in online searching is a judgment call because of the complexity that the thesaurus adds to the searching process. Before making the commitment, look for openings in class discussions with undergraduate and graduate students to introduce the thesaurus in the subject-specific database you are demonstrating and they are using. If students respond well, then you may want to include thesaurus use in your classes for students majoring in areas that have a subject-specific database with a thesaurus. The sixth query in tables 14.2 to 14.4 names a known item. Remind users about all the thinking they did to typecast their queries (step 3). Advise them that entering search terms for a query typecast as a known item should be limited to one facet at a time,
either the most specific facet or the one likely to produce fewer retrievals in a search of their selected database. The choice between book and narrow road to the deep north is simple: the former retrieves thousands of books and the latter retrieves the one book entitled Narrow Road to the Deep North. Show users how advanced interfaces allow them to restrict searches to certain fields on the database record. In this case, it makes sense to restrict the search to the title field because the query bears a book’s title. If the system offers a choice between browsing or searching the field that interests them, show users how browsing displays titles in the alphabetical neighborhood of their entered title, allowing them to scan for and select their desired title just in case the title they entered isn’t exactly right. Step 6: Entering the Search and Responding Strategically Recommend the à la carte edition of the building block search strategy for multifaceted topics. This means entering one salient search term per facet. The à la carte edition of the building block search strategy isn’t much different from how users conduct searches by inputting keywords in a search box. À la carte adds discipline and deliberation to the search-term selection and entry tasks, and you can help users with this search strategy in the following ways: Advise users to pause and think about their interests in a systematic way instead of rushing to enter keywords into the search system at hand. Urge users to be forthcoming about the full scope of their interests. Ask users to express their interests in a sentence. Then ask users to extract the main ideas from the sentence. Lastly, work with users to repackage their complex main ideas into distinctive facets and/or consolidate wordy ones into shorter adjectival phrases or simple nouns.
The pearl growing search strategy helps users whose searches produce too few results to find additional results. To execute this strategy, users must scrutinize the results they’re able to find, looking for synonyms for their search terms. For example, retrievals for the crops and cyanobacteria queries use the terms transgenic plants and reusable energy instead of genetic modification and alternative energy. Substituting the former terms for the latter terms is likely to increase the number of relevant results, because the former are the controlled vocabulary terms that the database uses to refer to the latter. Three other approaches for increasing retrievals are find-like searches, backward-chaining, and forward-chaining. Find-like isn’t available in all search systems and databases, but when it is, it requires a minimum of effort on the user’s part. They identify a relevant source and click on the accompanying find-like link to trigger the search system to find more like it. In some systems, like articles are listed in a column next to an article displayed on the screen, making it easy to choose an article without having to search for it. In a citation database, such as Scopus or Dimensions, clicking on an article’s references initiates backward-chaining to find earlier works on the same topic, while clicking on the citations in articles that have cited the article initiates forward-chaining to identify subsequent publications. Find-alike and backward-and forwardchaining make it possible to find information on a topic using a single relevant article as an entry point rather than crafting and entering a search statement. Step 7: Displaying Results and Responding Tactically Much of the information users retrieve through their library’s website has been vetted by librarians who choose the databases listed there and by publishers who decide on the content added to databases. Databases are costly, so librarians consider many factors when making subscription decisions (Johnson 2018). Librarians also evaluate the books, audiobooks, films, and other materials they
purchase for the library. Helping them make purchase and subscription decisions are various trade publications, such as Publisher’s Weekly, Choice, Booklist, and Kirkus Reviews, which are dedicated almost entirely to reviews of newly published resources. When your library’s users choose vetted resources, evaluating their credibility might take a backseat to assessments of their relevance. Asked to provide evidence for why a source is relevant, users point to specific subject matter that the source contains or to the overlap between their topic of interest and a particular source’s contents (Markey, Leeder, and Rieh 2014). When users have difficulty assessing relevance, they usually don’t have a firm grasp on their topic of interest (Zhitomirsky-Geffet, Bar-Ilan, and Levene 2017). That relevance is a moving target, subject to users’ knowledge about a topic and their exposure to newly retrieved sources, is built into the berrypicking model of information retrieval. Automated relevance ranking is based on proprietary algorithms that change regularly. As a consequence, it’s helpful to advise users to notice the order in which results are displayed, whether newest first or relevant first. And help them see the value in assessing more than the first few results. Finally, talk to users about their unspoken desire to find the perfect source that matches exactly what they want to say. Rarely does such a source exist. Instead, users must synthesize the information they find and draw on qualified others for insight, knowledge, and understanding to complete an assignment, form an opinion, make a decision, or take decisive action. When users display results, explain the function of surrogate records in a database. Briefly identify the information on the surrogate: the metadata elements that are used to construct the citations for the sources they use in term projects; the words and phrases shown in the title, author-supplied keywords, abstracts, and subjects fields that can be used in a reformulated query; and how to download the citation and full text of an item in their set of results. If the full text is a print book or other tangible item in the library’s physical collection, help them to interpret location information such as call numbers or sections organized by genre or author names.
Figure 14.3 PubMed results screen with filters for refining the search. Source: PubMed, https://pubmed.ncbi.nlm.nih.gov/ The results page itself should be explained as well, since it may have a lot of information to interpret, as shown in figure 14.3. Clusters for filtering results, when they are available, provide users with a means of refining the search using non-subject criteria such as date of publication, language, or genre. Subject filters can help students find articles tagged with subject descriptors and classification codes (if these are used in the database) without necessarily needing to know how subject indexing and classification enhance knowledge organization and retrieval. Benefiting most from results page filters are seekers with one-facet queries. Encourage them to browse subject and classification clusters for interesting subtopics, especially if their research is for a course paper.
TEACHING TIPS Eventually, you will find yourself teaching library users in a variety of venues, and you will want to do your best. Here are teaching tips to keep in mind as you prepare for the big moment: Be the epitome of organization. Compose a syllabus for half-and semester-long classes and workshop series. Distribute an agenda that enumerates major class-period activities. If you lecture, begin with an outline that summarizes the content you expect to cover. Review your lectures in advance, making sure you haven’t crammed too much into the allotted time period and your content is ordered logically. Know your audience. Do some research about your audience in advance. Here are some suggestions: When an instructor invites you to a class, interview the instructor in advance to learn about the course, such as the assignment that is the impetus for your presentation to students and the instructor’s special instructions about the sources and databases students are expected to use. When preparing for a workshop, check the evaluation forms that workshop attendees completed the last time the workshop was offered to see who they were and their suggestions for improvements. Don’t overwhelm users. Limit your teaching objectives to what you can accomplish in the time allotted, leaving ample time for questions, evaluations, and minimal deviations from your lesson plan. Librarians always want to be comprehensive, and that includes overwhelming users with too many relevant resources. If you try to include too much information, you’ll speak faster, which will make it harder for students to comprehend what you are trying to convey. If faculty assign you more than you can cover, be frank with them, negotiating how much and what you can cover in the allotted time period.
Substitute active learning for lecturing and demonstrations. Online searching is natural for active learning because users do online searching. They interact with an online system that produces results instantaneously, allowing them to experiment in a penalty-free environment and save what they do for postsearch debriefings and analyses. Whether you are teaching young people or older adults, devise situations in which they experience online searching firsthand and have opportunities to share their observations about their searches and search results with others. Be patient. If you ask attendees a question or ask for a volunteer to share their online searching experiences, wait for an answer. People hesitate because they know they’ll feel bad if they give a wrong answer. Sometimes it takes everyone in the class a minute to process the question and develop what they want to say. Get comfortable with the silence while you wait for the first person to speak. Admit when you don’t know the answer to someone’s question. Promise to follow up with the answer at the next class meeting or via email. If you don’t have time to follow up, give everyone one minute to find an answer right then and there, or in a credit-bearing course, offer extra credit to the first person who messages you the right answer. Always commend the inquirer whose question has you stumped. Expect the unexpected. The database, search system, or userauthentication system may be down just when you need it. The projector may break. A fire drill may interrupt. When you are planning the session, workshop, or class, give some thought to back-up plans before you need them. If you’re planning a live demonstration of a website, for instance, you can prepare a slide-show with a few screenshots you can use if technical issues make a live demo impossible. Reflect on the assistance and instruction you offer others. Consider how you greet them, how you elicit what they really want, how you put your knowledge of the information universe to work to answer their questions. Review reference interviews
that you think went really well or really badly. With regard to the latter, be honest with yourself about what you did, what you wish you would have done, and how you’ll respond more effectively to a similar situation in the future. Ask students and workshop attendees to complete brief evaluations to help you identify what works and what you need to work on. Ask your colleagues. It’s impossible to know everything about online searching, teaching, and information literacy. When it comes to interpersonal interactions, group dynamics, in-class policies, and the like, check with your more experienced colleagues because they’ve probably encountered much the same challenges and experimented with and instituted solutions that work. When you are new to the job, your colleagues will expect you to ask questions and seek their advice. You may want to ask a trusted colleague to sit in on one of your sessions and offer feedback on the organization and presentation of the material. Prepare in advance, and practice, practice, practice! Preparing well in advance gives you time to create and update material and to become adept at using instructional technology. Time your practice sessions so you’ll know you’ll be able to cover the material in the allotted time. Take advantage of every opportunity to make presentations so that you become accustomed to speaking to groups, expressing yourself extemporaneously, answering questions, handling the unexpected, and much more. Channeling nervousness into preparation can be a positive way to handle the jitters. Consider the nervousness you feel before teaching as a form of energy that you can tap into when you enter the physical or virtual classroom.
QUESTIONS
1. The head of the public library where you work has asked you to develop an hourlong workshop that teaches high school students the basics of using the Gale General OneFile and Gale Academic OneFile (or similar) databases to find sources for term papers and speeches. Their assignments usually require three to five authoritative sources, not including encyclopedias. List the main points you want them to leave the workshop knowing. In other words, what are the learning objectives? Explain your reasoning regarding the learning objectives you listed. How many did you list? How likely is it that you can cover the information sufficiently in a one-hour workshop? Imagine being in the workshop and realizing about halfway through that you’re going to run out of time before you cover everything. What would you leave out and why? 2. You did such a great job on the workshop for high school students that the head of the public library is asking you to teach a two-hour in-service workshop for middle and high school teachers, focusing on the same databases. To what extent would your learning objectives change for this group? If one of the teachers contacted you a few weeks later for help with a lesson plan that includes having students search for information, what would you hope the teacher remembered from the workshop so you wouldn’t have to repeat it at this follow-up interaction? 3. For the first two questions, consider where your learning objectives fit in the seven-step online searching process. How much detail about the steps do you think is necessary to accomplish the learning objectives? At what point might you risk overwhelming the participants with too much information?
SUMMARY
The seven-step online searching process is the framework for this chapter’s recommendations about searching to learn and learning to search. The search process is replete with opportunities to learn, for both the information seeker and the intermediating librarian. The learning begins with the very first step in the process, the reference interview, when the librarian learns about the seeker’s information needs and the seeker learns about the resources that may be useful. Stay alert for teachable moments when a seeker signals readiness to learn more about a resource or an approach you suggest for finding sources and answers. Always close the interview with an invitation to contact you or your colleagues anytime in the future, either for follow-up help or for assistance with a new query. Sustained teaching events, such as the one-shot informationliteracy session, a module in an online course, the standalone workshop or workshop series, and half-or full-semester courses, allow for more detail and depth than do spontaneous teachable moments during a reference encounter. Such events may warrant the preparation of an outline or syllabus, lesson plans, and teaching materials. They require planning and practice, but are invaluable for participants. Don’t be surprised if you find you understand the search process more fully after teaching it, or parts of it, to others.
REFERENCES American Library Association. 1989. Presidential Committee on Information Literacy: Final Report. http://www.ala.org/acrl/publications/whitepapers/presidential. Asher, Andrew D. 2011. “Search Magic: Discovering How Undergraduates Find Information.” Paper presented at the American Anthropological Association meeting, Montreal, Canada, November. http://www.erialproject.org/wpcontent/uploads/2011/11/Asher_AAA2011_Search-magic.pdf.
Association of College and Research Libraries (ACRL). 2000. Information Literacy Competency Standards for Higher Education. https://crln.acrl.org/index.php/crlnews/article/view/19242/22395. ACRL. 2016. Framework for Information Literacy for Higher Education. https://www.ala.org/acrl/sites/ala.org.acrl/files/content/issues/inf olit/framework1.pdf. Bates, Marcia J., Deborah N. Wilde, and Susan Siegfried. 1995. “Research Practices of Humanities Scholars in an Online Environment: The Getty Online Searching Project Report No. 3.” Library and Information Science Research 17 (Winter): 5–40. Bloom, Beth S., and Marta Deyrup. 2012. “The Truth Is Out: How Students REALLY Search.” Paper presented at the Charleston Library Conference, Charleston, SC, November 7–10. Boss, Stephen C., and Michael L. Nelson. 2005. “Federated Search Tools: The Next Step in the Quest for One-Stop-Shopping.” The Reference Librarian 44, nos. 91–92: 139–60. boyd, danah. 2015. “Online Reflections of Our Offline Lives.” On Being. Podcast audio. April 9. https://soundcloud.com/onbeing/danah-boyd-online-reflectionsof-our-offline-lives. Brady, Jennifer M., and Susan Kromrie. 2022. “Creating a Self-Paced Library Orientation and Information Literacy Module: Providing Access to Library Resources at the Point of Need.” College & Research Libraries News 83, no. 7 (July/August): 301–5. Connaway, Lynn Sillipigni, Timothy J. Dickey, and Marie L. Radford. 2011. “If It Is Too Inconvenient I’m Not Going After It: Convenience as a Critical Factor in Information-Seeking Behaviors.” Library & Information Science Research 33, no. 3: 179–90. De Rosa, Cathy, Joanne Cantrell, Janet Hawk, and Alane Wilson. 2005. College Students’ Perceptions of Libraries and Information Resources: A Report to the OCLC Membership. Dublin, OH: OCLC. Fast, Karl V., and D. Grant Campbell. 2004. “‘I Still Like Google’: University Student Perceptions of Searching OPACs and the Web.” Proceedings of the ASIS Annual Meeting 41: 138–46.
Griffiths, Jillian, and Peter Brophy. 2005. “Student Searching Behavior and the Web: Use of Academic Resources and Google.” Library Trends 53, no. 4: 539–54. Head, Allison J., and Michael B. Eisenberg. 2009. “What Today’s College Students Say about Conducting Research in the Digital Age.” http://www.projectinfolit.org/uploads/2/7/5/4/27541717/2009_fin al_report.pdf. Head, Allison J., and Michael B. Eisenberg. 2010. “How College Students Evaluate and Use Information in the Digital Age.” http://www.projectinfolit.org/uploads/2/7/5/4/27541717/pil_fall2 010_survey_fullreport1.pdf. Holman, Lucy. 2011. “Millennial Students’ Mental Models of Search: Implications for Academic Librarians and Database Developers.” The Journal of Academic Librarianship 37, no. 1: 19–27. Jansen, Bernard J., Amanda Spink, and Tefko Saracevic. 2000. “Real Life, Real Users, and Real Needs: A Study and Analysis of User Queries on the Web.” Information Processing and Management 36, no. 2: 207–27. Johnson, Peggy. 2018. Fundamentals of Collection Development and Management, 4th ed. Chicago: American Library Association. Joint, Nicholas. 2010. “The One–Stop Shop Search Engine: A Transformational Library Technology? ANTAEUS.” Library Review 59, no. 4: 240–48. Kuhlthau, Carol C. 1985. “A Process Approach to Library Skills Instruction.” School Library Media Quarterly 13, no. 1: 35–40. Kuhlthau, Carol C. 1989. “Information Search Process: A Summary of Research and Implications for School Library Media Programs.” School Library Media Quarterly 18, no. 5: 19–25. Kvavik, Robert. 2005. “Convenience, Communications, and Control: How Students Use Technology.” In Educating the Net Generation, edited by Diana G. Oblinger and James L. Oblinger, 7.1–7.20. https://www.educause.edu/ir/library/pdf/pub7101.pdf. Lamphere, Carly. 2017. “Research 3.0.” Online Searcher 41, no. 3 (May/June): 30–33.
Lawrence, Kate. 2015. “Today’s Students: Skimmers, Scanners and Efficiency-Seekers.” Information Services & Use 35, nos. 1–2: 89– 93. Lierman, Ashley, and Ariana Santiago. 2019. “Developing Online Instruction According to Best Practices.” Journal of Information Literacy 13, no. 2: 206–22. Markey, Karen, Chris Leeder, and Soo Young Rieh. 2014. Designing Online Information Literacy Games Students Want to Play. Lanham, MD: Rowman & Littlefield. Matysek, Anna, and Jacek Tomaszczyk. 2021. “In Quest of Goldilocks Ranges in Searching for Information on the Web.” Journal of Documentation 78, no. 2: 264–83. Milkova, Stiliana. 2014. “Strategies for Effective Lesson Planning.” Center for Research on Teaching and Learning, University of Michigan. http://www.crlt.umich.edu/gsis/p2_5. Mischo, William H., and Jounghyoun Lee. 1987. “End-User Searching of Bibliographic Databases.” Annual Review of Information Science & Technology 22: 227–63. Mizrachi, Diane. 2010. “Undergraduates’ Academic Information and Library Behaviors: Preliminary Results.” Reference Services Review 38, no. 4: 571–80. Oxford University Press. 2017. “Navigating Research: How Academic Users Understand, Discover, and Utilize Reference Resources.” https://global.oup.com/academic/content/pdf/navigatingresearch. pdf. Pan, Bing. 2007. “In Google We Trust: Users’ Decisions on Rank, Position, and Relevance.” Journal of Computer-Mediated Communication 12, no. 3: 801–23. Perruso, Carol. 2016. “Undergraduates’ Use of Google vs. Library Resources: A Four-Year Cohort Study.” College & Research Libraries 77, no. 5 (September): 614–30. Rieh, Soo Young, and Brian Hilligoss. 2008. “College Students’ Credibility Judgments in the Information-Seeking Process.” In Digital Media, Youth, and Credibility, edited by Miriam J. Metzger and Andrew J. Flanagin, 49–72. Cambridge, MA: MIT Press.
Rieh, Soo Young, Kevyn Collins-Thompson, Preben Hansen, and HyeJung Lee. 2016. “Towards Searching as a Learning Process: A Review of Current Perspectives and Future Directions.” Journal of Information Science 42, no. 1: 19–34. Ross, Catherine Sheldrick, Kirsti Nilsen, and Marie L. Radford. 2009. Conducting the Reference Interview: A How-to-Do-It Manual for Librarians. New York: Neal-Schuman. Thomas, Susan, Eamon Tewell, and Gloria Willson. 2017. “Where Students Start and What They Do When They Get Stuck: A Qualitative Inquiry into Academic Information-Seeking and HelpSeeking Practices.” The Journal of Academic Librarianship 43: 224–31. Turner, Nancy B. 2011. “Librarians Do It Differently: Comparative Usability Testing with Students and Library Staff.” Journal of Web Librarianship 5, no. 4: 286–98. University of Texas at Austin. 2018. “How to Generate Search Terms.” https://apps.lib.utexas.edu/apps/libraries/key/nlogon/. Vakkari, Pertti. 2016. “Searching as Learning: A Systematization Based on Literature.” Journal of Information Science 42, no.1: 7– 18. von Hoyer, Johannes, Anett Hoppe, Yvonne Kammerer, Christian Otto, Georg Pardi, Markus Rokicki, Ran Yu, Stefan Dietze, Ralph Ewerth, and Peter Holtz. 2022. “The Search as Learning Spaceship: Toward a Comprehensive Model of Psychological and Technological Facets of Search as Learning.” Frontiers in Psychology 13 (March): 1–19. Zhang, Haotian, Mustafa Abualsaud, and Mark D. Smucker. 2018. “A Study of Immediate Requery Behavior in Search.” In Proceedings of the 2018 Conference on Human Information Interaction & Retrieval, 181–90. Zhitomirsky-Geffet, Maayan, Judit Bar-Ilan, and Mark Levene. 2017. “Analysis of Change in Users’ Assessment of Search Results over Time.” Journal of the Association for Information Science and Technology 68, no. 5: 1137–48.
SUGGESTED READING Badke, William. 2015. “Ten Things We Really Should Teach about Searching.” Online Searcher 39, no. 3 (May/June): 71–73. Wong, Melissa A. 2019. Instructional Design for LIS Professionals: A Guide for Teaching Librarians and Information Science Professionals. Santa Barbara, CA: Libraries Unlimited.
ANSWERS 1. The head of the public library where you work has asked you to develop an hourlong workshop that teaches high school students the basics of using the Gale General OneFile and Gale Academic OneFile (or similar) databases to find sources for term papers and speeches. List the main points you want them to leave the workshop knowing. First, I’d want high school students to understand how the kinds of material they will find in these databases differs from the kinds of material they find on the web. They need to know that these databases contain the full text of articles published in newspapers and periodicals that are not available for free on the web. Second, they should be able to identify the different facets of their chosen topics. Third, they should be able to identify at least one keyword for each facet of their topic. Fourth, they would learn where to input their keywords and run the search. Fifth, they would learn how to access a full text and cite the source if they use it in their paper. The second and third objectives could be developed during the session by letting students work briefly on identifying facets and keywords. If that took too long, however, I might not worry too much about showing them exactly where to input keywords, since they are
familiar with search boxes and search buttons from using web search engines. 2. To what extent would your learning objectives change if you were doing a two-hour session for high school teachers? I would have the same objectives as for the students, with an additional one for the teachers: how to assess results. This would include understanding the difference between a newspaper, a popular magazine, a trade publication, and a scholarly journal. It would also include the ability to identify these different document types in sets of search results. The other part of assessing results I would include is skimming relevant results for other keywords to use in a revised search and/or using the subject filters on the results page. At that point, it would be useful to explain the difference between subject descriptors and natural-language keywords. 3. For the first two questions, consider where your learning objectives fit in the seven-step online searching process. My objectives involve part of step 4, conducting the facet analysis, but not creating the logical combination. High school students can do one search at a time without necessarily needing to use the Boolean operators. My objectives also involve step 5, but at the basic level of identifying useful keywords. Step 6 is represented in the objective to help teachers learn to identify additional keywords to use in reformulated searches, and step 7 is related to the workshop’s coverage of accessing full texts and learning how to cite sources.
15
Online Searching Now and in the Future A number of factors impact the effectiveness and efficiency of online searching. Some are the product of search systems, some involve the expertise of the searcher, and some are entangled in the larger world of politics, business, science, and society. The search for useful information requires a complex set of knowledge and skills. It’s important to understand how sources are rendered discoverable through indexing. It’s necessary to master the use of features and functions built into search systems as you translate a query into a search statement that will retrieve relevant results. It’s imperative that information seekers and intermediators assess results for relevance and for revising search statements when necessary. Chapter 15 considers some of the developments that are shaping search. One of the most pronounced is the emphasis on recall over precision. Throughout online searching—no matter how it continues to develop—is the tension surrounding relevance, a judgment that shifts over time with the system, the seeker, the intermediator, and the search process itself. Another is the acceleration of open access to research publications and data. Open access is a product of scientific and other research practices, but it also drives research as data and findings are freely shared and widely distributed. A third development is the continuing scourge of misinformation, disinformation, and malinformation and the critical need to judge sources not only on their relevance but by their truthfulness, accuracy, and verifiability. The chapter concludes with an
acknowledgment that this dynamic and challenging search landscape requires us to keep learning.
THE REIGN OF RECALL Ever since Google introduced relevance-ranked results as the solution to the problem of retrieving overwhelming numbers of web pages, search systems have been moving toward retrieving more rather than fewer results. Even systems that order results by newest first rather than relevance favor recall over precision. Users don’t have to know anything about information organization, search tools, or document types to get results. If the first results aren’t useful, it’s easy to try other keywords, and hard to imagine there are better ways to search. Examples of recall-oriented systems include the following: library discovery systems that retrieve results from many nonobvious sources, search system defaults designed to increase the number of results retrieved, the display of brief records or snippets of items on the search screen before a search has been done, and the addition of browser extensions to include open-access versions of articles. Library Discovery Systems Recall reigns in the web-scale discovery systems whose everything search boxes are featured on academic library home pages. Although the search box on a public library home page usually retrieves results only from the library catalog, the search box on an
academic library home page now retrieves results from many other sources in addition to the catalog. Knowing that most students begin their searches on Google, libraries have implemented web-scale discovery systems akin to one-stop shopping for academic information. Users enter their keywords into a Google-like search box, and the system searches an index of millions of items from the library’s catalog, e-book collections, licensed databases, institutional repositories, and the like. The results are surrogate records representing books, scholarly journal articles, trade and popular magazines, and other types of information. Depending on where the record comes from, it may include the titles of all the chapters in a book, an abstract of an article, and/or the full text of the item. The development of web-scale discovery systems is intended to help library users who don’t know where to start searching to find scholarly, professional, and educational information. The assumption is that it’s better to retrieve something, even something off topic, rather than nothing. Some systems label the mix of results clearly, but even with clear labels results can be confusing, especially for those whose only search experience is with Google. For students new to research, it can be difficult to distinguish between a record representing a book review, the book itself, or an article discussing the same topic as the book. Conversely, the longtime academic library user accustomed to searching the library catalog and one or two subject-specific databases may find the home page search box needlessly comprehensive and have difficulty finding where on a complex website to search only the catalog or only their favorite database. Libraries have seen an increase in downloads of full-text articles due to the installation of discovery systems (Levine-Clark, McDonald, and Price 2014). Undergraduate students seem well served by the ready retrieval of results, while graduate students seem to prefer discipline-specific databases that yield more precise results (Lundrigan, Manuel, and Yan 2016; Lee and Chung 2016). A midway point between the high recall of the library discovery system and the precision of a single subject-specific database is the ability to use a
single database vendor’s platform to search its multiple databases at once. Subject Indexing and Phrase Searching Another indicator of the emphasis on recall involves subject indexing. Subject fields used to be phrase-indexed; if you knew the correct subject heading or multi-word descriptor, you’d type it in and the system would look for it character-by-character. In most systems, the subject field is now word-indexed, meaning that all records tagged with subject descriptors that contain one of the words in your search statement will be retrieved. For example, in the Political Science Complete database, inputting the word identity in the search box and using the drop-down menu to select the subject will retrieve all records tagged with the descriptor’s identity (psychology), identity politics, LGBTQ+ identity, and group identity. If you’re using Political Science Complete to research “identity politics,” you’ll have to be more precise by putting quotation marks around the phrase when using it as a subject descriptor. In some databases, even quotation marks no longer ensure that the system will search character-by-character from the opening quotation mark to the closing one. For example, the EBSCOhost search system is programmed to ignore stop words even if they are part of a phrase defined by quotation marks. A scholar specializing in the history of print culture will find that inputting “history of the book” in the default search box yields thousands of unexpected results. The high number of results comes from a combination of the system treating stop words as stop words even when they are enclosed in quotation marks and the system having an automatic NEAR5 truncation operator between each word. In Academic Search Ultimate, the system is programmed to process the phrase search “history of the book” as history N5 book, yielding much higher recall than a strict phrase search statement would, but less than using the AND operator between the two words. The phrase is a descriptor in this particular database, and launching the search from the
thesaurus will retrieve the most precise results, 582 as opposed to tens of thousands otherwise. Not all default search boxes are programmed to loosen the functionality of quotation marks placed around phrases for greater recall or treat a space between words as a proximity operator rather than the Boolean AND for greater precision (but not necessarily greater topical relevance). How subject indexes are constructed makes a difference as well. It used to be possible to assume that a search for a multi-word subject descriptor in the subject index would retrieve only the records that had been tagged with that exact subject descriptor. It used to be possible to assume that a search system recognized a space between two words as the Boolean AND. Now it’s best not to assume anything about what a search system is programmed to do or, for that matter, how ranking algorithms determine the order of the items on the search results page. Help screens available while you are logged in to a database will likely tell you how search boxes, indexes, and other features function, or you can experiment a bit with searches and evaluate results to understand how retrieval works in a particular database. Results on the Search Page While Google maintains its minimalist single search box interface, Bing has opted for an interface replete with trending news and popular, quirky, or sensational stories; stock prices; sports scores; weather reports; lots of photographs, videos, and graphics; and ads. Databases more suited to scholarly research occupy a similar spectrum, from the minimal basic interface offering a single search box to the multiple boxes and presearch qualifiers on the advanced search screen. The latter is exemplified by databases such as PubMed and Taylor & Francis Online, both of which keep the emphasis on the search box but list trending or popular articles farther down on the page. The interface at the extreme end of this spectrum is Dimensions, where the interface is a list of results, filters, and categories with a search box at the top. Some vendors
make it possible for libraries to choose which search interface to display, with school and public libraries opting for the basic search screen with a single box and academic libraries more often choosing the advanced search screen with multiple boxes and presearch qualifiers. Another way of thinking about these interfaces is to consider the relationship of searching to browsing. On the Dimensions interface, browsing is at least as important as searching. Since Dimensions is designed to help serious researchers find others in their areas of interest, browsing and clicking can be educational as users explore the structure of categories and their subcategories, decide which categories are relevant and should be revisited, and identify scholars to follow and cite. Consider also the interfaces that combine searching and browsing in databases designed for readers, such as NoveList and Goodreads. Users can learn about categories of literature as they browse for books to read and revisit their favorite categories by browsing or use them as qualifiers or filters in a keyword search. In effect, browsing can help users find authors and works that may not turn up in sets of search results because of the way the search statement was crafted. In that sense, browsing functions as a way to increase recall. But it also allows for precision as the user drills down through the subcategories to home in on narrower and more focused material. Interfaces have traveled quite a distance since Google introduced a search screen so spare that new users would see all the white space and wait for the (nonexistent) rest of the screen to load before attempting a search (Bosker 2012). The search interface that includes lists of articles on trending topics and paths for finding material, such as articles listed under subject categories, has, in a sense, already recalled items for display at the place where most users expect to do a search. Such an interface can by turns be efficient or time-consuming, depending on the information seeker’s particular quest and familiarity with their topic. But notice how helpful it is, when confronted with a crowded interface, to possess search skills and knowledge. An undergraduate
who can’t think of a topic for a term paper may appreciate being able to work from a list of trending news and popular articles. The information intermediary may even help a student decide what topic to research for an assignment. It may also be necessary to help a student recognize the difference between news stories and research articles and advertising. To move past the trending articles and undertake a more deliberate effort to find material, it’s helpful to understand search basics, such as choosing the best database, keywords, and Boolean operators. Filters, although meant to help users winnow high-recall results, present their own challenges. For document type, wording can be a bit confusing, especially on platforms that use the shorthand “academic journals” to mean articles in journals and “magazines” for articles published in magazines. There may be a tab with finer document-type distinctions, but the user needs to know what is meant by document types and which types to select for their project. Without understanding the basic concepts and jargon, an information seeker won’t know which filters to use, such as whether to browse the link labeled “Research Categories” or the one labeled “Researchers.” Even though it takes some effort to learn how to craft a good search statement, it can also take some effort to browse effectively and efficiently.
OPEN-ACCESS RESULTS Perhaps it isn’t obvious, but another way to increase recall is to make sure that open-access versions of articles are included alongside paywalled versions in sets of search results. Such is the case with Google Scholar, where results provide links to paywalled versions and to open-access versions if they exist. Sometimes open access is the only option. Some commercial databases index openaccess journals whose articles are open from the start and have
never been paywalled. Additionally, preprints or other forms of openaccess research are uploaded to institutional and disciplinary repositories for easy discovery in the repository, in library discovery systems that include the library’s institutional repository, and in Google Scholar results. To make sure information seekers find existing open-access versions of articles, browser extensions such as those offered by Unpaywall and Lean Library can be installed on personal and library computers. When Unpaywall finds an open-access version of an article, it will display an icon of an open lock, while Lean Library displays a pop-up alert when it finds an open-access article. When libraries install such browser extensions on their public-access computers, they make millions of open-access articles available to students and researchers and they save users from having to make an extra effort to locate open-access resources. Research studies have indicated that somewhere between 28 and 47 percent of scientific articles are published in some form of open access (Piwowar et al. 2018). A study comparing the percentages of open-access material indexed in the Dimensions and Web of Science databases found that Dimensions had a somewhat higher percentage, in part because its index includes more material from non-Western countries where open access has been more common (Basson et al. 2022). Open access to research papers will no doubt increase over time as governmental bodies continue to mandate that articles resulting from government-funded research be made freely available as soon as they are published. In 2022, the White House Office of Science and Technology Policy announced that all research publications funded by federal government agencies must be made freely available upon publication (Nelson 2022). This policy replaced one allowing publishers to embargo such publications for up to a year. The policy requires that the datasets underlying the research also be made publicly available. In effect, many more articles will be published at a high level of open access (gold), characterized by journal publishers making articles freely available on their websites (Clarke & Esposito 2022).
This development is closely tied to the open data and open science movement. As figure 15.1 indicates, open science depends on, among other things, openly available data and publications, the ability to measure research impact, technologies that make it easy for scientists to share their work, and public engagement (Ramachandran, Bugbee, and Murphy 2021). As the availability of open research publications and open data continues to expand, librarians will need to stay apprised of changes in researchers’ sharing practices and the impact this may have on database selection and, more broadly, on organizing, indexing, and distributing scholarly knowledge. The availability of open-access material in journals and databases has become a feature that publishers advertise as a selling point, a further indicator that we can expect the proportion of open publications to increase. For example, Springer Nature (2021) reported it had published 1 million gold open-access research and review articles, making it the first publisher to reach that number. Those articles had been downloaded 2.6 billion times between 2016 and 2021.
Figure 15.1 Open science. Source: Ramachandran et al. 2021 In a commercial subscription database where the publisher has invested deeply in organizing and structuring information, developing and updating retrieval functions, and enhancing presearch qualifiers and postsearch filters, sophisticated search techniques will retrieve precise results that include paywalled and openly accessible publications. Open-access repositories and other database providers without the huge profits of the largest vendors typically don’t have the same level of resources to invest and consequently don’t offer
the same level of information organization and search sophistication. The lack of such functionality means recall will prevail over precision even within a single database, which can be a greater or lesser problem depending on the context. Some providers conceptualize their databases as collections because of the nature of the material indexed, and that also makes a difference in the organization of the material and the search features available to users. JSTOR is a good example. Although JSTOR functions as a database that indexes journal articles, it also has a number of open-access collections and provides a separate search box for them. JSTOR’s open-access and publicdomain content includes not only journals and e-books but also collections of material from library and museum partners, curated research reports from 140 policy institutes, and a digitized collection of alternative newspapers and periodicals, including prison newsletters (JSTOR 2022). Using the JSTOR collection of digitized magazines and newsletters, you can search for a particular magazine by putting the phrase “lesbian tide” in the search box; the results are instructive. The academic content filter to the left of the search results shows that 65 are journal articles and 53 are book chapters containing the phrase. Below that, the primary-source content cluster shows there are 218 serials; many of these are issues of The Lesbian Tide, a magazine that serves as a primary historical source for studying the LGBTQ movement of the mid-twentieth century. Some of the 218 serials are other publications that mention The Lesbian Tide. On this results screen, there are no filters for subject. Clicking on an issue of the magazine takes you to the digitized version, and to the left of that screen are links to the creators/authors whose work is published in the issue as well as links to the two JSTOR collections that the magazine is part of: “Feminist” and “LGBT.” Clicking on the “Feminist” collection link takes the user to a search box for retrieving material from that collection’s 2,857 items. In contrast to JSTOR’s collection-oriented combination of searching and browsing, the open version of WorldCat displays a more traditional library approach to information organization. A
search for “lesbian tide” yields records for the magazine in print, microfilm, and online formats, as well as for an archival collection of material about feminists in Los Angeles (where the magazine was published), a commercially produced collection of primary sources, and books. There are a number of separate records for the print and microfilm formats, and clicking on the serial title for each one will display the cataloging metadata for the magazine itself (not individual issues), including the following Library of Congress Subject Headings: Women—Periodicals. Lesbian activists—California—Los Angeles—Periodicals. Lesbianism—Periodicals. Lesbians—California—Los Angeles—Periodicals. Lesbians—Periodicals. Neither JSTOR nor WorldCat represents the one best way to make the magazine findable. In JSTOR’s open-access version of The Lesbian Tide, it’s not possible to do a search for other magazines grouped under the heading “Lesbianism–Periodicals,” but it is possible to browse for similar magazines in the “Feminist” collection. Browsing can lead to serendipitous discovery, and it can be especially helpful to instructors who craft assignments that use browsing to help students learn new information and see the connections among primary sources. A traditional librarian might be disappointed that the lack of subject headings in a certain kind of database makes a certain kind of access impossible. But it’s more productive to accept that a lot of freely available collections and open-access repositories do not use all of the traditional cataloging and indexing techniques and to use your search knowledge and skills to quickly assess what is available for discovery in any given system. Ubiquitous Metrics
As the proportion of open-access publications indexed in databases increases, it may not be surprising that the largest database publishers and vendors have diversified their business models to encompass metrics and analytics services. Articles published in open-access journals, as well as preprints and other versions of otherwise paywalled articles submitted to repositories, typically have higher citation rates than articles behind paywalls (Piwowar et al. 2018). If subscription databases are indexing a lot of material findable elsewhere for free, and if that findable and free material gets more notice, then it makes sense that database publishers might want to develop research reputation and impact services to go along with their traditional retrieval services. Clarivate (2020) makes the relationship between open access and research impact clear, noting that its Web of Science core collection of databases provides access to 14 million open versions of articles and 1.7 million researcher profiles. Elsevier, the publisher of Scopus, ScienceDirect, and Mendeley, emphasizes saving researchers time and helping them document their impact. For example, a researcher who logs in to Mendeley will see items recently added by others in their field and can create and update their own profile in Mendeley Showcase (Elsevier 2022). As badges, pop-up score boxes, little metrics icons, and similar devices intrude on results screens, it’s important to understand what the scores mean. Although the metrics are designed for researchers, librarians can help users interpret the available ones as one gauge of an article’s importance to a topic they are researching. Other assessments of an article’s importance or relevance depend on the user’s purpose and interpretation. Colorful scoring icons can present a visual distraction, so it may be necessary to ignore them in order to focus on assessing the results in ways that serve the information seeker’s purpose.
INFORMATION DISORDER Let’s take a moment to consider the larger context of practices that information seekers and intermediators can do habitually to improve the probability that their searches will be successful: Cultivate an awareness of local, regional, national, and international current events. Listen to, watch, and read accurate news. Be willing to question received information and know what questions to ask. Maintain a curious and open mind. Understand that needed information most likely exists but that it might take some effort to find it, and that effort might include contacting someone knowledgeable for help. The information intermediator may not have much control over the factors that shape the abilities and expectations of information seekers. Nevertheless, awareness of both search skills and contextual practices can help you make decisions about what and how to search given the seeker’s purpose and topic; when to inject some instruction into the transaction; how far to go when helping users interpret results; and at what point to cease intermediation. When information seekers don’t have the time, attention span, dedication, or knowledge to craft smart search statements, evaluate results, and revise searches for better results, they can be influenced by misinformation, disinformation, and malinformation. Wardle (2019) groups all three under the term information disorder. As figure 15.2 shows, misinformation can include mistakes made in a publication as well as misinterpretations of information by readers, viewers, and listeners. Disinformation, like propaganda, intentionally misleads and manipulates the audience. Malinformation includes deliberate attempts to damage the reputation of others through the publication of private information and/or the alteration of
authoritative content. Cooke (2021) points out that the racism that has pervaded, overtly and covertly, all forms of media should be understood as malinformation because of the harm it causes. Her books on libraries’ responsibilities to serve diverse populations (2016) and to address the manifestations of information disorder by helping students to develop literacy in all media (2018) suggest some of the ways in which information intermediators can and should develop critical cultural literacy. A sense of urgency regarding source credibility emerged in the wake of the misinformation and disinformation that accompanied the 2016 US presidential election. Facebook came under fire as the source of much of the mis- and disinformation that grabbed headlines during the campaign’s last three months (Silverman 2016). The results of a Pew survey revealed that everyday Americans contributed to the problem as a result of knowingly or unknowingly sharing a fake news story (Barthel, Mitchell, and Holcomb 2016). Naming “post-truth” the 2016 Word of the Year, the Oxford English Dictionary (2017) defined it as “relating to or denoting circumstances in which objective facts are less influential in shaping political debate or public opinion than appeals to emotion and personal belief.” About the post-truth era, Johnson (2017) adds that “it’s about picking the facts that you want to be true” (15). Laybats and Tredinnick (2016) elaborate further: We are living . . . in an age where politics no longer functions through rational discourse. The facts of the matter are of secondary importance to free-floating opinion. Instead, truth is replaced by demonstrative arguments that appeal to the electorate on a more visceral and emotional level. . . . It is characterized by a willful blindness to evidence, a mistrust of authority, and an appeal to emotionally based arguments often rooted in fears or anxieties. (204)
Figure 15.2 Components of information disorder. Source: Wardle 2019 Special projects have promise for developing best practices, standards, and strategies. Examples are The Trust Project, developing transparency standards so people can easily assess the news stories they read; News Literacy Project, giving middle and high school students tips, tools, and resources to learn how to tell fact from fiction; and Project Look Sharp, providing lesson plans, materials, training, and support to help teachers integrate media literacy into their classroom curricula. Linguistics faculty at the Open University have developed short animated videos on skillful interactions with social media (Seargeant and Tagg 2018). Marwick et al. (2021) offer a syllabus of themes, readings, audiovisual resources, and case studies that recognize history, identity, and legacy media as factors contributing to the current information disorder. Nonpartisan fact-checkers can provide insights on the latest trends, issues, and gone-viral sensations: FactCheck at the University of Pennsylvania; PolitiFact, a Pulitzer Prize–winning website of the
Tampa Bay Times; Snopes, a regular winner of various “Best of the Web” awards and recipient of two Webby Awards; and Retraction Watch for scientific papers. Responding to calls for state-mandated media literacy at all education levels is Washington State (Batchelor 2017; Jacobson 2017; McGrew et al. 2017; Padgett 2017). Illinois and New York State have mandated civic education at the high school level (Batchelor 2017; Colglazier 2017). Among the trends contributing to information disorder are: The decline of traditional news outlets (Quint 2017; Rochlin 2017). For the last two decades, newspapers across the country have failed due to lost revenue, and network news has been plagued by double-digit drops in believability ratings. With fewer active outlets, less news written by professionally trained journalists is reaching everyday people. The rise of social media outlets for news (Gottfried and Shearer 2016). A Pew survey reported that, by the time of the 2016 US election, the majority (62 percent) of Americans were getting their news from social media. Social media news is easy to access, free, and available at any time. The news that people read on their social media feeds is a curious amalgamation, some pieces passed on to them automatically by the profile that the outlet has on file for them and other pieces passed on by friends who want to share. On social media outlets, popularity determines what counts as news (Laybats and Tredinnick 2016; Ohlheiser 2018; Ojala 2017). In traditional media, headlines and story placement are indicators of a story’s importance. Social media has changed all that. Algorithms detect the buzz surrounding a story, the story passes from obscurity to trending, more people see the story in their newsfeeds and pass it on to friends, and very quickly the story goes viral. That a piece reaches viral status may be due to most people’s outrage and disbelief, but social media algorithms can’t interpret what’s behind the likes, shares, and comments that a piece receives.
The filter bubbles and echo chambers that create personalized worlds, where individuals are exposed to information from a select set of resources (Badke 2016; Laybats and Tredinnick 2016; Pariser 2011; Quattrociocchi, Scala, and Sunstein 2016; Spohr 2017). People want their opinions to be validated, so they are aided and abetted by personalization algorithms, and they see mainly or only news and other information that doesn’t challenge their beliefs. If they want different perspectives, they have to actively look for them. The ease of monetizing websites by publishing fake news (Kennedy 2017; Rochlin 2017; Seargeant and Tagg 2018). Making pennies per click from web-search and advertising services might not seem like a lot, but pennies quickly become dollars when your content is clickbait, provocative content that piques people’s curiosity to the extent that they have to click on it. When you click on and share stories with your friends, website owners and advertisers make money. A particularly egregious case of monetizing malinformation is Alex Jones and his InfoWars broadcasts; he has made a fortune selling products on his site. In 2022, the parents of a child killed in the Sandy Hook Elementary School shooting won their defamation lawsuit against Jones, who had fueled a conspiracy theory that the shooting was staged so that the government could take away gun ownership rights. Jones’s fans harassed and threatened the parents of Sandy Hook victims for decades. The court ordered Jones to pay $4.1 million in compensatory damages and $45.2 million in punitive damages to the parents (Sweeny 2022). While politics has fueled much of the current information disorder, deliberate misrepresentations have a long history. Information disorder affects spheres other than politics. Marketing firms and businesses providing products and services can be damaged by misinformation, disinformation, and mal-information spread on social media (Di Domenico et al. 2021). Scientists themselves can be among the culprits spreading misleading statements by working with industries to deny the damage caused
by smoking, pollution, and global warming (Oreskes and Conway 2010). Social media can distribute, and even generate, extremism, as has been the case with Facebook’s placement of ads connected to searches for white supremacist groups and its automatic generation of business sites for such groups mentioned in users’ profiles (Tech Transparency Project 2022). For information intermediators and seekers, the message is clear: understand how technology and social media create and distribute spurious and harmful information, and be alert to scams, lies, misrepresentations, and all forms of information disorder. Although librarians have a treasure trove of information literacy strategies to share with users, addressing the many and changing manifestations of information disorder is a core responsibility. Evaluating sources is only part of the picture. It’s also important to understand how ranking algorithms work, how search systems collect data about people, and how that data is stored and used. It means learning about clickbait techniques and how online advertising works on different platforms. We need to pay attention to how power operates through misinformation, disinformation, and malinformation. Librarians and information specialists, fact-checkers, journalists, educators, and researchers have a stake in understanding how information is manipulated and taking action within and across their professions to lessen the harm.
STUDENTS OF SEARCHING So much about information seekers is variable: their knowledge of the topic that interests them; their ability to articulate their information need to someone else; their motivations; their history as readers; and the time and effort they are able or willing to put into finding the information they need or want.
Much about information resources is also variable. At the library’s website, the databases that your institution’s budget can support to satisfy the information demands of its user base may be many or few. Database search systems offer a wide variety of functions for searching, browsing, filtering, and selecting and managing results. At times, your access to a familiar database will end or the search system of an information resource you know well will be revamped or redesigned. The information industry itself is changing constantly. Just when you’ve mastered a search system, the vendor changes its interface or searching functionality. Or your library cancels a contract with a vendor for a better deal and different content from another vendor (as happened to one of this book’s authors). Changes are disruptive and disconcerting for experts and end users alike. It’s your job to keep up with changes, assessing what the vendor has done and updating what you say about the search system and its databases to users. Stay a student and keep learning! To avoid surprises and prepare for changes in advance, keep up to date on informationindustry news and developments: Read trade publications, such as Information Today and Online Searcher. Attend conferences specific to online searching, such as Electronic Resources & Libraries, Computers in Libraries, and Internet Librarian. Visit database vendor information booths at the annual meetings of organizations such as the American Library Association, the Special Libraries Association, the Medical Libraries Association, and the Association of Independent Information Professionals. Participate in online discussion groups, such as ERIL-L and Liblicense-L. Follow database publishers and vendors on social media sites. Take advantage of free training offered by publishers and vendors, including live and recorded webinars, video tutorials,
and other media such as newsletters and blogs; and share what you learn with colleagues. You have prepared by taking a graduate-level course about online searching and by working your way through this book. You have unique experiences and expertise. Your commitment to the broad field of library and information science is also a commitment to your own continuing education throughout your career. We must also be aware of our tendency to think we know more than we do after conducting a bit of research (Ballantyne and Dunning 2022). When confronted with scientific information and data that run counter to their opinions, conspiracy theorists tell each other to “do your own research.” Yet studies have documented the human tendency to become overconfident in our knowledge and abilities early on in whatever research process we initiate. Ballantyne and Dunning offer this advice: “If you are going to do your own research, the research you should do first is on how best to do your own research.” By working through this book and taking related graduate courses, you are doing exactly that. Your knowledge, skill, and critical thinking as an information intermediator will help others to do that as well.
REFERENCES Badke, William. 2016. “Evidence and the Doubter.” Online Searcher 40, no. 2: 71–73. Ballantyne and Dunning. 2022. “Skeptics Say: ‘Do Your Own Research.’ It’s Not That Simple.” The New York Times, January 3. https://www.nytimes.com/2022/01/03/opinion/dyor-do-yourown-research.html. Barthel, Michael, Amy Mitchell, and Jesse Holcomb. 2016. “Many Americans Believe Fake News Is Sowing Confusion.” Pew
Research Center. https://www.pewresearch.org/journalism/2016/12/15/manyamericans-believe-fake-news-is-sowing-confusion/. Basson, Isabel, Marc-André Simard, Zoé Aubierge Ouangré, Cassidy R. Sugimoto, and Vincent Larivière. 2022. “The Effect of Data Sources on the Measurement of Open Access: A Comparison of Dimensions and the Web of Science.” PLOS One 17, no. 3: e0265545. https://journals.plos.org/plosone/article? id=10.1371/journal.pone.0265545. Batchelor, Oliver. 2017. “Getting Out the Truth: The Role of Libraries in the Fight against Fake News.” Reference Services Review 45, no. 2: 143–48. Bornmann, Lutz, and Rudiger Mutz. 2015. “Growth Rates of Modern Science: A Bibliometric Analysis Based on the Number of Publications and Cited References.” Journal of the Association for Information Science and Technology 66, no. 11 (November): 2215–22. Bosker, Bianca. 2012. “Google Design: Why Google.com Homepage Looks So Simple.” Huffington Post (March 28): https://www.huffpost.com/entry/google-design-sergeybrin_n_1384074?ref=technology. Clarivate. 2020. “Advancing Your Institution’s Open Research Mission.” https://clarivate.com/webofsciencegroup/wpcontent/uploads/sites/2/2020/10/Advancing_your_institutions_op en_research_mission_Infographic-1.pdf. Clarke & Esposito. 2022. “Zero Embargo.” The Brief 45 (August). https://www.ce-strategy.com/the-brief/zero-embargo/. Colglazier, Will. 2017. “Real Teaching in an Era of Fake News.” American Educator 41, no. 3: 10–11. Cooke, Nicole A. 2016. Information Services to Diverse Populations: Developing Culturally Competent Library Professionals. Santa Barbara, CA: Libraries Unlimited. Cooke, Nicole A. 2018. Fake News and Alternative Facts: Information Literacy in a Post-Truth Era. Chicago: ALA Editions. Cooke, Nicole A. 2021. “Tell Me Sweet Little Lies: Racism as a Form of Persistent Malinformation,” August 11.
https://projectinfolit.org/pubs/provocation-series/essays/tell-mesweet-little-lies.html. Di Domenico, Giandomenico, Jason Sit, Alessio Ishizaka, and Daniel Nunan. 2021. “Fake News, Social Media and Marketing: A Systematic Review.” Journal of Business Research 124: 329–41. Elsevier. 2022. Mendeley Reference Manager. https://www.mendeley.com/reference-management/referencemanager/. Gottfried, Jeffrey, and Elisa Shearer. 2016. “News Use across Social Media Platforms 2016.” http://www.journalism.org/2016/05/26/news-use-across-socialmedia-platforms-2016/. Jacobson, Linda. 2017. “The Smell Test: In the Era of Fake News, Librarians Are Our Best Hope.” School Library Journal 63, no. 1 (January): 24–28. Johnson, Ben. 2017. “Information Literacy Is Dead: The Role of Libraries in a Post-Truth World.” Computers in Libraries 37, no. 2 (March): 12–15. JSTOR. 2022. “Open and Free Content on JSTOR and Artstor.” https://about.jstor.org/oa-and-free/. Kennedy, Shirley Duglin. 2017. “All the News That’s (Un)Fit.” Information Today 34, no. 1 (January/February): 8. Laybats, Claire, and Luke Tredinnick. 2016. “Post Truth, Information, and Emotion.” Business Information Review 33, no. 4: 204–6. Lee, Boram, and EunKyung Chung. 2016. “An Analysis of Web-Scale Discovery Services from the Perspective of User’s Relevance Judgment.” The Journal of Academic Librarianship 42, no. 5: 529–34. Levine-Clark, Michael, John McDonald, and Jason S. Price. 2014. “The Effect of Discovery Systems on Online Journal Usage: A Longitudinal Study.” Insights 27, no. 3: 249–56. Lundrigan, Courtney, Kevin Manuel, and May Yan. 2016. “‘Pretty Rad’: Explorations in User Satisfaction with a Discovery Layer at Ryerson University.” College & Research Libraries 76, no. 1 (January): 43–62.
Marwick, Alice, Rachel Kuo, Shanice Jones Cameron, and Moira Weigel. 2021. Critical Disinformation Studies: A Syllabus. Center for Information, Technology, & Public Life (CITAP), University of North Carolina at Chapel Hill. https://citap.unc.edu/critical-disinfo. McGrew, Sarah, Teresa Ortega, Joel Breakstone, and Sam Wineburg. 2017. “The Challenge That’s Bigger Than Fake News: Civic Reasoning in a Social Media Environment.” American Educator 41, no. 3: 4–9, 39. Nelson, Alondra. 2022. “Memorandum for the Heads of Executive Departments and Agencies,” August 25. https://www.whitehouse.gov/wp-content/uploads/2022/08/082022-OSTP-Public-Access-Memo.pdf. Ohlheiser, Abby. 2018. “Algorithms Are One Reason a Conspiracy Theory Goes Viral; Another Reason Is You.” Intersect, Washington Post, February 22. https://www.washingtonpost.com/news/theintersect/wp/2018/02/23/algorithms-are-one-reason-aconspiracy-theory-goes-viral-another-reason-might-be-you/? utm_term=.af5a8e8d3310. Ojala, Marydee. 2017. “Fake Business News.” Online Searcher 41, no. 3 (May/June): 60–62. Oreskes, Naomi, and Erik M. Conway. 2010. Merchants of Doubt: How a Handful of Scientists Obscured the Truth on Issues from Tobacco Smoke to Global Warming. New York: Bloomsbury Press. Oxford English Dictionary. 2017. Modified version published March 2022. s.v. “Post-Truth.” Padgett, Lauree. 2017. “Filtering Out Fake News: It All Starts with Media Literacy.” Information Today 34, no. 1 (January/February): 6. Pariser, Eli. 2011. The Filter Bubble: What the Internet Is Hiding from You. New York: Penguin Press. Piwowar, Heather, Jason Priem, Vincent Larivière, Juan Pablo Alperin, Lisa Matthias, Bree Norlander, Ashley Farley, Jevin West, and Stefanie Haustein. 2018. “The State of OA: A Large-Scale Analysis of the Prevalence and Impact of Open Access Articles.” PeerJ 6: e4375. https://doi.org/10.7717/peerj.4375.
Quattrociocchi, Walter, Antonio Scala, and Cass R. Sunstein. 2016. “Echo Chambers on Facebook,” June 15. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2795110. Quint, Barbara. 2017. “Honesty in the Digiverse.” Online Searcher 41, no. 2 (March/April): 25–26. Ramachandran, Rahul, Kaylin Bugbee, and Kevin Murphy. 2021. “From Open Data to Open Science.” Earth and Space Science 8, no. 5: e2020EA001562. https://doi.org/10.1029/2020EA001562. Rochlin, Nick. 2017. “Fake News: Belief in Post-Truth.” Library Hi Tech 35, no. 3: 386–92. Seargeant, Philip, and Caroline Tagg. 2018. “The Role of Information Literacy in the Fight against Fake News.” Information Literacy (blog), CILIP Information Literacy Group, February 15. https://infolit.org.uk/the-role-of-information-literacy-in-the-fightagainst-fake-news. Silverman, Craig. 2016. “This Analysis Shows How Viral Fake Election News Stories Outperformed Real News on Facebook.” Buzzfeed News, November 16. https://www.buzzfeed.com/craigsilverman/viral-fake-electionnews-outperformed-real-news-on-facebook. Spohr, Dominic. 2017. “Fake News and Ideological Polarization: Filter Bubbles and Selective Exposure on Social Media.” Business Information Review 34, no. 3: 150–60. Springer Nature. 2021. “Major Milestone Reached as Springer Nature Publishes One Million Open Access Articles.” https://group.springernature.com/gp/group/media/pressreleases/archive-2021/major-milestone-reached-one-million-oaarticles/19919044. Sweeny, JoAnne. 2022. “Massive Verdict against Alex Jones Isn’t Just Vindication. It’s a Warning.” Think, August 6. https://www.nbcnews.com/think/opinion/alex-jones-law-suitverdict-sandy-hook-case-scare-conspiracy-theorists-rcna41828. Tech Transparency Project. 2022. “Facebook Profits from White Supremacist Groups,” August 10. https://www.techtransparencyproject.org/articles/facebookprofits-white-supremacist-groups.
Wardle, Claire. 2019. “First Draft’s Essential Guide to Understanding Information Disorder.” https://firstdraftnews.org/articles/information-disorder-thetechniques-we-saw-in-2016-have-evolved/.
Glossary A&I database. An abstracting and indexing database that indexes surrogate records bearing complete citation information along with an abstract summarizing the source’s content. abstract. A concise and accurate summary of a source’s contents. adjacency operator. A proximity operator incorporated into a search statement to retrieve texts in which the search words are adjacent to one another. The operator also specifies whether word order matters. almanac. “A collection of facts, statistics, and lists” (Smith and Wong 2016, 477). altmetrics. Indicators of the reach and impact of a researcher’s publications beyond citations by other scholars, such as discussions on social media platforms, mentions in traditional media, references in policy documents, saves to reference management systems, and views and downloads on publishers’ websites. AND. The Boolean operator that is inserted into search statements to tell the system to retrieve only the surrogates and sources that contain all the terms connected with the operator. article-level metric. A measure that is used to evaluate the impact of an article in scholarly journals. See also h-index.
author. A person, corporate body, or family responsible for creating a source. author-bibliography search. A subject search in which users want to scan a list of sources that a particular person or entity wrote, edited, illustrated, or created because they like the creator’s work and they want to find more like it. authority control. The editorial process used to maintain consistency in the establishment of authorized index terms for names, titles, subjects, and other phenomena. authority file. A database of index terms, usually for names, titles, and/or subjects, that are authorized for use in the known-item and subject-heading fields of surrogate records. See also authority record. authority record. An entry that displays an index term’s syndetic structure and entry vocabulary, plus such related information as a scope note, the date that the index term was authorized in the controlled vocabulary, and a history note. See also authority file. authorized term. See index term. author keywords. The subject words and phrases that journal editors ask authors to add to their manuscripts when they submit them to the journal for review. Author-supplied keywords do not comply with the rule of specific entry or draw from a controlled vocabulary. automatic vocabulary assistance. A recall-enhancing feature of search systems that searches for terms related to user-entered terms based on statistical relationships of term occurrences in texts. backward-chaining. See bibliography scanning.
basic index. When a search statement fails to specify an index or field, the search system defaults to the system’s basic index, usually made up of all fields or all subject-rich fields. bento box. Partitioning the display of results into categories by type of resource or service. bibliographic database. See surrogate database. bibliography. A systematic listing of references or citations, usually organized alphabetically by author name, and restricted in coverage by one or more features, such as subject, publisher, place of publication, or genre. bibliography scanning. Finding relevant results among the citations listed in a source’s footnotes or bibliography. Also called backward-chaining. bibliometrics. The statistical analysis of the written products of academic inquiry, scholarship, and research, such as journal articles, books, dissertations, theses, and conference papers, usually based on the number of times a work has been cited by another work. biography. An account of a person’s life, often supplemented with one or more other appropriate genres (e.g., bibliography, catalog, discography, filmography, etc.) that reports their accomplishments. Boolean logic. The systematic ways in which search systems produce results in response to search statements bearing Boolean operators. Boolean operators. See AND, OR, and NOT. Boolean search systems. Search systems governed by Boolean logic to produce results in response to user-entered search statements.
bound phrase. Enclosing a search term in quotation marks, parentheses, or brackets or connecting its individual elements by hyphens or underscores to indicate to the search system to process it as a phrase. Which symbol to use depends on what the system has been programmed to recognize. broader term. A hierarchical relationship between two controlled vocabulary terms in a thesaurus that expresses either a whole-part or genus-species relationship, the broader term designating the whole or the genus. browsing. The act of scrutinizing a display of entries, such as an alphabetical index, classified list, cluster array, thesaurus, list of topical categories, or set of results by a user with the intent of selecting one or more entries to further the search. catalog. A special type of index bearing surrogate records that describe sources contained in a collection, library, or group of libraries and that are organized according to a formal scheme or plan. cataloger. See indexer. citation. A reference to a source that gives just enough identification information for a person to find the source in a collection, such as a library or a database, or in a publication. citation count. The number of times a publication has been cited in other publications. citation index. A database that makes it possible to identify the works that have cited a scholarly article or book. citation verification search. A known-item search that verifies the citation data the user has in hand for a source or completes it for citation purposes.
cited references. See references. cited-reference searches. Finding the sources that have cited an older source since the beginning of its publication. Also called forward-chaining. citing references. Sources that have cited the source in hand since its publication. See also cited-reference searches. classification. A system for sorting like items and assigning them to classes and sub-classes, denoted by numeric or alphanumeric codes. Examples: Library of Congress Classification; APA PsycInfo Classification Categories and Codes; JEL (Journal of Economic Literature) Classification System for economics literature; and Inspec Classification for engineering-related subjects. See also North American Industry Classification System (NAICS) and Standard Industrial Classification (SIC) System. classification captions. Broad-based topical headings that make up a classification’s outline. In some databases, indexers assign such captions (or codes representing the captions) to surrogates in ways that are similar to their assignment of controlled-vocabulary terms from a thesaurus so that search systems can index the captions (or their codes) to facilitate subject searching. closed-ended questions. Questions that librarians ask users during the reference interview to elicit yes or no answers or short answers from them. clusters. See postsearch clusters. command-line interface. Allows users to interact with a search system by entering commands that instruct the system to perform certain operations. commercial databases. See licensed databases.
controlled vocabulary. A carefully selected list of preferred terms, phrases, and codes that indexers assign to surrogate records to describe a source’s intellectual contents and to facilitate online searching. controlled vocabulary searching. Utilizing search-system features for browsing, selecting, or directly entering a database’s preferred terms in the form of words, phrases, or codes, ultimately for the purpose of producing high-precision results. corporate body. “Organizations or group of persons that are identified by a particular name and that acts or may act as an entity” (Chan and Salaba 2016, 745). COUNTER-compliant data. Database usage statistics that conform to Project COUNTER standards developed by librarians and the publishers and vendors of electronic resources. credible. Whether the information at hand is trustworthy and written by a domain expert on the topic. database. A systematically organized collection of data or information. A database may contain texts, media, spatial and numeric data, or a combination of these. database aggregators. Search services that host databases from a variety of database publishers. Such services may also host databases that they themselves publish. database publishers. Nonprofit and for-profit publishers that employ professional staff to select database content, organize it, and add it to the database. Some publishers index database content and add search and retrieval services, and other publishers license database aggregators to do it for them. descriptor. See index term.
dictionary. See discipline-based dictionary; language dictionary. digital libraries. Research databases that provide access to actual sources across a wide range of genres, including texts, media, and numeric and geospatial data. direct entry. The searcher’s manual entry of search terms and searching language into the system’s search box. directory. A collection of entries for persons and organizations bearing contact information and other potentially useful information, such as age, gender, and occupation for persons and founding date, number of employees, and contact person name for organizations. discipline-based dictionary. A collection of entries for concepts, events, objects, and overarching topics in a discipline, subject, or field of study, along with definitions and short explanations. document representation. The information, whether surrogate, actual source, or both, that a search system indexes, retrieves, and displays. DOI. A unique digital object identifier assigned to a source, providing persistent access even if the source moves from one web location to another. domain expert. A person who has earned credentials (e.g., degree, certification, license, experience) that represent mastery of a discipline, subject, field of study, practice, trade, or profession. double posting. A database-indexing practice in which systems index the words and phrases in surrogate record fields multiple times to maximize the searcher’s chances of retrieving information. encyclopedia. A collection of entries for concepts, events, objects, or overarching topics in a discipline, subject, or field of study that provides background information, definitions, detailed explanations,
and current issues and trends and includes bibliographical references to seminal sources. encyclopedic. A term used to describe a database that covers a wide range of disciplines, subjects, or fields of study. Also referred to as multidisciplinary. end user. A person who uses library resources and services, excluding the library staff who provide access to library resources and services. Also known as everyday people. entry vocabulary. Synonyms that link users to authorized index terms in a controlled vocabulary. Also called cross-references. See also references, use references. expert intermediary searcher. A person (usually a librarian) who has received special training in online searching from search-system representatives and/or faculty in schools of library and information science and who continues to hone their knowledge and skills through the practice of searching. facet. A word or very short phrase that represents a single concept or idea. A facet can also be a word or phrase that is a data element in a citation, such as a title or author name. facet analysis. An analysis of the user’s query in which the objective is to express it in no more than a handful of big ideas, major concepts, or facets that should or should not be present in results. facets. See postsearch clusters. fact. Something that exists now or something known to have existed in the past, such as an object, event, situation, or circumstance.
fair linking. Search systems giving equal consideration to results in the search and display process. false drop. A retrieval that matches the search statement’s criteria, but due to multiple meanings or the context of its matching words and phrases, is irrelevant to the search. federated search system. A search system that dispatches the user’s search statement to a set of disparate databases; merges each database’s results into a succinct response, with duplicates handled in an efficient manner; and presents results to the user, along with functionality for sorting them in various ways. field. In a surrogate record, the separate sections representing the separate metadata elements that describe a work. The title field contains the work’s title, the author field contains the work’s author’s name, etc. field label. The full or abbreviated name of a field that the user can choose from a search system’s select-a-field drop-down menu or enter directly into a search statement for the purpose of restricting retrieval to this field. fields drop-down menu. The drop-down menu in a search system’s basic or advanced interface bearing field names that the user chooses for the purpose of restricting results to the selected field. filters. See postsearch clusters. filter bubbles. See personalization. form. The structure of a database. forward-chaining. See cited-reference searches.
free-text searching. Natural-language keyword searching used to find obscure information or to retrieve high-recall results. full text. Complete written sources, both print and online. full-text aggregator. See journal aggregator. full-text database. A systematic organization of values (e.g., words, phrases, numbers, or codes) contained in a source database’s full-text sources, along with the pointers, references, or web addresses that the search system uses to retrieve the full texts in which the values occur. full-text fulfillment search. Finding a full text for a desired source. full-text publisher. See journal publisher. full-text searching. Used in a source database to expand the search beyond its surrogate records to include every indexed word in the database. genre. The nature of the sources contained in a database—what they are as opposed to what they are about. handbook. “A handy guide to a particular subject, with all of the critical information that one might need” consolidated into a single source (Smith and Wong 2016, 478). high-posted searches. Searches that produce many results.
h-index. An article-level metric, formulated by physicist Jorge E. Hirsch, that is used to evaluate the impact of journal articles. An author with an index of h has published h papers, each of which has been cited by other papers h times or more.
history note. Information about an index term’s representation in a controlled vocabulary, such as changes over the years, special instructions for searching this term online, and the range of years an unused term was in use. hits. The number of results that a search statement retrieves. Also called postings. iMetrics. See altmetrics; bibliometrics. impact factor. A metric used to evaluate the impact of a journal. It is calculated by determining the number of times the journal’s articles are cited by other journals over a two-year period, divided by the total number of citable pieces published in the journal over the same period. implicit operators. Boolean or proximity operators that a search system may be programmed to insert automatically between search words without their being visible in the search box. imposed query. A query the user poses to a reference librarian that comes from someone else, typically a teacher, family member, boss, neighbor, friend, or colleague. in-depth query. Negotiated queries that usually require subject searches and produce multiple results from which users must synthesize answers to their questions. index. A systematic organization of values (e.g., words, phrases, numbers, or codes) contained in a database’s surrogate records or full-text sources, along with the pointers, references, or addresses that the search system uses to retrieve the surrogates and/or fulltexts in which the values occur. index term. A controlled vocabulary term for a name, subject, or title that is authorized for indexers to assign to controlled vocabulary
fields of surrogate records and for searchers to use in controlled vocabulary searches of online databases. See also subject descriptor. indexer. A person who assigns controlled vocabulary terms to surrogate records to represent the names, subjects, or titles pertaining to sources. indicative abstract. A summary that functions like a table of contents, describing a source’s range and coverage and making general statements about the source. indicative-informative abstract. A summary that is part indicative of the source’s more significant content and part informative of its less significant content. information need. The user’s recognition that what they know is inadequate or incomplete to satisfy an overarching goal. information retrieval system. See search system. informative abstract. A summary that functions as a substitute for a source, detailing its quantitative or qualitative substance. inquiry. Same as query but includes requests that do not necessarily involve online searching. institutional repository. A combined search system and online database that a learning institution, such as a college, university, or laboratory, supports, where institution members (e.g., faculty, students, researchers, or administrators) archive digital materials that are the products of their teaching, research, and/or service activities. integrated library system (ILS). A computer-based information system that automates a library’s important functional operations, such as acquisitions, cataloging, circulation, interlibrary loan, public
access, and serials control, and its public services operations, such as the online public access catalog (OPAC). intermediary searcher. See expert intermediary searcher. journal aggregator. A search service that delivers full texts in the journals that they publish and in the journals that other journal publishers outsource to them. journal holdings record. The listing of the copies of a journal that the library has stored on its bookshelves or online. Listing journal holdings by year, volume, issue, and supplement, the record includes the names of the journal publishers or journal aggregators that supply full texts. journal publisher. A publisher that specializes in the publication of one or more scholarly or trade journals. When journal publishers offer search services to their journals that include full-text-fulfillment searches, they become database publishers; however, due to high costs, the majority of journal publishers outsource search services and full-text fulfillment to journal aggregators. journal run. A subject search in which users want to scan multiple issues of a journal because it has published a relevant article(s) on the topic they seek and they want to find more like it. keywords. The natural-language words and phrases that users enter into search systems to express their queries. The keywords users enter don’t necessarily arise from a facet analysis and logical combination, and they vary in form, ranging from single words and phrases to sentences, questions, and even whole paragraphs. known-item search. A request for an actual source that you or the user know exists. language dictionary. A collection of entries for acronyms, proper nouns, phrases, or words giving definitions, etymology, foreign-
language equivalents, grammar, orthography, pronunciations, regionalisms, synonyms, usage, visual imagery, and/or written-out forms. lexicographer. The person who develops and maintains a thesaurus or classification system. LibGuide. An easy-to-use content-management system marketed to librarians, who use it to author web-based resource pages for users, putting users a click away from recommended resources on a hot topic, discipline, genre, theme, current event, etc. library catalog. The physical or virtual index that users search to access a library’s collection, consisting mostly of surrogate records for monographs and serial titles. See also online public access catalog (OPAC). Library of Congress Name Authority File (LCNAF). See authority file. licensed databases. Databases that contain information on publishers, database aggregators, journal publishers, and journal aggregators that are licensed to libraries for a subscription fee. Because authentication is required, access to these databases is limited to individuals affiliated with the subscribing institution. Also referred to as commercial or subscription databases. link resolver. A software product that processes the citation data embedded in an openURL to determine whether the library holds or owns the actual source itself, and when it does, facilitates the retrieval and display of the source back to the user. literary warrant. When enough domain experts have written on a topic, the lexicographer establishes an index term for the topic and adds it to the controlled vocabulary.
literature review. An evaluative report of what is known about a subject, theme, current event, issue, etc., that strives to be comprehensive in certain ways; for example, by covering a certain range of years, a certain genre, or a certain methodology. See also systematic review. logical combination. The addition of Boolean operators to the facet analysis to indicate to the search system how it should combine facets during retrieval. main entry. When main entry governs the display of results, the results page contains a list ordered alphabetically by author and, when authorship is diffuse or anonymous, by title. major index terms. Index terms that indexers assign to surrogate records when the sources are specifically about the subjects described by the major index term. manual. “A convenient guide to a particular procedure, typically with step-by-step instructions” (Smith and Wong 2016, 478). media. Information packages that people experience with their visual, tactile, or auditory senses. metrics. See altmetrics; bibliometrics. narrower term (NT). A hierarchical relationship between two controlled vocabulary terms in a thesaurus that expresses either a whole-part or genus-species relationship, the narrower term designating the part or the species. nearby operator. A proximity operator that the searcher incorporates into a search statement to retrieve texts in which the search words are separated by one or more intervening words and the order of the search words does not matter. Usually expressed as N# or NEAR#.
negotiated query. The librarian’s understanding of what the user wants as a result of conducting a reference interview with the user. nested Boolean logic. The inclusion of parentheses or brackets in a search statement to tell the search system in what order to process the operators. North American Industry Classification System (NAICS). A system for classifying industries in the United States, Canada, and Mexico by authorized code numbers called NAICS codes. See also Standard Industrial Classification (SIC) System. NOT. The Boolean operator that tells the search system to eliminate all the surrogates and sources that include the term following the NOT operator from the set of results. Online public access catalog (OPAC). The online index that users search to access a library’s collection, consisting mostly of surrogate records for books, serial titles, and media the library holds or provides access to. See also library catalog. open access. Unrestricted access to scholarship on the web. open-ended questions. The questions that librarians ask users during the reference interview to elicit anything but a yes or no answer or short answer. openURL. A standard for encoding citation data for actual sources into a URL that can be passed to a link resolver that processes the citation data to determine whether the library holds or owns the actual source. open web. Publicly accessible websites and web pages, where anyone with a computer, internet connection, and web browser can search, retrieve, display, and publish information.
OR. The Boolean operator inserted into a search statement to tell the system to retrieve items that contain any of the terms in the search statement. ORCID. A unique identifier assigned to a scholar to facilitate the accurate linking of the scholar’s works to their profile. An ORCID identifier disambiguates like names and persists even if a scholar’s name changes. paywalls. The restrictions publishers implement to prevent free access to information resources such as databases and publications. pdf (portable document format). A file format that displays text, media, or numeric data like the printed page so that a person can read, view, print, and/or transmit the file electronically to other(s). pearl growing. The practice of scrutinizing results, both surrogates and the actual sources themselves (when available), to find relevant terms to incorporate in a follow-up search that retrieves additional relevant sources. peer review. The systematic evaluation of scholarship by domain experts in a discipline, subject, or field of study. personalization. A relevance-enhancement technique that web search engines perform algorithmically, using personal information that they glean from the web to influence results for a search and populating search results with items specific to the person’s interests. postcoordination. The searcher’s deliberate combination of words into search statements after the search system has extracted words from texts into its searchable indexes. postings. The number of results that a search statement retrieves. Also called hits.
postsearch clusters. Nonsubject aspects, such as date of publication and language, and subject aspects, such as major subjects, regions, and age groups, that searchers choose after the system produces results for their search statements. Also called facets, filters, limits, or refinements. precedence of operators. Rules that govern the order in which Boolean and proximity operators are processed. See also nested Boolean logic. precision. A search that yields mostly relevant results. Precision is calculated by dividing the total number of relevant results your search retrieves by the total number of results your search retrieves. precoordination. “The combination of individual concepts into complex subjects” before conducting a search for them (Chan and Salaba 2016, 750). presearch qualifiers. Nonsubject aspects, such as date of publication and language, and subject aspects, such as regions and age groups, that searchers choose to limit results at the same time they enter their search statements. proximity operator. An operator in a search system that specifies two criteria that must be met for retrieval of surrogates and/or full texts to occur: (1) how close the words should occur in the text, and (2) whether word order matters. query. The user’s immediate expression of their information need. recall. Retrieving as many relevant results as there are on the topic in the database. Recall is calculated by dividing the total number of relevant results your search retrieves by the total number of relevant results in the database. Used conceptually rather than literally, since knowing the true number of total relevant results in a database is impossible.
reference databases. Databases filled with facts and meant for quick look-ups. reference interview. A conversational exchange between a reference librarian and a library user, in which the user is likely to describe something they don’t know, requiring negotiation between the two so that the librarian is able to determine what the user really wants. reference management system. Stores the user’s sources and provides automated tools to help the user manage the sources and cite them in written works. references. Sources that are cited in a publication, usually in footnotes, endnotes, or a bibliography. See also bibliography scanning. related term. A controlled vocabulary term in a thesaurus that is coordinate to another controlled vocabulary term. Because both terms are at the same level in the hierarchy, they are not related hierarchically. Also called an associative relationship. relationship designator. See role. relevance. The user’s perception that the information at hand has the potential to answer their question or will contribute to satisfying their information needs. relevance feedback. The search system’s utilization of one or more results to find ones like them. relevance-ranked results. Results that systems rank algorithmically in order of likely relevance based on such factors as the number of times they match user-entered search terms in results, the number of times such terms are posted in the database, and the proximity of such terms in results. Search engine ranking algorithms may include factors such as how many and which
websites are linked to and from a site or page, how often a term is used on a page, and other factors. research databases. Databases that index sources that must be read, analyzed, and synthesized to answer in-depth questions. resolver links. Links that trigger the release of a retrieval’s citation data to link resolvers, software that automatically queries the library’s other databases to determine whether they contain the actual source and, when they do, retrieves and displays the actual source to the user. results. The set of items indexed in a database that match the search statement input in the search box(es) and so are retrieved and displayed. Also called retrievals. role. A word or phrase that describes the part played by a person, corporate body, or family in a source’s creation. Examples are author, calligrapher, editor, illustrator, photographer, and videographer. saved search. Search-system function that stores a search formulation permanently or for a specified time period for later reuse. See also search alert. scholarship. The process of sharing new discoveries, theories, ideas, information, and data. scope note. In selected authority records in a controlled vocabulary, an index term’s definition and/or explanatory information about the term’s proper usage, such as clarifying an ambiguous term or restricting the term’s application. search alert. A saved search that the search system automatically executes according to the searcher’s instructions, sending newly retrieved sources that meet the criteria on a regular basis. See also saved search.
search box. Usually a single-line box into which the user enters a search statement and then activates the accompanying search button or magnifying-glass icon. search engine. See web search engine. search engine optimization (SEO). Techniques used to place a company, product, or other entity as high up in the results ranking as possible. search history. Search system function that displays the user’s previous search statements and the number of results for each. Can be used to combine results sets without having to input the search statement again. searching language. System-provided instructions and controls that the searcher wields to tell the system which operations to perform and how to perform them. search statement. An expression of the negotiated query that the expert intermediary searcher or knowledgeable end user formulates to be consistent with the search system’s searching language and controlled vocabulary so as to retrieve relevant results. search strategy. “A plan for the whole search” (Bates 1979, 206). search system. A computer program that indexes and stores surrogates and/or full-text sources, prompts users to enter search statements that represent their queries, and processes these statements in ways that enable it to respond with results in the form of surrogates and/or sources that have the potential to satisfy people’s queries. search tactic. “A move to further the search” (Bates 1979, 206). search tools. Features and functions that aid intermediaries and information seekers with online searching, including search boxes
with drop-down field menus, presearch and postsearch filters, thesauri, field indexes, and help screens. sets. Temporary storage bins for search results. source. “A distinct information or artistic creation” (Tillett 2003, 11). Also called a work. Online Searching uses source to refer to the texts, media, and numeric and geospatial data that databases index and search systems retrieve. source database. A database that contains and can search the actual sources themselves, including full texts, media, and numeric and geospatial data. specificity. A principle that governs the indexer’s assignment of index terms from a controlled vocabulary to surrogates that are as specific as the source’s subject matter. Standard Industrial Classification (SIC) System. A system for classifying US industries by authorized code numbers called SIC codes. See also North American Industry Classification System (NAICS). stemming. See truncation. subject. A topic or area of knowledge that is the content of a source or that interests a person. subject heading. An authorized subject word or phrase added to the surrogate record to indicate what the source is about. Typically used in library catalogs and less frequently in indexes. subject-rich fields. In a surrogate database, fields whose contents indicate the topics covered such as title, descriptor, author keywords, and abstract fields.
subject search. Formally, a search statement that includes subject headings or descriptors and specifies that the search be limited to the subject field index. Informally, any search that involves research about a topic. subscription databases. See licensed databases. subject descriptor. The authorized and preferred term for a subject or topic, typically used in licensed research databases rather than library catalogs. See also thesaurus. surrogate. A record representing a source that is a full text, media, or numeric or geospatial data. At a minimum, a surrogate is a citation that contains just enough information to enable a person to find the source in a collection such as a library or database. More comprehensive are surrogates bearing subject index terms and/or an abstract that summarizes the source’s intellectual contents. Also called catalog records or database records. surrogate database. A database that contains summary versions of the actual sources; it does not contain the sources themselves. See also A&I database. syndetic structure. The thesaurus network of controlledvocabulary-term relationships: broader terms, narrower terms, related terms, and use and used-for references. synonym. A term with the same or similar meaning as one or more other terms. In a controlled vocabulary, one of the terms is authorized as the preferred subject descriptor and the other(s) designated as used-for reference(s). systematic review. A rigorous literature review, usually in the health sciences, that is based on a clearly articulated research question, identifies all relevant published and unpublished studies, assesses each study’s quality, synthesizes the research, and
interprets and summarizes research findings. See also literature review. technical reading of a database. A methodology for searchers to quickly and efficiently familiarize themselves with a database and the system they use to search it. technical reading of a source. Reading only those portions of a source that are the most important for understanding overall content. texts. Written documents. thesaurus. A controlled vocabulary that lists authorized subject index terms (descriptors) in a syndetic structure that expresses the broader and narrower relationships between these terms, and includes related terms and cross-references from synonyms. title. The word(s) that serves as a label for a source but is not necessarily a unique identifier for the source. truncation. The use of a symbol, such as a question mark, asterisk, or exclamation point, to tell the search system to retrieve variant endings of a word stem. Stemming may be specified using a truncation symbol or it may be an automatic feature of the system. typecasting. Scrutinizing the user’s query to determine whether a subject search or a known-item search is warranted. uniform resource locator (URL). An address for a website, document, or other resource on the web. unused synonyms. See used for references. used for references. In an authority record, a list of unused synonyms for the record’s authorized name, subject, or title.
use references. In a controlled vocabulary, a synonym for or variant of a name, subject, or title that guides the user to the authorized index term. See also entry vocabulary. Venn diagrams. Visual representations of Boolean expressions in which the circles are facets and the overlap between two or more circles demonstrates relationships between facets that should be, may be, or shouldn’t be present in search results, signified by the AND, OR, and NOT operators, respectively. web-scale discovery system. A library’s Google-like search interface that retrieves results from multiple electronic resources, including the catalog and licensed databases. Also referred to as a library discovery system. web search engine. A computer program that indexes websites and web pages, provides a search interface, processes searches, and rank-orders the results using an algorithmically determined relevance. WSD system. See web-scale discovery system. yearbook. A review of trends, issues, and events pertaining to a topic, place, or phenomenon in a particular year.
REFERENCES Bates, Marcia J. 1979. “Information Search Tactics.” Journal of the American Society for Information Science 30, no. 4: 205–14. http://pages.gseis.ucla.edu/faculty/bates/articles/InformationSear chTactics.html.
Chan, Lois Mai, and Athena Salaba. 2016. Cataloging and Classification: An Introduction, 4th ed. Lanham, MD: Rowman & Littlefield. Smith, Linda C., and Melissa A. Wong. 2016. Reference and Information Services: An Introduction, 5th ed. Santa Barbara, CA: Libraries Unlimited. Tillett, Barbara. 2003. “FRBR: Functional Requirements for Bibliographic Records.” Technicalities 23, no. 5: 1, 11–13.
About the Authors Karen Markey is a professor emerita in the School of Information at the University of Michigan. Her experience with online searching began with the earliest commercial systems, Dialog, Orbit, and BRS; and the first end-user systems, CD-ROMs and online catalogs. It now centers on today’s web search engines and proprietary search systems for accessing surrogate and source databases of full texts, media, and numeric and spatial data. Since joining the faculty at the University of Michigan in 1987, she has taught online searching to thousands of students in her school’s library and information science program. Her research has been supported by the Council on Library Resources, Delmas Foundation, Department of Education, Institute of Museum and Library Services, National Science Foundation, and OCLC. She is the author of six books, more than a dozen major research reports, and more than one hundred journal articles and conference papers. Cheryl Knott is a professor in the School of Information at the University of Arizona. Her experience with online searching began in 1988 when she went to work as a reference and instruction librarian at the University of Texas, when access to online databases involved dialing in via an external modem. For two decades she has taught online searching in undergraduate courses designed for end users and graduate courses designed for aspiring librarians. Her book, Find the Information You Need! Resources and Techniques for Making Decisions, Solving Problems, and Answering Questions (2016), is
written for undergraduates. Her research interests center on access to information, broadly construed. In addition to publishing widely in scholarly journals, she is the author of Not Free, Not for All: Public Libraries in the Age of Jim Crow (2015), which won the Eliza Atkins Gleason Book Award from the Library History Round Table of the American Library Association and the Lillian Smith Book Award from the Southern Regional Council.