Known item search
by Birger Hjørland
(This is a version of an article printed in Library Resources & Technical Services, see colophon)
Table of contents:
1. Introduction
2. Definitions and meaning of KIS
3. Examples and characteristics of KIS
4.The functions of the KIS-concept
4.1 Bibliographic verification (or bibliographical validation)
4.2 Descriptive cataloging versus subject cataloging
5. Kinds of metadata suited for KIS
5.1 Terms for individual concepts vs. general concepts
5.2 Referring vs describing
5.3 Kinds of properties
5.4 Which kinds of metadata are best suited for KIS
6. Conclusion
Acknowledgments
Endnotes
References
Colophon
Abstract:
This article looks at the concept of known item search (KIS) and considers it in relation to library
practices. The author critically examines previous research on KIS and argues that the concept is
important because it is categorically different from subject search and because it is assumed in
processes such as bibliographic verification and descriptive cataloging. The article further discusses
which kinds of metadata best serve KIS and argues that the traditional distinction between descriptive
cataloging and subject cataloging is a fruitful point of departure for describing the metadata needed
for respectively known item searches and subject searches.
[top of entry]
1. Introduction
Known item search (KIS), the search for a particular thing or → document, is a common activity performed in library catalogs, in bibliographies, databases, and search engines, as well as in everyday life (e.g., finding a song in which just a fragment is remembered) [1]. It is a concept that seems easy to understand, and it is often regarded as a rather trivial problem in → library and information science (LIS), where the main focus has been → subject searching (also termed “topic searching”) [2].
Because KIS puts other demands on both search systems (including document → descriptions) and on search strategies, however, it is an important concept in its own right. Kan and Poo wrote:
How important is known item search? In the setting of an → OPAC [online public access catalog], it is very important. Larson [1991] points at the long term decline of subject searching in OPACs, in which known item search accounts for a growing proportion of library catalog searches, up to 50%. However, supporting these types of searches has largely been ignored by the information retrieval community, whose focus has been on topical search (e.g., TREC bakeoff competitions [3]). While these efforts have improved the state of the art for topical search, we see a need to support better known item query detection and retrieval. (Kan and Poo 2005)
This quote is not meant to claim that KISs are more important than subject searches but rather to illustrate that KIS is important enough to deserve special attention in LIS.
In library science, the purposes of library catalogs have been discussed since Charles A. Cutter, who in 1876 presented the following “objects” [4]:
- To enable a person to find a book of which either:
(a) the author is known
(b) the title is known
(c) the subject is known
- To show what the library has:
(d) by a given author
(e) on a given subject
(f) in a given kind of literature
- To assist in the choice of a book:
(g) as to its edition (bibliographically)
(h) as to its character (literary or topical)
The first of Cutter’s objectives is about enabling KIS, while 2e is about subject searching. Lee, Renear, and Smith discussed these rules from the perspective of KIS, writing:
In the 1876 edition of Cutter’s Rules for a Printed Dictionary Catalogue (p. 10), we find that the first objective of a catalog is “1. To enable a person to find a book of which either (A) the author, (B) the title, (C) the subject is known”. It is interesting that an interpretation of this allows subject as an access point for known-item search. However in the later literature, “title” and “author” dominate as the major attributes used for conducting a known-item search, and the mention of “subject” as an access point becomes harder to find. In some cases, known-item searches are even considered to be equal to the aggregation of “title” and “author” searches (Cooper & Chen, 2001). However, a few authors do consider other attributes such as publisher (Swanson, 1972; Hjørland, 1997), series (Hjørland, 1997), subject (Wildemuth & O’Neill, 1995) as the types of information used for known-item searches. (Lee, Renear and Smith 2007, 12)
In spite of Cutter’s explicit mention of subject searches, library and information scientist Pauline A. Cochrane (1983, 3) described what she considered a paradigm shift in library science: “Common wisdom since Cutter’s time has been that most users of the library want a catalog where they can find a particular item, a known item”.
Although this example may misinterpret Cutter, it is reflective of a tendency in library cataloging theory and practice to prioritize descriptive cataloging. The opposite tends to be practiced in scientific subject databases such as Chemical Abstracts, MEDLINE, PsycINFO, and Web of Science, which tend to have inferior descriptive data, for example, to have less author authority data, or, as the Web of Science, not to provide original titles for non-English articles. Such descriptive elements are better developed in library cataloging, but, as argued by Hjørland (2023), this represents a problematic tendency in library cataloging practice to prioritize known item searches, confirming Cochrane’s view.
The priorities between KIS and subject searching have therefore not been uniform in library and information science, and different kinds of searches have not always been clearly distinguished in objectives for document representation. Important research has been carried out in relation to KIS, and the main purpose of this article is to address the main conceptual and theoretical issues of this topic.
[top of entry]
2. Definitions and meaning of KIS
Lee, Renear, and Smith (2007, 1) found it surprising that despite its central status in LIS over a long period, the concept “KIS” has received practically no systematic discussion. The authors further demonstrated “that this apparently simple notion is actually quite complex and varied, and moreover, that there is hardly a single feature ordinarily associated with it that can confidently be said to be an essential part of the concept”. The article pointed out that it is not required that the searcher “know” the searched document to exist or that it really exists, because a bibliographical reference can be to a non-existent document (a so-called bibliographical ghost.) A famous example of such a ghost is a highly cited, non-existing paper by information scientist Gerald Salton, which he never wrote [5]. The term known item has therefore misleading associations. A person may be searching for a document, on which very little is remembered or known with a reasonable degree of certainty, and which may even have a doubtful existence. Therefore, when information scientist Michael Buckland (2024) wrote: “Known items generally have names, addresses, or distinguishing physical features”, this is only true when the searcher knows some of these characteristics. There are easy as well as difficult cases of known item searches, and in the difficult cases, such characteristics may be unknown.
Dictionary for Library and Information Science (Reitz 2004) defined:
[K]nown-item search: A search in a library for a specific work, as opposed to a search for any work by a known author or for works on a particular subject. [6]
ISO 5127 defined:
[K]nown item retrieval: search and retrieval (3.10.2.01) for a specific item present in the searcher’s mind from the start on.
Buckland (2024, 1) suggested:
Known item search is ordinarily understood to mean a search where the searcher has a specific item in mind and either has an address for it or else believes (or hopes) that sufficient clues, such as author surname and/or title words, will enable that particular document to be found. It is, in effect, a citation search with, commonly, an incomplete or uncertain citation. [7]
Buckland (2024, 1) also wrote that the opposite of “known item search” is a “not-known item search”, writing:
Known item search is traditionally distinguished from subject search. Strictly, this is an incomplete view because the logical complement of a known item search has to be a not-known item search [8]. Subject search is a common kind of not-known item search in a library context, but it is not the only kind. The many other possibilities include, for example, searches by → genre, bestsellers, and banned books [9].
In a library or other bibliographic context known item searches are often not, in fact, for a particular known item but, more loosely, for any instance of a particular known edition or of an instance of any edition of a particular known title. This is a departure from the pure case of search for a unique, particular document [10].
In his article, Buckland elaborated on the distinction between “particulars” and “specimens”. A particular document may be an exemplar (i.e., an individual copy) of a book with notes or marked passages, while a specimen may be any copy of a given edition of a book [11]. Normally just specimens are meant in relation to known item search, but in some instances, the user may want a particular, unique document. In this connection, Lee, Renear, and Smith brought the attention to IFLA’s Functional Requirements for Bibliographic Records (FRBR), which later developed to IFLA → Library Reference Model (LRM) (see Žumer 2018), which provides a model describing the relations between “work”, “expression”, “manifestation” and “item” (Lee, Renear, and Smith 2007, 9). FRBR/LRM thus enable users to distinguish four kinds of known items. In the example of Shakespeare’s Hamlet, the known item may mean (1) work: the tragedy by that title written by Shakespeare; (2) expression: a Danish translation of Hamlet; (3) manifestation: a particular Danish edition of Hamlet; (4) item: a specific exemplar of a particular Danish edition of Hamlet (in libraries that have multiple copies of the same book items are often given an individual barcode or RFID (radio frequency identification) tag.
What then, is a KIS? We saw above that Buckland (2024, 7) suggested it to mean a search where the searcher has a specific item in mind and has either an address or clues expected to be sufficient to find it. If having “a specific item in mind” includes having a bibliographic reference (whether sufficient or insufficient, and whether a ghost or not) then this part of Buckland’s definition seems okay. The other part of the definition “has either an address or clues expected to be sufficient to find it” excludes searching for something without any idea of whether one’s clues are sufficient to find it, which, however, is probably often the case (e.g., when a name is on the “tip of the tongue” (ToT), cf. example 5 below) [12].
[top of entry]
3. Examples and characteristics of KIS
1. A researcher looks up a bibliographical reference in an OPAC to order and to obtain a copy of the document referred to. Normally a basic set of essential bibliographic data are considered essential in both bibliographical references and in OPACs, for example, author’s last name, publication year, title of book or journal, publisher’s name (in the case of books), volume, first page, and article title (in the case of journal articles) and, when available, the digital object identifier (DOI) [13]. Looking up a combination of a few of these essential data will, if the library has or provides access to the document, in the overwhelming number of cases lead to an unproblematic match and the document can be ordered. Kilgour (2001), for example, prescribes how informed library users can issue effective known item queries, by including the author’s surname and specific words from the title of the item. Normally, the redundancy in the bibliographical data (both the user’s reference and the library’s OPAC) is high, meaning that if first attempt fails, then another combination of such essential data will probably succeed.
2. If the document is not found as described in point 1, it may be (a) because there are errors in the user’s bibliographical data, (b) because there are errors in the library’s bibliographical data, (c) because the library does not have the document, or (d) because the reference is a bibliographical ghost [14]. Normally the best procedure is first to verify the user’s data, e.g., by searching in Google, WorldCat, or other more comprehensive bibliographical databases. If verification succeeds, the document can be ordered, or it is not in the library, an interlibrary loan may be requested.
A number of special problems exist, including:
a. A common problem in KIS is spelling errors (in user queries or in databases, e.g., when input is made by optical character recognition) or spelling variations (e.g., “color” and “colour”). Now that search engines support fuzzy searching and approximate string matching, such errors have been reduced in search engines and some databases. Unfortunately, many OPACs have not yet successfully addressed spelling search errors.
b. Another example is due to ambiguities in printing years. Publishers have a tendency to provide wrong publication date for their book (e.g. books published in the fall are given the following year’s publication date) [15].
3. In some cases, the user’s reference is very problematic. This author got the following reference from ChatGPT-4o and has not been able to verify it: “Sayre, Kenneth M. ‘Cybernetics and the Philosophy of Mind: The Neglected MacKay-McCulloch Exchange.’ Kybernetes, vol. 38, no. 9, 2009, pp. 1539-1555.” The journal Kybernetes exists, the volume issue and year match each other in the existing journal, but the article is not on the specified pages. There exists a book by Sayre Cybernetics and the Philosophy of Mind but without the subtitle, and not containing the demanded information about MacKay. Google searches of both “The Neglected MacKay-McCulloch Exchange” and “MacKay-McCulloch Exchange” returned zero hits. A return to ChatGPT-4o, indicating the error and asking for a correction, provided the same reference, but now in issue 7/8, which is also wrong.
This example shows the user’s options in difficult cases: systematically use every bit of information in the given reference (and systematically exclude every other bit of information) in order to verify the reference, and to try to obtain further information about the sought item [16]. If the reference contains a reasonable amount of bibliographical data, and it cannot be verified, it must be considered a bibliographical ghost, if it does not contain a reasonable amount of bibliographical data, it must be considered lost (at the least for now).
4. KISs are not always initiated by bibliographical references, but also user memory about documents, or about some contents of a document. For example, one may have heard about “the 20 percent rule” (or was it “the 25% rule”?) in library classification and expect it to be part of a library’s written guidelines and therefore search for it. Or, one may have a more or less vague memory from prior reading about some information that could be relevant for a present argument and try to recall where this was read and then to retrieve that document. In such cases, the KIS resembles a subject search, in which the remembered information is used as input and search criteria, and there is the possibility that more than one document fulfils the users need.
5. The phenomenon known as tip of the tongue (ToT) is relevant to many cases of KIS: the searcher cannot remember the relevant term for something but has a partial memory of it and a feeling that it is likely to be remembered soon. ToT is studied in many fields, particularly in psychology (see e.g., Brown 1991). There are very few studies (e.g., Arguello et al. 2021 and Bhargav, Sidiropoulos and Kanoulas 2022) relating this term to document searches in databases, however, and these seem not clearly to distinguish the general failing to retrieve documents from the cases with the feeling to be almost able to remember the term needed to retrieve a document. ToT has also been used to discuss recall of non-textual items, e.g., music.
6. Some examples are due to library users’ lack of knowledge about the library catalog. Dwyer, Gossen and Martin (1991) found that more problems were associated with periodical articles than with monographs [17]. For example, many requests were based on article titles when journal titles should be used. This issue is related to the problem that some documents (e.g., Educational Resources Information Center documents) are not cataloged in the OPAC, but rather identified in a separate database, even if the library holds them.
Based on a small set of queries, Kan and Poo (2005, 93-94) provided some general characteristics of KIS (as opposed to subject searches):
- They are longer, and often copied from a syllabus or a web search.
- They contain determiners: in English titles, determiners (such as “the”, “an”, and “a”) are often parts of book titles and are thus also prevalent in known item queries. In contrast, most area or unknown item searchers do not type determiners into search boxes as many know that such words are often ignored by OPACs.
- They contain proper nouns, including names of authors and editors and names of things which may appear in document titles.
- They contain mixed case, for example exactly matching a title’s orthographic case (whether or not the OPAC is case insensitive).
- They contain certain advanced operators, such as specifying terms for the author and the title fields.
- They contain keywords such as “journal”, “course”, and “textbook”. These usually connote the desired type of resource, rather than a keyword search for the word. Similarly, many titles in libraries but few subject headings consist of these words.
These characteristics of KIS are, as already noted, based on a small sample of requests. However, even if studies are performed on large samples, such characteristics will only be indicative: some KISs may not conform to certain rules or statistical patterns. Nonetheless, they are important because, as suggested by Kan and Poo, they may provide a basis for improved search interfaces that may be helpful for users.
[top of entry]
4. The functions of the KIS concept
4.1 Bibliographic verification (or bibliographical validation)
One function of the term known item search relates to the concept of bibliographical verification. In libraries, bookstores, and databases, many requests for documents contain errors and therefore cannot easily be found. Bibliographic verification is admittedly easier in the online catalog compared to the card catalog, but this is just a question of degree, not of a categorical difference. If, for example, the author or title in a search or request is wrong or misspelled, a first conclusion may be that the required document is not in the library. Rather than providing this answer to the users, the library may start a verification process examining the request for errors (or examine if the reference is a bibliographic ghost), correcting the errors, and obtaining the document (if not from the library’s own stacks, then potentially from an interlibrary loan).
Bibliographical verification is the process of confirming the accuracy and completeness of bibliographic information for a given source. This involves checking details of a basic set of essential bibliographic data such as author name, title, publication date, and thereby verify or falsify the existence of a document about which such data have been given. The staff working with this task in large libraries used to be trained in bibliographical verification (often in relation to interlibrary loan), and a textbook has been written on this (Bruhns 1999, in Danish). The verification process was often an algorithmic procedure based on national bibliographies, catalogs from large libraries such as Library of Congress, and other bibliographical tools. The point here is that such verification processes are known item searches, and that they are very different from subject searches performed in libraries, such as helping students and researchers find books, articles, and other documents for their theses and papers.
The above is written in the past tense because today libraries no longer tend to perform verification processes in the same formalized ways [18]. This does not make known item searches and bibliographical verification needless concepts, however, because it is still important to distinguish them from subject searches in order to optimize both kinds of search processes.
An important implication of this issue of verification is the need for researchers and students to know about essential bibliographic data. These data are required for readers to obtain the documents to which the references refer. This is often done by teaching a specific referencing style or standard, for example, the Chicago Manual of Style (2024) or the ANSI/NISO Z39.29-2005 standard. Such styles develop over time. For example, today it is mostly required that references to journal articles include the article’s DOI, which has contributed to facilitating KIS (also in OPACs when these are integrated with discovery services that support DOI searching).
[top of entry]
4.2 Descriptive cataloging versus subject cataloging
The dichotomy between known item search and subject search is related to the dichotomy between descriptive cataloging and subject cataloging. Reitz (2004) emphasizes the difference between the two last processes in the following definition:
Descriptive cataloging: The part of the library cataloging process concerned with identifying and describing the physical and bibliographic characteristics of the item, and with determining the name(s) and title(s) to be used as access points in the catalog, but not with the assignment of subject headings and genre/form terms. In the United States, Great Britain, and Canada, descriptive cataloging is governed by → Anglo-American Cataloguing Rules (AACR2) [and its successor Resource Description and Access, RDA].
In relation to the part of the library cataloging process concerned with classification and indexing, Reitz (2004) defined:
Subject analysis: Examination of a bibliographic item by a trained subject specialist to determine the most specific subject heading(s) or descriptor(s) that fully describe its content, to serve in the bibliographic record as access points in a subject search of a library catalog, index, abstracting service, or bibliographic database.
One reason for the differentiation between descriptive and subject cataloging is that generalist librarians in major libraries trained in the standards mentioned by Reitz and typically performed the former, while subject specialists typically performed the latter.
Therefore, as reported by Hjørland (2023), large libraries used to have separate departments for descriptive and subject cataloging, staffed with general librarians and subject librarians [19]. A similar separation can also be found in subject bibliographical databases such as MEDLINE, and these two library processes have their parallels in the field of → bibliography, where a distinction exists between descriptive bibliography, which describes documents as physical objects, and subject bibliography, which compiles and characterizes documents, emphasizing their subject [20]. Descriptive bibliography is primarily based upon knowledge about techniques of book production, whereas subject bibliography requires subject knowledge [21].
It is too simple to say that descriptive cataloging serves known item searches, while subject cataloging serves subject searches, although overall this is the case. Whereas a subject assignment to a document is generally a bad tool for verification (further described below), many descriptive data are often useful for subject searches (e.g., searches using words from document titles). Nonetheless, known item searches and subject searches make different demands regarding the prioritization of metadata, and this implies that known item search is a concept which requires its own aim to be considered in developing bibliographic databases.
[top of entry]
5. Kinds of metadata suited for KIS
The author has already presented the concept of essential bibliographical data for KIS in the first of the examples of KIS. The present section focuses on discussing three dichotomies suggested by Buckland (2024) for understanding KIS and it ends with an overall conclusion about metadata suited for KIS.
[top of entry]
5.1 Terms for individual concepts vs. general concepts
Buckland (2024, 4) discusses the relation between known item search versus subject search on the one hand and individual concepts versus general concepts on the other. Concerning individual concepts, Buckland, citing indexing theorist Robert Fugmann (1982), wrote: “’individual concepts’ are persons, institutions, and towns, all with proper names, and which occur in single or very few instances”. Buckland seems to suggest that individual concepts somehow correspond to, or are appropriate for, supporting known item searches. Before we discuss this, it can be mentioned that individual concepts (e.g., the name of a person), may be indexed by general terms such as biography
[22], anamnesis
(medical history of an individual person), case reports
, etc. These examples demonstrate that general concepts are also developed in order to facilitate communication and retrieval of information about individual concepts considered from different perspectives and interests.
Concerning the use of individual concepts for known item searches, Buckland (2024, 5) wrote:
Fugmann rightly stresses the use of proper names to refer to individual concepts, but proper names may also be used to describe (dispositively). Authors’ names are ordinarily associated with known item searches for particular books
While often true that author names are known when items are sought, this need not be the case, nor is it always the case that other proper names, such as journal titles are known, or any other individual concept for that matter. Many kinds of known item search occur when authors have a vague memory of a relevant quote they have formerly read and are now trying to retrieve. In such cases, general concepts often are the only available clues [23].
[top of entry]
5.2 Referring vs describing
Buckland (2024, 5) suggested that known item search corresponds to the process of referring, while subject search corresponds to the process of describing:
The difference between naming what is wanted in a known item search and specifying what is desired in a not-known item search corresponds to the distinction between referring and describing [24]. Referring indicates directly; describing indicates indirectly by specifying characteristics which may in turn indicate appropriate targets. In a traditional digital database one looks up the name of a record of interest in the appropriate table, with possibly a data dictionary to resolve any ambiguity. In a full-text search one searches using descriptors, closely related terms, and vocabulary control which, one hopes, will indicate a small enough set to allow selection of any one or more suitable items without missing other, more suitable items.
Let us exemplify Buckland’s claims. In a known item search, author names, journal titles, or specific (combination of) terms may be looked up in order to see if the item searched for can be recognized, possibly after further specifications, and the task thus solved. In a subject search a combination of terms or other subject access points are looked up to see if the set of items thus retrieved seems relevant and satisfactory in relation to recall and precision [25]. If not, the process continues with modified concepts using so-called “recall devices” and “precision devices”, until the task is considered solved. In both cases, what is done is to look up what a certain combination of subject access points are referring to. It is difficult to describe information searching as a descriptive process, because the relevant documents are unknown and therefore impossible to describe. It is better to say that the searcher lists a set of terms describing criteria, which the documents must fulfill in order to be relevant.
[top of entry]
5.3 Kinds of properties
In known item searches versus subject searches there are no differences in the properties of the documents sought for: in principle, these documents, and therefore their properties, are the same [26]. Differences in properties are not specific to the items sought, but rather in the way the search processes are performed and occasionally in the databases used. In relation to the present article, an important issue to clarify is the nature of the data most relevant for known item searches in databases as distinct from those most relevant for subject searches.
Buckland (2024, 3-4) discussed the distinction between material and non-material properties:
Material properties are the physical attributes, the “brute facts” of a document, such as a title as printed, the author’s name as given, and its literal text as well as physical features such as its height, pages, binding, and other objective characteristics. Its non-material properties are any imaginable characteristics other than its material properties, including ownership, topics discussed, point of view, copyright status, genre, and the language of the text.
In this quote “the author’s name as given” is considered a material property of a book, but in Table 2 (2024, 4), exemplifying the book République by Bodin (Paris, 1580), the property that it is authored by Jean Bodin is considered a non-material property. This is somewhat confusing, and here it is suggested instead to distinguish the kinds of data obtained by respectively descriptive and subject cataloging, as described by Reitz above.
- Data obtained by descriptive cataloging: the physical and bibliographic characteristics of the item, and the name(s) and title(s) to be used as access points. Other points can be added, such as tables of contents, and, in citation indexes, the reference lists of the documents catalogued.
- Data obtained by subject analysis: assigned classification notations, subject headings, genre/form terms, and notes about the contents.
Although both categories might in some circumstances serve known item searches, I shall here argue that data obtained by subject analysis is relatively unhelpful because of the nature of subject analysis. A given subject analysis (and the resulting metadata) represents one individual’s view of what the document is about, and we know from inter-indexer consistency studies that inconsistency is an inherent feature of subject indexing, rather than a sporadic anomaly [27]. Whereas there is a fair chance that a person remembers some of a document’s physical or bibliographic characteristics, or (parts of) its title or the author’s name, the same is not the case with a classification code or a subject heading which is not a part of the document itself, but is something that somebody has assigned to a bibliographical record. This corresponds to the finding by Lewis (1987): “Searching for known items by subject is very inefficient, but can be successful when other approaches fail”.
Our conclusion is that although a basic set of descriptive data (as provided by recognized reference style guides) is often fully adequate, there may be difficult cases, in which a broader set of descriptive data are needed, even including subject metadata. We can say, through a modification of a quote by Buckland: “We conclude that we are unable to say confidently of any bibliographical data that it could not be relevant for KIS” [28]. This does not mean, however, that it is impossible to prioritize bibliographical metadata for KIS.
[top of entry]
5.4 Which kinds of metadata are best suited for KIS
We have seen that Kilgour prescribed how informed library users can issue effective known item queries, by including the author’s surname and specific words from the title of the item. Such a simple procedure resolves very large parts of identifying KISs, but not all. We have also considered how scholarly norms of bibliographical referencing, for example the Chicago Manual of Style, prescribe essential sets of metadata, which are meant to guarantee findability of the documents referred to, and we have seen that such norms develop over time, and today include the DOI for journal articles. This may be considered the essential knowledge about metadata for KIS. Still, however, there are difficult problems, which cannot be solved by such essential sets of metadata. We may fear that these problems will increase because of problems with hallucinations in systems like ChatGPT, as have been exemplified above.
While we have concluded above “that we are unable to say confidently of any bibliographical data that it could not be relevant for KIS”, we have also claimed that this does not mean that it is impossible to prioritize bibliographical metadata for KIS. This becomes, however, much more difficult beyond what is considered the essential set prescribed by referencing norms. It has been argued above that, contrary to Buckland’s suggestions, the dichotomies between individual/general concepts, referring/describing and material/non-material properties may not be important. The further development of metadata for this purpose may be based on studies of different kinds of documents in a way related to the ways in which documents are studied in the field of descriptive bibliography. (About bibliographical traditions, including descriptive bibliography, see Hjørland 2024; 2025.)
[top of entry]
6. Conclusion
KIS is generally considered the easiest and the most successful kind of document searching in OPACs. Slone (2000, 763), for example, wrote that query formulations for KIS seems a natural state for searchers and that 88% of searchers were successful. KIS is, however, also a very frequently used kind of search, and some databases, such as WorldCat, are primarily used for known-item searches (see Wakeling et al. 2017). We have claimed that library cataloging — in contrast to scientific bibliographical databases — have prioritized KIS higher than subject searches. However, KIS often encounters greater problems when performed on the Web (see, for example, Dixon et al. 2010). Which strategies can be used by the library community to improve KIS?
One point is to reconsider the metadata in library catalogs. Seymour Lubetzky provided the important principle of functional library cataloging in which the purposes, functions, and values of the different kinds of metadata need to be carefully explored [29]. There is a need for updated investigations and considerations for cataloging of all kinds of information resources. More obviously, there is a need to provide techniques such as fuzzy spelling/spell-check techniques, already common in search engines (cf. Willson and Given 2010). It seems obvious to focus such efforts on databases such as WorldCat, which are mostly intended and used for KIS.
Kan and Poo (2005) provided a set of characteristics that distinguish KIS from subject searches. Based on such characteristics, machine learning, and language modelling and machine translation evaluation techniques were used to automatically identify KIS among other online enquiries. The authors found that this approach has the potential to streamline the interfaces of both OPACs and digital libraries in support of KIS. This too seems to be a way forward.
[top of entry]
Acknowledgments
Thanks to the editors, Rachel Scott and Michael Fernandez, and to three anonymous reviewers for careful and detailed suggestions, which have been very helpful for improved the article. Also, thanks to the two anonymous reviewers in the second round of review.
[top of entry]
Endnotes
1. KIS is also known as KIR (known-item retrieval). The term “navigational queries” for specific documents (as opposed to “informational queries”) was used by Khabsa, Wu and Giles 2016).
2. A subject search can be defined as the search for documents which are potentially relevant for a certain task, for example writing an academic paper on a given subject/topic, the existence and identity of which are unknown for the searcher. (Concerning the concept of relevance, see Hjørland 2010.)
3. Larson here has an endnote 4 referring to Harman (1996).
4. Cutter (1876). Cutter here uses the term “object” in the sense described in the Oxford English Dictionary: “I.2.a. A goal, purpose, or aim; the end to which effort is directed; the thing sought, aimed at, or striven for”.
5. Bibliographic ghosts (or phantoms) are references that refer to non-existing documents. About the example by Salton (1975) mentioned in the text, see Dubin (2004). The extremely hyped generative AI system ChatGPT-4o provides many bibliographical ghosts and other kinds of what is often called “hallucinations”, but should rather be called “fabrications” and “falsifications” (cf. Emsley 2023).
6. Entry “known-item search”. Reitz’s “definition” continues by including advice on how to perform KIS.
7. Already Buckland (1979) criticized the dichotomy between KIS and subject searches. On p. 145, he wrote: “In other words a ‘known item’ search may, in fact, be an indirect and disguised ‘subject’ search for specific information not necessarily unique to the document used”.
8. Buckland (2024) here has an endnote 3: “For example, Michael Buckland, Information and Information Systems (New York: Greenwood, 1991), 105 no. 3; Birger Hjørland, Information Seeking and Subject Representation (Westport, CT: Greenwood [1997]), 14, 20”.
9. Buckland (2024) here has an endnote 4: “Keyword searching is commonly used in subject searches, but not always, and keyword searches are not always subject searches. So the distinction in process between keyword search and other forms of search is different from the distinction in purpose between known item search and subject search and its examination would require a different paper”.
10. Other categories of search include “fact retrieval”, the search for specific information rather than for documents containing this information and “area search [which] is one in which a person uses the on-line system to identify the area of the library where a group of subject or author related books are located”, cf. Slone (2000, 762).
11. Oxford English Dictionary provides, among other, the following definition of the noun specimen: “4.a. A single thing selected or regarded as typical of its class; a part or piece of something taken as representative of the whole”. (In this article “specimen” is used, for example, on a single copy of a book representing all books published in the same edition of that book.) In relation to Buckland’s distinction between particular document and specimens, different kinds of metadata are required for each of these categories. Individual documents are not normally considered in library services, but in special cases, such as rare book collections, detailed descriptions as known from the field of descriptive bibliography are needed. For an introduction to the different fields of bibliography, including descriptive bibliography, see Hjørland (2025). Concerning the description of individual copies of books, see Reed (2017).
12. Lee, Renear, and Smith (2007, 3) made a distinction between operational and conceptual definitions of a KIS, where operational definitions confuse the conceptual issue with the issue about how to perform KIS.
13. ISBN, on the other hand, is not normally considered an essential element in reference styles for academic writing, but it is a standard element in library catalogs. The fact that hardback and paperback versions get different ISBNs, is important for the book trade, but not for scientific communication and library users. The researcher identification ORCID (Open Researcher and Contributor ID) was established in 2009 as a collaborative effort by publishers of scholarly research in order to resolve the author name ambiguity problem in scholarly communication. Many journals now require an ORCID, and it is now used by databases such as Web of Science, but it seems not (yet) to be demanded in academic reference manuals, or to be used by library catalogs.
14. Behnert and Lewandowski (2017) investigated the reasons that known-item searches in discovery systems resulted in zero hits and identified the following reasons: (1) item in stock, but query incorrect (e.g., containing spelling errors), (2) item not in stock, (3) item in stock, but incomplete or erroneous metadata, (4) query is ambiguous or not understandable.
15. For example, in the colophon of Szostak (2022) the publication year is 2023, but the book was out, catalogued by the Royal Library in Denmark and borrowed by me in 2022.
16. Bates (1979) provided an overview of twenty-nine “information search tactics”. She did not explicitly discuss strategies for known item searches, but Nicolaisen (2023) found that the combination of two of Bates’ tactics, “EXHAUST” (extension of a query) and “REDUCE” (shortening of a query), provides the best results for known item searches. Nicolaisen’s suggestion is to expand the search request with information that is supposed to match the document sought, and, if that document is not found, then reduce the search elements in order to remove potentially defective items and thereby increase the probability of success. In this process, it is helpful to have knowledge about which kinds of errors are common in bibliographical records and an understanding of why such errors occurs (e.g., the confusion of family names and given names in documents by Chinese authors).
17. Dwyer et al. (1991, 235) wrote: “While it is tempting to suggest that more and better bibliographic instruction would increase the accuracy and efficiency of patrons' searching, it may be futile to try to provide instruction to the entire student body and/or faculty at an institution if many of them will be such infrequent users of the catalog that they will forget what they were taught before they come into the library again. It would probably be more effective to target instruction in the use of the online catalog and periodical printout to faculty and students who are just preparing to embark on research”.
18. Possible reasons for the diminishing role of verification in libraries may be: (1) that verification has become easier in the digital environment so that few such requests are received by the libraries; (2) that the absence of direct requests (which formerly often were forms completed in writing) in the online context implies that the users’ needs in this respect is not effectively communicated to the libraries; (3) that library administrators have downgraded this service, because they believe it has become unnecessary or (4) that the general library policy has changed towards making such tasks the users’ own responsibility.
19. Hjørland (2023, 1545, note 11) wrote that a strong tendency since about 2000 has been to save libraries own descriptive as well as subject cataloging and replace these with imported data. Therefore, departments for descriptive cataloging and subject classification have mostly disappeared today.
20. National Library of Medicine, for example, distinguish descriptive and subject cataloging processes: “MMP [Metadata Management Program, formerly the Cataloging and Metadata Management Section] is responsible for review and development of cataloging policies for descriptive and subject cataloging and classification of all print, audiovisual, and electronic resources and applying them to resources acquired for the NLM collection” (retrieved January 25, 2025 from ). National Library of Medicine wrote: “A prospective indexer must have no less than a bachelor's degree in a biomedical science” (National Library of Medicine 2018): “Frequently asked questions about indexing for MEDLINE: Who are the indexers, and what are their qualifications?” . However, “As of April 2022, all journals indexed for MEDLINE are done by automated indexing, with human review and curation of results as appropriate. MeSH indexing for MEDLINE was done completely by human indexers until 2011” ().
21. Concerning descriptive bibliography and subject bibliography see Hjørland (2025).
22. The term biography is also used about non-human entities such as libraries and towns, for example, Widener: Biography of a Library (Battles 2004) and Jerusalem: the Biography (Montefiore 2011).
23. If literature is sought about an individual concept, such as “Copenhagen”, many documents may exist, and this is therefore a subject search. If a user is seeking for a particular document about Copenhagen, the term “Copenhagen” therefore is insufficient, why other concepts, whether individual or general, have to be included in the search.
24. Buckland here has an endnote 15 referring to Strawson (1950).
25. Hjørland and Kyllesbech Nielsen (2001, 251-2) explained that “[h]ypothetically, it may be relevant to limit a subject search according to the name of a publisher, a journal, or even a language code. Subject data are not strictly limited to specific kinds of data; under specific circumstances any kind of data may serve to identify documents about a Subject”.
26. If a subject search is performed, and a number of potential relevant documents have been selected, these documents may subsequently be looked up in a library catalog, which is a known item search process. If there are errors in some of their bibliographical descriptions, this may make a further verification process necessary. The point here is that there are no differences in the properties of documents found in subject searching and documents found in known item searching: it is by principle the very same documents.
27. On inter-indexer consistency studies see Hjørland (2018, 614-5).
28. The original quote is in Buckland (1991, 50, italics in original): “We conclude that we are unable to say confidently of anything that it could not be information”.
29. Svenonius and McGarry (2001, 48) wrote: “Studies was a landmark in the history of Anglo-American cataloging. To begin with, it was notable for the approach it took. This was a systematic approach, which took its departure from the assumption that before describing a book it is necessary first to be aware of the objectives that description is to serve. Only then it is clear what is and what is not to be included in a bibliographic record. Only with an awareness of the objectives is it possible to evaluate existing rules and to make proposals for change”.
[top of entry]
References
ANSI/NISO Z39.29-2005 (R2010). Bibliographical References. National Information Standards Information. Available at: .
Arguello, Jaime, Adam Ferguson, Emery Fine, Bhaskar Mitra, Hamed Zamani, and Fernando Diaz. 2021. “Tip of the Tongue Known-Item Retrieval A Case Study in Movie Identification”. In CHIIR '21: Proceedings of the 2021 Conference on Human Information Interaction and Retrieval, 5-14. .
Bates, Marcia J. 1979. “Information Search Tactics”. Journal of the American Society for Information Science 30, no. 4: 205-214. .
Battles, Matthew. 2004. Widener: Biography of a Library. Harvard College Library.
Behnert, Christiane and Dirk Lewandowski. 2017. “Known-Item Searches Resulting in Zero Hits: Considerations for Discovery Systems,” Journal of Academic Librarianship 43, no. 2: 128-134. .
Bhargav, Samarth, Georgios Sidiropoulos, and Evangelos Kanoulas. 2022. 'It's on the Tip of My Tongue': A New Dataset for Known-Item Retrieval”. In WSDM '22: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, 48-56. .
Brown, Alan S. 1991. “A Review of the Tip-of-the-Tongue Experience,” Psychological Bulletin 109, no. 2: 204–223. .
Bruhns, Svend. 1999. Bibliografisk verifikation. 2nd. ed. Center for Bibliographical Studies.
Buckland, Michael. 1979. “On types of Search and the Allocation of Library Resources”. Journal of the American Society for Information Science 30, no. 3: 143-147. .
Buckland, Michael. 1991. Information and Information Systems. Greenwood.
Buckland, Michael. 2024. “Known Item Search and Subject Search”. Library Resources & Technical Services 68, no. 3: [1-9], quote p. 4. .
Chicago Manual of Style. 2024. Eighteenth edition. The University of Chicago Press
Cochrane, Pauline A. 1983. “A Paradigm Shift in Library Science” (Guest editorial). Information Technology and Libraries 2, no. 1: 3-4.
Cooper, Michael D. and Hui-Min Chen. 2001. “Predicting the Relevance of a Library Catalog Search”. Journal of the American Society for Information Science and Technology 52, no. 10: 813-827. .
Cutter, Charles A. 1876. Rules for a Printed Dictionary Catalogue (Government Printing Office, 10). Available at ).
Dixon, Lydia, Cheri Duncan, Jody Condit Fagan, Meris Mandernach, and Stefanie E. Warlick. 2010. “Finding Articles and Journals via Google Scholar, Journal Portals, and Link Resolvers Usability Study Results”. Reference & User Services Quarterly 50, no. 2: 170–181. .
Dubin, David. 2004. “The Most Influential Paper Gerard Salton Never Wrote”. Library trends 52, no. 4: 748-764.
Dwyer, Catherine M., Eleanor A. Gossen and Lynne M. Martin. 1991. “Known-Item Search Failure in an OPAC”. RQ 31, no. 2: 228-236.
Emsley, Robin. 2023. “ChatGPT: These are not Hallucinations – They’re Fabrications and Falsifications”. Schizophrenia 9, 52. .
Fugmann, Robert. 1982. “The Complementarity of Natural and Indexing Languages”. International Classification 9, no 3: 140–4. Reprinted in Theory of Subject Analysis: A Sourcebook, ed. Lois M. Chan, Phyllis A. Richmond and Elaine Svenonius. Libraries Unlimited, 1985: 392–402.
Harman, Donna K. (ed.). 1996. The Fourth Text REtrieval Conference (TREC-4). US Government Printing Office. (NIST Special Publication 500-236). Available at .
Hjørland, Birger. 1997. Information Seeking and Subject Representation: An Activity-Theoretical Approach to Information Science. Greenwood Press.
Hjørland, Birger. 2010. “The Foundation of the Concept of Relevance”. Journal of the American Society for Information Science and Technology 61, no. 2: 217-237. .
Hjørland, Birger. 2018. “Indexing: Concepts and Theory”. Knowledge Organization 45, no. 7: 609-39. DOI: 10.5771/0943-7444-2018-7-609.
Hjørland, Birger. 2023. “Description: Its Meaning, Epistemology, and Use with Emphasis on Information Science”. Journal of the Association for Information Science and Technology 74, no. 13: 1532-1549. .
Hjørland, Birger. 2024. “Bibliography (Field of Study)”. In ISKO Encyclopedia of Knowledge Organization, eds. Birger Hjørland and Claudio Gnoli, .
Hjørland, Birger. 2025. “Bibliographical Foundations of Information Science: A Review Essay”. Journal of Documentation 81, no. 1: 128-146. .
Hjørland, Birger and Lykke Kyllesbech Nielsen. 2001. “Subject Access Points in Electronic Retrieval”. Annual Review of Information Science and Technology 35: 249-298.
ISO 5127: 2017(E). Information and Documentation: Foundation and Vocabulary. 2nd edition. International Organization for Standardization.
Khabsa, Madian, Zhaohui Wu and C. Lee Giles. 2016. “Towards Better Understanding of Academic Search”. In JCDL '16: Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries: 111–114. .
Kan, Min-Yen and Danny C. C. Poo. 2005. “Detecting and Supporting Known Item Queries in Online Public Access Catalogs”. In Proceedings of the 5th ACM/IEEE Joint Conference on Digital Libraries. June 7 to June 11 2005, 91-99. .
Kilgour, Frederick G. 2001. “Known-Item Online Searches Employed by Scholars Using Surname plus First, or Last, or First and Last Title Words”. Journal of the American Society for Information Science and Technology 52, no. 14: 1203-1209. .
Larson, Ray R. 1991. “The Decline of Subject Searching: Long-Term Trends and Patterns of Index Use in an Online Catalog”. Journal of the American Society for Information Science 42, no. 3: 197-215. .
Lee, Jin Ha, Allen Renear and Linda C. Smith. 2007. “Known-Item Search: Variations on a Concept”. In Proceedings of the American Society for Information Science and Technology 43, no.1: 1–17. .
Lewis, David W. 1987. “Research on the Use of Online Catalogs and Its Implications for Library Practice”. Journal of Academic Librarianship 13, no. 3: 152-156.
Montefiore, Simon Sebag. 2011. Jerusalem: the Biography. Weidenfeld & Nicolson.
Nicolaisen, Jeppe. 2023. Sådan finder du videnskabelig litteratur: databaser og informationssøgning [How to Search Scientific Literature: Databases and information Searching]. Hans Reitzels Forlag.
Reed, Marcia. 2017. “Provenance of Rare Books”. In Encyclopedia of Library and Information Sciences 4th ed. Eds. McDonald, John D and Michael Levine-Clark. CRC Press, 3766-3773.
Reitz, Joan M. 2004. Dictionary for Library and Information Science. Libraries Unlimited, Western Connecticut State University. Digital edition: ODLIS. Online Dictionary for Library and Information Science. .
Slone, Debra J. 2000. “Encounters with the OPAC: On-Line Searching in Public Libraries”. Journal of the American Society for Information Science 51, no 8: 757–773. .
Svenonius, Elaine and Dorothy McGarry, eds. 2001. Seymour Lubetzky. Writings on the Classical Art of Cataloging. Libraries Unlimited.
Strawson, Peter F. 1950. “On Referring”. Mind 59, no. 235: 320–44.
Swanson, Don R. 1972. “Requirements Study for Future Catalogs,” Library Quarterly 42, no. 3: 302-315.
Szostak, Rick. 2022. Integrating the Human Sciences: Enhancing Cohesion and Progress across the Social Sciences and Humanities. Routledge.
Wakeling, Simon, Paul Clough, Lynn Silipigni Connaway, Barbara Sen and David Tomas. 2017. “Users and Uses of a Global Union Catalog: A Mixed-Methods Study of WorldCat.org”. Journal of the Association for Information Science and Technology 68, no. 9: 2166–2181. .
Wildemuth, Barbara M. and Ann L. O'Neill. 1995. “The ‘Known’ in Known-Item Searches: Empirical Support for User-Centered Design,” College and Research Libraries 56, no. 3: 265-281. .
Willson, Rebekah and Lisa M. Given, 2010. “The Effect of Spelling and Retrieval System Familiarity on Search Behavior in Online Public Access Catalogs: A Mixed Methods Study”. Journal of the American Society for Information Science and Technology 61, no. 12:2461–2476. .
Žumer, Maja. 2018. “IFLA Library Reference Model (IFLA LRM): Harmonisation of the FRBR Family”. Knowledge Organization 45, no. 4: 310-8. .
[top of entry]
Visited
times.
Version 1.0 published 2025-02-05
Last edited 2025-02-10
Article category: KO in different contexts and applications
This article is a version of an article accepted for Library Resources & Technical Services. In online version 1.0, a table of contents and section numbers have been added, and the full references have been removed from the endnotes and replaced with a list of references according to the referencing system used by IEKO. How to cite it:
Hjørland, Birger. 2025. “Known Item Search (KIS): : Theoretical and Practical Considerations”. Library Resources & Technical Services. July 2025 (further information and DOI to be added). Also available in ISKO Encyclopedia of Knowledge Organization,
eds. Birger Hjørland and Claudio Gnoli, https://www.isko.org/cyclo/kis .
To quote text edited in a later version, you should save it in the Wayback Machine and cite the saved version.
©2025 ISKO. All rights reserved.