I S K O

Patrick Wilson

by Howard D. White

Table of contents:
1. Introduction
2. Consultants and aids
3. Modeling information seekers
4. Bibliographical control
5. Reimagining cataloging
6. Subject indication
6.1 Subjects of entire writings
6.2 Subjects of parts of writings
7. The Catalog vs. The Encyclopedia
8. Utility indexing and citation indexing
9. An ideal information system
10. The view from R&D
11. Trustworthy communication
12. Conclusion
Wilson’s oeuvre
Other references
Colophon
Abstract:
During 1965-2001, Patrick Wilson brought the acuity of a professional philosopher to library and information science (LIS) and became a major theorist in many aspects of knowledge organization (KO). This article, an extensive critical introduction to his thought, reflects the view that much of his work is of permanent value. He can be read for well-informed critiques of the instruments by which writings are organized for retrieval — the bibliographical side of KO. He can also be read for shrewd accounts of personal knowledge and behavior with respect to societal information systems — the social-epistemological side of KO. Indeed, in his work the two sides converge. One of his themes is the preferability of human consultants over bibliographies and catalogs for answering questions. He thus writes at length about the social organization of possible consultants and their degrees of cognitive authority in communicating what they know. Another theme is the desirability of indexing writings not only by subject but also by their possible utility in helping individuals. For that, however, he saw little hope. A third theme is ideal information systems. Broadly, he can be read for his clarifications of concepts on both sides of KO, such as bibliographical control, relevance, subject indeterminacy, information needs, information overload, librarians’ roles, and LIS as a field.

1. Introduction

Patrick Wilson (1927-2003) artfully brought rigorous attention to certain fundamental problems of → knowledge organization (KO) in → library and information science (LIS). His post-doctoral career spanned the 1960-2000 epoch in which the human-literature interface became the human-computer-literature interface. Although his work largely antedates the Web and Google, his analytical abilities, informed by very wide reading, are such that many of his works are likely to last. That is because he excelled at describing people’s situations vis-à-vis information services that the latest technology does not necessarily improve. He used philosophical reflection and thought experiments rather than empirical techniques to arrive at these descriptions, but in so doing he drew extensively on empirical research by others. A recurrent strategy of his is to characterize ideal services as a way of revealing the shortcomings of actual ones. Throughout his works he is hard on librarians insofar as their professional literatures hold out false promises for their services, but he is equally hard on information scientists insofar as their professional literatures rest on glib assumptions about what their algorithms or hypothetical systems will do. In both fields he undertook to deflate unwarranted claims and to temper even warranted claims with modesty. He was of a pragmatic and skeptical turn of mind.

Fig. 1: Patrick Wilson — Figure 1: Patrick Wilson

Wilson’s background was unusual among information scientists, many of whom come from the sciences or engineering. His bachelor’s and Ph.D. degrees, both from the University of California at Berkeley, were in philosophy. He had as well a bachelor’s degree in library science from Berkeley and experience in various Berkeley library jobs. These included part-time map cataloging while in school and then professional positions in reference librarianship (1953) and South Asian studies librarianship (1954-1959). In the latter position he published three large bibliographies (Wilson 1956; 1957a;b) while also writing a dissertation in the Anglo-American tradition of concept analysis (Wilson 1960a). He subsequently taught philosophy during 1960-1965 at the University of California at Los Angeles, and his first publications — on J. L. Austin, W. V. O. Quine, and aesthetics — appeared in philosophy journals (Wilson 1960b; 1965; 1966a). Given his earlier jobs, however, he was uniquely qualified to do something new — that is, to analyze as a philosopher what he had learned as a bibliographer (Wilson 1998a, 307-308). His conclusions, moreover, could be extended to all organizers of writings and thence to libraries and information services in the wider context of their users and non-users.

After transferring from UCLA to the faculty of Berkeley’s library school in 1965, he taught cataloging and published his first book, a treatise on bibliographical control called Two Kinds of Power (Wilson 1968). 2KoP’s forceful abstractions have won it many admirers (e.g., Smiraglia 2007; 2014), but its immediate forerunner was concrete and practical: a long, multidisciplinary, multilingual bibliography on South Asian science (Wilson 1966b). His major creative period was 1968-1983, during which his three books and most influential papers appeared, but he also developed many fresh ideas in the papers and book reviews of 1984-2001 (e.g., his analysis of copyright in Wilson 1990). A conference honoring his contributions to LIS was held in Sweden in 1993 (Olaisen et al. 1996). In a late memoir that is the best short account of his intellectual life (Wilson 1998a), he calls himself, dryly, “a bibliographer among catalogers”. A long, fascinating set of interviews he gave in an oral history project (Wilson 2000a) is titled Patrick G. Wilson, Philosopher of Information: An Eclectic Imprint on Berkeley’s School of Librarianship, 1965-1991. He was dean of that school (now the School of Information) during 1970-1975 and its acting dean during 1989-1991. In 2001 the American Society for Information Science and Technology gave him its highest honor for career achievement, the Award of Merit. His acceptance speech (Wilson 2001a) brilliantly distills the range of problems that attracted him.

What follows moves freely across his writings to extract themes and sometimes to contest points that bear on KO. Responses to his work by later writers are selectively cited but not discussed. Neither are his 25 book reviews (with one exception), but they appear in the references. A superb writer, Wilson elaborates and qualifies his ideas in considerable detail, and his arguments and occasionally amusing examples can be read for pleasure. The suasion of his style is largely lost in the present overview, but his own prose will often be quoted (italics in the quotations are his). His preferred form bibliographical has been adopted here, except when he or others use bibliographic. He also used he and a man in the old-fashioned way to stand for both sexes.

In its well-established narrower sense, KO deals principally with describing documents and organizing the descriptions for retrieval — that is, with products long associated with LIS (Hjørland 2016; Zeng 2008). But Hjørland (2003; 2008) and Andersen and Skouvig (2006) argue for a broader interpretation of KO — one that goes beyond the bibliographical concerns of LIS to relate knowledge to persons, groups, practices, and institutions in society. This sense has much in common with the field of social epistemology (Goldman and Blanchard 2018), which examines who knows what and how they know it. In Wilson, the two conceptions of KO converge. He wrote, for example (2KoP, 118), “The use of bibliographical instruments is frequently a stupid activity, as is, I suspect, known more or less clearly to many scholars, and provides an excellent reason why they should not do more of it”. The present account portrays both the bibliographical side and the social-epistemological side of his writings, with emphasis on their fusion (see also Hjørland 1996; Munch-Petersen 1996; Andersen 2004; Furner 2010).

[top of entry]

2. Consultants and aids

An early non-philosophical work of Wilson’s hints at his subsequent thought. The first words of his “Introduction” to South Asia: A Selected Bibliography on India, Pakistan, Ceylon (1957a, 1) are: “If one intended to read only one book on India, that book should be Nehru’s Discovery of India, an inside view of Indian history and civilization by its most prominent spokesman”. The “Introduction” is in fact a three-page bibliographical essay in which Wilson briefly states what various titles are good for or why they might interest the reader. However, the basis for these recommendations is a 41-page, single-spaced, largely unannotated list of publications assigned to form classes (e.g., Periodicals) or broad subject headings (e.g., History - Kashmir) in the manner of a library catalog. The “Introduction” thus seems an attempt to superimpose on the aridly impersonal bibliography the face of a well-read advisor — someone concerned with the uses of publications as well as formulaic descriptions of them.

This evokes Wilson’s distinction in 2KoP between consultants and aids. Regarding a subject literature, he says (116), the consultant or advisor is “able to say where to start, and whether starting was worthwhile, whether one might expect to find much or little of value and where one might expect to find it. He would be able to understand our purposes, and make reasonable suggestions, if not specific recommendations, about the best ways of attaining them; but he might also suggest that our purpose was unattainable, or that no textual means would be likely to be of much value”. The consultant, in other words, has read or read about a fair number of items in the literature — knows them from the inside, so to speak — and can assess and prioritize them on an inquirer’s behalf, possibly including the one best thing to read. Suppose, for example, a reader complained that the Nehru book is too long. A responsive consultant could identify its most pertinent parts for that reader or name a shorter but still reputable history of India.

The aid, a relative outsider, is much more limited in such dealings. For instance, if a student in 1960 had wanted a book on the history of Kashmir, the aid might produce the titles under that heading in Wilson’s (1957a) bibliography, but could do little more than that, having formed no opinions about them. The aid (2KoP, 117) “can discover for us answers to bibliographical questions if the answers can be got immediately or mediately from bibliographical instruments […]. He is one who can do those things which can be done on the basis of knowledge of the specifications of bibliographical instruments, a minimum of general knowledge, and the specific instructions of the person he is aiding”.

“Bibliographical instrument” is Wilson’s generic term for tools such as free-standing bibliographies, printed or digital library catalogs, indexes, guides to literatures, and journals of abstracts. The “specifications” of such instruments (59-62) he defines as (a) the domain (i.e., the set of writings) from which their contents are drawn, (b) the principles for selecting their contents, (c) what counts as a listable unit in them, (d) how these units are routinely described, and (e) how the descriptions of the units are organized (cf. Bates 1976). (More on specifications later.) The aid might know instruments in this sense better than the consultant does, but that is not enough to make the aid more helpful.

Wilson regards consultants and aids as ideal types that real people only approximate. The consultant resembles a scholar or subject expert in a given field; the aid, a librarian with bibliographical instruments and a collection in that field. As Wilson knew, there are scholar-librarians who can serve as consultants in their areas of expertise (he was one himself). However, deep subject knowledge is not generally presumed in librarians; it is not part of their professional image, so to speak. Librarians are front-line specialists in textual metadata. They are trained to acquire, create, and use sources in which writings are characterized, but which do not directly answer most non-bibliographical questions. Accordingly, when librarians seek to teach their potential customers what they know (informally or in classes), they discuss sources in which answers to questions might be sought, should the need arise. By contrast, consultants might use their own expertise to simply answer the question, obviating further inquiry. Wearing another hat, consultants might also synthesize research results for others, thereby conserving their reading time, which is another skill not ordinarily expected of librarians.

How society organizes potential consultants is discussed at length in Wilson’s 1977 book, Public Knowledge, Private Ignorance (PKPI ), under the headings “Specialists in Knowledge” and “The Social Organization of Knowledge”. Not surprisingly, the availability of helpful information or advice on various matters is shaped most strongly by occupational structure: people know what their jobs require them to know. Within this structure, librarians are prepared to give help of three limited kinds (100-107):

Bibliographical assistance . Staff in special libraries may search literatures (as in Wilson 1992a) and prepare bibliographies for researchers, but service at this level is generally reserved for the fortunate few. Neither public nor academic libraries are staffed to give such time-consuming help to their numerous, relatively unsophisticated users. Rather, librarians in these settings (and most others) simply refer their users to existing bibliographical tools or to areas of the collection where self-service may be productive.

Question answering. Librarians do try to answer some non-bibliographical questions directly. That is, for customers with questions about specific matters of fact, they will search for answers in ready-reference tools such as almanacs and atlases. Nowadays, of course, people look up their own answers on the Web, and even when Wilson was writing, librarians’ ready-reference services were hardly over-used. But beyond the shallow nature of these services, Wilson notes that, as a rule, librarians were not prepared to vouch for the accuracy of their answers. They searched only until an answer had apparently been found and almost never tested it for correctness across more than one source. Independent checks thus quite often showed their answers to be wrong — something neither they nor their customers had suspected. At most, librarians consulted works they deemed authoritative and then attributed those sources in their replies. This failure to assume responsibility for the quality of their answers casts doubt on their professional status. “People talk lazily of libraries as storehouses of information,” writes Wilson (1998a, 311), “but they contain at least as much misinformation as information, and the problem is to tell the one from the other”.

Selection assistance. On many occasions, people would welcome trustworthy advice on what to read or where to find the best information. Consultants steeped in particular writings can usually perform this service for people better than bibliographical aids. Unfortunately, consultants like this are often undiscoverable (or, if found, unavailable). Librarians, by contrast, are easy to find, and if they could complement their collections with dependable recommendations, many people would benefit. However, librarians are not and cannot be universal experts, with detailed, accurate subject knowledge across many fields. What a librarian can do, Wilson notes (PKPI, 105) “is to produce others’ recommendations — reviews, lists of recommended readings, lists of standard or canonical writings. He may be able to say, like an assistant in a book store, that one title is very popular, another has been well reviewed, and another has apparently a good scholarly reputation. Again, the librarian avoids making an independent judgment on the accuracy and trustworthiness of a text; he reports the views of others, or gives the patron a collection and lets the patron do the deciding”.

Librarians do sometimes recommend other persons as sources of information, but here, too, they typically act more as aids than as consultants. “If I ask to be referred to a personal information source,” writes Wilson (107), “I do not expect to be referred to an arbitrary source, but to the best, or at least a good, source. I do not want a list, say, of doctors or lawyers; I can find that in the telephone book. I want to be told which is a good one. Even if there is only one agency or personal source for some sort of information, I want to know whether it is any good or whether I would do better to avoid it. This sort of advice is not, so far as one can tell from published literature, offered by libraries”.

So, are consultants always more effective? No, because they cannot guarantee their advice either — cannot guarantee that it will produce successful outcomes. In his chapter on “Reliability” in 2KoP, Wilson points out that, except for relatively simple problems, there are no clear tests of success in advising readers (126): “If my adviser tells me that a certain work is worth examining, and I do look at it but find nothing in it to my purpose, the outcome is perhaps no success but neither is it a ‘failure’ (except, perhaps, on my part), and does nothing at all to discredit the advice”. A recent illustration: after confessing an “embarrassing” inability to read poetry, the author Amy Chua (2018) says, “A good friend gave me Edward Hirsch’s How to Read a Poem, which I read and still have on my shelf, but it didn’t work”. Or take Wilson’s own claim that Nehru’s book is the best introduction to Indian history. Suppose someone begins it and gives up; is this a failure on Wilson’s part? No, his advice remains justifiable. On the other hand, suppose someone enjoys and learns from the book; does satisfaction with it prove Wilson right? Not altogether; another choice might have been even better. Wilson (1978a, 20-21) observes of subjective satisfaction in general: “There are obvious reasons why one should take care to see that users of information systems are satisfied, but it is not obvious that their satisfaction should be the goal of the system; rather, it is the satisfaction of their needs and wants that should be the goal”. He means satisfaction that is logically, not psychologically, related to fulfilling needs or wants, because the psychological kind may be illusory.

More broadly, who is competent to evaluate Wilson’s advice? Some experts on Indian history might agree with him, but others might favor other introductions; for any consultant, these evaluations are matters of opinion, not fact. Unless success can be measured by some objective test of utility, all we can hope for in grappling with superabundant writings are consultants’ best guesses on what to read.

[top of entry]

3. Modeling information seekers

The foregoing account illustrates Wilson’s fusion of social and bibliographical themes. Stated tersely:

Instruments such as bibliographies and catalogs undoubtedly have their uses, but for many questions, persons are preferable sources of answers, if such persons can be found.
Characterizations of writings by their potential utility to us are preferable to neutral bibliographical descriptions of them. But then someone must evaluate writings for that purpose.
While people frequently have questions that writings can answer, most people do not want long lists of possible things to read. They want the one best thing to read, which again involves critical evaluation.
In all of these matters, what we would ideally like from information services, including those in libraries, is not what we can routinely get.

The notion of the one best thing to read, such as Nehru on India, is contextualized in Wilson’s discussion of library users and non-users in any large population (PKPI, 94-99). Here he posits a variable called studiousness. This is “the number of sources [i.e., full-text documents] one is prepared to use together in relation to a single decision problem”. The individuals who are unwilling or unable to study any document in relation to their problems are studious to degree zero. Individuals “prepared to study a single source, but no more” are studious in the first degree. For them, even two documents are too many (94-95): “If we use two sources together and both tell us the same thing, the second source has added nothing except, perhaps, a degree of confirmation. If the two tell us different things, however, the work of the additional job of comparison, reconciliation, and decision of which to believe is added”. Finally, “those willing to use together any number of documents,” are “studious in the nth degree”, where n designates the tolerable number of documents.

Since each new document adds effort to reaching a decision, degrees of studiousness are distributed very unevenly across the population. Wilson imagines a falloff in which the largest group of people would be at degree zero, the next largest group would be at the first degree, and then the group frequencies would sharply decline rightward as the count of documents to be studied increased. Wilson does not put it this way, but it appears that, were actual data available, the frequencies would form a reverse-J or power-law curve of the sort common in LIS.

The studiousness variable can be used to partition any society’s members. Potential consultants on what to read can be defined as persons who have already shown themselves to be studious to various degrees in certain literatures. A library’s potential customers can be imagined as zero-book, one-book, and multi-book people — the first being mostly unreachable and the last frequently made up of aspiring or actual knowledge workers and decision-makers. (While Wilson’s oeuvre deals most often with knowledge workers, he also supervised Elfreda Chatman’s (1983) dissertation on the working poor, many of whom are in the zero-book category. She became known for her writings on how several such groups seek and use information.) In this context it is the one-book people — those studious in the first degree — who interest Wilson as he considers potential users of libraries as information centers. These users rarely if ever need what he calls “the complete library” — that is, a large set of deep collections accessed through complex bibliographical instruments. Instead, they need relatively small collections of readable “single-package” works that deal with commonplace problems and that can be accessed through browsing. But this, Wilson observes, is precisely what typical bookstores also offer. Bookstores do not eliminate the need for public libraries, but they put continual pressure on them to justify their economic existence.

If people have practical decisions to make (Wilson’s example is, “Should I sell my car and use public transportation?”), why would they not benefit from having a large array of well-indexed collections at their disposal? The PKPI chapter on this question — “Access to the Complete Library” — answers largely in terms of mismatches (88-93). The user may lack the right search terms to find items relevant to the decision. Topically-organized bibliographies may not align well with it. On-topic items may be low in quality. Items that would be jointly useful if found may be hard to find because they are topically dissimilar. Also, many items in the complete library may be written in technical vocabularies or foreign languages the user does not understand or understands only with difficulty. If the language of the items is understood, the user may still lack enough background to evaluate what is claimed in them. More generally, the user must be asking the right question and willing to commit non-negligible time and effort to it. And lastly, the content of the library items must be reasonably accurate and not false. Wilson’s compact presentation of these problems amounts to a rationale for avoiding large libraries whenever possible. As such, it contributes to basic information behavior theory.

Libraries and reading can be integrated in a general model of personal information systems (36-39). Most adults have internal images of the world that are more or less well developed, but whose particulars are continually updated by information from various sources. Wilson assigns these sources to three systems:

The monitor system. Everyday means of monitoring our surroundings are, first, observation and, second, communication with others. For example, we might routinely check certain places, talk to certain persons, and follow certain media reports. An inventory of our monitoring systems at a given time would list these sources, the topics associated with them, and the frequency with which they are used. Our habits are shaped by the perceived utility and quality of the information they supply.

The reserve system. We also know of non-monitored sources we can turn to if needed. We value having these potential sources in reserve even if that need never comes. For the great majority of people, libraries and the items they hold, such as reference works or databases, fall in this category. (So would Web reference sites.)

The advisory system . Wilson emphasizes the third system because it can supply not only information but counsel on what to do in problematic situations. While both persons and writings may qualify in the advisory role, persons are much more important, he says (38), because they “can fit advice to the circumstances of the particular case and the particular time, as documentary sources cannot with any exactness. Documentary advice must be more or less impersonal, directed to circumstances of a given type. Whether our own circumstances fit the type is exactly what one needs to know but cannot find out from documentary sources”. And again (40):

We can converse with people and (often) get quick answers. We can ask them, in effect, to reorganize what they know to bring it to bear on a problem and to select from their stock of knowledge the things that we should know. We can ask them to use our problems, our interests, and our capacities as bases for the selection, organization, and presentation of part of their stock of knowledge, or as bases for the giving of definite advice. They are supple and adaptive sources of information, as documentary sources are not. Anything a personal informant or adviser might tell us could be part of a documentary record, but documents do not reorganize themselves and rewrite themselves on demand to fit new questions.

[top of entry]

4. Bibliographical control

For Wilson, stocks of knowledge in the traditional philosophical sense of “true, warranted beliefs” reside only in people’s heads (PKPI, 4). But writings can represent people’s knowledge — and their non-knowledge as well, such as their opinions, conjectures, fantasies, and false beliefs. Moreover, the same writing will frequently mix knowledge with non-knowledge, and because the two are rarely flagged as such in texts, the differences between them are by no means necessarily apparent. “Since there is no mark by which we humans can recognize the truth when we see it, we have invariably to make do with the best opinion we can get, the best attested opinion” (2KoP, 27). Thus, to call bodies of writings in their entirety “public knowledge” is to mislead (PKPI, 4-5). Yet by common consent, innumerable writings are worth reading. They pass the test for public knowledge when that is defined not as absolute truth, but as (5) “the view of the world that is the best we can construct at a given time, judged by our own best procedures for criticism and evaluation of the published record”.

Given that people with one or more degrees of studiousness often read for practical purposes, they naturally seek the writings that will advance these purposes the most. As noted, librarians have traditionally tried to assist them by providing instruments that characterize published writings of all sorts. Taken jointly, these characterizations bring writings under what librarians call bibliographical control. In 2KoP, Wilson reimagines such control as two kinds of power that a reader might have.

Exploitative control is the power to obtain the best textual means to an end. In its ideal form, the wielder of it (25) “has merely to say what he wants the writings for, and is then provided with whatever will suit that purpose best, whatever it is”. In practice, exploitative control depends on evaluating texts for their potential to help specific readers. Consultants may attempt to do this; more often, readers will attempt it themselves, by considering texts for virtues such as intelligibility, accuracy, adaptability, and scholarship. Simultaneously, readers must consider the utility of texts in light of their own interests, knowledge, and capacities. Their concluding step is to decide how well the texts have actually served them (22-23).

Descriptive control is the power to line up populations of writings that meet an evaluatively neutral description — neutral in the sense that no one has appraised their likelihood of helping the reader, or even how well they actually fit their descriptions. In its ideal form, the wielder of this power “can have summoned up every writing that fits his arbitrary description so long as the applicability of the description can be discovered without any consideration of virtues or vices or utilities” (25). As examples of neutral descriptions, Wilson gives (22) “authored by Hobbes”, or “discusses the doctrine of eternal recurrence”, or “contains the word ‘fatuity’”. Writings with these features we can imagine being retrieved through explicit term-matches in bibliographical indexes or full-text databases. Perhaps the most familiar implementations of descriptive control are instruments that characterize items by their genres, authors, titles, dates, and subjects. As a matter of policy, the most desirable expansion of such control for Wilson (147-148) would be greater revelation of the subject matter in writings.

Descriptive control may be identical with exploitative control when retrieving neutrally described texts is an end in itself — for example, if I want the first edition of a play, or a book that, by virtue of full-text indexing, contains a textual string that I supply. But, in general, such control is weak at identifying writings by personalized function — Wilson’s ideal when both function and personalization are taken seriously. Suppose I want to remedy my shyness, and I look for “self-help” books on that problem. For me at least, the promise of any book so labeled may be deceptive; as in Amy Chua’s case, it does not help me help myself. Or suppose I want the best introduction to economics I can find, and I look for textbooks with “economics” and “introduction” in the title (26). It hardly needs saying that this is not a surefire route to what, for me, would be the most suitable introduction — the one best thing to read.

Exploitative control has its parallel in the services of consultants; descriptive control, in the services of aids. The better power to have, obviously, is exploitative control. Wilson comments (26): “The only reason for wanting the ability to line up a population in arbitrary ways is that one lacks the other power, and has oneself to attempt discovery of the best textual means to one’s ends by scrutiny of members of various neutrally described classes of the population”. Exploitative control is not imaginary; readers frequently do find the best textual means to ends. For instance, they find guides that not only lay out the steps for doing something, but that lead to success in doing it — e.g., kitchen recipes, statistical algorithms, parts catalogs, avionics manuals. However, in incalculably many cases, exploitative control exists only as an ideal. What is more, the “best textual means”, even if found, may be unrecognizable as such (30). The situation resembles that of readers with respect to consultants: success in achieving goals is not guaranteed, if only because readers vary so much in the qualities they bring to text-based endeavors. Wilson nevertheless equates consultants with exploitative control (149), since it is obvious “that the use of bibliographical apparatus is not an activity engaged in for its own sake, that it is an activity that people will avoid so far as they can, and that it is in general more pleasant, more efficient, and quicker to ask a question of a person likely to know the answer than laboriously to seek the answer in catalogs and bibliographies”.

Both kinds of power can be assessed on certain dimensions (34-39): the populations they would serve, how reliably they can be exercised, the extent of writings they cover, their versatility in meeting demands of different sorts, and the nature of items supplied under them, from vague sets of unlocated titles at one end, to copies of full texts for personal ownership at the other. Wilson concludes his argument by imagining, as a rhetorical device, omnipotence on these dimensions (39-40):

If I had the greatest conceivable degree of exploitative control, I would be able to have the best means to my and everyone else’s ends supplied instantaneously, effortlessly, with absolute reliability, the supply consisting of the most suitable copy or performance in the bibliographical universe. If I had the greatest conceivable degree of descriptive control, I could have supplied, under analogous conditions, items satisfying or fitting any neutral or non-evaluative description whatever.

A fantasy, of course, but it jolts us into thinking about actual bibliographical instruments in terms of the powers they give. Take, for instance, a large library’s online catalog. How does it perform on the dimensions of exploitative control? Of descriptive control? Are its objectives even stated? What would be feasible advances in its capabilities? How is it linked to other bibliographical tools? By what criteria can its successes and failures be judged? Pursuing questions of this sort, one sees the relevance of knowing the general rules by which the catalog was constructed. One wants its specifications, which is why Wilson argues that makers of bibliographical instruments should state them. He himself did this to some extent (Wilson 1956, v-vi; 1966b, vii-x), but the practice is far from universal.

Recall that, in Wilsonian specifications, the set of documents considered for inclusion in an instrument is called a domain. Bibliographers create instruments by selecting documents from a domain on grounds such as their language, form class, subject matter, time of publication, and audience level. Then the bibliographers’ specifications of domain and selection principles, if trustworthy, imply that they have included all the documents in the domain that met their selection criteria, and that further searches over that domain are not needed. Explicit specifications thus increase the powers that bibliographical instruments provide by licensing certain inferences. For example, Wilson distinguishes on this basis between an inconclusive literature search and a search that is a negative success (2KoP, 58-59). A negative success occurs when we can infer that documents meeting our criteria do not exist, because bibliographers have established that fact through prior searches based on their specifications. A search is inconclusive when we cannot tell whether documents remain to be discovered, because bibliographers have not stated their procedures, leaving us up in the air.

Two further examples: specification of the units listed in the instrument (e.g., “books and articles only”) allows us to infer that other potentially valuable items must be found elsewhere. Specification of the routine descriptions of items (e.g., by author, date, and so on) allows us to infer that the absence of a descriptive feature (e.g., date) means that that feature is absent in the document and not simply omitted by accident. Wilson’s remarks on specifying how bibliographical instruments are organized by subject will be taken up in Section 6 and Section 7.

Bibliographers and librarians essentially do the same thing, says Wilson (1998a, 309), in that both groups search and analyze files of writings, select items for inclusion in new contexts, describe the items, and organize the descriptions. Their only real difference is that bibliographers make virtual collections of documents, whereas librarians make actual ones. But when Wilson first taught at Berkeley, ideas like these were not routinely part of library school courses. In his words: “my aim was to end the isolation of cataloging and classification instruction from questions of policy and alternative practices, to try to prevent students (and teachers) from thinking of the subject matter as just technical routine to be mastered and get them to think of it as a central part of a very large, complex system of bibliographical organization”. As “a bibliographer among catalogers”, he was familiar with the latter’s tendency to focus on detailed lore about current practices, a sort of myopia he opposed. His own mind was stocked with examples of librarians’ follies; for instance, as a Berkeley librarian he had been assigned to make entries for a labor-intensive catalog of maps that no one ever used (305), yet he also knew of valuable books that were not findable because they appeared in monographic series, and library policy at the time was to catalog the series by name rather than the books (2KoP, 61). Thus, his implicit question to the makers of any instrument is always: “Why are you doing this in this way? What purpose does it serve?” By stressing the design and critical evaluation of bibliographical instruments, and not solely their maintenance, he was performing the philosopher’s job of teaching his readers how to think.

[top of entry]

5. Reimagining cataloging

Wilson’s proposals for library catalogs were visionary in their time, and some still are. These instruments put explicit descriptions of published items in specific arrangements or give the descriptions specific points of access. Naturally this triggers policy questions like “What items should a catalog cover?” and “How should the items be described?”. Well into Wilson’s career, the answers presupposed print technology, publication in book format, and card catalogs. Strictly speaking, books merely package content; they are not identical with what is packaged. Yet books were the unit cataloged (and remain so), partly because of their visibility and tangibility in collections. In breaking with this past, Wilson prepared his readers for new possibilities of computerization.

The first chapter of 2KoP is called “The Bibliographical Universe”. The items of this universe are not books but intangible writings (or recorded sayings), decoupled from any particular storage medium (6-11). A given writing may be regarded as a → work (a linguistic composition of any length judged more or less complete by its producer), as a text (an abstract string of linguistic symbols in a certain sequence), as an exemplar (the union of a text with a durable storage medium), and as a copy (a reproduction of an exemplar). The set of copies made from a single exemplar is an edition. Applying this chain to typical books is straightforward; for practical purposes, there is only one edition — one work, one text, one exemplar, and one set of copies. But for other publications, the chain is much more complex: many valuable works consist of families of texts that vary among themselves; the texts appear in different exemplars; editions made from diverse exemplars proliferate across locations; significant intertextual ties exist among different works, and so on. The cataloger attempts to bring order to this complexity so that works are discoverable and copies of them are findable. But note that library users typically want a copy of a work; its medium of storage (e.g., in a journal or a book) is secondary. Thus, the work-text-exemplar-copy chain opens up works of any length to cataloging; they need not be published as books to qualify. Note, too, that Wilson’s 1968 work-text-exemplar-copy chain anticipates the work-expression-manifestation-item chain of the computer-oriented → Functional Requirements for Bibliographic Records, or FRBR (Coyle 2016), which did not appear until the 1990s.

“The Catalog as Access Mechanism” (Wilson 1983a) subverts received ideas by taking them literally. In Charles A. Cutter’s 19th-century dictum, the catalog’s first objective is “to enable a person to find a book of which either the author, the title, or the subject is known”. Taking “find” literally, the card catalog did not suggest the whereabouts of any book not in its assigned place on the shelf. Nor did it lead to books not owned by the library but available through interlibrary loan or some other means. Nor did it lead to texts of the same work held by the library (e.g., Macbeth) if they did not occupy a whole book or were not in foreseeable volumes (e.g., Shakespeare’s Tragedies). For example, while analytics on a catalog card might reveal that Plays of the Supernatural (an imaginary anthology) contains Macbeth, analytics are not access points, and someone who did not already know that anthology by title or editor would not find that text of Macbeth. Why, Wilson asks, should one text of the work be cataloged but not another? Cutter’s second objective for the catalog is “to show what the library has by a given author, or on a given subject, or in a given kind of literature”. Taking literally “show what the library has”, the card catalog did not show authors’ works published, e.g., in serials or as book chapters. It also notoriously failed to show everything the library has on given subjects. For example, under the principle of specific entry (Wilson 1979a), books were assigned the most specific subject heading that covered the entire book, and so a book on, say, political polling would receive a heading indicating that topic. However, someone who assumed that everything the library had on political polling appeared under that heading would not find a book on, say, American political history that had rich material on it. (The catalog’s failures in revealing Cutter’s “kinds of literatures” are taken up below.)

“The Second Objective” (Wilson 1989a) continued to examine books vs. works in light of the increasing capabilities of computers and telecommunications. Those capabilities were creating virtual libraries of e-texts that could be stored, copied, and read anywhere; one no longer had to visit an actual library to obtain copies. This called into question Seymour Lubetzky’s 1953 formulation of a catalog’s two objectives, as phrased by Wilson (7): “the first, to ‘enable the user…to determine readily whether or not the library has the book he wants’; the second, to reveal what works the library has by a given author and what editions or translations of a given work”. The first objective lessens in force as library ownership of physical copies lessens in importance. However, given the priority of works for users, Wilson’s main assertion is that the second objective should actually be the first. That is, publication in book format should no longer be a screening device for determining what writings are cataloged; the same instrument can record and unite an author’s books, articles, book chapters, and papers in an “index-catalog”. (Today’s research libraries, e.g., Berkeley’s, are moving in this direction.) Thus, the work would replace the book as the unit cataloged. The first part of the catalog record would present the work’s author and title in standardized form, describe its content, and give (9) “historical or contextual information relating to its creation. The second part would be open ended, a potentially growing locating record telling us that the work appears in such and such a book, also in such and such a journal, and in such and such a microfiche collection, and so on”.

Importantly, virtual copies of a work would be added to its locating record. Historical or contextual information about a work could define it in terms of sequences of textual states. A finished, stable work emerges from drafts and may also be released in new editions, all of which join its sequence of states. In a growing work (11-12), “parts already completed are stable and new parts are continually added”. In a changing work (e.g., a database), “parts already produced are changed, new parts, are added, and old parts are subtracted”; every update thus represents a new copy of the work. This opens new challenges in characterizing unstable works bibliographically, a problem less salient in the days before computers. A last consideration (14-15) involves what “smaller” genres might be cataloged, such as short stories, poems, book reviews, newspaper articles, and letters to the editor. While adding these to the main index-catalog would make it unmanageably large, there is no reason in principle why they could not be cataloged as above and linked to the main file in files of their own.

A more specialized essay on the second objective (Wilson 1989b) responds to Ákos Domanovszky’s proposals for cataloging editions of a work (as in the Lubetzky quote). Briefly, Wilson counter-proposes a policy (347) that would (a) represent distinct works separately, (b) label as identical the different editions of a work that have fully or nearly identical texts, and (c) bring together works that are strongly related by criteria other than textual identity. Item (c) means that works that derive from a core text, such as translations or adaptations of a classic, should be linked to it. This has long been accomplished by giving classics uniform titles and then cataloging derivative works under these titles. Item (b) is much more novel; to this day, catalogers do not label identical texts across editions. Yet many potential readers would like to know that one text of a work is substitutable for another (e.g., the proceedings version of a paper for the journal version), especially if the two are not equally accessible.

Wilson and Robinson (1990) found a state of incompleteness in the Library of Congress headings that identify Cutter’s “kinds of literatures” by form (e.g., directories) or genre (e.g., fiction). Catalogers typically add these as subdivisions to topical headings for works. What is implied, Wilson and Robinson ask, if a work has no form subdivision added to its record? They conclude that there are no “generic” works that cannot be cataloged by form. Rather, there are simply works for which proper form headings are as yet uncreated or unavailable for free assignment. Through a process of elimination, they determine some of these to be composite works, such as non-literary anthologies. Others are “single complete factual discursive” works, such as how-to-do-it guides and book-length introductions-to-something. These and many other potential form headings are already in use by publishers, authors, reviewers, and readers. They are also evaluatively neutral. The question then is why the Library of Congress leaves unnecessary gaps in its repertory of form headings — a question still with us today. (For some critical responses to Wilson’s ideas on cataloging, see Yee 1995 and Svenonius 2000.)

[top of entry]

6. Subject indication

Large-scale provision of subject access to writings involves characterizing them with terms that supposedly express their degrees of topical similarity and that also map onto people's interests. Put differently, the terms label places in pre-arranged schemes such as subject-heading lists, → thesauri, and classification schedules, and writings are assigned to those places in bibliographical instruments.

Describing a hypothetical subject scheme (2KoP, 66), Wilson makes the point, important for KO, that such schemes indeed list subjects, not concepts. In so doing, he distinguishes between understanding the meaning of terms, and using those terms to refer to writings. Subjects are indicated by the act of referring. For example, suppose an imaginary book called Flames tells the story of altar candles. Then by assigning the book to Altar candles in a scheme, one is in effect referring to its subject matter — to things that Flames itself refers to at length. The term Altar candles also has one or more meanings for the scheme’s users (perhaps aided by a scope note), and if one chooses, these meanings can be related to the concept of altar candles. However, Flames is not about how one understands the term or demonstrates that understanding, as if a concept were being analyzed; it is about altar candles in the world. “One can write about concepts,” Wilson says, “but most writings are not about concepts, but about other sorts of things, for instance, water, queens, candles”. Following his logic, it appears that, even in writings about concepts, authors are referring to concepts as their subject matter, and bibliographical terms that echo that fact would simply be subject indicators, not “concept indicators”. This would also hold if concept is merely being used to mean a complex abstract idea, such as “similarity” or “autism” or “democracy”.

The example of Flames has the advantage of seeming very straightforward, but, in Wilson’s view (69), bibliographical instruments that indicate subjects are “the most difficult to make and the least generally satisfactory”. He explains why in several of his works, but especially in 2KoP.

[top of entry]

6.1 Subjects of entire writings

The 2KoP chapter “Subjects and the Sense of Position” analyzes the situation of those who assign entire writings to places in subject schemes. These bibliographers (to call them that) have already placed millions of works and continually add more; characterization by subject would therefore seem to pose few difficulties. But how, Wilson counters, do bibliographers decide the subject or subjects of writings so as to assign them to one or more positions? He calls the matter “deeply obscure” and notes that no manual in fact offers rules on how to do it (70-71). Nor are such rules ever likely to be found, because the very notion of → subject (or “topic” or “aboutness”) in writings, while not meaningless, is inherently vague. The common intuition that bibliographers can identify the subject of a writing requires them to choose a labeled position that precisely describes the work as a whole. Far from being easy in all cases, Wilson writes (89), “The notion of the subject of a writing is indeterminate, in the following respect: there may be cases in which it is impossible in principle to decide which of two different and equally precise descriptions is a description of the subject of a writing or if the writing has two subjects rather than one”.

Suppose bibliographers could obtain lists of terms that identified (a) everything the writing explicitly mentions and (b) all its implicit concepts (i.e., abstractions inferred from its text without being mentioned in it). Wilson calls such a list the writing’s Cast of Characters (77-78). But even the Cast of Characters for a writing would not necessarily lead bibliographers to its unique subject; in fact, the Cast would likely contain multiple equally precise descriptions of it, thereby complicating placements. The far more limited information that bibliographers actually work with is still equivocal as to the subject (or subjects) of a writing. The heart of the difficulty is that bibliographers’ guidelines do not link the labeled positions in subject schemes to any consistent set of documentary features. If they did, specifications to that effect could appear in bibliographical instruments, but of course they do not. In contrast to, say, biological classifications of plants and animals, which are feature-based, the signs of aboutness in writings are left to bibliographers’ own judgments, and no feature or set of features in a writing determines what they might infer or wish to express about a work. At most, they operate by in-house conventions and precedents rather than by rules that everyone understands.

Here is Wilson’s main argument verbatim, but recast as bulleted points (90-91):

If position is assigned on the basis of identification of some determinate feature of writings, we can know that items at a position will share features in common, and in some respect differ from items located elsewhere.
But what can we predict about what items at a position will have in common, that will distinguish them from items everywhere else, if position is assigned on the basis of identification of subject?
Of the items at other positions, some might have been assigned to this position if a different method had been employed of identifying subjects;
items at other positions may resemble some of the items at this position more closely than the items at this position resemble each other, and
this not because of mistake on the part of the locator, but because of the indeterminacy of the notion of the subject of a writing.
No single feature, and no cluster of features, set off writings at one position from those at all other positions;
the rules of assignment prescribe nothing definite, and no confident predictions can be made about what will be found in the writings at a given place.
So the place has no definite sense.

Wilson’s critique is most applicable to subject classification and cataloging of books → in libraries; the thesaurus-based indexing of journal literatures in the sciences, e.g., medicine, is probably more predictable. Nevertheless, his broad account of subject indeterminacy explains the tendency of bibliographers to assign writings inconsistently (cf. Wilson 1992a, 168). This is a problem hidden by the easy match in the Flames–Altar candles example.

There is no one best way to ascertain subjects. In an analysis that has influenced other writers (e.g., Hjørland 2001; Andersen 2004; Joudrey and Taylor 2018), Wilson describes four methods by which bibliographers might infer where writings should be placed (78-89). All have flaws, and all might yield different assignments for the same work. The four will be briefly paraphrased as directions, with Wilson’s caveats, introduced by “However,” immediately following.

Authorial purpose. Look for authors’ own statements of their primary purpose in writing — the “master plan” that governs the work as a whole. However, works often have more than one purpose, and the main one cannot always be readily identified, especially if the purposes are interlinked. Other works may have purposes that are indefinite or shifting or mischaracterized by the author (e.g., Goodman 2019 notes a misleading subtitle), which clouds placement decisions.

Figure-ground perception. Look for the text’s dominant entities — those foregrounded in the exposition, as opposed to others treated as background. However (83), “Dominance is not simple omnipresence; what we recognize as dominant is what captures or dominates our attention, but we cannot expect that everyone’s attention will be dominated by the same things”.

Reference-counts. Look for the items in the text that names, words, and associated pronouns most frequently refer to — i.e., estimate the counts. However, this does not guarantee that figure and ground will be clearly distinguished (83): “The constantly-referred-to item might be merely a background item, as a history of happenings in Petrograd might mention Petrograd constantly while the action was described in terms of a succession of different persons and their various doings”. Items frequently mentioned in a work — e.g., a person’s relatives — might also be grouped by bibliographers in equally plausible but arbitrary ways. At the same time, an apt subject term for a work might never occur in its text at all — e.g., the phrase “political career” in a work wholly concerned with incidents in someone’s political career.

Unifying rule. Look for a rule that seems to unite the elements of a work into a coherent whole — for example, its principle of inclusion and exclusion or the scope of the questions it answers. Such characterizations may not be made by authors themselves, but they can be inferred. However, this again requires bibliographers to impose their own insights onto authors’ texts, and decisions as to subject placements may again be arbitrary — “a piece of artistry on our part” Wilson says (88), “rather than on the part of the writer”.

The four methods all presume too much reading and cogitation to be feasible; they reflect principled judgments in theory, not what is or ought to be done in practice. Wilson knew full well that real-world bibliographers (91) “do not have time to brood over alternative possibilities, nor do they need, in most cases, to attempt a very precise description of subjects. It is their job to locate items quickly, and the organizational schemes they use are mostly too coarse to allow or require the making of fine distinctions. They find a location which satisfies them, and count this a success”. This imperfect solution still reigns, whether bibliographers are classifying books (under the maxim “mark it and park it”) or cataloging them under one or more subject headings.

[top of entry]

6.2 Subjects of parts of writings

In the chapter “Indexing, Coupling, Hunting”, Wilson shifts to ways of making parts of writings — passages of various lengths — retrievable by subject, on the ground that these may be at least as valuable as writings in their entirety (93). He starts with two possible strategies. The first is to divide a writing into paragraphs (or other small stretches) and assign each to a single, finely discriminating subject position. But paragraphs, like entire works, are nebulous to assign, and authors’ inconsistent styles of paragraphing do not help. The second strategy is to greatly increase the subject terms applied to the writing overall — to assign it, that is (94), “to as many positions as we like or can afford”.

The latter set of positions would, at the extreme, be the Cast of Characters for the writing — all its implicit concepts and explicit mentions. Could concepts from the Casts of writings be merged to create a true “concept bibliography”? Wilson rejects the idea as delusory (95). He also rejects the idea that every explicit mention of something might be valuable, giving as cautionary examples “a hundred thousand mentions of Dante” (95) and “all discoverable discussions of the freedom of the will” (137-143). When he wrote, concordances existed, but keyword indexing of full texts by computer was hardly dreamed of. Now, explicit mentions in the Casts of digitized writings yield enormous retrievals. Entering “Dante” in Google currently produces 224 million documents. Entering “freedom of the will” produces 13.2 million. In a sense, the quality of such retrievals depends on how well people use keywords to index their own purposes. Even so, they tend to ignore all but a tiny fraction of Google’s indiscriminate search results, and they may also dismiss the texts (e.g., Wikipedia) that the search algorithm ranks highest.

There is thus still a place for human indexers guided by time-honored criteria for indicating subjects. “Internal criteria”, Wilson writes (98), “are those whose application requires looking at nothing but the writing being judged; external criteria are those whose application requires looking beyond the writing itself”.

His internal criterion for indexers is the apparent importance of discussions within texts. One test of this is perceived indispensability; that is, would removing a discussion greatly affect the text’s overall meaning? If so, the discussion should be → indexed. A perhaps quicker test is simply to observe the page-space devoted to something: the greater the space, the greater its importance. But while the latter test seems sensible, it is not always easy to decide where a discussion begins and ends. First, a subject must be identified, with all the difficulties that poses. Once that is done, direct references to the subject may be visible, but what about the indirect ones? What about passages strongly associated with it by implication? Nor is discussion length a reliable indicator in all cases; something very brief, e.g., a sentence or a few numbers in a table, could be the most retrieval-worthy item in the text. Judgments of textual importance on internal grounds, Wilson says, are essentially aesthetic in nature; they resemble the opinions of editors refining a manuscript for publication.

Judgments on external grounds are not aesthetic but social: indexers should bring out whatever in texts has potential value to readers. Depending on the intended audience, an indexer can highlight quite different aspects of the same text (98): “An indexer who knows the active interests of some group of people will count as important enough to mention whatever he thinks would be seized on by one with those interests”. A reader thus may care nothing about the length or dispensability of a passage as long as it is personally engaging. In the social case, a practical distinction is between indexing for a broad group (e.g., a whole discipline) and indexing for a few specialists or even one individual. The indexer’s understanding of readers’ differing goals and interests would then shape the criteria of importance. For instance, a new text might have one set of implications for the discipline and another for the few specialists, and the two groups would want indexers to respond accordingly.

Ultimately, however, the notion of “importance” is like the notion of “subject” in that it is not linked to any determinate set of features. It therefore cannot be captured in bibliographical specifications or in instructions to indexers; what they do remains an art (100): “We can give long lists of examples of things to look for, but at the end of our list we must say ‘and so forth’, trusting to the wit of the indexer to extend the list, or to see how it could be extended”. In this case, the slipperiness of importance contributes to the inconsistent results that indexers produce. It also implies that the relatively impersonal indexing for a group (e.g., members of a discipline) might be wholly or largely useless for a particular member of that group.

Wilson then analyzes the situation of any individual faced with impersonal subject indexing — that is, anyone searching for writings on a subject in large bibliographical instruments. Such writings are defined by the searcher’s interests. If he or she can find these writings simply by consulting a known subject position, or simply by reading further descriptions of writings at that position (e.g., abstracts, excerpts), all is well. However, searchers ignorant of terms and placements must rely on their knowledge of the world and their inferential powers to make headway. In Wilson’s terms of art, one of their tasks is “hunting” — i.e., trying to predict likely subject positions under which to look. (To convey the difficulties involved, he presents the case of someone searching Dewey Classification positions for items on the history of the stirrup.) Another task is “picking” — i.e., trying to decide whether writings are retrieval-worthy when descriptions of their contents are inadequate.

Because bibliographers know that inferring subject positions is problem-ridden, they usually provide auxiliary tools to facilitate hunting (105-109). They supplement an instrument’s main arrangement of positions with alphabetical or classified indexes. They also explicitly refer searchers from one position to another to remind them where similar writings may be found (e.g., X, see also Y). Wilson calls these latter linkages “couplings”, and he distinguishes three sorts. Analytic couplings show semantic or logical ties between terms (e.g., synonyms, wholes and parts, genera and species). Factual or synthetic couplings link commonplace matters of fact (e.g., Pierpont Morgan and bankers; diamonds and cutting tools). However, the relations revealed by links of these first two sorts are seldom news; one knows many analytic couplings simply by knowing a language, and many factual couplings simply by having a standard mental encyclopedia. Of greater value are what Wilson calls overlap couplings, since they can reveal similar writings occurring in unfamiliar or unexpected positions. His example is the overlap between histories of Sanskrit literature and histories of Indian medicine.

As a “General Rule of Hunting”, Wilson proposes (110) that “Discussions of a thing X are more likely to be found in the context of discussions of a thing Y, the more closely related Y is to X”. But even to guess at X-Y relationships, searchers need background knowledge, and in this they vary greatly. Can bibliographers therefore couple the most closely related subject positions on their behalf? If so, how can closeness be estimated over vast numbers of subjects? Especially, how can valuable overlap couplings be made? Wilson in 1968 nibbled at the edges of certain statistical solutions then available (e.g., Kessler’s bibliographic coupling), but he did not foresee all the powers that computerizing bibliographical texts and then full texts would bring. That is, although he knew about word occurrence counts, he did not foresee the benefits of having co-occurrence counts instantly available in very large databases — counts of co-occurring descriptors, co-citations, and the like. Co-occurrences can show perceived overlaps, and the higher the count, the closer the coupling. For example, books or articles described as histories of Sanskrit literature might be frequently co-cited with books or articles described as histories of Indian medicine, and bibliographers would not need to detect this overlap themselves; it would be automatically created by scholarly citers. The availability of large-scale co-occurrence data does not solve all problems, of course, as Wilson would be the first to note. But he would have to ponder decades of statistical solutions, including automatic term-weighting schemes now standard, if he were writing a new essay on bibliographical control.

He did, however, deliver one verdict in his final book review (Wilson 2001b). His main criticism of The Intellectual Foundation of Information Organization by Elaine Svenonius is that, in a world of “self-describing” digital documents, it accepts 150 years of subject organization in libraries as secure. The continuing scarcity of instructions on how bibliographers should use the traditional schemes (204) “ought to raise eyebrows: those secure foundations had little useful to say about the application of subject descriptions? Time, then, to start afresh”.

[top of entry]

7. The Catalog vs. The Encyclopedia

Real-world bibliographers apply subject terms inconsistently in part because they lack explicit rules of procedure. Wilson takes up this matter in 2KoP by performing a thought experiment with a pair of imaginary instruments that do have explicit rules. Using distinctive capitalization, he calls them “The Catalog”, which affords descriptive control of writings by subject, and “The Bibliographical Encyclopedia”, which affords exploitative control of writings by utility (65-70). While the two might list the same writings, they would support different kinds of lookups because they are indexed by different rules.

Wilson first asks us to imagine an indexing scheme with many different labeled places (perhaps with interpretive comments added). The place-labels — i.e., indexing terms — can be names or descriptions of anything we like. In both The Catalog and The Bibliographical Encyclopedia (henceforth simply “The Encyclopedia”), writings are indexed by assigning them to places from the scheme. However, to construct The Catalog (66):

Assign an item to a place N, just in case the description that identifies N is a closer description of the subject of the item than is any other description in the list.

Whereas to construct the Encyclopedia (66):

Assign an item to a place N, just in case the primary utility of the item lies in the help it would give to one engaged in the serious study of the thing mentioned by the descriptive label that identifies N.

Under both rules, the places in the scheme are labeled the same, and many writings might be assigned to the same place in both The Catalog and The Encyclopedia. Given their different criteria, however, it is at least conceivable that no writing would occupy the same place in both — that is, every writing would be primarily useful for studying some subject other than its own. More likely, Wilson explains (67), this would occasionally happen because “the utility of a writing, if any, is by no means bound to lie in its contribution to the understanding of its subject. If I am seriously interested in the study of, say, concept formation among young children, I may get no help from the writings whose subject that is, but much help from writings whose subject is chimpanzees”.

To use either system properly, users need to grasp its rules of assignment. Ideally, these would be explained in a specification as to how the bibliography is organized. Uninstructed persons would presumably find The Encyclopedia harder to interpret and use than The Catalog, since the titles of writings (indicating subjects) would more frequently clash with the place-labels (indicating utilities). But The Catalog could also pose serious problems to users, such as guessing the right level of generality for terms in subject searches. Wilson therefore warns that (67-68) “unless we understand the rules of assignment, including the rules that interpret the descriptive labels if there are any, we cannot know what it means about an item that is assigned a particular place, we cannot know what inferences we can draw about it and about the items which are not at its place. So we do not know what we are finding, and what we cannot expect to find, when we see an item at a place”.

When a writing could plausibly go in two or three places, Wilson imagines that the subject indexers of his thought experiment might not always follow the subject rule. Rather, they might arbitrarily switch to the utility rule and put it where it will “do the most good” (67). This is the indeterminacy factor in action.

Indeterminacy can be demonstrated in real-world practice. To adapt an example from White (1992, 103), the Library of Congress Subject Headings is a large, complex scheme with a place labeled “Social Surveys”. Subject catalogers have assigned to it (1) works that discuss techniques for doing surveys, (2) works that assemble re-usable questionnaires and scales, and (3) works that report results of surveys. Jumbling these three distinct genres under one label shows the label’s indeterminate meaning for both catalogers and catalog users (cf. the example of items under the label Economics in 2KoP, 64).

Where, then, might the three jumbled groups of works be placed if the rules for The Catalog and The Encyclopedia were strictly interpreted? The first group comprises methodological items that are on social survey research and that also assist in the study of such research as their primary utility. So they could appropriately be assigned to “Social Surveys” in both instruments. But the second and third groups are not on social surveys as a subject; they are on whatever the questionnaires and scales measure, or whatever the surveys were about. Thus, terms reflecting their actual subjects, such as sexual discrimination or attitudes toward foreigners, would suit them best in The Catalog. By contrast, the questionnaires and scales were used in surveys, and the completed surveys exemplify that form of research. Since their primary utility or function would lie in the study of past or future surveys, assigning them to “Social Surveys” in the Encyclopedia seems appropriate.

The Catalog and The Encyclopedia are again contrasted in Wilson (1978a), although not by those names. There, Wilson describes two systems with identical indexing vocabularies; in one, the documents are grouped by topic; in the other, by most significant use. He illustrates with a biography of Einstein that would be indexed under Einstein as a topic, but as “an example of a new method of biographical investigation” as its most significant use.

In the same paper, Wilson amended the Catalog/Encyclopedia distinction, now claiming that merely indexing a document by subject brings out its initial primary utility — that is, as a source of information on that subject. “Any further use the document has will depend on first putting it to this use, by reading and understanding it, by gathering the information it contains; this is the sense in which its use as information source is its primary use” (21). But if The Catalog does this, The Encyclopedia is worth compiling only if it brings out additional utilities. Wilson says (23) these might be “descriptions of logical relevance of documents to projects or problems”.

Real-world subject catalogers have always implicitly followed a rule that approximates the one for The Catalog. Without reading anything — there isn’t time — they simply re-express or copy key phrases (e.g., title words) from documents in authorized indexing vocabularies. By contrast, the rule for makers of The Encyclopedia is not one that ordinary subject catalogers can readily follow. To do so, they would need to read documents, then exhibit consultant-like knowledge in many topics and sometimes unusual creativity as well. Rephrasing Wilson’s examples:

Assign a writing on chimpanzees to concept formation in children because that insight occurs to you.

Assign a biography of Einstein to new methods of biographical investigation because you are aware it qualifies as such.

Utility indexing as in these examples requires indexers to imagine new functions of writings, and this kind of indexing cannot be routinized over vast bodies of texts in the same way as subject indexing. How, we may ask, can ordinary subject catalogers — or any indexers — be expected to predict the “most significant use” of all the writings they must process under time pressure? Moreover, whose most significant use? Could they ever know enough to judge every document in light of its “logical relevance to problems and projects”? This is rather like expecting them to connect hitherto unconnected literatures, the problem identified in Swanson (1986). Suppose I, as an indexer, read the item on chimpanzees but have no clever ideas on nonobvious uses for it, or I read the Einstein book but am unaware of its contribution to biographical method. By the rule for The Encyclopedia, I would still have to put them somewhere, and here the possibility for idiosyncratic guesses and mistakes seems great: for instance, I might assign a book on Bayesian statistics to “information retrieval” because I am ignorant of its relevance to other fields. Moreover, because my notions of utility would be unexplained, users of The Encyclopedia would have no quick way of learning why I assigned an item to a place. Worse, they could never be sure where to look for something.

Much of the time, Wilson treats subject indexing and utility indexing as if they were equally feasible. Yet he knew they are not, as Wilson (1980a, 18) shows:

It is a great challenge to librarians and bibliographers to provide what I call a “functional approach” to documents (Wilson 1978[a]), and what Swanson calls “problem-oriented access” to literature (Swanson 1980, 112), in which documents are described not, or not merely, as being about such and such a topic but as being of likely use in an inquiry of such and such a sort. I agree with Swanson that hope for major advances in such a direction may be illusory, my reason being that functional or problem-oriented organization of literature requires guessing about future utilities, and people are not very good at doing this.

Indeed, he admitted in Wilson (1983a, 15) that, for librarians to adopt a functional approach to writings in a tool like the Encyclopedia, they would “have to start not with particular books, but with particular questions or problems, and ask about each book, What if anything might this book contribute to solving or clarifying this particular problem?” Since this approach to KO would clearly require impossible amounts of time and manpower, librarians settle for “an instrument that is fatally flawed from the serious user’s point of view”. His bleak summary: “We can’t provide evaluations, and can’t organize materials functionally, in terms of uses to which they can be put rather than topics they’re about”. Wilson’s Berkeley colleagues M. E. Maron and William S. Cooper also proposed models of indexing that required unrealistic predictions from indexers, and he himself politely undermined their work (Wilson 1968, 96; 1978a, 14-15; 1979b).

Thus, while The Catalog has many instantiations in the real world, the Encyclopedia has none. Writings, it appears, could be indexed by their utilities only if the data were a by-product of some other activity and the process could be automated. As it happens, however, there has long been a form of utility indexing that meets these requirements and that also draws on the knowledge and creativity of consultant-like experts rather than aid-like indexers. That is citation indexing.

[top of entry]

8. Utility indexing and citation indexing

In The Encyclopedia, a lone indexer predicts the future utility of a work, whereas in citation indexes, citers demonstrate the work’s past utility — its actual use history — in contexts from which various functions of the cited work may be inferred. The same work frequently has multiple citation contexts. Uncited works do not appear in these indexes, of course, but assuming a work is cited in the first place, citation indexes are arguably the richer form of utility indexing. In any case, they are the only systematic form we have.

As said earlier, certain writings misassigned to “Social Surveys” by The Catalog rule would be properly assigned there by The Encyclopedia rule. This claim can be linked to citation indexing if we imagine that the Encyclopedia’s labels are followed by explanatory chapters. Then the expert author of the chapter on “Social Surveys” could cite works on social surveys, or used in them, or exemplifying them, or even unrelated to them but helpful in making a point. The prose contexts of these various citations would often suggest “the logical relevance of works to projects or problems” — here, to social survey research. More generally, they would imply that authors were using — and possibly evaluating — works for specific ends, a kind of exploitative control that others, too, might adopt.

Wilson fully realized the value of authors’ references in KO. Although he devotes most of 2KoP to what he calls the formal bibliographical apparatus, such as free-standing bibliographies and catalogs, he is at pains to note that the informal apparatus of references (i.e., citations) in learned literatures is potentially far more important (58): “Insofar as the parts of the informal apparatus refer to other works and specifically evaluate or reply to or build on other writings, they add links in the complicated network of bibliographical connections, a network the tracing of which in the informal apparatus may be more valuable, if more time-consuming, than any use of the formal apparatus”. Then, in a distinction paralleling that between subject experts and indexers (or consultants and aids), he immediately adds: “…if a man evaluates a work on which he has labored for days or years, his evaluation has a greater prima facie claim to be taken seriously than does that of one who had, by the magnitude of his task, to evaluate quickly and superficially an enormous number of writings”.

Given Wilson’s appreciation of references (seen again in Wilson 1983a, 6; 1983b, 243), it is remarkable that he wrote so little about citation indexing. He identified his “functional indexing” with Swanson’s (1980) “problem-oriented indexing,” but Swanson’s own real-world example of the latter is Eugene Garfield’s citation indexing. Garfield had contrasted his innovation with conventional subject indexing in papers from 1954 onward, and many other authors had joined him in exploring the features of citation networks. In fact, Wilson’s career coincides exactly with the growth of the modern citation-analytic literature, yet he remained aloof from it. He cites a few bibliometricians here and there; he describes two modes of citation retrieval in Wilson (1992a, 156); and he briefly discusses bibliometrics and citation analysis in characterizing LIS (Wilson 1983c, 1996a). But he excluded detailed treatments of citation indexing from his discussions of bibliographical utility. That is a gap, since indexing by citation links (and later by Web links) is the sole major complement to indexing by subject indicators, and analysis of “citances” — the sentences in which citations are embedded — adds to our exploitative control of writings.

In two instances, Wilson used his own experience to show the limitations of topical indexing. These very examples make his silence on citation indexing puzzling. In the social sciences, he notes (1980a, 18):

[W]ork that should be read may not be read for many reasons, including the reason that there was no way one could have discovered it using only bibliographical access systems based, as ours are, on topical indexing — one may be unable to guess the topic of work that would actually be of crucial importance to one's own research. I would have been quite unable to predict the topics of all the works I have found useful in working on this essay and would not have found them through a conventional subject index.

Wilson (1983b, 244) further notes that subject searches may lead to what has been explicitly said about a topic, but are no help to someone interested in what might cast light on it:

In this kind of case, the texts which you are looking for are texts that are functionally related to your question, but that need not be topically related. You want material you can use, and the things you can use may well have topics that are apparently quite unrelated to the topic of your question. For example, I recently came upon a paper on misleading metaphors in linguistics that I find enormously useful in understanding certain problems in information science. No train of see-also references could be expected to connect these topics.

The paper Wilson refers to is Reddy (1979).

It is true that conventional subject indexes would not have led him to the rich array of references with which he supported Wilson (1980a) or to Reddy’s stimulating paper. Yet the functional relationships he perceived are not lost; he himself preserved them. The references in his 1980 paper now lead backward to the earlier works he cited, and the works he cited now lead forward, through citation indexes, to his own 1980 paper. These references are texts that cast light on topics without being on them. The same would hold for the paper on misleading metaphors, had he cited it explicitly. (He did cite it in Wilson 1983d, 11).

The earliest major article on citation indexing, Garfield (1955, 1123), distinguished between topic and function in a way analogous to Wilson’s:

If one considers the book as the macro unit of thought and the periodical article the micro unit of thought, then the citation index in some respects deals in the submicro or molecular unit of thought. It is here that most [subject] indexes are inadequate, because the scientist is quite often concerned with a particular idea rather than with a complete concept.

Garfield implies that, whereas subject indexing is applied to whole works, citations relate to authors’ discussions in passages, which need not topically resemble the whole work in any way.

Even an army of subject indexers, says Garfield (1955, 1123), could not feasibly index passages. Strikingly, however, “By using authors’ references in compiling the citation index, we are in reality utilizing an army of indexers, for every time an author makes a reference, he is in effect indexing that work from his point of view”. This is the key insight; as noted above, the contexts of citations imply rhetorical functions that cited works perform for citers. The functions reflect citers’ perspectives and may or may not be related to the citing work’s global topic. While this is not exploitative control in Wilson’s strict sense, it contributes to that power in ways that matching a person’s subject request does not. The point is exemplified in Garfield (1955, 1125-1126). Of the 23 papers that had cited Hans Selye’s classic endocrinological paper “The general adaptation syndrome”, none of them appeared under Adaptation in Index Medicus, and none of them is clearly related to Selye’s global topic. Instead, they provide evidence for Selye’s theory from a variety of fields — an extremely valuable kind of functional information that subject indexing and see-also references would have missed. Compare Wilson (1973, 460):

It must be obvious that the concept of evidential relevance is also of central concern in information retrieval. It is clearly a desirable characteristic of an information retrieval system that it be able to provide information that could help one arrive at conclusions or reasoned opinions even in cases where conclusive arguments are unobtainable.

The reasons for Wilson’s reticence on citation indexing can only be guessed. The Science Citation Index (SCI) began in 1964, and he knew it from both reading and personal examination. He also supervised a two-volume dissertation on it by Theodora Hodges (1972). This pioneering work, massively documented and thoroughly Wilsonian in character, reached conclusions moderately favorable to citation indexing and retrieval; in essence, SCI is good but not great. Every scholar interviewed by Hodges valued the familiar network of references to earlier works from a work in hand. But that, she argues, is because such references are embedded in contexts that help to evaluate them. By contrast, SCI shows the later works that cite an earlier one, but not the contexts in which it was cited; users must do further lookups to evaluate the function and worth of each citation. SCI-style retrieval is thus noisy, and the evaluation of functions in multiple contexts is very complicated. Wilson may have thought these conclusions by Hodges made further comment on his part unnecessary.

Then there is the matter of citation counts. Hodges noted the research involving them but found it unconvincing. Wilson apparently shared this staunchly humanist opinion. In Wilson (1980a, 6) he is skeptical about bibliometric counts in general, after earlier dismissing citation counts in particular as a substitute for evaluative judgments by individuals (PKPI, 7):

The scientist who publishes his results presumably wants to influence his colleagues and make a contribution to knowledge. If his work is unread, the first aim is not attained, but the second may still be. [Then in an endnote:] This is one strong reason for resisting the claim that citation counts, that is, counts of the frequency with which a piece of scientific work is referred to in subsequent publications, are an adequate measure of the value of scientific work.

A writing’s citation count roughly indicates its popularity among experts. But in Wilson’s discussion of exploitative control, popularity of this sort is another criterion he rejects — a point relevant here. He gives the example of a man who wants the best books on Cretan history (2KoP, 35):

We might take a request for the best books on Cretan history to be a request for those books that are most highly regarded by, say, “the experts” on Cretan history. If that is the request, it can be filled without any evaluation, for to report on the popularity or standing of a writing among a certain class of men is not itself to evaluate the writing at all.

He might equally have written: “those books that are most highly cited by, say, ‘the experts’ on Cretan history”. Recall that exploitative control involves evaluating a writing as a means to a personal end, and, in that sense, a book with a high citation count might indeed not qualify. However, this is to put logical consistency before pragmatic realism. Decades have passed since Wilson wrote, and bibliographical advice tailored to individuals is still ad hoc and unsystematic, if it is available at all. At the same time, formal and informal reviewing systems daily suggest various utilities of writings to various readerships. Wilson himself recommended the Nehru book to a readership and implied that Reddy’s paper on misleading metaphors might be “enormously useful” to readers outside linguistics. If we turn to citation counts as indicators of the general utility of specific writings — as proxies, that is, for the advice of consultants — we find that Google Scholar currently has a count of about 3,000 for the Nehru book and about 4,000 for the Reddy paper. Armies of citers have thus upheld Wilson’s evaluations from long ago. The citers’ form of utility indexing, moreover, comes as an automatic by-product of everything else they were doing.

[top of entry]

9. An ideal information system

Wilson’s (1973) most cited paper, “Situational Relevance”, describes another ideal system — one that generalizes the notion of exploitative control of writings. What he envisions is far from the delivery of bibliographical references that match a user’s topical request. It more closely resembles the “expert systems” that would flower in the 1980s. His own system, an extraordinary one, gives us personalized answers to our dominant questions rather than things to read. Answers are information in the strong sense, says Wilson (1978a, 10), only if they are true, whereas information in the weak sense is merely content, which can include misinformation. In Wilson (1973), the system’s answers have been critically evaluated so as to be as true (or as warranted) as possible. More precisely, they are intelligence, connoting that they have been evaluated for the confidence we can place in them and their appropriateness to our situations.

Wilson’s (1973, 468) basic model is the intelligence supplied by human advisors, with or without computer support. Some of his formulations will be glossed in due course; other will already be familiar:

Once the idea of situational relevance is set forth, and the corresponding idea of significant situationally relevant information introduced, it is immediately apparent that information systems aimed at providing the latter sort of information would be particularly desirable sorts of systems. Such systems, supplying information rather than bibliographic references, on a regular or “standing” basis, providing a personal rather than impersonal approach, yielding information selected on the basis of logical relations to our concerns rather than on the basis of subject matter, taking into account one’s state of knowledge, perhaps operating in a “tutorial” mode, modifying or reformulating information so as to be comprehensible and acceptable to us (and hence of course also capable of misleading and misinforming us, like any other tutor), would be of enormous power and utility. As noted, commercial and military intelligence systems aim to deliver this sort of information, and we rely on friends and colleagues to serve as sources of such information.

Two of Wilson’s other glosses on relevance may be given in brief. In 1968, he is against using “relevant” to describe a document that simply fits a topical description or is satisfactory to the requester. Instead, he wants to preserve the term’s traditional senses of counting for or against a claim, or helping someone to solve a problem; the latter aligns with a document’s being the best textual means to an end (2KoP, 43-53). A decade later, bowing to inveterate usage, he says in Wilson (1978, 16-19) that in information science “relevant” simply means “retrieval-worthy” and that one way in which documents may meet this vague standard is by being on a topic. (Above, this was also called their initial primary utility.)

His more stringent situational relevance in Wilson (1973) requires that system-supplied information must address the concerns and preferences of specific individuals. Concerns are matters in which persons are not indifferent, such as the state of their health or wealth. Persons prefer, that is, one state to another (461): “A feature or aspect of a situation will be said to be of concern to a person if the feature can exhibit any one of several different specific states or conditions, and if the person cares which specific state or condition is the current one”.

Wilson assumes that specific states can be expressed as a set of questions about which state is current. For each question there is a set of answers, called the concern set, with which the system can respond to the question, and these answers must be at least partially rankable by personal preference (i.e., they are not all tied). Jointly the answers in the concern set describe a situation for a person. Every person also has a stock of beliefs about the world that includes beliefs about concerns. Then an answer from the concern set will be directly relevant to a belief if it increases or decreases the belief’s degree of confirmation. If an answer that is not part of the concern set prompts an inference that alters the belief’s degree of confirmation, it is indirectly relevant to the belief.

Wilson’s system is based on inductive logic — the logic of confirmation and disconfirmation by evidence, which admits of degrees of probability. However, it is modeled in part on a question-answering system by his Berkeley colleague William S. Cooper (1971) that defined the relevance of answers in terms of deductive logic — the logic of strict entailment. The following adapts one of Wilson’s own examples:

Belief about a concern : My bank balance today is at least $150.

Preference : I prefer $150 and all higher amounts to all lower amounts.

Question : Is my bank balance at least $150 today?

Directly relevant answer from concern set : Your bank balance today is at least $150.

Confirmation : Answer greatly increases probability of my belief.

Indirectly relevant answer not from concern set : Your check for $175 has just bounced.

Confirmation : Answer greatly decreases probability of my belief.

For an answer — i.e., an item of information — to be added to someone’s stock of beliefs (or knowledge), the person must learn of it and accept it as true. However, the item may be situationally relevant even if the person is unaware of it, because this is, once more, a matter of logic, not psychology. For example, the bounced check is logically relevant to a concern with one’s bank balance even if one ignores the bank’s alerts. Moreover, the logical relevance of an item of information holds whether the situation is past, present, or future, although one’s concerns and orders of preference will naturally change over time.

Wilson (1973, 467) calls information significant “if it is directly relevant situationally, and if it is new information to the recipient at the time of its receipt”. Novelty thus figures in his account of relevance (as it does in many others in LIS). Significant information also revises beliefs. It must report “a condition that is either higher or lower in preference than the condition previously thought to exist (represents, that is, a change for the better or for the worse)”, or it must report “no change when a change for better or worse had been expected” Information that is only indirectly relevant may not change one’s view of the situation, but it, too, can be called significant if it changes the “confidence or probability” one assigns to items in the situation description.

Information systems are often said to aim at retrieving items relevant to interests. In Wilson (1973, 464-465) interests and concerns may overlap or switch places, but they are not the same: “situational relevance [to concerns] depends on the existence of preferences about states of affairs; interest depends on, or consists in, wanting to know about a thing, being curious about a thing”. One can be interested in something (e.g., film noir, Zen Buddhism) without preferring that it be one way or another. Wilson (1977, 42) adds that concerns imply a commitment to act, if necessary, to attain a more preferable state. Given such commitments, a situationally relevant information system tells users what they ought to know if actions are to be taken. (The bank customer ought to know about the bounced check.) But absent the commitment to act, interests imply nothing that users logically ought to know. (A movie fan might like to learn about collections of film noir, but that information is not imperative.)

Chapter 2 of PKPI extensively analyzes personal information systems, especially as they pertain to decision-making. Describing “costly ignorance” in this context, Wilson writes (PKPI, 62):

We are sometimes sure that a piece of information would have been crucial in the sense that without it, a decision went one way, but with it, the decision would have gone another way. When the outcome of the more informed decision would have been better from our point of view than the outcome of the less informed decision, a loss has been incurred.

The loss is sometimes literally costly in money, but it generalizes to any concerning matter. On this basis, Wilson sharpens the hazy notion of “information need” in LIS: “Crucial information, lack of which would result in a worse decision, is needed information; information that is lacking but has no such effect is not needed” (PKPI, 63). He can thus define information need causally: information lacking ⇒ poorer decision. If the deliberations preceding a decision can be cast as formal premises, he can also define it logically: information lacking ⇒ better decision does not follow as a consequence. Needs thus characterized are objective, not subjective. Or as he puts it in Wilson (1978a, 19-20), “Questions of need are factual questions about the relation of means to ends. It is worth insisting on this, in opposition to the common view that needs are subjective psychological states”. The latter are wants.

Wilson (1978a, 22-23) relates needs and wants to a similar ideal information system, while Wilson (1986) examines them in the context of rule-governed library reference services.

[top of entry]

10. The view from R&D

Despite Wilson’s preference for actionable intelligence over bibliographical lists, he was always attuned to groups for whom the “the literature” is not merely an interest but a permanent concern. These are, broadly, research and development workers, such as scientists, scholars, and technologists, literature-based professionals, and students aspiring to those fields. In general, R&D workers may be said to want news of significant, situationally relevant writings — bibliographical intelligence, as it were — from their monitor or reserve or advisory systems. Wilson knew, of course, that they rely more on personal exchanges than the written archive for research-related news, but he also knew they are not indifferent to documents that would advance their projects. At the same time, they must guard against having too many things to read. He thus devoted a series of papers — almost a short book’s worth — to their use of writings (Wilson & Farid 1979; Wilson 1980a; 1983b; 1993a;b; 1995a; 1996b;c). These might be called “the overload papers,” and they are relentlessly deflationary.

Wilson and Farid (1979, 128-132) analyze how individual researchers avoid burdensome reading, given normative expectations. As pragmatic skeptics, Wilson and Farid mostly deny that these norms are — or need be — strictly observed in successful research. Italicizing some terms from their account of norms, researchers should exhibit situational familiarity with the current state of knowledge affecting their projects, and historical familiarity with specific past studies that led to that state. They should exhibit expert situational and historical familiarity with writings by their direct competitors or by those doing parallel work. They should exhibit working or at least nodding familiarity with writings ancestral to theirs, and writings from donor fields that have exported theories, methods, or insights to their own. Not so, according to Wilson and Farid. They conclude that (142) “use of the literature is avoidable in theory and often in practice, except insofar as conventional requirements of scholarship prescribe its use. Its use is neither necessary nor sufficient for acquiring expert situational and historical familiarity with the immediate area of one’s work”. In particular, Wilson and Farid devalue the comprehensive literature search, which supposedly precedes or accompanies any serious scholarly project (cf. Wilson 1983b, 242). Especially for ancestral and donor studies, they say, such searches are misguided because they ramify endlessly. Wilson and Farid (129): “No research worker needs to be familiar with more than a small fraction of the work done by others; nor is the same degree of familiarity always needed or sought”. Researchers know they must cite works essential to their studies, but their further references to the literature are a matter of craft, not obligation.

The downside of being studious in Wilson’s sense is that the more writings one tries to consider in a fixed period, the more problematical reading and integrating them becomes. Yet the problem can be managed — for instance, by assembling teams who jointly know a wider range of writings than any one member (cf. Wilson 1996b, 194-195), or by asking other researchers for summaries of current knowledge rather than bibliographical advice, or by adopting the convention that ignorance of certain literatures is permissible. Another alternative is to read overviews of the literature rather than primary research reports (Wilson 1983b, 242). Librarians could help scientists in the latter regard, according to Wilson and Farid (143), by providing “more reviews, more authoritative critical surveys, more compendious works of reference, more works of haute vulgarisation, more works of synthesis” and by preparing “evaluative rather than simply enumerative bibliographies”. But in the social sciences, writes Wilson (1980a, 18-19), tactics like these will work only if the primary reports are dispensable (as, say, primary documents in history are not). The sole innovative way in which librarians could help social scientists is to assemble collections of materials hitherto scattered, so that they become more convenient to use.

The norm that researchers will have current knowledge of their fields means they should keep up with the literature, whether or not it bears directly on their immediate projects. Wilson (1993a) delves into the value of currency. Conventionally, current knowledge is a desirable (and sometimes mandatory) part of any researcher’s or professional’s social capital. But here again, the norm cannot withstand scrutiny; as a general notion it is vague; if made specific, the definitions are inconsistent from field to field and trail off into indeterminacy. In Wilson’s view (636): “A requirement of keeping up with developments in one’s profession is not unambiguously a requirement to know what is going on today that is new, nor a requirement of deep understanding, nor a requirement of an exact scale [i.e., level of detail] of knowledge, nor a requirement of knowledge of every nook and cranny of the profession, nor is it a requirement to maintain the same level of currency over all parts of the field for which one is responsible”. Moreover, the cognitive impact of any one current work on any one reader is highly uncertain and could be low or even zero. Then why read beyond some comfortable minimum in the struggle to keep up?

A decade earlier, Wilson (1983b) persuasively captured the view of a specialized researcher with regard to reading. It consists of rationing attention through intense self-centeredness (241): “[W]hat others are doing is of interest primarily as it affects one’s own work, and what doesn’t affect one’s own work can be ignored”. This does not mean that researchers will lack broad knowledge of the history and sociology of their fields, but monitoring publications is not the only way of getting that; another is from mentors “by osmosis” (246). The goal is to know enough, by whatever means, to succeed in one’s own projects. If exploratory literature searches are needed, Wilson recommends treating them as “a series of brief raids” (246), conducted with high cognitive flexibility (244):

One’s notions of what one is looking for change in the process of looking. One’s ideas of possible uses change, as one learns more, through successes or failures. One thing leads to another, in unforeseeable ways. And it would often be better to speak of making things useful than of finding them useful. One makes connections, constructs bridges. Spotting potentially useful texts is very much an exercise of imagination and insight.

Wilson (1996b) deals with the problem of overload for solo researchers in the social sciences or humanities as they try to read across specialties or disciplines. They do so in the belief that complex social or cultural phenomena cannot be adequately addressed in single specialties, because multiple specialties contribute relevant information. Wilson distinguishes (193) between “upkeep overload”, caused by the endless stream of new publications in any one field, and “task overload,” caused by the volume and variety of materials that a researcher must master in projects involving two or more fields. Both result in backlogs of reading, which in turn force continual prioritizations of what will be read. If solo researchers try to enter a field new to them in mid-career, they incur steep reading costs in time and effort. Some may create idiosyncratic new specialties out of prior ones, thereby gaining greater say over which writings are relevant and which are not. But either way, the division of their attention across fields will leave large gaps in what they know. Their attempt to extend the range of relevant information, while commendable, cannot enlarge their capacity to read. The only rational way to draw on relevant information from multiple specialties is to form teams — something many solo researchers may be loath to do.

These papers share as backdrop a scientific ideal that greatly preoccupied Wilson, which is that, to be rational, researchers must consider all relevant information in doing their work (Wilson 1993b; 1995a; 1996c). Individually, researchers do not — and cannot — live up to this ideal, because there is simply too much information in literatures for any one person to absorb. Teams are an improvement, but they, too, are cognitively too narrow. The proper unit for asking how well the ideal is met is the many-eyed research specialty: “It is not how some individual is affected but how the specialty as a whole is affected that is in question: it is the group as a whole that has to be persuaded that the information has an appropriate logical or evidential status” (Wilson 1993b, 379). The specialty’s cognitive situation comprises the cognitive situations of all its members, to whom inputs of information may or may not be situationally relevant.

Inputs can be communications from any source, oral or written, informal or formal. But how can they be evaluated? Wilson found a model in efficient market theory from economics. Empirical studies have shown that markets are efficient in the sense of using all available relevant information in setting prices. Therefore, Wilson correspondingly asks the degree to which members of R&D specialties use all available relevant information in doing research. Given how situational relevance is defined, if a particular input is not new or would not substantially revise beliefs — revise them, that is, objectively across the specialty — then a specialty’s cognitive system can be called adequate as it stands.

In Wilson (1993b) he adopts a hypothesis from efficient market theory in three forms — (a) weak, (b) semi-strong, and (c) strong — and states it both strictly and loosely. Hypothetically, the R&D communication system is efficient in that:

(a) the current cognitive situation is adequate to that specialty’s past productions. (Loosely, current opinion fully reflects all prior work in the same field.)

(b) the current cognitive situation is adequate to published information produced in any specialty. (Loosely, current opinion fully reflects all publicly available relevant information produced in any field.)

(c) the current cognitive situation in a specialty is adequate to all information, whether published or unpublished, available in any specialty. (Loosely, current opinion reflects both published and unpublished information available to any worker in any field.)

Wilson (1993b) and its direct continuation, Wilson (1995a), develop four lines of argument against the hypothesis, but, on balance, conclude that the question of efficient communication in R&D remains open. That is, the hypothesis is not refuted, at least in its weak or semi-strong forms, because strong evidence for it also exists. In the following sketches, Wilson’s arguments against the efficiency hypothesis come first; those for it begin with “But” in italics. The first two lines of argument are from Wilson (1993b); the second two, from Wilson (1995a).

Late finds of information. Communication in a specialty is not efficient if researchers repeatedly complain of finding relevant documents too late for them to be of use. Empirical studies have found numerous failures of this sort; they also have found unnecessary duplications of research and cases of being anticipated (i.e., scooped) by other researchers. Almost certainly many more failures along these lines go undetected.

But since cognitive differences among individual researchers are irremediable, not all late finds are equally significant; there must be a threshold. Moreover, even if a late find is very significant for a particular project, it does not greatly matter unless the failure affects multiple projects across a specialty.

The Frame Problem. Communication in a specialty is not efficient if it is impossible, even in theory, to design systems that will bring all “must-read” relevant work to members’ attention. This is a concrete example of the abstract Frame Problem from artificial intelligence — namely, the impossibility of formulating “rules that would specify, given a representation of the world, and given a change in some feature of the representation, what other features must change or at least be reconsidered” (Wilson 1993b, 379). In general, since no communication system can reliably recommend all desirable imports or exports of information among specialties, late discovery or non-discovery of relevant information by specialty members is inevitable.

But items that might be highly relevant to an individual or a team are not crucial imports at the specialty level. At that level, the crucial imports are the broad theories or methods that can be used by everyone. Since members in their entirety monitor various streams of research, it is likely that some of them will discover and publicize widely applicable work, even if more narrowly relevant items are missed.

Overload . Communication in a specialty is not efficient if it identifies far more relevant items than members have time to read. A sign of overload is that researchers’ strategies for managing their reading backlogs always lead to significant omissions.

But if no one can escape overload, then a sensible compromise must be accepted, which is to combine wide browsing with judicious prioritization. A project suffering from too many relevant documents can be redesigned so as to limit required reading.

Deliberate exclusion. Communication in a specialty is not efficient if, for whatever reason, researchers ignore admittedly relevant materials.

But in many specialties, research is routinely deemed successful even though countless relevant items go unread and uncited. There are various justifications (Wilson gives six) for bracketing such work — for example, to make a complex problem manageable. A broader justification is that, although the ideal of using all work relevant to a project may be rational, it is also impossibly demanding and hence wholly impractical.

Wilson (1995a) further implies that the LIS systems designers’ goal of providing all relevant materials and only relevant materials to researchers is defective. In retrieval system evaluations, “relevant” documents are defined as matching the query in topic. It is then presumed that the more matching documents a system retrieves, the better its performance (as measured by recall). It is also presumed that the fewer non-matching documents it retrieves for the same query, the better its performance (as measured by precision). This paradigm, which is with us still, is not well suited to the situation of most researchers, which “is more likely to be a surfeit of relevant information rather than a scarcity” (Wilson 1995a, 50). The counter-proposal in the same passage is to develop “aids in screening, evaluating, and filtering not just to distinguish relevant from irrelevant, but to separate dispensable from indispensable relevant material”. As a principle of design, Wilson’s dispensable/indispensable criterion from a generation ago remains radical. Today’s relevance-ranking techniques as yet scarcely address it. It encapsulates his constant theme that expertly chosen texts in small quantities are what readers need most.

[top of entry]

11. Trustworthy communication

Whatever that theme is called — e.g., authoritative recommendation or individualized advice — it leads to Wilson (1983d), titled Second-Hand Knowledge: An Inquiry into Cognitive Authority (2HK). This is the most cited of Wilson’s three books, and the one classified in the Library of Congress scheme as epistemology rather than LIS. Wilson himself called it a work of social epistemology, greatly elaborating on a term coined by another bibliographical theorist, Margaret Egan (Egan and Shera 1952).

When Wilson wrote, philosophers had traditionally concerned themselves with first-hand knowledge, gained through direct experience. They had ignored knowledge of the extremely common second-hand kind, gained not through direct experience but by taking the word of others that something is the case (2HK, 13-26). What others say, of course, is not necessarily knowledge (or information in the strict sense); what we hear or read must be true or at least well-founded. Second-hand knowledge thus involves questions of truthful and hence trustworthy communication. Those most to be trusted in some matter are the authorities in that matter, and learning who they are, or what they have written, is a social pursuit. With some exceptions, such as cult leaders, their authority will be limited to hazily defined intellectual spheres (as wide as “all genetics” or as narrow as “Antiguan stamps”), and, even within those spheres, the degree to which their words can be trusted will vary (19-20). In any case, their authority will supposedly have been tested and determined over some period by other persons qualified to do so (26-35). The latter may be the authority’s peers within some activity, or they may be assessors and critics from outside the activity (15):

Cognitive authority is influence on one’s thoughts that one would consciously recognize as proper. The weight carried by the words is simply the legitimate influence they have.

People’s influence (21-26) may be justified by their métiers (in which specific knowledge is expected) or by their reputations (among the general public or a particular circle, such as one’s friends). They may have said or written something that is intrinsically plausible (if we know enough about a topic to infer what is plausible). They may have successfully performed tasks relevant to their claims (which is probably the best test). Finally, they may be believed simply because of personal ties (as when a mother believes her son, whatever he says).

Wilson calls such authority “cognitive” to distinguish it from administrative authority (the power to compel behavior). In this context he also distinguishes between an authority and an expert (26-30). The two terms are usually taken as synonymous, but, strictly speaking, expertise is oriented toward content, while authority is social in nature. The expert is simply one who knows a lot about something; the authority shares that knowledge with others. (The last person on earth could be an expert in survival techniques, yet, lacking an audience, would no longer be an authority.) Ordinarily, of course, authorities with expertise will seek to communicate it in a trustworthy way. One test is whether they can apply their knowledge creatively to new questions (16), which might include advising on credible writings in their specialties. They can be wrong in their recommendations, but they can also be pragmatically right (31-32), which sets them apart from those without opinions in the matter.

Since most of what people know, or think they know, consists of beliefs acquired from others through hearing or reading, and since “information” and “knowledge” in LIS usually refer to beliefs obtained by those means, Wilson thought it high time to examine trustworthiness, especially in writings, as a major qualitative variable. LIS, he points out, regularly announces “information” or “knowledge” as its stock in trade yet fails to emphasize truthfulness — the defining feature of those terms — in documentary quality control (171-179). Although social epistemology has become a thriving branch of philosophy, quality in this sense is still not much addressed in LIS, even though the field takes written sources of answers as its purview. Given that we want to read truthful or at least evidence-based accounts of the world, what writings can we trust to provide such accounts, and why should we trust them (165-170)? What roles do institutions (81-83), professions (131-134), the arts (107-112), intellectual fashions (57-71), and factions (90-93) play in determining authority? Who can advise us on the epistemic status of questions (17-18) — that is, on which questions are closed (answerable by knowledge) and which remain open (answerable only by opinion)? These are the sorts of topics Wilson explores. (See also Rieh 2002; McKenzie 2003; Sundin and Johannisson 2004; Rieh and Danielson 2007.)

Today, any developed society has a “knowledge industry” (39-46) — that is, a multitude of learned groups (81-114) that publish claims of fact and value about the world. It also has even more non-learned groups (123-156) that confidently state what is the case in one matter after another. As a rhetorical device, Wilson imagines all such claims put before a jury that could examine which of them are the best-attested — the most authoritative — and why (83-84). The examination is by no means straightforward, nor the results clear-cut. An objection to making trust depend on truthfulness is that some people — astrologers, for instance — gain reputations as authorities not by being truthful but by communicating what the credulous want to hear (34-35). An objection to making trust depend on personal reputation or achievement is that some writings have deservedly high authority — dictionaries and atlases, for instance — yet their authors are not at all well-known (81, 169). Wilson takes up many such complications; his book can be understood as an inquiry into the “-worthiness” part of “trustworthy”. His conclusions, as usual, are skeptical; scare-quotes can always be placed around “authority” of any kind, but in his view some groups, such as natural scientists (82-88), are more worthy of trust than others, such as social scientists (88-94), because, within human cognitive limits, the quality of their evidence is more compelling and their predictions are more reliable (cf. Wilson 1980a, 7-8).

The final chapter of 2HK focuses on libraries as institutions that conceivably might vet and prioritize written claims for various readerships. Wilson imagines, as yet another ideal, a library service that could not only provide authoritative accounts of what is known in various matters but could also rule out the possibility that other writings might be even better. At the utopian extreme, library staff would themselves synthesize such authoritative accounts on behalf of their users (170-171). In real life, of course, they do nothing of the sort; they simply deliver writings whose putative authority is conferred elsewhere — e.g., by scientific or scholarly associations, publishing houses, and review journals (165-169). Although the judgments of these latter institutions may be challenged, libraries are not in the running as a viable alternative. Again, librarians on the whole lack the subject expertise to vet the credibility of texts. Nor is there any societal demand for them to assume this role, or desire on their part to do so (176-179).

Wilson thus retains from his earlier books the theme of librarians’ inadequacy as experts (scholar-librarians excepted). Yet he does not dismiss them entirely. He grants that they do evaluate the cognitive authority of one class of works — the printed or digital reference tools they use in answering brief, closed questions (180-181). This authority is reflected in the lore that is part of their training in coursework or on the job. (For example, they might learn that Gmelin and Beilstein are trusted databases in, respectively, inorganic and organic chemistry.) Librarians moreover perform tasks ancillary to critical evaluations by others, such as building and managing collections, assembling requested writings quickly, and advising on the reputations of sources. Those activities facilitate the in-depth evaluation of texts by persons whose subject expertise is more advanced and specialized. As generalists, librarians may complement the specialists by serving, in a limited way, as “authorities on authorities”. Although unable to judge the trustworthiness of most texts themselves, they can provide information relevant to such evaluations — e.g., facts on authors’ careers, reviews of their work, even their citation counts. Wilson writes (180):

Librarians are in a particularly advantageous position to survey a wide field, to be at least superficially acquainted with the work of many different people, with many books, with many works evaluating and summarizing the state of knowledge in different fields.***[T]hey are in advantageous positions to develop a wide familiarity with reputations, with changing currents of thought, with external signs of success and failure. Along with knowledge of the standing of individuals, they can accumulate information about the standing of particular texts: particularly classics of different fields, standard works, and the like.

Some years later, Wilson (1991, 263) distinguished between cognitive authority gained by doing research, (“the kind of authority claimed by practitioners and producers of the literature”) and authority gained by reading the literature produced (“a kind that can be acquired without being a practitioner in the area at all”). While practitioners and producers know how to conduct and evaluate research in an area, they are not necessarily experts on its literature. By contrast, literature experts know the substance and history of specific works, their intellectual or methodological perspectives, their intertextual relationships, their authors’ reputations. Expertise along these latter lines, Wilson asserts, is sufficient for evaluating research.

Literature experts may also act as consultants (Wilson 1991, 263):

I can ask a person who knows a body of literature well “Is there anything there that I should know about?” and hope that, once I have made it clear what my own interests and problems are, the other will be able to make connections between my situation and the literature of their field and steer me toward works that I might otherwise never have heard of. The crucial ability involved is the ability to see, or imagine, indirect or nonobvious relevances — i.e., the possible utility of works that have no obvious connection at all to my interests, which I would never have found by direct search because it would not have occurred to me to search for them. This ability, though marvelous, is not all that rare. Good librarians have it; graduate students may have it, helping faculty members by identifying potentially interesting material in regions unfamiliar to the faculty member.

Note the upgrade in librarians’ status from being mere aids in 2KoP (see also Wilson 1983b, 246).

The strong claim in Wilson (1991) is that librarians can teach students to be literature experts through bibliographic instruction (BI) courses. Departing uncharacteristically from skepticism, he says that BI can enable a student to evaluate the trustworthiness of texts in an area without knowing how to do research in it. For once, he seems too optimistic about what librarians can do, because a BI course alone could not give students these powers. In his account, the BI librarian would choose a specimen literature and present students with its “topography” — e.g., its sub-literatures, its bibliographical works, its genres of publication, its indexing schemes, and its links to other fields (266-267). Students would then do similar topographies in research areas of their choice, an approach that prefigures domain analysis (Hjørland 2002). As such, it seems appropriate only for advanced students with specialized subject interests. Even so, BI librarians could not assign their students the readings necessary to master an area’s subject matter, and only intensive, thoughtful interaction with multiple texts over time could give them cognitive authority as literature experts. It is that interaction, rather than “topographical” knowledge, that enables a librarian, a graduate student, or anyone else to recommend even obvious works to others. To recommend useful nonobvious works requires wide reading plus the ability to see hidden relevances on the fly, and creative insight like that cannot be a goal of BI because it cannot be taught.

Similarly dubious is Wilson’s claim that BI would enable students to evaluate the trustworthiness of texts. A student cannot become a chemist, or referee papers in chemistry, by taking a course in the structure of the chemical literature, nor does a student who domain-analyzes the maser literature thereby qualify as an authority on masers. As noted in 2HK (51-56), insiders in all learned disciplines guard their autonomy as judges of cognitive authority, and they would insist that socialization and experience in their research specialties are necessary to judge the soundness of contributions to them — not mere reading knowledge, let alone a mere BI course. Wilson (1991, 268) admits as much, but urges outsiders to make personal evaluations that simply flout insiders’ opinions. Granted, outsiders with literature expertise sometimes persuasively evaluate research in areas not their own — Wilson mentions the literary critic Frederick Crews on Freudian psychology — and, if they can do it, other talented outsiders can do it too. But, again, a BI course is neither necessary nor sufficient for such assessments; at best it would usefully supplement extensive reading and personal gifts.

[top of entry]

12. Conclusion

It remains to add that such doubts pale in comparison to the achievements of Wilson’s thought. A spare diagram of that thought captures its comprehensiveness and its high applicability to KO and LIS:

Knowledge and non-knowledge in minds

Writings and recorded sayings

Bibliography

Reading down, philosophers have always investigated knowledge and non-knowledge, but much less often their problematic representation in writings and recorded sayings, i.e., in the multitudes of texts external to minds. Still less have they investigated problems of intellectual and physical access to those texts through bibliography, i.e., through formulaic writings about writings. Reading up, the justification of bibliography is that it gives us certain desirable powers over texts, which in turn give us certain desirable powers over knowledge (and valuable non-knowledge, such as informed opinions or classic fiction). A vital matter for study, then, is the nature of these powers — particularly their limits and failures as well as their successes and possible improvements. The philosophically unusual inclusion of bibliography in the diagram problematizes its relation to knowledge, allowing Wilson to discuss, for example, how expert consultants complement bibliographical instruments. Indeed, old passages of his lend themselves to a critique of present-day recommender systems, which are attempts to automate the role of consultants through algorithmic operations on bibliographical data. Designers of these systems typically judge their success by how well they return documents that resemble explicit queries. By contrast, Wilsonian consultants recommend documents on the basis of implicit criteria, such as trustworthiness, indispensability, and nonobviousness. Implicit criteria are of course a problem for computation, but Wilson at least prompts designers to explore ways of operationalizing them with explicit data (see, e.g., White 2017; Jin and Saule 2018).

The three levels of the diagram are always simultaneously present in his books. When the interviewer in Wilson (2000a, 134-135) asked him to name his most important publications, he answered [slightly edited]:

Oh, those books. Simply because there are three of them, and they fit together. They’re not independent books in a sense. They’re all facets of a central subject matter. See, from my point of view it turned out that leaving philosophy and coming back to librarianship — even though I didn’t expect it at the time, I had no idea at the time that it was going to work out like this — it turned out that the bibliographical core, the bibliographical problem, bibliographical center, provided a perfect platform from which to look at everything else in the world. A few overly enthusiastic classifiers and catalogers have said in the past, the classifier is in charge of all of knowledge. Well, just in the sense that you’re working with a classification system which tries to cover all of knowledge, and so you enthusiastically think of yourself as being somehow knowledgeable about all of knowledge. Well, in the weakest possible sense you are, I guess. But there is a grain of sense to that position. If you start from the situation of people saying things about the world, trying to find out about the world, lying about the world, concealing facts about the world, and writing it all down, [bibliography] is a good place from which to start asking questions about knowledge and the world of which it’s knowledge of. So it turned out to be a wonderful central position or platform.

Wilson’s bibliographical platform is thus a standpoint for analyzing KO in both librarianship and information science. In Wilson (1983c) he sums up the main intellectual component of LIS as “bibliographical R&D” and calls LIS-style information retrieval not a science but a branch of engineering. In engineering, solutions to many problems are reasonably determinate from the outset (Wilson 1996a, 320). Accordingly, he identified the clearest successes in information retrieval with the computerization of existing bibliographical data. Attempts at creating bibliographical data by computer, such as automating subject classification, he regarded as less successful, probably because less determinate. A metaphor in his ASIST Award of Merit speech nicely captures his own role in this regard (Wilson 2001a, 11, slightly edited):

I’m no engineer myself… I think of much of my work as related to information systems design in the way the study of the properties of materials, materials science, is related to traditional branches of engineering… Our version of materials science is to study problems in the description of content in terms of subject matter and form and function and relevance and utility to find out what can be done easily and what can’t be done easily or at all. And there is even an analogue in our field to the strength of materials that plays such a big role in older branches of engineering. One of the most important properties I’m interested in when I’m looking for arguments or evidence or proofs or persuasive cases is strength and the ability to bear a lot of weight.

At the same time, Wilson (1996a, 321-322) champions the fact that LIS sprawls beyond engineering into the social, behavioral, and human sciences, including his own brand of social epistemology, defined as “the social study of knowledge production and use” (Wilson 2001a, 11). From his bibliographical platform, writings reflecting knowledge production and use are looked at in terms of bibliographical consequences. LIS is unique among fields in regarding bibliographical consequences as a central concern, and Wilson was exceptionally wide-ranging in what he made of this. Take a hypothetical case, based on his interest in cognitive authority. As he implies, although we regularly read to gain knowledge, not every text delivers it authoritatively; some writers, for instance, lie or conceal facts about the world. Should we therefore try to index writings by their general trustworthiness, as is done with monographs in the Human Relations Area Files? Could star-ratings be given for degree of credibility? If not, should we at least link non-fiction books in online library catalogs to their reviews, as is done in Amazon.com? The standard guides for bibliographical descriptions of writings are of course mute on this matter, but it is the sort of problem Wilson might have relished.

Or consider the problem of rapid conceptual change, which he highlighted in Wilson (2001a, 11):

Conceptual change has huge consequences for those attempting to organize knowledge for retrieval and use. Conceptual frameworks get outdated, relevance relations change unpredictably, things fall apart.

As noted in Wilson (1996b), changes in relevance relations place new reading burdens on researchers and impel new social arrangements for coping, such as collaborative teams with members from diverse specialties. But information professionals, too, are affected. Wilson (1983a, 12) foresaw bibliographical consequences for catalogers. Conceptual change as reflected in new books leads to the creation of new subject headings. For older books, these new headings might be better than the headings originally assigned. But systematic review of older books for possible re-cataloging was not conducted then, nor is it now. As a result, “over time the amount of misdescribed material is bound to increase, the accuracy of the subject catalog declines, the quality is gradually degraded. This is something that automatic procedures cannot eliminate. There is no automatic recognition process for misdescribed works”.

For his breadth of vision alone, Wilson is inexhaustibly re-readable.

[top of entry]

Wilson’s oeuvre

Wilson, Patrick. 1956. Government and Politics of India and Pakistan, 1885-1955: A Bibliography of Works in Western Languages. South Asia Studies, Institute of East Asiatic Studies, University of California, Berkeley.

Wilson, Patrick. 1957a. South Asia: A Selected Bibliography on India, Pakistan, Ceylon. New York: American Institute of Pacific Relations.

Wilson, Patrick. 1957b. A Checklist of the Writings of M. N. Roy. South Asia Studies, Institute of East Asiatic Studies, University of California, Berkeley.

Wilson, Patrick. 1960a. On Interpretation and Understanding. Ph.D. diss., University of California, Berkeley.

Wilson, Patrick. 1960b. “Austin on Knowing”. Inquiry: An Interdisciplinary Journal of Philosophy 3: 49-60.

Wilson, Patrick. 1965. “Quine on Translation”. Inquiry: An Interdisciplinary Journal of Philosophy 8: 198-211.

Wilson, Patrick. 1966a. “The Need to Justify”. The Monist 50: 267-280.

Wilson, Patrick. 1966b. Science in South Asia, Past and Present: A Preliminary Bibliography. New York: Foreign Area Materials Center, University of the State of New York.

Wilson, Patrick. 1968. Two Kinds of Power: An Essay on Bibliographical Control. Berkeley and Los Angeles: University of California Press. [Chapter V, “Subjects and the Sense of Position,” reprinted in Theory of Subject Analysis: A Sourcebook, edited by Lois M. Chan, Phyllis A. Richmond, and Elaine Svenonius, Littleton, CO: Libraries Unlimited, 1985. 253-268.]

Wilson, Patrick. 1973. “Situational Relevance”. Information Storage and Retrieval 9: 457-471.

Wilson, Patrick. 1977. Public Knowledge, Private Ignorance: Toward a Library and Information Policy. Westport CT: Greenwood Press.

Wilson, Patrick. 1978a. “Some Fundamental Concepts of Information Retrieval”. Drexel Library Quarterly 14, no. 2: 10-24.

Wilson, Patrick. 1978b. Review of Libraries in Post-Industrial Society, edited by Leigh Estabrook. Journal of Academic Librarianship 4: 95-96.

Wilson, Patrick. 1979a. “The End of Specificity”. Library Resources & Technical Services 23: 116-122.

Wilson, Patrick. 1979b. “Utility-Theoretic Indexing”. Journal of the American Society for Information Science 30: 169-170.

Wilson, Patrick. 1980a. “Limits to the Growth of Knowledge: The Case of the Social and Behavioral Sciences”. Library Quarterly 50: 4-21.

Wilson, Patrick. 1980b. Review of New Perspectives for Reference Service in Academic Libraries, by Raymond G. McInnis. Library Quarterly 50: 263-264.

Wilson, Patrick. 1980c. Review of The Vital Network: A Theory of Communication and Society, by Patrick Williams and Joan T. Pearce. Journal of Academic Librarianship 5: 351.

Wilson, Patrick. 1981. Review of Information, Organization, and Power: Effective Management in the Knowledge Society, by Dale E. Zand. Journal of Academic Librarianship 7: 295-295.

Wilson, Patrick. 1983a. “The Catalog as Access Mechanism: Background and Concepts”. Library Resources & Technical Services 27: 4-17.

Wilson, Patrick. 1983b. “Pragmatic Bibliography”. In Back to the Books: Bibliographic Instruction and the Theory of Information Sources, edited by Ross Atkinson. Chicago: Association of Research Libraries. 5-15. [Pagination here from the reprint in White, Bates, and Wilson, 230-246.]

Wilson, Patrick. 1983c. “Bibliographical R&D”. In The Study of Information: Interdisciplinary Messages, edited by Fritz Machlup and Una Mansfield. New York: Wiley. 389-397.

Wilson, Patrick. 1983d. Second-Hand Knowledge: An Inquiry into Cognitive Authority. Westport, CT: Greenwood Press. [Chapter 6, “Information Retrieval and Cognitive Authority,” reprinted in Knowledge Management Tools, edited by Rudy L. Ruggles. Boston: Butterworth-Heinemann, 1997. 121-144.].

Wilson, Patrick. 1983e. Review of Knowledge and the Flow of Information, by Fred I. Dretske. Information Processing & Management 19: 61-62.

Wilson, Patrick. 1983f. Review of Reading Research and Librarianship: A History and Analysis, by Stephen Karetzky. Journal of Academic Librarianship 8: 361.

Wilson, Patrick. 1984. Review of The Subject in the Dictionary Catalog from Cutter to the Present, by Francis Miksa. Library Quarterly 54: 109-110.

Wilson, Patrick. 1985. Review of Knowledge Structure and Use: Implications for Synthesis and Interpretation, edited by Spencer A. Ward and Linda J. Reed. Information Processing & Management 21: 370.

Wilson, Patrick. 1986. “The Face Value Rule in Reference Work”. RQ 25: 468-475.

Wilson, Patrick. 1989a. “The Second Objective”. In The Conceptual Foundations of Descriptive Cataloging, edited by Elaine Svenonius. San Diego, CA: Academic Press. 5-16.

Wilson, Patrick. 1989b. “Interpreting the Second Objective of the Catalog”. Library Quarterly 59: 339-353.

Wilson, Patrick. 1990. “Copyright, Derivative Rights, and the First Amendment”. Library Trends 39: 92-110.

Wilson, Patrick. 1991a. “Bibliographic Instruction and Cognitive Authority”. Library Trends 39: 259-270.

Wilson, Patrick. 1991b. Review of Envisioning Information and The Visual Display of Quantitative Information, by Edward R. Tufte. College & Research Libraries 52: 382-383.

Wilson, Patrick. 1992a. “Searching: Strategies and Evaluation”. In For Information Specialists: Interpretations of Reference and Bibliographic Work, by Howard D. White, Marcia J. Bates, and Patrick Wilson. Norwood, NJ: Ablex. 153-181.

Wilson, Patrick. 1992b. Review of History and Communications: Harold Innis, Marshall McLuhan: the Interpretation of History, by Graeme Patterson. Library Quarterly, 62, no. 2: 232-233.

Wilson, Patrick. 1993a. “The Value of Currency”. Library Trends, 41: 632-643.

Wilson, Patrick. 1993b. “Communication Efficiency in Research and Development”. Journal of the American Society for Information Science 44: 376-382.

Wilson, Patrick. 1993c. Review of Change and Challenge in Library and Information Science Education, by Margaret F. Stieg. College & Research Libraries 54: 275-276.

Wilson, Patrick. 1993d. Review of Dilemmas in the Study of Information: Exploring the Boundaries of Information Science, by S. D. Neill. Journal of Education for Library and Information Science 34: 90-91.

Wilson, Patrick. 1994a. Review of The Barefoot Expert: The Interface of Computerized Knowledge Systems and Indigenous Knowledge Systems, by Doris M. Schoenhoff. Journal of the American Society for Information Science 45: 220-221.

Wilson, Patrick. 1994b. Review of The Metaphysics of Virtual Reality, by Michael Heim. College & Research Libraries 55: 87-88.

Wilson, Patrick. 1995a. “Unused Relevant Information in Research and Development”. Journal of the American Society for Information Science 46: 45-51.

Wilson, Patrick. 1995b. Review of Knowledge-Based Systems for General Reference Work: Applications, Problems, and Progress, by John V. Richardson. Journal of the American Society for Information Science 46: 792-793.

Wilson, Patrick. 1995c. Review of Thinking Through Technology: The Path between Engineering and Philosophy, by Carl Mitcham. College & Research Libraries 56: 184-186.

Wilson, Patrick. 1996a. “The Future of Research in Our Field”. In Information Science: From the Development of the Discipline to Social Interaction, edited by Johan Olaisen, Erland Munch-Petersen, and Patrick Wilson. Oslo: Scandinavian University Press. 319-323.

Wilson, Patrick. 1996b. “Interdisciplinary Research and Information Overload”. Library Trends 45: 192-203.

Wilson, Patrick. 1996c. “Some Consequences of Information Overload and Rapid Conceptual Change”. In Information Science: From the Development of the Discipline to Social Interaction, edited by Johan Olaisen, Erland Munch-Petersen, and Patrick Wilson. Oslo: Scandinavian University Press. 21-34.

Wilson, Patrick. 1996d. Review of The Closing of American Library Schools: Problems and Opportunities, by Larry J. Ostler, Therrin C. Dahlin, and J. D. Wallardson. College & Research Libraries 57: 197-198.

Wilson, Patrick. 1996e. Review of Theories of the Information Society, by Frank Webster. College & Research Libraries 57: 487-489.

Wilson, Patrick. 1997a. Review of The Global Information Society, by William J. Martin. Journal of Academic Librarianship 23: 145-146.

Wilson, Patrick. 1997b. Review of Cognition and Complexity: The Cognitive Science of Managing Complexity, by Wayne W. Reeves. Journal of Education for Library and Information Science 38: 232-233.

Wilson, Patrick. 1998a. “Patrick Wilson: A Bibliographer among Catalogers”. Cataloging & Classification Quarterly 25: 305-316.

Wilson, Patrick. 1998b. Cognition and Complexity - Response. Journal of Education for Library and Information Science 39: 155-156.

Wilson, Patrick. 1998c. Review of Information Seeking and Subject Representation: An Activity-Theoretical Approach to Information Science, by Birger Hjørland. College & Research Libraries 59: 287-288.

Wilson, Patrick. 1998d. Review of The Scientific Revolution, by Steven Shapin. Library Quarterly 68: 102-103.

Wilson, Patrick. 2000a. Patrick G. Wilson, Philosopher of Information: An Eclectic Imprint on Berkeley's School of Librarianship, 1965-1991. Interviewed by Laura McCreery. Introduction by Howard D. White. Library School Oral History Series and University of California, Source of Community Leaders Series, Regional Oral History Office, The Bancroft Library, University of California, Berkeley. https://oac.cdlib.org/ark:/13030/kt958006vr/?brand=oac4.

Wilson, Patrick. 2000b. Review of The Information of the Image, by Allan D. Pratt. Library Quarterly 70: 135-137.

Wilson, Patrick. 2001a. “On Accepting the ASIST Award of Merit”. Bulletin of the American Society for Information Science and Technology 28, no. 2: 10-11.

Wilson, Patrick. 2001b. Review of The Intellectual Foundation of Information Organization, by Elaine Svenonius. College & Research Libraries 62: 203-204.

Wilson, Patrick, and Mona Farid. 1979. “On the Use of the Records of Research”. Library Quarterly 49: 127-145.

Wilson, Patrick, and Nick Robinson. 1990. “Form Subdivisions and Genre”. Library Resources & Technical Services 34: 36-43.

[top of entry]

Other references

Andersen, Jack. 2004. Analyzing the Role of Knowledge Organization in Scholarly Communication: An Inquiry into the Foundation of Knowledge Organization. Ph.D. diss. Copenhagen: Royal School of Library and Information Science.

Andersen, Jack, and Laura Skouvig. 2006. “Knowledge Organization: A Sociohistorical Analysis and Critique”. Library Quarterly 76: 300-322.

Bates, Marcia J. 1976. “Rigorous Systematic Bibliography”. RQ 16: 7-26. (Reprinted in For Information Specialists: Interpretations of Reference and Bibliographic Work, by Howard D. White, Marcia J. Bates, and Patrick Wilson. Norwood, NJ: Ablex, 1992. 117-130. Also in her Information Searching Theory and Practice. Berkeley, CA: Ketchikan Press, 2016. 348-380.)

Chatman, Elfreda A. 1983. The Diffusion of Information among the Working Poor. Ph.D. diss., University of California, Berkeley.

Chua, Amy. 2018. “By the Book: Amy Chua”. New York Times Book Review, 4 February, 8.

Cooper, William S. 1971. “A Definition of Relevance for Information Retrieval”. Information Storage and Retrieval 7: 19-37.

Coyle, Karen. 2016. FRBR Before and After: A Look at Our Bibliographic Models. Chicago: ALA Editions.

Egan, Margaret E., and Jesse H. Shera. 1952. “Foundations of a Theory of Bibliography”. Library Quarterly 22: 125-137.

Furner, Jonathan. 2010. “Philosophy and Information Studies”. Annual Review of Information Science and Technology 44: 159-200.

Garfield, Eugene. 1955. “Citation Indexes for Science: A New Dimension in Documentation through Association of Ideas”. Science 122(3159): 108-111. [Pagination here from the 2006 reprint in International Journal of Epidemiology 35: 1123-1127.]

Goldman, Alvin and Thomas Blanchard. 2018. “Social Epistemology”. The Stanford Encyclopedia of Philosophy edited by Edward N. Zalta. https://plato.stanford.edu/archives/sum2018/entries/epistemology-social/.

Goodman, James. 2019. “Freed, but Not Free”. Review of Separate: The Story of Plessy v. Ferguson, and America’s Journey from Slavery to Segregation by Steve Luxenberg. New York Times Book Review. 24 February, 11.

Hjørland, Birger. 1996. “Overload, Quality and Changing Conceptual Frameworks”. In Information Science: From the Development of the Discipline to Social Interaction, edited by Johan Olaisen, Erland Munch-Petersen, and Patrick Wilson. Oslo: Scandinavian University Press. 35-67.

Hjørland, Birger. 2001. “Towards a Theory of Aboutness, Subject, Topicality, Theme, Domain, Field, Content…and Relevance”. Journal of the American Society for Information Science and Technology 52: 774-778.

Hjørland, Birger. 2002. “Domain Analysis in Information Science: Eleven Approaches — Traditional as well as Innovative”. Journal of Documentation 58: 422-462

Hjørland, Birger. 2003. “Fundamentals of Knowledge Organization”. Knowledge Organization 30: 87-111.

Hjørland, Birger. 2008. “What is Knowledge Organization?” Knowledge Organization 35: 86-101.

Hjørland, Birger. 2016. “Knowledge Organization”. Knowledge Organization 43: 475-84.

Hodges, Theodora L. 1972. Citation Indexing: Its Potential for Bibliographical Control. Ph.D. diss., University of California, Berkeley.

Jin, Haofeng, and Erik Saule. 2018. “Toward Finding Non-Obvious Papers: An Analysis of Citation Recommender Systems”. https://arxiv.org/pdf/1812.11252.

Joudrey, Daniel N., and Arlene G. Taylor. 2018. The Organization of Information, 4th ed. Santa Barbara, CA: Libraries Unlimited

McKenzie, Pamela J. 2003. “Justifying Cognitive Authority Decisions: Discursive Strategies of Information Seekers”. Library Quarterly 73: 261-288.

Munch-Petersen, Erland. 1996. “Patrick Wilson and the Classics”. In Information Science: From the Development of the Discipline to Social Interaction, edited by Johan Olaisen, Erland Munch-Petersen, and Patrick Wilson. Oslo: Scandinavian University Press. 233-243.

Olaisen, Johan, Erland Munch-Petersen, and Patrick Wilson, eds. 1996. Information Science: From the Development of the Discipline to Social Interaction. Oslo: Scandinavian University Press.

Reddy, Michael J. 1979. “The Conduit Metaphor: A Case of Frame Conflict in Our Language about Language”. In Metaphor and Thought, edited by Andrew Ortony. New York: Cambridge University Press. 284-324.

Rieh, Soo Young. 2002. “Judgment of Information Quality and Cognitive Authority in the Web”. Journal of the American Society for Information Science and Technology 53: 145-161.

Rieh, Soo Young, and David R. Danielson. 2007. “Credibility: A Multidisciplinary Framework”. Annual Review of Information Science and Technology 41: 307-364.

Smiraglia, Richard P. 2007. “Two Kinds of Power : Insight into the Legacy of Patrick Wilson”. In Proceedings of the Annual Conference of CAIS/Actes du congrès annuel de l'ACSI, edited by Kimiz Dalkir and Clément Arsenault. http://www.cais-acsi.ca/ojs/index.php/cais/article/view/735.

Smiraglia, Richard P. 2014. “Wilson” in his The Elements of Knowledge Organization. Springer International. 9-12.

Sundin, Olof, and Jenny Johannisson. 2004. “Pragmatism, Neo-Pragmatism, and Sociocultural Theory: Communicative Participation as a Perspective in LIS”. Journal of Documentation 61: 23-43.

Svenonius, Elaine. 2000. The Intellectual Foundation of Information Organization. Cambridge, MA: MIT Press.

Swanson, Don R. 1980. “Libraries and the Growth of Knowledge”. Library Quarterly 50: 112-134.

Swanson, Don R. 1986. “Undiscovered Public Knowledge”. Library Quarterly 56: 103-118.

White, Howard D. 1992. “Publication and Bibliographic Statements”. In For Information Specialists: Interpretations of Reference and Bibliographic Work, by Howard D. White, Marcia J. Bates, and Patrick Wilson. Norwood, NJ: Ablex. 81-116.

White, Howard D. 2017. “Bag of Works Retrieval: TF*IDF Weighting of Works Co-Cited with a Seed”. International Journal of Digital Libraries 19: 139-149.

White, Howard D., Marcia J. Bates, and Patrick Wilson. 1992. For Information Specialists: Interpretations of Reference and Bibliographic Work. Norwood, NJ: Ablex.

Yee, Martha M. 1995. “What Is a Work? Part 4. Cataloging Theorists and a Definition”. Cataloging & Classification Quarterly 20, no. 2: 3-23.

Zeng, Marcia Lei. 2008. “Knowledge Organization Systems”. Knowledge Organization 35: 160-182.

[top of entry]

Visited times.

Version 1.0 published 2019-05-07
Article category: Biographical articles

This article (version 1.0) is also published in Knowledge Organization. How to cite it:
White, Howard D. 2019. “Patrick Wilson”. Knowledge Organization 46, no. 4: 279-307. Also available in ISKO Encyclopedia of Knowledge Organization, eds. Birger Hjørland and Claudio Gnoli, http://www.isko.org/cyclo/wilson