I S K O

Thesaurus (for information retrieval)

by Stella G. Dextre Clarke

Table of contents:
1. Introduction and clarification of scope
2. What is a thesaurus?
    2.1 Purpose
    2.2 Content and structure
    2.3 Definitions
    2.4 Why the confusion?
3. How a thesaurus is used
    3.1 For post-coordinate indexing and searching
    3.2 Networked uses, especially in the Semantic Web
    3.3 Other uses
4. History of thesaurus development and use
    4.1 Origins
    4.2 Period of ascendancy
    4.3 Systematization
    4.4 Maturity, senescence or rejuvenation?
5. Types and styles of thesaurus
    5.1 Overview
    5.2 The bare minimum
    5.3 Different styles for different communities
    5.4 Electronic thesauri
    5.5 Multilingual vs monolingual thesauri
    5.6 Macro- and micro-thesauri
    5.7 The search thesaurus
6. Performance and evaluation
7. The future of thesauri
8. Further reading
Endnotes
References
Colophon
Abstract:
In the post-war period before computers were readily available, urgent demand for scientific and industrial development stimulated Research and Development (R&D) that led to the birth of the information retrieval thesaurus. This article traces the early history, speciation and progressive improvement of the thesaurus to reach the state now conveyed by guidelines in international and national standards. Despite doubts about the effectiveness of the thesaurus throughout this period, and notwithstanding the dominance of Google and other search engines in the information retrieval (IR) scene today, the thesaurus still plays a complementary part in the organization of knowledge and information resources. Success today depends on interoperability, and is opening up opportunities in Linked Data applications. At the same time the IR demand from workers in the Knowledge Society drives interest in hybrid forms of KOS that may pool the genes of thesauri with those of ontologies and classification schemes.

1. Introduction and clarification of scope

This article is about thesauri intended for use in information retrieval (IR) [1], rather than literary thesauri, which are generally designed for the different purpose of helping and inspiring the choice of words and phrases in normal discourse. The first edition of Roget's Thesaurus, that very well known literary thesaurus, came out long before the first IR thesaurus and probably inspired the invention of the latter. For this reason there is some reference to literary thesauri in the History section of this article. In other sections, however, the term thesaurus invariably refers to the information retrieval thesaurus.

I S K O

Encyclopedia of Knowledge Organization

Thesaurus (for information retrieval)

1. Introduction and clarification of scope

2. What is a thesaurus?

2.1 Purpose

2.2 Content and structure

2.3 Definitions

2.4 Why the confusion?

3. How a thesaurus is used

3.1 For post-coordinate indexing and searching

3.2 Networked uses, especially in the Semantic Web

3.3 Other uses

4. History of thesaurus development and use

4.1 Origins

4.2 Period of ascendancy

4.3 Systematization

4.4 Maturity, senescence or rejuvenation?

5. Types and styles of thesaurus

5.1 Overview

5.2 The bare minimum

5.3 Different styles for different communities

5.4 Electronic thesauri

5.5 Multilingual vs monolingual thesauri

5.6 Macro- and micro-thesauri

5.7 The search thesaurus

6. Performance and evaluation

7. The future of thesauri

8. Further reading

Endnotes

References