I S K O |
Encyclopedia of Knowledge Organization |
|||||||||||||||||||||||||
home about ISKO join ISKO Knowledge Organization journal ISKO events ISKO chapters ISKO people ISKO publications Encyclopedia KO literature KO institutions ⇗ KOS registry 🔒 members contact us |
edited by Birger Hjørland and Claudio Gnoli
Alphabetizationby Wendy Korwin and Haakon Lund 1. Definitions and explanationAlphabetization [1] is a kind of ordering. The Oxford English Dictionary (Oxford University Press 2018a) defines ordering: 1a: “a. To place in order, give order to; to arrange in a particular order; to arrange methodically or suitably”. Ordering may be understood in two ways:
It is the first of these meanings that is relevant in relation to the term alphabetization. Besides alphabetical order as ordering criterion, other criteria such as chronological or systematic may be used for arranging items in a sequence [3]. Both these meanings of ordering are often used synonymous with sorting [4], although sorting is often preferred for mechanical procedures, such as sorting algorithms [5]. In general, the most common uses of ordered (or sorted) sequences are:
Alphabetization is the process of establishing the alphabetical order of a set of items based on their names or headings [6]. Alphabetical order is the arrangement of items by sorting strings of characters [7] according to their position in a given alphabet. In addition to conventions for ordering letters, other characters such as numbers, symbols, ideograms, logograms, and typographical issues such as lowercase and uppercase letter should be differentiated. The overall term for this is alphanumeric arrangement. Examples:
The process of alphabetizing headings starts by collocating those starting with the first letter in a given alphabet. Headings starting with the second letter are then collocated, and the process repeats through the last letter in the alphabet (in English this is mostly termed the A-Z order). Each collocated group is then arranged according to the second letter in the heading and so on, until the whole string of characters in the heading has been arranged (i.e., exact alphabetical order, cf. below). Alphabetical order has been described as “unnatural and arbitrary” (Weinberger 2007, 26) rather than organic or intuitive. The reasons for this are:
Another issue arises with synonyms, which allow for the same concept to be expressed using different words (and therefore placed in different alphabetical locations). This is dealt with in library and information science by forms of vocabulary control (such as subject headings and → thesauri). This issue will be dealt with in other articles in this encyclopedia. Despite this “unnatural and arbitrary” order, alphabetization has proven itself extremely useful. It is a widespread practice valued for its ability to render large amounts of information readily accessible to users. Alphabetizing is such a firmly ingrained process in many cultures that users may scarcely notice the organizational scheme that helps them browse through record stores or locate icons on their computer desktop [9]. Its history reveals, however, that alphabetization was not an inevitable development, nor was it a practice adopted wholesale from the moment of its invention. Instead, it has existed alongside, and has frequently been combined with or challenged by, other means of arrangement. As we shall see below, alphabetization is often a complex operation that demands much more knowledge than just the 26 letters in the English alphabet and their conventional order. 2. HistoryThe literature on alphabetization is limited; Flanders (2020) is a recent exception. There exists, however, large related literatures on the developments of alphabets (e.g. Drucker 1995; 2022), writing systems (e.g. Diringer 1962; Hooker 1990; Daniels and Bright 1996) and, at the broadest level, human symbolic evolution (Lock and Peters 1999). Each specific writing system has its own literature and may pose specific problems to the development of standards for its representation and ordering. Alphabetization requires that letters bear consistent names and, most importantly, a standard order. In non-phonetic languages like Japanese and Chinese, for instance, alphabetization is less entrenched, as their logographic and syllabic characters support multiple arrangement possibilities. The English term alphabet, on the other hand, embodies the very idea that it labels: the consistency and predictability of A, B, C. Michael Rosen (2015, 395-6) explains that the word itself is constructed from the first two letters of the Greek alphabet, alpha and beta. He writes, “The alphabet is then the ‘alphabeta,’ rather as if we were to call the number system the ‘one-two’. Tracing the route back we go to Latin ‘alphabetum,’ back to the ancient Greek ‘alphabetos’, back to Phoenician ‘aleph’ (‘ox’) and ‘beth’ (‘house’) which were once pictograms. So, incredibly, the word ‘alphabet’ contains within it the whole history of this particular alphabet or ‘ox-house’”. Lloyd Daly, author of the most in-depth study of the history of alphabetization to date (Daly 1967), notes that the practice became possible when the ancient Greeks adopted the Phoenician alphabet, along with its established letter order. Yet, for roughly five centuries, the Greeks found no need to develop alphabetization, relying instead on other forms of classification, or indeed no classification scheme at all, to compile their lists. Daly traces the earliest uses of alphabetization to the end of the third century BCE. On the islands of Kos and Kalymnos, he finds inscriptions recording participants in local cults in which individuals’ names were divided into sections and then arranged according to their first letter. The Alexandrian libraries provided an early occasion to apply the alphabetic principle more broadly, as scholars accumulated and needed to navigate amongst an expanding number of texts. Portions of the Pinakes, Callimachus’ partial library catalogue, classified works by subject and then, most likely, by author. As part of their literary study, scholars also produced glosses of words found in various texts. At first, they arranged these lists to reflect the order in which the terms appeared in a given work, but as the glosses grew to unwieldy proportions, they began to arrange them alphabetically by first letter. In spite of these early examples, Daly stresses that adoption of alphabetic order was piecemeal, and favored mainly by scholars rather than public officials. Although he finds evidence that tax rolls and other administrative documents from the Ptolemaic and Graeco-Roman administration of Egypt reflected alphabetization to some extent, he also explains that “for each example cited, there are hundreds of documents where the principle might have been used but was not” (50). One particular gap is found in the administration of the ancient Rome, where the alphabetic scheme, although known, was not adopted to organize army rolls or tax ledgers, whose large scale might have benefited from such a system. 2.1 Some challenges of alphabetizationIn all the early instances uncovered by Daly, alphabetization was limited to arranging items based on their first letter. Eventually, scholars began to extend the practice to order entries according to their second and third letters, but it is not until the second century AD that Daly finds examples of exact or absolute alphabetical order in Galen’s Interpretation of Hippocratic Glosses [10]. In general, its cumbersome nature prevented absolute order from gaining widespread acceptance until the end of the Middle Ages, in part due to the effort required and in part due to the availability of materials. When compiling a list, an alphabetizer needed to estimate ahead of time the amount of area required to accommodate the number of entries falling under a given letter. Expanding or combining lists thus required physically fitting new items into pre-allotted spaces, which sometimes resulted in creating sub-lists or squeezing new entries into the margins of existing documents. Until the development of printing, the alphabetic principle was also limited by media. Extensive alphabetization projects depend upon the ability to manipulate entries individually, and this is often done by first composing these entries on provisional cards [11] or slips. Both papyrus and parchment were too valuable to be used so ephemerally, and so until paper became cheaper and more abundant in the late fifteenth century, few efforts were made to apply alphabetization to its full potential. As Geoffrey Martin (2003, 16) writes, alphabetical indexes based on absolute order became common only “as the printed book established itself as an engine of scholarship,” and alphabetization “came into its own as a guide to the contents of the greatly expanded libraries that printing made both possible and necessary to the advancement of learning”. Even in the age of the printed book, however, alphabetization remained one of many arrangement possibilities, and end users still needed to be guided in its application. When Robert Cawdrey published one of the first English dictionaries in 1604, his Table Alphabeticall, he explicitly instructed readers how to use it: “If thou be desirous… rightly and readily to understand, and to profit by this Table, and such like, then thou must learne the alphabet, to wit the order of the letters as they stand, perfectly without booke, and where every letter standeth: as (b) nere the beginning, (n) about the middest, and (t) toward the end” (quoted in Daly 1967, 91). Clearly, Cawdrey could not assume that his early seventeenth-century readers were familiar with the practice of locating information by consulting alphabetically arranged documents. There is also a point to be made about alphabetical order in ‘word’ books at a time when spelling was not standardized and much more fluid. Mulcaster’s ‘Elementarie’ (Mulcaster 1582) provides an example. Words such as ‘chalenge’, ‘chauffinch’, ‘chearfull’, and ‘chearie’, are spelled differently in modern English, and therefore fall in different places in the alphabetic sequence. Writers and publishers of encyclopedias have also wrestled with presenting alphabetic schemes to their readers. Etymologically, encyclopedias offer “general education” or “instruction in a circle”, and most early authors sought to structure their works in ways that presented a coherent sum of human knowledge, stressing the internal relations between different fields of inquiry. Alphabetical arrangements, in contrast, disperse conceptually related terms based on the relative happenstance of the order of their letters, severing important connections between associated ideas. Richard Yeo has written about this tension and the ways that, at least since the 1728 release of Ephraim Chambers’ Cyclopaedia, editors have tried to resolve it using tools like subject indexes, cross-references, mixed thematic and alphabetical arrangements, and historical surveys. Chambers opted to combine systematic and alphabetical orders in his work, while the Encyclopaedia Britannica, in 1824, introduced longer historical dissertations on different branches of science to accompany its shorter, alphabetically-ordered entries. The apparent objectivity of alphabetical order also obscures editorial decisions, such as whether one term should fall under the umbrella of another or merits its own treatment. Other practical concerns arise when a recently published volume contains entries that rely on concepts that follow them alphabetically and may not appear in print for years to come. Later entries might also be condensed to meet publication deadlines, space limits, and financial constraints. Yeo (1991, 40) points out that in the original Encyclopaedia Britannica, volumes dedicated to the letters A and B were granted 687 pages of text, while the remainder of the alphabet was condensed to occupy only 2,000 pages. Rather than necessarily offering order and ease of use, then, alphabetization is also capable of producing disorder, randomness, and opacity. 3. Some principles of alphabetizationAny arrangement scheme must take all elements of an index entry into consideration. Wellisch (1999) provides a detailed discussion of alphabetical arrangement and presents the following seven rules for ordering characters:
There are two overall basic forms of arrangements of headings, word-by-word and letter-by-letter (see Table 1). These two schemes differ in how they handle spaces and other non-letter characters and typographical forms. Word-by-word arrangement puts “nothing before something”, whereas letter-by-letter arrangement (“all through”) ignores spaces and punctuation between words. Wellisch (1999, 5) writes: “This method [letter-by-letter] is primarily used for the arrangement of headings in dictionaries, because it keeps different spellings of the same term together (for example, ground water, ground-water, groundwater). The application of this method violates, however, the provision of section 3.1, and it is also subject to a number of different interpretations […]. This method is therefore not recommended”.
As shown in Table 1, these two styles of alphabetizing yield very different results, which is of great importance in long indexes. Most users of indexes do not think about the various ways entries may be alphabetized, and if not found in a particular place, they may assume that a subject is not included. Using standards and orders that work for users is critical. Unfortunately, there are different, non-compatible standards and guidelines, as discussed below. 4. Standards, norms and guidelinesThe rules and standards governing the many aspects of alphabetization may be difficult to grasp [12]. This is particularly true with the implementation of well-established national traditions for arranging names and headings in computer programs. This process is often (e.g. in Library of Congress Filing Rules as well as in this article) called “filing rules” (see note 13 about the use of this term in classification research). One challenge of alphabetization in computer software is establishing the method by which a system will encode alphanumeric characters. The encoding of characters in computer systems has been guided by both national and international standards, as well as by proprietary encoding schemes established by the various software houses, e.g. IBM, Microsoft, Apple Computer etc., leading to difficulties in interoperability between different software programs. As an example, uppercase and lowercase letters are ordered separately following the 7-bit character set defined by the American Standard Code for Information Interchange (ASCII) (Table 2; for ASCII and bit see Appendix 1 and Appendix 2). This does not follow the traditional ordering of letters in the English alphabet, where uppercase and lowercase letters do have the same position in the alphabet [14]. In a digital computer (or binary computer), each character is given a unique binary code. This means that an uppercase A has a different code than a lowercase a. According to ASCII, all uppercase letters appear in order first, followed by lowercase letters. Following this logic, all entries beginning with uppercase letters will be arranged before entries beginning with lowercase letters (which also has effects on → notations). Table 2 illustrates the result of using the ASCII arrangement to encode characters compared to the example in Table 1.
In the two leftmost columns, the arrangement follows the traditional English alphabetical order according to the guiding principles of word-by-word or letter-by-letter arrangements, with no distinction between uppercase and lowercase letters. In the rightmost column, the order follows the encoding scheme used in the ASCII character set. In addition to encoding schemes, it has therefore been necessary to establish guides or collation [15] rules for how letters should be ordered according to national alphabets. These language-specific rules reflect different cultural traditions for arranging alphabetic characters. To add to the complexity, different institutions (e.g. libraries, publishing houses) also maintain specific traditions for how they arrange names and headings. This impacts the order of books on shelves, the arrangement of book indexes, and the display of search results in an OPAC. Ordering practices have been guided by professional associations like the American Library Association, the Library of Congress, ARMA International (previously the Association of Records Managers and Administrators), as well as by standardizing bodies such as the National Information Standards Organization (NISO) and International Organization for Standardization (ISO), among others. Filing rules differ by the level of human intervention used to determine which part of the heading or name should be used for ordering. This involves an intellectual understanding of the actual meaning of the heading, i.e. to distinguish between a personal name, a place name, a subject etc. and arrange accordingly. The example below is taken form the Library of Congress Filing Rules, where headings with identical leading elements [16] are arranged in the following order: person, place, corporate body, subject, title (leading element underlined): George III, King of Great Britain, 1738-1820 In this example, the leading element is in all cases identical and the list is then arranged according to type of heading. Outside the scope of this article are the standards, rules, and guidelines suggesting what indexes are appreciated in a certain document or information system and how entries or headings should be formulated, e.g. back-of-the-book indexes, algorithmic search indexes, library OPACs etc. Figure 1 provides an overview of numerous standards, guidelines, and rules (the top box represents issues related to indexing, cataloging, and metadata that are beyond the scope of the present article [17]): 4.1 Standards for encoding of alphanumerical charactersPresented below is a selection of US and international standards, mainly governing the encoding of the English written alphabet with later extensions allowing for encoding of alphabets using Latin script. ANSI INCITIS X3.4-1986: Information Systems—Coded Character Sets—7-Bit American National Standard Code for Information Interchange (7-bit ASCII), first edition published in 1963 and was adapted as the international standard ISO/IEC 646 in 1967. These two standards for 7-bit encoding are only presented here because of their historical importance for the early development of computers and the attempt to standardize the industry. The 7-bit character sets provided space for English alphanumeric letters, resulting in many national variants. To support a wider number of characters, the 8-bit family of encoding standards was developed, the first edition published in 1987 as ISO/IEC 8859. This family of standards is incorporated in ISO/IEC 10646 mentioned below. A widely used character set is the Unicode Standard, which was first published in 1991 and whose most recent version, Unicode 11.0, was published in 2018 (Unicode Consortium 2018). Version 11.0 contains a repertoire of 137,439 characters covering 146 modern and historic scripts, as well as multiple symbol sets and emojis [18]. Unicode makes it possible to encode more than 1.1 million characters, thereby providing encoding of all existing alphabets, including letter based as well as ideographic writing systems, but only a fraction of this set is currently in use. The Unicode standard is developed by The Unicode Consortium in tandem with ISO, and the most recent ISO standard is ISO/IEC 10646:2017 Information technology—Universal Coded Character Set (UCS). It corresponds to Unicode 10.0 but excludes some special characters and emoji symbols (see further Wikipedia, “Universal...” 2018). Unicode is currently the most important issue relating to alphabetization, and it may deserve an independent entry in this encyclopedia (see Aliprand 2017 for an encyclopedia article in Encyclopedia of Library and Information Sciences). From a research-oriented perspective, two issues are crucial: (i) Unicode can be implemented by different character encodings and there seems to be a trade-off between the number of bytes used for each character and the space used, and thus the efficiency of the implementation and (ii) philosophical and completeness criticisms. There has been a debate on such issues [19]. Among the issues raised is the relation between characters, graphemes and glyphs as units. Holmes (2003) has suggested that although Unicode is a success, a different approach would have worked much better for encoding text, documents, and writing systems. The attempt to accommodate all the world’s languages in one gigantic codespace means that it cannot take full advantage of the systematic graphical features of various writing systems. The criticisms of Unicode seem related to earlier versions and are possibly less relevant to its newer versions. It is, however, important to be open to possible limitations and biases in all kinds of standards and knowledge organization systems. 4.2 Standards and recommendations for the ordering of alphabetsAccording to Küster (1999, 21) the “ordering of letters is highly dependent on the cultural expectations”. This author thus seems to strive for a multilingual approach to ordering. What might be expected as the correct alphabetical order in English is not the same in, for example, Danish. Besides the letters a to z, the Danish alphabet also comprises the letters æ, ø and å, and the ordering of the Danish alphabet is from a – å, meaning that æ, ø and å are the three last letters in the alphabet. This raises a number of questions about how to treat different national alphabets when dealing multilingual information and software. These issues are both about securing correct order according to different national traditions and about how to incorporate or express letters from other alphabets in, for example, the English language. Example: according to Wellisch (1999, 3) the Danish letters æ, ø and å should be arranged in the English alphabet as ae, o and a. Needless to say, this would have an effect on the arrangement of characters when following a Danish language-based system compared to an English language-based system, and subsequently also the exchange of information between the two systems. This is not just of ‘academic interest’ but relevant whenever Danish names appear in English reference lists — and of course similarly with every other language. Standards such as BS/EN 13710: 2011 European Ordering Rules. Ordering of Characters from Latin, Greek, Cyrillic, Georgian and Armenian Scripts have been established to normalize this (see also Küster 2006, chapter 17.4). The standards and recommendations mentioned here do not only deal with the ordering of letters but also define collation algorithms. According to Davis, Whistler and Scherer (2018, section 1) the purpose of the Unicode Collation Algorithm (UCA) is: Collation varies according to language and culture: Germans, French and Swedes sort the same characters differently. It may also vary by specific application: even within the same language, dictionaries may sort differently than phonebooks or book indices. For non-alphabetic scripts such as East Asian ideographs, collation can be either phonetic or based on the appearance of the character. Collation can also be customized according to user preference, such as ignoring punctuation or not, putting uppercase before lowercase (or vice versa), and so on. Linguistically correct searching needs to use the same mechanisms: just as "v" and "w" traditionally sort as if they were the same base letter in Swedish, a loose search should pick up words with either one of them. Collation rules have a wide impact on digital systems, from determining the simple alphabetical ordering of letters in an index to influencing how databases and search engines are organized and consequently behave when confronted with a search request submitted by a user. One important function of UCA is therefore to provide a technical solution for implementing filing rules (see below in 4.3) in a software program. It is imperative to underline that the collation algorithm does not prescribe specific rules for how to arrange or file headings; it only governs the technical implementation of filing rules. The international collation standard is ISO/IEC 14651, Information Technology, International String Ordering and Comparison, Method for Comparing Character Strings and Description of the Common Template Tailorable Ordering. It was developed in tandem with UCA. Furthermore, Wellisch (1999) and the LC filing rules (Rather and Biebel 1980) prescribe the ordering of the English alphabet and the arrangement of non-English letters into the English alphabet. 4.3 Rules and guidelines for the arrangement of headings (filing rules)Filing rules guide alphabetization, including the ordering and sorting of library catalogs, indexes, dictionaries, and directories (Wellisch 1999, v). These rules are published by both professional entities and organizations, e.g. national library bodies, library associations, publishing houses etc. With this in mind, only a few important examples of guidelines are mentioned here. Wellisch (1999) published by NISO is an attempt to establish a set of common guidelines. According to the foreword, “this technical report seeks to make the alphanumeric arrangement of headings ‘as easy as ABC’” (Wellisch 1999, v). The American Library Association (ALA) has published ALA Filing Rules (American Library Association 1980) and the Library of Congress has published LC Filing Rules (Rather and Biebel 1980). Both are widely used within libraries, but alas they provide different solutions. For example, the ALA filing rules do not distinguish between types of headings (Bakewell 1972, 166); this differs from the LC filing rules (see this article section 4 for example). The three recommendations above all advise a word-by-word arrangement. Many book publishers follow their own alphabetizing styles. North American publishers often follow the guidelines in The Chicago Manual of Style (University of Chicago Press 2017, 944, §16.58), which call for letter-by letter alphabetization: “Chicago, most university presses, and many other publishers have traditionally preferred the letter-by-letter system but will normally not impose it on a well-prepared index that has been arranged word by word”. It is important to note that Chicago’s choice of letter-by-letter alphabetization is in conflict with the word-by-word arrangement recommended by Wellisch (1999) and by both the ALA filing rules and the Library of Congress filing rules. For use within the domain of records and information management ARMA International publishes a set of guidelines (ARMA and ANSI 2005). These guidelines advise a unit-by-unit approach for alphabetical filing, which differs from both letter-by-letter and word-by-word filing. 5. Alphabetic order versus other ordering criteriaIn botany, Richards (2016, 66) explains that alphabetical arrangements of plants in herbaria were common by about 1596, but many other criteria were also used, like sorting plants with pleasant flowers from odorous plants and classifying plants according to their similarities and differences. This last principle led to hierarchical and more systematic approaches, for instance, organizing plants into genera and subdividing them into species. But these species and genera were not necessarily what we would see in modern scientific classifications. Sometimes plants were, for example, simply classified as trees, shrubs, or herbs. It is common knowledge that such different ordering principles were standardized by the taxonomy set up by Carl Linnaeus in his Systema Naturae (1735). Today it is the norm that such systematical arrangements are supplemented by alphabetic indexes for the easy location of a specific name. Concerning the organization of knowledge in encyclopedias, Sundin and Haider (2013) write: The encyclopaedias that emerged around the time of the Enlightenment are said to have shifted knowledge’s organizational principle; from the tree of knowledge to the alphabet. Yet despite the success of the alphabetic principle, it has not erased classification endeavours, in fact not even in the beginning. As Ann Blair [2010] points out, already d’Alembert defended the alphabetic principles in the Encyclopédie at the same time that he provided readers with an image of a tree of knowledge as a supplement to the alphabet. Sundin and Haider then describe how the Swedish electronic encyclopedia Nationalencyklopedin, in addition to its alphabetical arrangement, also uses a Swedish bibliographic classification system, Klassifikationssystem för svenska bibliotek (SAB). However, the authors do not further examine the use of classification systems in contemporary encyclopedias, and although such systems are sometimes provided (e.g. in Encyclopedia Britannica’s “Syntopicon: An Index to The Great Ideas” (1952) followed in 1974 by “Propaedia”, an "outline of knowledge", see Adler 2007), there is little evidence of their use and usefulness over alphabetical arrangements, indexes, and internal references. However, such systematic outlines often form the basis for the overall editing of encyclopedias and the commission of articles. For the user, they may therefore provide a better overview and means to evaluate the coverage of the work. In libraries, there have been controversies about the strengths and weaknesses of alphabetical subject catalogs versus systematic catalogs (see Hanson and Daily 1970 about the history of library catalogs). In The Organization of Knowledge in Libraries and the Subject-Approach to Books, Henry E. Bliss (1933) argues that a systematic subject-approach is required. Any attempt to apply a simple alphabetical subject-approach without a systematic organization of the plurality of knowledge subjects is rejected by Bliss (1933, 301) as a kind of “subject-index illusion” [20]. A mere listing of subjects, as provided by subject headings, would not be able to meet the principle of maximal efficiency that results from the strategies of collocation of closely related classes or subjects and subordination of the specific to the generic. This means that a differentiation (analysis) of subjects should only be considered as a necessary first step that needs to be succeeded by an integration (synthesis) of subjects into a well-structured → knowledge organization system, as underlined by Bliss (1933, 104): Analytic division tends to dispersion. But synthesis, either collocative or systematic, places subjects in effectual relation and efficient organization. A collocative synthesis does not, however, forego analysis, which inevitably issues from subdivision; but it collocates the results of analytic subdivision. This is the very nature of systematic classification. It opposes the false theory that disorder and dispersion can be obviated or compensated by an alphabetic key or subject-index. There are different ways of combining alphabetic and systematic order. One example is provided by the so-called “Cutter numbers” used by the Library of Congress, where alphabetic arrangement is a very significant aspect of the classification scheme [21]. 6. ConclusionResearch has demonstrated the complexities that may arise from using alphabetization: the apparent simple process can be quite difficult. To order headings and indexes alphabetically is not as straightforward as it may sound, depending on both cultural traditions and different approaches used in different domains or under different circumstances. The implementation of well-established filing rules in computer software has resulted in a number of different proprietary technical solutions established by software companies. What has characterized these has been a lack of interoperability, resulting in incompatible systems. The development of computers and software has been dominated by Anglo-American companies, hence the default “computer” language has been and still is English. This has created a number of difficulties for supporting non-English alphabets, based on both Latin and non-Latin writing systems. Fortunately, the increase in computational power and decrease in storage cost has led to the development of new standards like Unicode, which can support all known writing systems. Unicode has now gained ground as the “default” standard for encoding characters, compatible with virtually all modern computer software. It now seems possible to support our culturally diverse writing systems and to achieve interoperability between different computer software. However, technical as well as philosophical questions persist: What happens when the most comprehensive standards prove impractical to use? And can any alphabetization standard ever function as a neutral tool, or will it always serve some cultures and domains better than others? AcknowledgementsThis article could not have been written without the engagement and help provided by the editor, Birger Hjørland. The authors also thank two anonymous peer-reviewers for valuable feedback. Endnotes1. This entry is about written alphabets only. We are not addressing issues relating to unwritten languages or sign-languages. About the International Phonetic Alphabet see Brown (2013). 2. This first meaning of ordering corresponds to how WordNet 3.1. defines the noun ordering: “S: (n) order, ordering (the act of putting things in a sequential arrangement) ‘there were mistakes in the ordering of items on the list’”. Küster (1999, 21) made a distinction between sorting and ordering that conflicts with the other definitions presented here: “English terminology usually distinguishes between sorting and ordering. Sorting is primarily a service for users to facilitate their access to information by presenting it in a structured and predictable way, e.g. by subdividing the information by subject matter (by having several registers to a book, for instance), having multiple indices in a library etc. Ordering — the arrangement of information in alphabetical sequence — is in most cases an integral part of this procedure”. But as we saw the term ordering is not normally limited to alphabetization. 3. Even a random order may be used for some purposes, e.g. statistical sampling. 4. The Oxford English Dictionary (2018b) defines sorting: “9. a. transitive. To arrange (things, etc.) according to kind or quality, or after some settled order or system; to separate and put into different sorts or classes; to classify; to assort”.
5. About algorithmic sorting see, for example, Knuth (1998), Christophersen (2000) and Wikipedia, “Sorting Algorithm” (2008). 6. Wellisch (1999, 2) defines heading: “Any written, printed or otherwise visually displayed item, consisting of one or more words, that is to be arranged among other such items in a known order”. 7. A character is the “smallest possible unit of arrangement: a space, letter, numeral, punctuation mark, or other symbol” (Wellisch 1999, 1). Later, in Section 4.1 it is mentioned that the Unicode has met some difficulties with characters and that glyphs rather than characters may be needed as units in some scripts. 8. In practice library catalogs will mostly apply the principle of uniform titles to ensure that a translation is entered under the original title to keep versions of the same work together. 9. One of the anonymous referees wrote: “Otherwise I thought this was a firm rebuttal of Weinberger and a challenge to the idea that alphabetical order is arbitrary, on that basis almost every ordering principle is, and even ‘natural’ orders need to seek consensus on the sequence (e.g. natural numbers in ascending order, elements in the periodic table by increasing atomic number and weight). What is a ‘natural’ order (such as the elements) may not be familiar to a lay audience in the manner of alphabetical order, and hence completely ineffective for retrieval”. 10. Valerius Harpocration was, according to Keaney (1973) probably the first to use absolute alphabetization. 11. In this context, it seems relevant to mention that it was Carl Linnaeus (1707–1778) who invented the card index (cf. Mueller-Wille 2009). The card index served an important purpose: “Linnaeus had to manage a conflict between the need to bring information into a fixed order for purposes of later retrieval, and the need to permanently integrate new information into that order”. 12. Beside the guidelines mentioned in the section, Chauvin (1977) should be mentioned. 13. In classification research, in particular in the → facet-analytic tradition, the terms citation order and filing order are well established with the following meanings:
14. ASCII-code order is also called ASCIIbetical order. In ASCII all uppercase come before lowercase letters; for example, Z precedes a (see the ASCII table in Appendix 1). 15. See also “Collation” in Wikipedia (2018). 16. Headings are split into elements where an element can consist of one or more words and is identified by punctuation marks etc., e.g. a person’s name consisting of a last name, first name is split in two elements using the comma as delimiter. 17. The history of the AACR cataloging rules and the different editions can be seen in Joint Steering Committee for RDA (2009); the latest version of the RDA is published by Joint Steering Committee for RDA (2015). Such rules belong to the field of (descriptive) cataloging (see Joudrey 2017). Publishers’ guidelines such as the Chicago Manual of Style (University of Chicago Press, 2017) are mainly constructed from practical experience but there is a growing tendency consider normative guidelines from the perspective of genre- and writing studies, thus contributing theoretical perspectives. 18. Most editions are published in electronic format as well as book form and have an ISBN; however, newer editions are not available in WorldCat or in Amazon but a PDF can be generated from the unicode.org page. Details about the book publication and ordering information of Unicode standards may be found at http://www.unicode.org/book/aboutbook.html. 19. A debate included Goundry (2001) “Why Unicode Won’t Work on the Internet: Linguistic, Political, and Technical Limitations”; Whistler (2001), “Why Unicode Will Work On The Internet”; Peterson (2006) “Unicode in Japan: Guide to a Technical and Psychological Struggle” and Searle (2002) “Unicode Revisited”. ”There is also, in Wikipedia’s entry about Unicode, a section about this: https://en.wikipedia.org/wiki/Unicode#Philosophical_and_completeness_criticisms. 20. However, despite Bliss’ criticism, the dictionary catalog had many followers, and there was a good deal of opposition to his view, most notably by John Metcalfe (1959). 21. Named after Charles Ammi Cutter, Cutter numbers represent a method of combining classifications and alphabetic order. “Example: Call number: ReferencesAdler, Mortimer J. 2007. "Circle of Learning". The New Encyclopædia Britannica, 15th edition. Chicago: Encyclopædia Britannica Inc. Aliprand, Joan M. 2017. “Unicode Standard”. In Encyclopedia of Library and Information Sciences, Fourth Edition. Edited by John D. McDonald and Michael Levine-Clark. Boca Raton, FL: CRC Press, vol. VII, 4662-71. American Library Association. 1980. ALA Filing Rules. Chicago: American Library Association. ANSI INCITIS X3.4-1986. 2007. American National Standard for Information Systems. Coded Character Sets. 7-bit American National Standard Code for Information Interchange (7-Bit ASCII). New York, NY: American National Standards Institute. ARMA: Association of Records Managers and Administrators and ANSI: American National Standards Institute. 2005. Establishing Alphabetic, Numeric and Subject Filing Systems. Lenexa, Kan.: ARMA International. Bakewell, Kenneth Graham Bartlett. 1972. A Manual of Cataloguing Practice. Oxford, UK: Pergamon Press. Batley, Sue. 2005. Classification in Theory and Practice. Oxford: Chandos Publishing. Blair, Anne. 2010. Too Much to Know: Managing Scholarly Information before the Modern Age. New Haven, Conn.: Yale University Press. Bliss, Henry Evelyn. 1933. The Organization of Knowledge in Libraries and the Subject-Approach to Books. New York, NY: Wilson. Brown, Adam. 2013. “International Phonetic Alphabet”. In The Encyclopedia of Applied Linguistics Vol. 1-10, edited by Carol A. Chapelle. Hoboken, New Jersey: Wiley-Blackwell, vol. 5: 1-8. DOI: 10.1002/9781405198431.wbeal0565 BS/EN 13710: 2011. European Ordering Rules: Ordering of characters from Latin, Greek, Cyrillic, Georgian and Armenian scripts. London: British Standards Institution. Chauvin, Yvonne. 1977. Pratique du Classement Alphabétique. 4e éd. Paris: Bordas. Christophersen, Hans. 2000. Alphabetisierung auf Computer: Prinzipien, Probleme und eine Lösungsverbesserung. Sorø, Denmark: Rostra. Retrieved from: http://www.rostra.dk/alphabet/alpha%5Fdt.htm Daly, Lloyd W. 1967. Contributions to a History of Alphabetization in Antiquity and the Middle Ages. Bruxelles: Latomus. Daniels, Peter T. and William Bright (eds.). 1996. The World's Writing Systems. New York: Oxford University Press. Davis, Mark, Ken Whistler and Markus Scherer. 2018. Unicode Collation Algorithm (11.0.0). Mountain View, CA: Unicode Consortium. Retrieved from https://www.unicode.org/reports/tr10/. Diringer, David. 1962. Writing. London: Thames & Hudson. Drucker, Johanna. 1995. The Alphabetic Labyrinth: The Letters in History and Imagination. London: Thames & Hudson. Drucker, Johanna. 2022. Inventing the Alphabet: The Origins of Letters from Antiquity to the Present. Chicago, IL University of Chicago Press. Flanders, Judith. 2020. A Place for Everything: The Curious History of Alphabetical Order. London: Picador. Goundry, Norman. 2001. Why Unicode Won’t Work on the Internet: Linguistic, Political, and Technical Limitations. Lake Tahoe, NV. Hastings Research, Inc. (Technical Papers). http://www.hastingsresearch.com/net/04-unicode-limitations.shtml Hanson, Eugene R. and Jay E. Daily. 1970. “Catalogs and Cataloging”. In Encyclopedia of Library and Information Science vol. 4, edited by Allen Kent and Harold Lancour. New York: Marcel Dekker, 242-305. (Reprinted in later editions of the encyclopedia, including the 2017 edition). Holmes, Neville. 2003. “The Problem with Unicode”. Computer 36, no. 6, 116 + 114-115 [sic]. DOI: 10.1109/MC.2003.1204385. Hooker, James T. (Ed.). 1990. Reading the Past: Ancient Writing from Cuneiform to the Alphabet. London: British Museum Press. Immroth, John Phillip. 1971. “Cutter, Charles Ammi”. In Encyclopedia of Library and Information Science, edited by Allen Kent and Harold Lancour. New York, NY: Marcel Dekker, vol. 6: 380-7. ISO/IEC 646. 1991. Information Technology, ISO 7-Bit Coded Character Set for Information Interchange. Geneva: International Organization for Standardization and International Electrotechnical Commission. ISO/IEC 8859. 1999. Information technology, 8-Bit Single-Byte Coded Graphic Character Sets. Geneva: International Organization for Standardization and International Electrotechnical Commission. ISO/IEC 10646. 2017. Information Technology, Universal Coded Character Set (UCS). Geneva: International Organization for Standardization and International Electrotechnical Commission. ISO/IEC 14651. 2016. Information Technology, International String Ordering and Comparison, Method for Comparing Character Strings and Description of the Common Template Tailorable Ordering. Geneva: International Organization for Standardization and International Electrotechnical Commission. Joint Steering Committee for RDA. 2009. A Brief History of AACR. Retrieved 2018-11-04 from http://www.rda-jsc.org/history.html. Joint Steering Committee for RDA. 2015. RDA: Resource Description and Access, 2015 Revision. London: Facet Publishing. Joudrey, Daniel N. 2017. “Cataloging”. Encyclopedia of Library and Information Sciences, Fourth Edition. Edited by John D. McDonald and Michael Levine-Clark. Boca Raton, FL: CRC Press, Vol. 2: 723-32. Keany, John. J. 1973. “Alphabetization in Harpocration’s Lexicon”. Greek, Roman, and Byzantine Studies 14, no. 4: 415–23. Knuth, Donald E. 1998. “Sorting and Searching”. In The Art of Computer Programming vol. 3, second edition. Boston: Addison-Wesley. Küster, Marc Wilhelm. 1999. “Multilingual Ordering–the European Ordering Rules”. In Multilinguale Corpora: Codierung, Strukturierung, Analyse: 11. Jahrestagung der Gesellschaft für Linguistische Datenverarbeitung, edited by Jost Gippert and Peter Olivier. Prag: Enigma, 21–33. Retrieved from https://pdfs.semanticscholar.org/509e/4be5fba9ee91e73caf9dc4a4dab58c5d667a.pdf. Küster, Marc Wilhelm. 2006. Geordnetes Weltbild: Die Tradition des alphabetischen Sortierens von der Keilschrift bis zur EDV: Eine Kulturgeschichte. Berlin: De Gruyter. Lock, Andrew and Charles R. Peters (eds.). 1999. Handbook of Human Symbolic Evolution. Oxford, UK: Blackwell. Mackenzie, Charles E. 1980. Coded Character Sets: History and Development. Reading, MA: Addison-Wesley Publishing. Available from https://textfiles.meulie.net/bitsaved/Books/Mackenzie_CodedCharSets.pdf. Martin, Geoffrey. 2003. “Alphabetization Rules”. In International Encyclopedia of Information and Library Science, second edition, edited by John Feather and Paul Sturges. London: Routledge, 15-17. Metcalfe, John. 1959. Subject Classifying and Indexing in Libraries and Literature. Sydney: Angus & Robertson. Mueller-Wille, Staffan. 2009. “Carl Linnaeus Invented the Index Card” Speech held at the annual meeting of the British Society for the History of Science in Leicester, UK on Saturday 4 July 2009. https://phys.org/news/2009-06-carl-linnaeus-index-card.html. Mulcaster, Richard. 1582. The First Part of the Elementarie which entreateth chefelie of the right writing of our English tung. London: T. Vautroullier. P. 177-178 reproduced at http://www.bl.uk/learning/images/texts/dict/large1323.html. Oxford University Press [2018a] ”Ordering”. In Oxford English Dictionary. Retrieved 2018-1104 from http://www.oed.com. Oxford University Press [2018b] ”Sorting”. In Oxford English Dictionary. Retrieved 2018-1104 from http://www.oed.com. Peterson, Benjamin. 2006. “Unicode in Japan. Guide to a Technical and Psychological Struggle”. Blog post. Retrieved from: https://web.archive.org/web/20090627072117/http://www.jbrowse.com/text/unij.html Rather, John C. and Susan C. Biebel. 1980. Library of Congress Filing Rules. Washington: Library of Congress: available from Customer Services Section, Cataloging Distribution Service, Library of Congress. Reitz, Joan M. 2004. ODLIS: Online Dictionary for Library and Information Science. Westport, Conn.: Libraries Unlimited. Digital edition: Western Connecticut State University. Retrieved from https://www.abc-clio.com/ODLIS/odlis_a.aspx. Richards, Richard A. 2016. Biological Classification: A Philosophical Introduction. Cambridge, UK: Cambridge University Press. Rosen, Michael. 2015. Alphabetical: How Every Letter Tells a Story. Berkeley, CA: Counterpoint. Searle, Steven J. 2002. Unicode Revisited. Retrieved from: http://tronweb.super-nova.co.jp/unicoderevisited.html. Sundin, Olof and Jutta Haider. 2013. "The Networked Life of Professional Encyclopaedias: Quantification, Tradition, and Trustworthiness." First Monday [Online], 18.6 (2013): n. pag. Accessed 27 Nov. 2018 from https://firstmonday.org/article/view/4383/3686. Unicode Consortium. 2018. The Unicode Standard. Version 11.0.0. Mountain View, CA: Unicode Consortium. Retrieved from http://www.unicode.org/versions/Unicode11.0.0/. University of Chicago Press. 2017. The Chicago Manual of Style, Seventeenth edition. Chicago: The University of Chicago Press. Weinberger, David. 2007. Everything Is Miscellaneous: The Power of the New Digital Disorder. New York: Times Books. Wellisch, Hans H. 1999. Guidelines for the Alphabetical Arrangement of Letters and Sorting of Numerals and other Symbols. Bethesda, Maryland: National Information Standards Organization. (NISO Technical Report 3 NISO TR03-1999). Digital version: http://www.niso.org/publications/tr/tr03.pdf. Archieved by WebCite: http://www.webcitation.org/6pUXHMCDG. Whistler, Ken. 2001. “Why Unicode Will Work On The Internet”. Blog Posted by timothy on Saturday June 09, 2001 @12:00PM from the Contrary-viewpoints dept. Retrieved from: https://features.slashdot.org/story/01/06/06/0132203/why-unicode-will-work-on-the-internet. Wikipedia, the Free Encyclopedia. “Sorting Algorithm”. Retrieved 2018-11-03 from: https://en.wikipedia.org/wiki/Sorting_algorithm. Wikipedia, the Free Encyclopedia. “Unicode”. Retrieved 2018-11-03 from: https://en.wikipedia.org/wiki/Unicode. Wikipedia the Free Encyclopedia. “Universal Coded Character Set” Retrieved 2018-11-03 from: https://en.wikipedia.org/wiki/Universal_Coded_Character_Set. Winke, R. Conrad. 2002. "The Contracting World of Cutter’s Expansive Classification". Library Resources & Technical Services 48, no. 2: 122-129. Wordnet Search 3.1. “Ordering”. Retrieved 2018-11-04 from http://wordnetweb.princeton.edu/perl/webwn?s=ordering. Wordnet Search 3.1. “Sorting”. Retrieved 2018-11-04 from http://wordnetweb.princeton.edu/perl/webwn?s=sorting. Yeo, Richard. 1991. “Reading Encyclopedias: Science and the Organization of Knowledge in British Dictionaries of Arts and Sciences, 1730-1850”. Isis 82, no. 1: 24-49. Appendix 1: ASCII TableAppendix 2: Developments in character codes by bits
Version 1.0 published 2019-01-10
This article (version 1.0) is also published in Knowledge Organization. How to cite it: Korwin, Wendy and Haakon Lund. 2019. “Alphabetization”. Knowledge Organization 46, no. 3: 209-222. Also available in ISKO Encyclopedia of Knowledge Organization, eds. Birger Hjørland and Claudio Gnoli, https://www.isko.org/cyclo/alphabetization
©2019 ISKO. All rights reserved. |
|||||||||||||||||||||||||