Abstract:
This contribution presents the principal features of ontologies, drawing special attention to the comparison between ontologies and the different kinds of knowledge organization systems (KOS). The focus is on the semantic richness exhibited by ontologies, which allows the creation of a great number of relationships between terms. That establishes ontologies as the most evolved type of KOS. The concepts of “conceptualization” and “formalization” and the key components of ontologies are described and discussed, along with upper and domain ontologies and special typologies, such as bibliographical ontologies and biomedical ontologies. The use of ontologies in the digital libraries environment, where they have replaced thesauri for query expansion in searching, and the role they are playing in the Semantic Web, especially for semantic interoperability, are sketched.
As a philosophical discipline, ontology (etymology: τά ὄντα, or ὄντος λόγος) is the branch of metaphysics that studies the nature of being, defines fundamental categories and the structures of reality and tries to classify entities in all spheres of being. In information systems, ontologies are engineering artifacts [1], or shared conceptual schemes that define relevant entities, concepts, objects, relationships between them, and their properties, which are increasingly used to represent a field of knowledge or the structure of systems using an unambiguous and machine-readable language (Guarino, Oberle and Staab 2009, 2).
At the end of the 1950s, ontologies arose in the context of computer science, and studies were developed in the databases community to identify tools useful for defining entities in the creation of more sophisticated databases (Legg 2007, 425). Taxonomies of entities and shared dictionaries of terms, also provided with axioms, came to be used to solve the terminological difficulties that arise in building databases from using different labels to define identical entities, or the difficulties that appear from using different names for the same meaning (Smith 2003, 158-159). Computer scientists became interested in the use of ontologies in the subfields of database management systems, of software engineering and of conceptual modelling. They began to consider the creation of a shared robust ontology of entities advantageous for their aims.
The step from each of these three starting-points to ontology is then relatively easy. The knowledge engineer, conceptual modeler, or domain modeler realizes the need for declarative representations which should have as much generality as possible to ensure reusability but would at the same time correspond to the things and processes they are supposed to represent (Smith and Welty 2001, 3).
Moreover, artificial intelligence developed more computational models to sustain automated reasoning. Languages for knowledge representation with the support of artificial intelligence were developed during the second half of the 1900s, and the adoption of description logics (Nardi and Brachman 2003; Baader, Harrocks and Sattler 2009) indicated a significant evolution for ontologies. Ontology became more closely interconnected to artificial intelligence, which increasingly focused attention on automated reasoning systems and tried to develop sets of terms with axioms to constrain and disambiguate meanings. As it is asserted by Smith and Welty (2001), “the field of knowledge engineering was born”.
The relationships between philosophical and technological ontologies have been the objects of important analysis. Poli and Obrst (2010) argue that the philosophical perspective of studying ontologies as systems of categories and the technological perspective of computer science creating artifacts to help software make inferences are not radically distinct. On the contrary, they suggest that the two fields are complementary and should collaborate. The two perspectives are dependent on each other, as technological ontologies aim to create models used by software that need the help of philosophical ontology in describing the real world; whereas philosophical ontologies may benefit from software engineering products. Of course, what a computational ontology should include, or what should be the boundaries of a domain ontology, is a philosophical issue (Poli and Obrst 2010, 11). Moreover, as it has been highlighted by Sowa (1995), philosophical strategies have assumed an important position in practical applications that aim to build object-oriented systems and knowledge bases in artificial intelligence. For instance, Sowa suggested applying top-level categories that follow systems of categories developed by philosophers, such as → Charles S. Peirce and Alfred N. Whitehead.
The relevance of philosophical ontologies in the modern age, as Nicola Guarino and Roberto Poli (1995) stressed, lies in that, in knowledge engineering, the definitions of categories and concepts used in the databases are crucial. As in knowledge engineering the principal aim is to integrate and reuse portions of knowledge bases, the transparency of its commitments to the external world is required. Therefore, ontology as a philosophical discipline that deals with the nature of reality can support this mission.
Since the end of the 1990s, the directories of web sites by Web services providers have helped users explore the Internet and discover information. From 1994 to 2002, Yahoo! provided a directory of sites based on subject categories and subcategories, similar to a classification system. From 1998 to 2017 DMOZ, thanks to a volunteer editing community, provided a taxonomy of top-level categories and other lower-level categories to organize searching. At present it has been moved to Curlie (“The Collector of URLs”), the largest Web directory.
One of the goals of the Semantic Web, devised by Tim Berners-Lee (Berners-Lee, Hendler and Lassila 2001), was to make a more fruitful → document (Buckland 2018) search possible and to more easily locate → data (Hjørland 2018c) through the use of stronger and more refined research tools that go beyond a simple search based on the presence of words in titles, abstracts or full-texts and to search semantically based on content. Traditional Web searching by means of keywords leads to unsatisfactory results and noise. However, the most ambitious aim of the Semantic Web was to extend the Web by automatically integrating data and information from many sources online, by implementing software able to perform automatic reasoning (Fensel et al. 2003). In the original concept of the Semantic Web, software agents would have processed contents, found information from different sources, reasoned about data, and produced output.
According to this conceptualization, ontologies have been considered the most suitable tools to go beyond the boundaries of the traditional strategies to find and access information. Their relevance appears in machine-to-machine communication, in the exchange of data among systems and in the possibility of facilitating → interoperability (Zeng 2019) across heterogeneous systems (see section 8).
Albeit more simple, something like the Semantic Web envisioned by Berners-Lee in his first formulation was accomplished in 2010, in the development of digital assistants such as Siri, Alexa, Cortana and Google Assistant, that is, software agents that are able to interpret the human speech. Alexa Voice Service, for instance, is an artificial intelligence technology based in the Cloud, developed by Amazon in 2014. It works as a vocal assistant, able to reply to a vocal order to search the Web, find information and interact with services useful for a smart home. A central computing system receives queries and manages them using natural language processing procedures. The most important feature is the ability to parse queries expressed in many different ways and using different words, to capture the underlying concepts and to reply to what is the most likely request (Hoy 2018, 82-83).
Our first concern lies in offering an accurate definition of ontology. In the context of computer science, ontologies are considered artifacts provided for a purpose. They define primitives, that is, classes, properties and relations among the members of classes that are relevant to model the knowledge of a domain (Gruber 2009). Within the knowledge engineering community, as has been stressed by Guarino and Giaretta (1995), ontologies are considered either a conceptual, semantic-level framework, or a concrete artifact provided for a specific purpose. From this double approach arises terminological ambiguity, and Guarino and Giaretta suggest separating the two levels by using the term conceptualization as an intensional structure to define the semantic structure of a conceptual system, and the term ontological theory to denote the artifact or the base of knowledge to be used and shared, providing always true sentences according to a certain conceptualization (about “conceptualization”, see section 4).
Focusing on ontologies as a philosophical categorical analysis, Grenon and Smith (2004, 143) provide the following definition of ontology:
An ontology is captured by depiction of the entities which exist within a given portion of the world at a given level of generality. It includes a taxonomy of the types of entities and relations which exist in the world under a given perspective.
The focus is on the entities that exist in the world at a given level of granularity. Considering the form of ontologies, Grenon and Smith (2004) distinguished two types of ontologies: SNAP ontologies, that is, ontologies for continuants such as entities, physical, or mental objects; and SPAN ontologies for occurrents, that is, events and processes, etc. In their opinion, an ontology should be indexed according to a perspective and considering the time: SNAP ontologies, for continuants, deal with a single instant in time; whereas SPAN ontologies, for occurrents, deal with a time interval.
On the contrary, Roberto Poli (1996) suggested considering ontologies not a catalogue of the world or a list of objects, but rather an organizational framework for catalogues, taxonomies and terminologies.
An ontology is not a catalogue of the world, a taxonomy, a terminology or a list of objects, things or whatever else. If anything, an ontology is the general framework (= structure) within which catalogues, taxonomies, terminologies may be given suitable organization. This means that somewhere a boundary must be drawn between ontology and taxonomy. (Poli 1996, 313)
This indicates the need for drawing boundaries between, for instance, ontologies and taxonomies. As a matter of fact, in → library and information science (Hjørland 2018a; 2018b), a widespread opinion considers the term ontology as just a new name used in computer science to define tools developed in the field of knowledge organization, such as taxonomies or → library classification systems (Hjørland 2017).
→ Ingetraut Dahlberg (1996, 129) claimed that the computer science community was using the term ontology to mean what knowledge organization has always called “taxonomy” or “classification”, and that “ontology has indeed something to do with classification systems in the sense that what we need to organize are our concepts about reality, about which we face and know of or learn about, thus creating our knowledge units, our concepts and our concept systems”.
A similar opinion was expressed by Dagobert Soergel (1999, 1120), who considered ontologies similar to classifications, asserting that, although libraries and information systems have long been using classification schemes, recently other fields such as artificial intelligence, linguistics and software engineering “have discovered the need for classification, leading to the rise of what these fields call ontologies”. Therefore, in his opinion, ontologies are considered similar to other kinds of KOS.
On the contrary, following Poli’s suggestion (1996, see above), it is essential to accurately distinguish ontologies from taxonomies and other kinds of KOS.
3. Ontologies and knowledge organization systems (KOS)
To offer a presentation and discussion of ontologies, first the relationship between ontologies and KOSs must be addressed. In the broad field of → knowledge organization (Hjørland 2008; 2016a), → knowledge organization systems (Mazzocchi 2018) are tools for describing resources and aiding in the access and retrieval of documents and information. A comprehensive and still useful definition, offered by Hodge (2000, 1), related to the framework of digital libraries, considers KOSs “all types of schemes for organizing information and promoting knowledge management”. Broadly speaking, KOSs range from authority files, classification systems and subject headings to → thesauri and, according to some scholars, to ontologies. As proposed by Marcia L. Zeng (2008, 161), KOSs may be categorized according to their structure and function. Their structures can range from flat to two-dimensional to multiple-dimensional, and their functions include eliminating ambiguities, controlling synonyms, establishing hierarchical and associative relationships, and presenting properties. Based on this categorization, Zeng presents a taxonomy of KOSs including simpler KOSs, such as dictionaries, glossaries and authority files, and more complicated structures, such as classification schemes, subject headings, and thesauri used in libraries and information centres, and, finally, ontologies.
According to Souza, Tudhope and Almeida (2012, 181, based on Souza, Tudhope and Almeida 2010), different KOSs are representations “based on concepts and with different degrees of relationships among them”. In addition to classification schemes, folksonomies, dictionaries, taxonomies, thesauri, data models, etc., it also includes different kinds of ontologies, ranging from informal to formal, which allow representation of all types of relationships.
Comparably, in the computer science community, Smith and Welty (2001), following a previous report by Welty et al. (1999), present a wide spectrum of artifacts, that they classify under the rubric of “ontologies”, as all satisfy Gruber’s definition (see section 4). These range from catalogues, glossaries, thesauri, frame-based systems, to more expressive ontologies that use axioms. In this case, the characteristics concern the increasing complexity of the information artifacts and the artifacts are distinguished between those that possess the ability for automated reasoning based on formal logic, and those that do not (Fig. 1)
McGuinnes (2003, based on Lassila and McGuinnes 2001) too proposed a particularly broad notion of ontology that ranges from so-called simple ontologies, such as controlled vocabularies, glossaries, taxonomies and thesauri, to complex ontologies, that is the tools that present properties and restrictions of values (Fig. 2).
In the literature produced by the aforementioned community, the term lightweight ontology is used essentially to mean simple taxonomies of concepts organized hierarchically based on the genera/species relationship, which can be employed to reach semantic interoperability if users take part in groups that share terminology and concepts (Zhu 2006). Classification systems and taxonomies may be transformed into formal systems written in a formal language instead of being written in natural language. In fact, natural language is ambiguous, leaves room for subjective opinions, and is barely capable of being automated. A formal classification is the formalized copy of a classification, encoded in a language of the family of description logics. As emphasized by Giunchiglia et al. (2006, 85), “a [formal classification] has the same structure as the classification, but it encodes the classification's labels in a formal language, capable of encapsulating, at the best possible level of approximation, their classification semantics”. Therefore, taxonomies, thesauri, faceted classification systems, and Web directories are defined as informal lightweight ontologies, that is, “prototypes of formal lightweight ontologies” (Giunchiglia and Zaihrayeu 2007).
The distinction between lightweight and heavyweight ontologies is that the latter primarily use axioms to model knowledge in order to define the semantic interpretation of the presented entities, rules, and class constraints and present multiple relations between concepts. Based on highly expressive formal logic languages to specify entities and relationships, heavyweight ontologies espouse search engines to make inferences and automatically reason. Description logics provides tools to express propositions about objects, about the attributes that the objects may have in common, and about the relations among objects. Descriptive formalisms include conjunction, negation and concepts intersection, value restriction, and existential quantification. The systems based on description logics allow human and automatic agents to realize inferences, that is, to infer new knowledge from a knowledge base.
3.1 The issue of KOS spectra with respect to ontologies
Some scholars have presented spectra of KOSs to explicate their features, often considering a simple criterion each time, as Souza, Tudhope and Almeida (2012) highlighted, remarking on their disagreement with this approach, as it neglects alternative criteria. One of the criteria considered is semantic richness, that is, the number of semantic relations between concepts, universals or particulars that KOSs exhibit. In fact, the “semantic staircase” presented by Olensky (2010) (Fig. 3) (and suggested earlier by Blumauer and Pellegrini 2006), shows a spectrum of KOSs based on the “semantic richness” that increases from glossaries to ontologies, with ontologies showing the highest degree of richness and allowing an unlimited set of semantic relations. Ontologies, thus, are considered as the most evolved form of KOS.
On the contrary, in the realm of knowledge engineering, Guarino (2006) emphasizes the concept of “precision” to define formal ontologies in comparison with traditional knowledge organization systems. Precision defines the exactness of the representation of a domain in a formal ontological environment compared to traditional KOSs, and Guarino considers “ontological precision” the key concept to represent the axis along which to arrange the different artifacts (Fig. 4).
In the perspective offered by Marcia L. Zeng (2008) (Fig. 5), like that presented by Olensky, KOSs are arranged in a spectrum with increasing semantic richness. Thesauri, semantic networks and ontologies are presented as members of the category “relationship models”, as they can represent many relationships. Though both thesauri and ontologies present semantic relationships, Zeng identifies the characteristics of ontologies by their ability to present, among major functions, “properties” for each class. Properties (attributes), that is, object-properties (see section 4.1) specify “how the individuals relate to other individuals” (OWL 2 Web Ontology Language Primer 2009). Ontologies, thus, have in particular the characteristic of representing attributes for each class, and differentiate from other kinds of KOS, such as classification systems and thesauri.
Therefore, taking into consideration the relationships with other KOSs, we could suggest the following definition of ontology:
ontologies are a kind of KOS that present the highest degree of semantic richness, as they allow to establish a great number of relations between terms.
Whether we should consider properties as semantic relations that characterize ontologies is an open question. Taking into account the suggestion by Zeng (2008), we should include in the definition of ontologies the distinguishing feature of presenting “properties” that are not completely similar to semantic relationships (associative). Therefore, we instead suggest the following more complete definition:
ontologies are a kind of KOS that present the highest degree of semantic richness, as they allow to establish a great number of relations between terms, and provide attributes for each class.
However, the focus of the discussion is not only the number and the kind of relations allowed to be included in ontologies, but also the underlying view of semantics. Semantics is the study of meaning, and it has spread into the fields of linguistics and logics. In information technology, and notably in the Semantic Web, semantics concerns the possibility of increasing the semantic power of descriptive → metadata, of improving knowledge representation, and thus retrieval on the Web. Consequently, it is important to wonder what kind of semantics is involved in the construction of the Semantic Web. In the Semantic Web, only formal semantics is relevant (Almeida, Souza and Fonseca 2011). Formal semantics encompasses theories that originated from philosophical logic, and is mainly founded on the principle of the “truth-condition” of sentences. It considers the meaning of a sentence equivalent to knowing its truth-condition in order to bypass the ambiguity of natural languages. Formal semantics may be involved in human-oriented systems, as well as in machine-oriented systems. In their explanation, Almeida, Souza and Fonseca (2011) argue that the Semantic Web aims to improve the inferences based on logics; whereas, the realm of meaning is much more comprehensive and complex.
Many scholars claim that ontologies are understandable by machines, in contrast with other systems that are only understandable by humans, and that on this characteristic lies the relevance of ontologies. On the other hand, it is worth noting that this ability has also been shown by thesauri, for instance. Thesauri may be built following a logical principle that allows narrower terms to inherit the characteristics of the broader terms to which they are connected. The command “explode” may be provided within the searching interface and allow users to include all the narrower terms linked to a top term used for searching. The PsycINFO database, for instance, available through EBSCO Discovery Services, can launch a query while adding all narrower and related terms that are part of the semantic object of searching, supported by the Thesaurus of Psychological Index Terms and its hierarchical structure. Also, thesauri have been developed following logical principles that allow computers to process data.
In computer science, the most cited definitions of ontologies are those focused on the notions of conceptualization and of shared meanings, by Thomas R. Gruber: “An ontology is an explicit specification of a conceptualization” (Gruber 1993, 199) and by Rudi Studer “An ontology is a formal, explicit specification of shared conceptualization” (Studer, Benjamins and Fensel 1998).
A conceptualization is a synthetic view of the world according to some purposes, a conceptual representation of a specific field of knowledge that represents concepts, entities, objects and relations among them by specifying the links between those concepts, objects, events and entities, pertaining to a field of interest. The better strategy to specify a conceptualization in ontologies is to adopt the intentional logics that considers abstract concepts and relations that are unchanging if the world changes, and to make axioms. On the contrary, a conceptualization based on an extensional notion could not fit our needs, because it depends on a specific state of the world (Guarino, Oberle and Staab 2009, 5-6).
Some computer scientists emphasized more the notion of ontologies as artifacts that allow formal modelling of the entities and the relations in a system and are expressed in a formal, machine-readable format that computers can process (Guarino, Oberle and Staab 2009). The meaning of “formal” is crucial. In the conceptualization of the knowledge engineering community, “formal” means that “the expressions must be machine readable, hence natural language is excluded” (Guarino, Oberle and Staab 2009, 8). The informal approach concerns glossaries, hierarchies (and → folksonomies), and thesauri; whereas, the formal approach uses logical languages (first order logics, description logics) to represent ontologies (Guarino, Oberle and Staab 2009, 13). Poli and Obrst (2010, 4) suggest adopting the name “formalized ontologies” to mean the formal codification for the constructs acquired in a logical language such as first order logic or in a description logics-based language, such as OWL.
A clear definition of formal ontologies comes from Sowa (2009):
A terminological ontology whose categories are distinguished by axioms and definitions stated in logic or in some computer-oriented language that could be automatically translated to logic.
It is debatable whether thesauri could also be considered formal structures, as they show formally defined relationships based on internationally shared guidelines, and are processable by computers. Thesauri may be considered an example of a conceptualization formally defined, even though the standards do not always clearly distinguish between concepts and terms, as Dextre Clarke and Zeng (2012) have highlighted. Moreover, thesauri can be used to support the techniques of query expansion, as is usual with ontologies. Computational ontologies present a formal conceptualization of a field of knowledge if they adopt a formal logic language. However, the most important feature of ontologies in relation to thesauri is not the use of formal languages, but rather that they can express all semantic relations as needed.
A considerable debate about the formal aspect of ontologies and the relationships between formal language-based systems and natural language-based systems has been ongoing in the first ten years of the 2000s. John Sowa’s (2006) point of view is notable for highlighting that the precision of formal languages does not fit the needs of users: “A precise, finished ontology stated in a formal language is as unrealistic as a finished computer system” (Sowa 2006, 204). Unlike formal languages, natural languages present vagueness, ambiguity and flexibility, as words may have different senses according to the different contexts or uses. Following Wittgenstein’s proposal of a multiplicity of “language games” and Peirce’s semiotics along with his view of “interpretant”, Sowa envisaged a modular approach using a dynamic collection of formal ontologies including all possible combinations through “systematic mappings to formal concept types and informal lexicons of natural language terms” (Sowa 2006, abstract). In this case, the richness of natural language would not be lost. Hjørland (2007) stressed the relevance of semantic relations in devising thesauri, taxonomies, classification schemas and ontologies as well as bibliometric maps and bibliographical databases. The relevance lies in that semantic tools cannot be based on neutral criteria or shared meanings; on the contrary, they are biased toward different paradigms represented in the literature of the field that they serve. “Any semantic tool may be more or less in harmony, or in conflict, with the views represented in the literature” (Hjørland 2007, 389).
As it is well known, one unique and correct way to model a domain does not exist; a domain may be modelled following different perspectives or considering the applications in which the ontology may be used (Noy and McGuinnes 2001, 4). In addition, it is worth noting that the description of domains may evolve, and some scholars consider computational ontologies as “dynamic” artifacts that may be implemented and populated manually, semi-automatically or automatically, following updates in the fields of knowledge (Buckner, Niepert and Allen 2011).
The key components of an ontology are classes, instances, relationships, properties (or attributes), restrictions, and axioms. In knowledge representation, concepts are defined as classes or sets of individual objects (Nardi and Brachman 2003). Classes group individuals (instances) that show something in common and they represent sets of individuals. In concept modelling, classes may be used also to denote “the set of objects comprised by a concept of human thinking, like the concept person or the concept woman” (OWL 2 Web Ontology Language Primer 2012, § 4.1).
Relationships are shown specifying class hierarchies, often offering subclass axioms to enable reasoners to make inferences about instances. The basic hierarchical relationship is is-a, for instance, Mother is-a Parent where is-a defines a hierarchy and allows Mother to inherit the properties (or attributes) from Parent. The subsumption is a fundamental mechanism that allows representation of the hierarchical relationship between concepts, along with the whole-part relationship.
Properties characterize the instances that make up a class and define the way in which the individuals are related. Properties connect individuals belonging to one class or to different classes; they use restrictions of domain [2] and range [3] to precisely define the class of individuals that can be connected by the property, and restrictions of existential quantification (cardinality) to define the maximum and minimum of individuals that can be connected. Properties are subdivided into object properties, which connect an individual to another individual, defining how the individuals are correlated, and datatype properties” [4], which are used to ascribe data values to objects, such as an age or a role to a person, or a date of publication to a bibliographical resource. To define the values, XML Schema DataTypes (2012) may be used.
Axioms make statements and definitions that are considered to be true, such as the definitions of classes and subclasses, within ontologies.
Description logics, a logic modeling language, allows relationships to be established between concepts to represent other relationships based on attributes; relationships between classes, for instance, equivalence and disjointedness; and to establish axioms. It allows the use of some logic operators, such as intersection, union, and complement of concepts, along with value restrictions and existential quantifications.
So far, we have considered ontologies as based on concepts. However, according to some scholars, the terms concept and conceptualization used in describing ontologies are ambiguous and subject to different interpretations. In their opinions, having concepts as subject matter means to assume that knowledge exists only in the minds of human beings and that it is known through our own concepts. Most important is the realistic perspective embraced by Barry Smith (2004), who underlines that in natural sciences, the ontologies developed in support of research should not be based on concepts, but rather on “the universals and particulars which exist in reality and are captured in scientific laws” (Smith 2004, 73), that is, what we know about reality from the work of scientists.
Smith defines two kinds of ontologies: SNAP ontologies concern all the entities in the universe, everything and its parts; SPAN ontologies concern the entities that happen in successive parts (cf. above). The ontological foundation to classify entities lies in “material universals”, which are not concepts, but real entities to which our concepts correspond and are multiply instantiated (Grenon and Smith 2004, 144). The instantiation is a relation between universals and particulars [5]. The ontologies in natural sciences, indeed, should present the world as including universals (or types), “counterparts in reality of (some of) the general terms used in the formulation of scientific theories”, and particulars (or instances) that exist in time and space as concrete entities and can be depicted on the basis of observation (Smith and Ceusters 2010, 141). Instances are not repeatable, whereas universals are repeatable.
4.2 Similarities and contrasting features of thesauri and ontologies
Since the 1960s, a thesaurus (Dextre Clarke 2019) offers the map of concepts, terms, and relationships between them for any disciplinary field. Based on the linguistic conception of “semantic field”, thesauri constitute a structure that allows the control of synonyms and homonyms. Three kinds of relationships are provided: an equivalence relation between terms (UF, Use for); hierarchical relations (BT Broader Term, NT, Narrower Term) and associative relations (RT, Related Term) between concepts.
Some scholars claim that the most important feature of ontologies is that they present a formal conceptualization of a field of knowledge as they adopt a formal logic language. However, the most relevant characteristic of ontologies in relation to thesauri is not the use of formal languages, but rather that they express all semantic relations as needed whereas thesauri provide a limited number of pre-defined semantic relations between concepts.
The debate about the role of thesauri in modern information retrieval was recently renewed thanks to the meeting held in February 2015 by the ISKO-UK chapter (“This House believes that the traditional thesaurus has no place in modern information retrieval”). The meeting was followed by the publication of a special issue of Knowledge Organization in 2016 (Dextre Clarke 2016).
Although in the context of computer science statistical and algorithmic power has challenged traditional information retrieval systems in which thesauri played a substantial role, thesauri could still have great potential in bibliographical databases (such as Medline, PsychInfo), as asserted by Hjørland (2016b). One prominent problem is whether a type of KOS may be transformed into another, for instance, a thesaurus into an ontology, and whether thesauri would improve their functionality, providing a more consistent number of relations in the style of an ontology (Hjørland 2015b, 118-122; 2016b, 150). For instance, thesauri should provide different kinds of semantic relationships by offering more precise “related terms”. As Hjørland (2015b) argues, the characteristics and the form of thesauri with limited relationships have never been discussed or justified in theory or in practice. The RT relationship or, better, the “unspecified semantic relations” in thesaurus construction encompasses antonyms, cause-effect relations, sequences of facts, which, instead, are offered with more precise definitions in ontologies (Hjørland 2015a, 1369).
In their traditional form, in fact, thesauri have no place in modern information retrieval. However, Hjørland (2016b, 151) suggests an “open approach to any kind of semantic relations useful for a given task in a given domain”, as every field needs specific relationships and thesauri should be grounded on domain-specific characteristics rather than on standardized methods.
In contrast, Tudhope and Binding (2016) underline the relevance of thesauri in the Linked Open Data (LOD) environment, which has overtaken formal ontologies and logic-driven applications for the first time. Semantic Web applications use the SKOS data model (see section 8), based on the RDF data model (2004). Thesauri may be published in LOD format and may be accessible to applications that can process data in RDF. In recent years, some cultural portals moved from the initial applications based on logics to the use of SKOS vocabularies for browsing collections, as an URI is added for each concept. Mapping and connecting a thesaurus to other thesauri allows us to retrieve results in many languages (Tudhope and Binding 2016, 177).
Information scientists have mostly pursued the reengineering of thesauri in ontologies as a strategy in order to provide structured hierarchies of concepts and connections among them using relations and attributes, with the support of the terminology offered by a thesaurus that is used for an extended time (Soergel et al. 2004).
For instance, the contents of the AGROVOC thesaurus, which provides a standardised multilingual terminology in the agricultural field, has been represented in a more suitable way for searching the Semantic Web, using the modelling ontology languages (OWL) in order to create a structure based on semantic relationships, which will allow automatic inference. The restructuring work started in 2001 (Sini et al. 2008). The conversion of the AGROVOC thesaurus into an ontology model has expanded the NT, BT and RT relations into more defined relations, such as inclusion, spatial inclusion, membership, and inheritance. Moreover, the relations such as genus/species and whole/part have been extended providing more specific semantic relationships, including, for instance, the relations of cause, similarity and difference, and processes (Soergel et al. 2004). The richer types of relationships offered will allow systems to provide access to documents and advanced functionalities such as information discovery and reasoning.
5. Types of ontologies: upper ontologies and domain ontologies
Upper ontologies (top-level or foundational ontologies) represent universal concepts and properties, independent from single scientific fields, such as event, space, time, dimension, substance, phenomenon, identity, process, quantity, etc., on which a general consensus of large user communities should be achieved. The main aim of foundational ontologies is to allow multiple artificial agents to cooperate with each other and with humans. To achieve this, foundational ontologies “negotiate meaning” and help for “establishing consensus in a mixed society” (Gangemi et al. 2002).
In philosophy, Husserl (Logical Investigations, 1900) used the term formal ontology meaning categories that characterize aspects or types of reality; basically, they correspond to upper ontologies in technical and engineering fields (Poli and Obrst 2010, 3). Upper or top-level ontologies may be employed in building specialized concepts of different domain ontologies.
Domain ontologies conceptualize the specific realm of a field of knowledge or a particular task specifying the contents of the general categories provided by a top-level ontology. Domain ontologies offer a model of detailed knowledge, on which there may be substantial sharing of meanings already.
Poli and Obrst (2010, 8) suggest a third kind of ontology, middle ontologies, which may be also defined as “domain-specific upper ontologies” and cover multiple domains. The domain-specific upper ontology presents general enough constructs to encompass an entire science that presents many sub-domains, which, in their own right, may be considered domains.
Suggested Upper Merged Ontology (SUMO) is the upper level ontology developed in 2000 by the Teknowledge Corporation (at present under copyright of the Institute of Electrical and Electronics Engineers, IEEE), that has been proposed by the Standard Upper Ontology Working Group as a candidate to be a standard for upper ontologies and to act as a foundational ontology for domain ontologies. This top-level ontology includes general and abstract entities coming from already existing upper ontologies, such as the upper ontologies of John Sowa and Russel-Norwig (Niles and Pease 2001; Ševcenko 2003). The physical world, which includes objects and processes, is distinguished from the abstract world, which includes classes, relationships, statements, quantities and attributes. In addition to the top-level ontology, SUMO includes a set of domain ontologies, for communication, geography, economics, engineering, etc.
Descriptive Ontology for Linguistic and Cognitive Engineering (DOLCE) is a prominent upper-level ontology developed within the international WonderWeb project (2001-2004) of foundational ontologies, by Nicola Guarino at the Laboratory for Applied Ontology (LOA) of ISTC-CNR (Italy). It provides ontological categories that reflect the structures of common languages and of cognitive human activity, with the aim of supporting interoperability among other domain ontologies. The first categorization concerns the distinction between endurants or continuants (physical: objects; nonphysical: social or mental objects) and perdurants or occurrents (events, processes, phenomena, activities, etc.). The distinction is based on the way they are present in time: endurants are completely present in every instant of their existence; perdurants occur in time and are only partially present at any time they are present (Gangemi et al. 2002).
Basic Formal Ontology (BFO) is an upper-level ontology (initially covering biomedicine) that was developed starting in 2002 by Barry Smith and Pierre Grenon. According to Grenon and Smith, philosophically, a good ontology should account for reality, considering reality both as it exists in a moment and as it happens through time. BFO is a formal ontology according to Husserl and presents basic structures of reality. It aims to represent a template for material ontologies (in Husserl’s sense), that is, ontologies of broad domains such as the domain of a society, of organisms, or of physical things. The philosophical dichotomy concerns the modes of existence of the entities during time. On one hand, entities exist that endure (continue) during time, even with some changes, that is, material and immaterial entities. On the other hand, entities exist that perdure in time, such as processes, events and activities (Spear, Ceusters and Smith 2016). BFO presents a two-component structure, representing SNAP (entities) and SPAN (processes): thus, both endurants and perdurants are represented in BFO (Grenon and Smith 2004).
An upper ontology level equipped with an engine for inferences is provided by Cyc, which is the largest formal ontology and the most ambitious of ontology system projects. The upper ontology level (Upper Cyc) defines approximately 3000 concepts related to temporal and spatial relationships, logical concepts and mathematical entities, concepts such as quantity, sets, and groups. The development of Cyc started in 1994, and in 2002 Cycorp released the first open-source version of the ontology OpenCyc equipped with a knowledge base of “common sense” and with fundamental concepts related to the fields of science, society, culture, environment and finance, and an engine able to make inferences and rules to allow reasoning about real-world objects. The Cyc technology allows access to data stored in external databases and web sites, to integrate data coming from heterogeneous sources, and create a single consistent set. It is worth noting that the Cyc ontology shows the characteristics of content organization highlighted by Sowa (2006), who stressed the opportunities offered by natural language, its flexibility, and the fact that the meaning of words dynamically evolves in contrast to the precision and clarity of formal languages. Cyc, thus, is provided with several thousands of microtheories, each one focalized on a specific field of knowledge or devoted to a particular context. Also Cyc, the most relevant formal ontology, chose the strategy of the multiplicity of modules. As Sowa declared (2006, 212) “The largest ontology project ever attempted began with a globally consistent set of axioms, but later divided it into a multiplicity of independently developed microtheories. That evidence does not prove that global consistency is impossible, but it suggests that a modular approach is easier to implement”.
Whereas top-level (or upper) ontologies represent general entities and properties, domain ontologies represent the concepts and objects of single scientific or applicative fields or of specific tasks. Domain ontologies conceptually describe the vocabulary of terms (and concepts or objects) of a field of knowledge, specifying concepts presented at the top-level and offering detailed knowledge concerning the domain based on the consensus of users, also in order to support automatic reasoning. Task ontologies and application ontologies offer a vocabulary of terms concerning a specific task or a particular activity, in the latter case including the ways used to realize events, processes and actions in a specific application field. Important issues in the development of domain ontologies encompass the coverage of the domain (either a considerable or small number of concepts), the possibility of re-use, and the purpose of the ontology, as a domain may be described following different points of view.
A particular attention must be paid to bibliographic ontologies, which enable the description of entities that belong to the bibliographic set, such as textual publications (e.g., articles, monographs, and series) and web pages, datasets, films etc., and define the relationships among these bibliographic entities (Nurmikko-Fuller et al. 2015; 2016). Besides, bibliographic ontologies have been built to define specific relationships, such as authorship, editorship, and aboutness among entities, as well as the relationships that connect → works (Smiraglia 2019) and their abridgments, adaptations, and translations. They can underline the relationships between a serial and the transformations it may have had over time, such as supplements or successors.
Bibo, the Bibliographic Ontology developed by Bruce D’Arcus and Frédérick Giasson in 2009, is the first OWL ontology that provides main concepts and properties for describing bibliographic entities and citations. Bibo’s properties have been used since 2011 in the BNB Linked Data Platform, which provides access to the British National Bibliography published as linked open data. The Linked Data Service of the Deutsche Nationalbibliothek also has used Bibo since 2010. Bibo includes five principal classes and 34 subclasses, 32 object properties and 20 sub-properties, 20 datatype properties (10 of which are OWL equivalent properties) and 26 sub-properties. Bibo presents a variety of entities related to the bibliographic world, organized into five principal classes: Agent, Collection, Document, Document Status, Event, and 34 subclasses. The classes Document and Collection accommodate most of the bibliographic sub-entities. Among the subclasses of Document, there are Article, Book, Image, Legal Document, Manuscript, Report, Web page, etc. Among the subclasses of Collection, there are Periodical, Series, and Web site.
Even though the developers of Bibo curated sound definitions of the classes that belong to the bibliographic field, it is worth mentioning that this ontology is not very detailed with respect to the properties required in a bibliographic environment. Although it represents the translations of bibliographic resources, the properties that concern derivative, merging, and absorbed resources, which are provided instead by BIBFRAME, are not represented in Bibo. The aforementioned properties refer to two different categories of relationships very relevant in the bibliographic field: derivative relationships, which concern different editions of the same work and works derived from a pre-existing work, and sequential relationships that include sequels of a monograph, the logical continuation of, or the transformation of another work (Tillett 1989; Green 2001; IFLA 2017, 69-78).
FaBiO (The FRBR-aligned Bibliographic Ontology) was developed by Silvio Peroni and David Shotton as part of the project SPAR - Semantic Publishing and Referencing Ontologies, a set of complementary and orthogonal ontologies developed in OWL 2 DL by Bologna University and Oxford University (2009) [6]. SPAR ontologies participate in the Semantic Publishing project, which deals with the use of Semantic Web technologies to describe the different aspects of the publishing domain and semantically linking scientific literature to facilitate discovery. FaBiO (Peroni and Shotton 2012) is based on the → FRBR entities (IFLA 1998; 2009): Work, Expression, Manifestation, Item. The Work class in FaBiO is restricted to entities published or printable: textual publications, such as articles, books, series, and journals, etc., but the entities also include Web pages, datasets, computer algorithms, catalogues, etc.
FaBio includes seven super-classes, one equivalent class, and 237 subclasses (for a total of 245 classes), only 28 object properties [7], 65 datatype properties [8] and 15 named individuals. FaBiO does not show the properties that concern derivative, merging, and absorbed resources, which are provided instead by BIBFRAME.
FaBiO uses FRBR categories to describe a document considering its different Expressions. For example, an academic paper could be published as a journal article first, later as a paper in conference proceedings, or as a book chapter (Peroni and Shotton 2012, 36; Peroni, Shotton and Vitali 2012). It is worth mentioning here that IFLA (2017), defining the levels of bibliographic description, established a strict connection between Work, Expression, Manifestation and Item: a Manifestation is a Manifestation of an Expression of a Work. Moreover, IFLA (2017) highlighted that a Work consists of the intellectual or artistic creation and that it includes all the various Expressions of the Work (IFLA 2017, 20). FaBiO uses the FRBR categories in order to define the different types of bibliographical entities or objects that belong to the publishing world. The presentation of bibliographical entities in a classified way, offering each of them as a subclass of Work or of Expression, produces some confusion. While the FRBR properties are correctly used in FaBiO to link a Work to an Expression or a Manifestation in abstract, without mentioning any particular bibliographical entity, bibliographical entities are offered each time as sub-class of Work or of Expression. Each bibliographical entity, instead, may be a Work or an Expression, depending on the situation (Biagetti 2018). The reasoning behind the decision is not clear for classifying as a Work, for instance, an essay, report, research paper, and, on the contrary, as an Expression, for example, an article, brief report, chapter, and proceedings paper [9]. An essay, an article, a report, a chapter, may be considered a Work or an Expression depending on the situation.
Bibframe 2.0 presents “three core levels of abstraction: Work, Instance, and Item” and “additional key concepts that have relationships to the core Classes” such as Agents, Subjects, and Events (associated with Works or Instances). BIBFRAME’s vocabulary offers 75 classes and 112 subclasses (plus two FOAF classes), 194 properties, out of which 131 are object properties (and sub-properties), and 63 are datatype properties (and sub-properties). In particular, the cataloging resource relationships (general, specific and detailed), mainly sub-properties of the property relatedTo, may be relevant for analysis. BIBFRAME provides properties that concern derivative, merging, and absorbed resources, which, on the contrary, have been less scrutinized in Bibo and FaBiO. BIBFRAME offers sub-properties of the property relatedTo, such as accompaniedBy, which allows definition of a supplement or index added to a resource, and derivativeOf, to express translations and different editions of a work. The replacement of a resource with another, the merging of two or more resources to form a new resource, the continuation of a resource under a new title, and the incorporation of a resource into another, may be described using the property precededBy. Additionally, the property suceededBy allows definition of a resource that supersedes another and the division of a resource into two different resources. These properties are relevant to connect entities that belong to the bibliographic field, and they are lacking in other ontologies, such as FaBiO and BIBO.
In BIBFRAME 2.0, special attention has been paid to derivative, added and merged resources; however, the same attention has not been paid to the properties that belong to continuing resources, such as serials. The recently developed formal ontology PRESSoo (2016), an extension of FRBRoo (IFLA 2016), addresses the problems of absorption, continuation, replacement, separation, and merging, also provided by BIBFRAME 2.0. Moreover, in PRESSoo, the cases of a temporary replacement of a serial with another serial, the reprint of a dead serial as a new monograph, the enhancing of a series by monographs, the launch and the end of a periodical, the issuing rules (for instance, regularity, frequency, etc.), and the partial continuation of a serial, are also added.
The Dublin Core Element Set, developed in 1995, has been maintained by the Dublin Core Metadata Initiative in order to facilitate the description of resources on the Web by the way of a limited shared set of elements. It has been described in official documents as “a very simple ontology”, as the up-to-date version of Dublin Core metadata terms (DCMI Metadata Terms) issued in 2019 shows the features of ontological schemas and includes the fifteen terms of the original version and classes, properties, datatypes that represent the extension vocabularies. Classes and properties are identified by URI to be used in linked data. The schema, a DCMI Recommendation 2020, shows 55 properties that notably derive from the Elements Refinements of the previous versions, and 20 classes (previously “elements”).
The most significant computational ontological model for the cultural heritage area is CIDOC-CRM (ICOM-CIDOC 2020), suggested by ICOM (International Council of Museums) since 1996 and at present internationally shared. CIDOC is a domain ontology that allows sharing of information among heterogeneous institutions that manage cultural heritage and research; it covers a wide field that includes archaeological sites, museums collections, monuments and the scientific documentation kept in libraries and archives, already described using different metadata systems. The CIDOC ontology is a formal language that can express cultural content, in particular information concerning historical-geographical contexts that different management systems may have in common. It represents a model for integration and sharing and allows different cultural sources to become global resources. Moreover, CIDOC-CRM offers a top-level ontology made up of general classes also applicable to other domains, such as Temporal entities, Period, Activity, Modification, Conceptual object. It has been approved as ISO 21127 standard in 2006 (revised in 2014). The latest version, in progress, is version 6.2.9 of April 2020 (Doerr 2003; 2009)
The model offers an event-centric vision: events connect persons, places, activities, ideas, and objects. The guiding principle for integration of information that comes from different institutional sources is the explicit representation of events in a historical context. Temporal entities assume therefore the central role, as they are directly connected to space and time. The aim of the ontology is to accommodate historical contents and to provide a model that can also represent content and data that contradict each other.
Following monotonic reasoning [10], this ontology allows merging broad knowledge bases without conflict. However, the choice of monotonic reasoning may be a limitation. Actually, abductive and scientific reasoning is oriented towards the revision of acquired knowledge if there are new facts that open a discussion about events previously considered. For instance, p-Logic (probabilistic logics) (Johnson-Laird, Khemlani and Goodwin 2015) is not founded on monotonic logics. The possibility to make mistakes and to correct the results considering new elements in contrast with the previous statements is offered by the non-monotonic logics (Strasser and Antonelli 2015), which allows for conclusions that may be beyond the meanings involved in the premises. Defeasible reasoning (Koons 2014) allows wrong conclusions to follow a true premise.
In the last draft version (2020), CIDOC-CRM offers 100 classes and 196 properties. Main characteristics of CIDOC-CRM:
based on two fundamental categories that concern the conservation of identity during the time: persistent items (endurants), which maintain their identity beyond the single events, and temporal entities (perdurants), which occur over time;
provides the multiple instantiation that allows an instance of a class to occur at the same time as an instance of other classes. For example, an object may be an instance of E20 Biological Object and at the same time an instance of E22 Man-Made Object.
provides multiple inheritance: a class may be a subclass of two or more super-classes and inherits the properties of different super-classes. For instance, Person is subclass of Actor and of Biological Object and inherits the properties of each one.
The strength of CIDOC lies in the great number of properties that allow the definition of the relationships between the entities. The core properties of the ontology permit the definition of the relations between agents, activities and places that meet in a single event. It is necessary to observe that the most relevant properties concern Participation (for instance, P11 and P12), which permits the highlighting of interactions among persons, places, and actions that occurred in one event. The set of relations defined by the concept of Influence (for instance, P15 and P17), and by the concept of Purpose (for instance, P20 and P21) allows the emphasis of the mutual influences between each entity and activity belonging to the universe of discourse of CIDOC.
CIDOC has been widely used in artistic cultural heritage: since 2000, Claros served as a system for searching archaeological and art collections; since 2012 Die Deutsche Digitale Bibliothek allows consultation of the digital collections by way of CIDOC; and Arches allows mapping of information about archaeological sites, historical buildings and areas of cultural relevance [11].
In biological and medical domain, controlled vocabularies of terms and relations are used to share information and several domain ontologies have been developed. A consortium of ontology developers — The Open Biological and Biomedical Ontology (OBO) Foundry — was founded in 2001 to define shared principles for developing interoperable ontologies in biomedicine. The principles include the collaborative development of ontologies and the use of a common syntax (RDF/XML); use based on most prominent models such as the Gene Ontology; and the provision of open access. The library of ontologies encompasses the BFO upper ontology and domain ontologies such as the Gene Ontology, the Disease Ontology, the Plant Ontology, the Protein Ontology, the Cell Ontology, the Coronavirus Ontology and so on. The Gene Ontology (GO), developed in 1998, describes the biological domain considering three aspects: cellular components, that is, the parts of the cell; biological processes, such as chemical reactions or metabolism; and molecular functions. Thus, the GO consists of three ontologies, each one devoted to one aspect. The GO is a dynamic vocabulary that allows description of the functions and activities performed by the gene products, that is, the macromolecules, across different organisms, enabling the analysis of genomic data. The three ontologies may be used to query a database of gene product functions.
The use of ontologies as tools for knowledge organization provides integrated access to the use of digital objects that can be distributed and managed by different systems. Ontologies allow semantic interoperability, performing a mediation function between the meanings attributed to documents managed by different repositories, each one set up following non shared strategies for knowledge organization.
Among the problems related to the use of ontologies in systems for managing research functionalities in digital libraries, the selection of the preferable ontology is the most important. In specialized scientific fields, the selection of an ontology does not present relevant problems; on the contrary, in the case of broad digital libraries that manage digital objects concerning a great number of disciplines, it is crucial to choose suitable ontologies, and this may cause differences in the settings of the research functionalities.
Advanced management systems in digital libraries allow the addition of basic functions for research such as the expansion of search terms, the application of such KOSs as classification systems and thesauri, and the inclusion of ontologies and annotations by users in a collaborative environment (Soergel 2009).
Query expansion enhances the results of a search in digital libraries (Efthimiadis 1996). Users may add variant words to the search terms; otherwise, they can manually or automatically add other words to those selected for searching. Historically, manual query expansion procedures have been based on the use of thesauri, especially in bibliographic databases. An example is PsycInfo, a database in existence since 1967 created by the American Psychological Association (APA), whose specialists index documents with the support of the Thesaurus of Psychological Index Terms. The thesaurus helps the users select terms and allows them to expand their query based on broader and narrower terms. Automatic query expansion, on the contrary, is based on the probability calculus of closeness of terms.
At present, query expansion in digital libraries is offered with ontology support to allow for the formal disambiguation of meanings, managed by machines (Bhogal, McFarlane and Smith 2007). The aim is to contextualize terms used in research. The use of general ontologies such as Cyc, or of domain ontologies, allows representation of the terms within their contexts. If the techniques of query expansion are based on the use of a broad domain ontology, the search results can be expanded by hundreds of terms, and the recall level increases. In digital libraries, the use of query expansion techniques can be limited to the use of the synonyms of the terms selected at the beginning, or of the terms belonging to a class; in this case the precision of the search increases (Frosterus and Hyvönen 2009).
Mapping the terms in documents to the terms in ontologies allows searches to be conducted with an ontological basis by expanding the terms for searches using all the terms declared by an ontology to belong to a class. However, query expansion techniques do not allow searching that considers different perspectives or the different points of view by which a monograph debates a topic, in particular in social sciences and humanities. Traditional indexing, instead, highlights these peculiarities.
In addition to the use of domain ontologies, query expansion in digital libraries is usually performed using lexical databases built on semantic networks, such as WordNet, developed for the English language in 1985 by the Cognitive Science Laboratory of Princeton University, under the supervision of George Miller. As Miller and Fellbaum (2007, 210) highlighted, WordNet is not an ontology, but rather a dictionary based on semantic structure. It was built adopting as a model, at least partly, such linguistic thesauri and thematic dictionaries as Roget’s Thesaurus (WordNet 1998), which organizes English terms into semantic fields. In WordNet, nouns, adjectives and adverbs are organized into synsets, or sets of synonym terms. Each synset is devoted to a concept and expresses the semantic networks through the relationships of meronymy, hyperonymy, antonymy and hyponymy. WordNet is made up of about 117,000 synsets, sets of synonyms and quasi-synonyms that accommodate about 150,000 words and glosses. Nouns and verbs are organized in semantic sets, and the hierarchies based on the relationships of hyponymy and hyperonymy are made explicit. Adjectives are organized in clusters with a central synset comprised of a couple of antonyms (fast/slow; wet/dry) enriched with glosses and examples, and “satellites” synsets, each one devoted to a related concept. At the beginning of the 1990s, Piek Vossen of Amsterdam University began a project to extend WordNet to some European languages such as German, French, Italian, and Spanish.
In 2000, to create guidelines for the development of WordNets in each language and set up a free platform for discussing and sharing information about WordNets for all languages and connecting different WordNets to guarantee interoperability and sharing, Piek Vossen and Christiane Fellbaum founded the Global WordNet Association (GWA).
A prominent tool devised by GWA is the Inter-Lingual-Index, a standard, universal index of meaning for inter-linking the WordNets in different languages. The Inter-Lingual-Index connects the new WordNets to each other and navigates through words with the same meaning in different languages. The Index is a repository of all concepts expressed in one language in the EuroWordNet; it is a significant tool for multilingual information retrieval. Domain ontologies have been developed for each set of concepts, and a top-level ontology has also been developed that shows the general concepts of the WordNets devoted to the different languages, with the aim of guaranteeing interoperability among different WordNets. The top ontology is made of 63 semantic sets that represent the common semantic structure for all the languages managed in the Index and are used to classify about 1024 concepts. The top-ontology distributes the entities in three areas: (1) objects and perceptible matter (classified according to their form, origin, function, and purposes); (2) situations and events (classified according to the type of elements that compose the situations); (3) the entities related to knowledge and mental states.
8. The role played by ontologies in the Semantic Web
In Tim Berners-Lee’s vision, the Semantic Web represents the extension of the traditional Web and transforms it into a network of documents connected to a network of knowledge elements. Ontologies represent the “core” of the transformation, allowing machines to query, reason and manipulate meanings and knowledge.
The RDF data model offers the syntax to describe the resources; however, to define the meanings of resources, it was essential to develop a new language: RDFS (2014), or RDF Schema, which defines the vocabulary of the RDF resources in a specific domain. The resource description made up using RDF along with RDF Schema may be managed by machines, which, then, might be able to make inferences and deductions. In order to achieve this, the use of RDF datastores, such as Sesame or OpenLink Virtuoso, is required. The bases of knowledge can save data in RDF, and the ontologies and the reasoners are used by machines to query the data using the SPARQL language.
Ontologies are the core of Berners-Lee’s original vision of the Semantic Web. Web agents cannot work without ontologies, in particular in multiagent systems devoted to Web searching where heterogeneous agents cooperate, and systems might use different languages. In this case ontologies constitute a common vocabulary (Costantini and Tocchio 2002).
However, many scholars have criticized the basic ideas on which both the Semantic Web and ontologies are based. Catherine Legg (2007, 438), for instance, underlined that none of the present ontologies deliver a “machine-understandable theory of meanings”. In order to achieve more efficient results, formulating axiomatic assertions and inferencing rules would be needed, and, in doing this, new problems would arise, such as the need to “determine […] formal semantics [of the language used], and the inferential tractability, scalability, and brittleness of applications built using it” (Legg 2007, 438).
Ontologies are considered tools that allow computers to understand information, whereas other kinds of KOS seem not able to achieve this aim, as they are built for humans. However, the functions related to artificial intelligence are not the only relevant considerations. In fact, applications in library and information science may benefit from a richer set of high-level semantic relations that ontologies may provide in order to improve, for instance, subject analysis, in cooperation with philosophers and information scientists.
The Sekt Project (Semantically Enabled Knowledge Technologies, 2002-2006) proposed by an association of 12 European partners, managed by John Davies and financed within the 6th Framework Program of the European Commission, aimed to develop technologies for the Semantic Web and testing methodologies and tools that allow the identification of the meanings of the informative resources and of the contexts in which these are included, e.g., ontologies in combination with metadata. Moreover, ontologies may be used to manage the users’ profiles linked to systems for the retrieval of information.
Particular attention must also be paid to the use of ontologies in annotations. To equip documents with annotations to highlight entities such as persons, events, or places and identify them with persistent identifiers is a fulfilment that the evolution of the Web allows. Semi-automatic, ontology-based annotations may be created by authors during the text formulation, or a posteriori by the users’ community. Really, annotations related to contents may be created by users classifying the texts with the help of classes defined by ontologies managed by the system.
Another main concern is interoperability (Zeng 2019). It involves the aggregation and the exchange of data, along with the expansion of searching across networks of data repositories. Also, KOSs should support interoperability that is needed in the following situations, as reported by Marcia L. Zeng (2019, 123) quoting from NISO Z39.19-2005 Appendix A 10.1:
Metasearching of multiple content resources using the searcher’s preferred query vocabulary;
Indexing of content in a domain using the controlled vocabulary from another domain;
Merging of two or more databases that have been indexed using different controlled vocabularies;
Merging of two or more controlled vocabularies to form a new controlled vocabulary that will encompass all the concepts and terms contained in the originals; and
Multiple language searching, indexing, and retrieval.
Ontologies support semantic interoperability, by way of which the meanings of terminology may be understood by applications. As underlined by Zeng (2019, 124), “semantic interoperability can be defined as the ability of different agents, services, and applications to communicate (in the form of transfer, exchange, transformation, mediation, migration, integration, etc.) data, information, and knowledge — while ensuring accuracy and preserving the meaning of that same data, information, and knowledge”. The interoperability of thesauri with other kinds of KOS, such as classification systems, subject headings and ontologies, has been recommended by the ISO standard 25964-2: 2013.
Turning attention now to ontologies, it must be noted that upper ontologies come especially into play in semantic interoperability; however, other kinds of ontologies may serve as shared concept schemes to integrate existing vocabularies. As reported by Zeng (2019), quoting from Fritzsche et al. (2017), ontologies as “bridged schemes” are tools used to mediate between specific concepts of ontologies in the same domain such as the Global Agricultural Concept Scheme (GACS) project, whereas “reference ontologies” are not strictly connected to specific use cases of an application but may facilitate integration across systems and sources of data: “Rather than serving as an upper ontology that helps mediate between other ontologies, a reference ontology serves as a means for mapping the terminology of multiple information systems and data to a common set of shared concepts” (Zeng 2019, 134).
Mapping KOS vocabularies is another strategy to achieve semantic interoperability among existing vocabularies. Relations between concepts in vocabularies may be established, but with many challenges, as vocabularies may present different structures and languages, or different vocabularies may reflect different cultures. ISO 25964 recommends two models for mapping: the direct-linked model, which allows different vocabularies to be linked to each other, and the hub structure, which allows many vocabularies to map to a single vocabulary that serves as hub (reported in Zeng 2019, 138-139).
To improve the interoperability of current KOSs in 2009 the World Wide Web Consortium developed the standard SKOS (Simple Knowledge Organization Systems 2009), a concept-centric data model based on RDF that identifies concepts using URIs to make already available knowledge organization systems public on the Web in machine-readable formats and ease the reuse of thesauri, classification systems, and subject headings lists. SKOS uses the RDF data model and allows data to be linked and merged with other data in RDF with software for the Semantic Web. SKOS envisaged a link between different communities of practice within library and information science. It serves as a connection between LIS communities and the Semantic Web, allowing the reuse of current KOSs such as the Library of Congress Subject Headings and, for instance, UN Food and Agriculture Organization’s AGROVOC thesaurus, in the Semantic Web environment in a machine-understandable format, using the RDF data model. However, some drawbacks to the transformation in machine-readable format using SKOS have been highlighted, especially with reference to the issues that concern the representation of tables and indexes of classification schemes and the inconsistency of the SKOS model in representing the relationships between topics and classes (Panzer and Zeng 2009).
Ontologies play an essential role in the process of building linked open data (LOD) to enhance the Semantic Web, as they offer a tool to express suitable and semantically qualified relationships in the form of RDF triples (RDF Resource Description and Framework Primer 2004). Ontologies, thus, offer the contents of the object properties that constitute significant links among subjects and objects of RDF triples; data are linked using meaningful connections. The creation of LOD allows the connection of data within the Web and enriches information by interlinking structured data from different sources (Berners-Lee 2006). In the process of interlinking data, ontologies such as FOAF (FOAF Friend Of a Friend Ontology 2000) are frequently used that enable the definition of biographic profiles and relations among persons and groups, and as Organization Ontology (2014), which allows expression of organizational structures, including governmental institutions. In the large domain of cultural heritage, the CIDOC-CRM ontology is broadly used, as it provides about 200 properties suitable to describe the attributes of the field. Bibliographic ontologies, such as Bibo (see section 6.1) have used to provide the consultation of data in LOD format since 2010 by the Deutsche Nationalbibliothek and since 2011 by the British National Bibliography.
An ontology language should describe meanings formally and in a machine-readable way to allow for automated reasoning. Traditional ontology languages are based on first-order logics, like in the case of Knowledge Interchange Format (KIF), or on description logics. Web-based ontology languages are Web standard compatible or based on a particular Web standard, such as OWL, which is based on RDF (Kalibatiene and Vasilecas 2011, 126-127). Ontology languages allow the building of ontologies, the encoding of knowledge and the inclusion of rules for processing.
KIF is a knowledge representation language created by the DARPA knowledge-sharing effort in order to allow the interchange of knowledge among different computer programs written in different languages and at different times. Its primary role is not to communicate with human users, but rather to allow computer systems to communicate. Its semantics are based on a conceptualization of the world; that is, it is based on abstract (such as concepts) and concrete objects, fictional objects (such as a unicorn), primitive and composite objects, and the words along with the things they represent, as well as on relationships among them (Genesereth et al. 1992). KIF is a declarative representation language and shows declarative semantics, richness of representation and human readability.
Resource Description Framework is a general-purpose language, a data model for the conceptual description defined to represent information about resources (RDF Resource Description and Framework Primer 2004). Resources may be described by statements about them that specify each thing as an entity that has properties and values, which are identified by URIs. The statements are in the form subject-predicate-object, where the subject represents the resource, the predicate denotes one aspect of the resource and expresses the relationship between the resource and the value of the property (object). Predicates may be defined by an URI from an ontology; the objects may be defined by URIs or literals that humans may read. RDF triples specify the relationships between entities using a propositional structure, but allow a low-level logic expressiveness, as they do not specify the meaning of annotations, do not resolve problems of polysemy and synonymy, and do not allow deductions and inferences on data.
An ontological language was requested that adds logical rules and allows systems to “reason” and make inferences. First order logics was assumed as a framework to define ontological languages, as it allows expression of multiple relationships between resources using a multiplicity of operators. Description logics expresses statements about objects, relations among them, and on properties that objects may share. It allows the use of formalisms for conjunction, disjunction, negation, existential quantification, value restriction, and number restriction (cardinality). Systems based on description logics allow users (humans and software) to make inferences using algorithms of subsumption, instance, and consistency.
Description logics offers operators for intersection, union and complement of concepts to define complex concepts (such as, for instance, persons that are not male), and quantified role restrictions such as value restrictions and cardinality. To produce inferences, algorithms of instance are used to determine whether an individual belongs to a class, algorithms of subsumption are used to establish hierarchical relationships among concepts, and algorithms of consistency are used to analyse logical consistency among concepts.
In 2000, DARPA (Defense Advanced Research Projects Agency) developed the DAML language (Darpa Agent Markup Language program) in the areas of interest of DARPA High Performance Knowledge Base; subsequently the OIL language (Ontology Inference Layer) was built. The Web Ontology Working Group, in the framework of the W3C Semantic Web Activity, finally developed OWL, Web Ontology Language (OWL Web Ontology Language Overview 2004). OWL can be used in applications that manage the content of information, the meanings of terms and the relationships among terms. It uses the syntax of RDF, extends RDF Schema, defines classes, instances, hierarchies of classes, individuals and properties, and allows definition of a greater number of relationships among classes: disjunction, cardinality, symmetry, etc. It presents three sublanguages: Lite, Description Logics, and Full. A second edition, OWL2 (OWL Web Ontology Language Overview 2009), was developed in 2009.
Restriction on properties permit software to automatically reason. OWL allows the definition of restrictions on properties using range and domain restrictions (already used in RDF):
OWL permits a new kind of restriction on properties, for instance, in relation to classes: the restriction on properties allValuesFrom requires that for each individual of the class to which a property is applied, all the values are members of the class declared in the restriction:
Property cardinality restriction specifies the number of individuals involved in the restriction and allows the definition of the minimum and maximum number of individuals. The mutual exclusiveness of classes is defined using owl:disjointWith that permits the declaration that the extension of a class is sharing no members with the extension of another class and to form an axiom:
The most popular editor for building ontologies in OWL is Protégé. Developed by the Center for Biomedical Informatics Research at the School of Medicine at Stanford University in the 1980s, Protégé is software for creating, developing and maintaining ontologies on the Web and in knowledge bases, and in its last versions it supports OWL and OWL2 specifications (Musen et al. 2015). A Web-based version, WebProtégé, is a free, open-source tool that provides support to develop, discuss, edit and share lightweight domain ontologies. The interface allows the definition of classes, class hierarchies, individuals, object properties, data properties and annotation properties, supporting the OWL ontology language (Fig. 6). In addition to the possibility for users to link to entities provided by successful knowledge bases, such as Schema.org, Wikidata, and DBpedia, an important feature is the collaborative functionality that allows the creation of threaded comments (Horridge et al. 2019). WebProtégé currently hosts around 68,000 OWL ontology projects, and the interface allows users to build complex queries and to see the visualization of subclasses and relations created in ontologies.
Other prominent ontology editors:
NeOn Toolkit is an open source editor developed by the NeOn Foundation within a project funded by the European Commission’s Sixth Framework Programme. It is suitable for heavyweight projects, such as multi-modular and multi-lingual ontologies and for ontology integration.
Fluent Editor, free for academic researchers and a limited number of others, allows the editing of semantic models and complex ontologies that use controlled natural languages. It is OWL compatible, interoperable with Protégé and supports referencing other ontologies.
OBO Edit, open-source, was developed by Berkeley Bioinformatics and Open Source Projects and is funded by the Gene Ontology Consortium and it is optimized for biological ontologies. At the present date, it is not working.
This article has provided a presentation of ontologies, drawing attention to their emergence in the context of computer science and their subsequent interconnection with the artificial intelligence that has sustained automated reasoning in the context of the Semantic Web, to share contents and allow agents to make deductions.
The focus of the article, however, has been on the definition of the relations between ontologies and KOSs, which are schemes for organizing information and making retrieval easier. Although some scholars (e.g. Smith and Welty 2001; McGuinness 2003) have determined a single category of KOSs including ontologies, the purpose of the article was to clarify the differences between KOSs and ontologies. Machine-processing ability does not characterize ontological artifacts, as thesauri, being built on logical principles, could also allow computers to process data. The real feature that differentiates ontologies from (other) KOSs is the ability to represent a greater number of semantic relations, and, as it has been suggested by Zeng, to offer attributes for each class.
The concept of “formal” ontologies expressed in formal, machine-readable formats using logic languages (first order logics, description logics) has been discussed considering a knowledge engineering perspective and also the view that prefers to maintain the richness of natural languages (Sowa). The alternative views of considering ontologies grounded on concepts (Gruber) or on real entities (Smith) have been highlighted. Finally, the role that ontologies played in the realization of the Semantic Web in the original vision by Berners-Lee has been addressed, the relevance of ontologies as semantic interoperability schemes, and their use in the building of linked open data starting from 2006.
The author thanks the editor-in-chief, Birger Hjørland, for the helpful support and the relevant suggestions, Claudio Gnoli for the careful proof-reading and editing, as well as the two anonymous reviewers for the useful advices and recommendations.
1. “The term ontology has a complex history both in and out of computer science, but we use it to mean a certain kind of computational artifact – i.e., something akin to a program, an XML schema, or a web page – generally presented as a document. An ontology is a set of precise descriptive statements about some part of the world (usually referred to as the domain of interest or the subject matter of the ontology). Precise descriptions satisfy several purposes: most notably, they prevent misunderstandings in human communication, and they ensure that software behaves in a uniform, predictable way and works well with other software”. (OWL 2 Web Ontology Language Primer 2009).
2. A domain is a set of classes such that “any resource that has a given property is an instance of one or more classes” (RDFS, https://www.w3.org/TR/rdf-schema/).
3. A range is a set of classes such that “the values of a property are instances of one or more classes” (RDFS, https://www.w3.org/TR/rdf-schema/).
4. “Properties in OWL 2 are further subdivided. Object properties relate objects to objects (like a person to their spouse), while datatype properties assign data values to objects (like an age to a person)” (OWL 2 Web Ontology Language Primer 2009, § 3).
5. “Material universals […] exist in toto at different places and different times in the different particulars which instantiate them […] For instance, the material universal mountain is instantiated by Mont Blanc in France and by Grossglockner in Austria. The two mountains are numerically distinct entities, but it is the very same universal which exists in these two different places” (Grenon and Smith 2004, 144).
6. The twelve SPAR ontologies: FRBR-aligned Bibliographic Ontology (FaBiO); Citation Typing Ontology (CiTO); Bibliographic Reference Ontology (BiRO); Citation Counting and Context Characterisation Ontology (C4O); Document Components Ontology (DoCO); Publishing Status Ontology (PSO); Publishing Roles Ontology (PRO); Publishing Workflow Ontology (PWO); Scholarly Contributions and Roles Ontology (SCoRO); DataCite Ontology (DataCite); Bibliometric Data Ontology (BiDO);Five Stars of Online Research Articles Ontology (Five) (http://www.sparontologies.net/).
7. They are organized into the following groups: Top Object Property, Has embodiment, Has exemplar, Has subject term, Related endeavor, Has part, Has realization, Is embodiment of, Is exemplar of, Is part of, Is realization of, Is representation of, Is scheme of.
8. They are organized into the following super-properties: Top data properties, Has title, Has identifier, Has date.
9. Among the subclasses of Work: artistic work, critical edition, essay, image, reference work, report, research paper, review, and vocabulary; but also dataset, metadata, and grant application. Among the subclasses of Expression there are abstract, article, book, brief report, chapter, comment, conference poster, index, letter, manuscript, metadata document, movie, periodical issue, proceedings paper, report document, repository, and supplement, as well as web content, computer program, database, and e-mail.
10. “Monotonic reasoning is a term from knowledge representation. A reasoning form is monotonic if an addition to the set of propositions making up the knowledge base never determines a decrement in the set of conclusions that may be derived from the knowledge base via inference rules. In practical terms, if experts enter subsequently correct statements to an information system, the system should not regard any results from those statements as invalid, when a new one is entered” (ICOM-CIDOC 2018, XI).
11. Since 2000 Claros (Classical Art Research Online Services) served as a system for searching archaeological and art collections of Koln, Paris, Basel, Heidelberg, Würzburg and the collections of the museums of Athens, managed by the centre for research of Oxford, OeRC and at present is no longer functioning. The German national cultural portal, Die Deutsche Digitale Bibliothek (https://www.deutsche-digitale-bibliothek.de/), since 2012 allows users to consult the digital collections of a large number of archives and scientific institutions serving as conveyor that favours Europeana Collections, and harvests digital resources using the schema LIDO (http://network.icom.museum/cidoc/arbetsgrupper/lido/L/11/), based on CIDOC-CRM, providing a single point of access with homogeneous functionalities for searching. Arches (https://www.archesproject.org), an open-source software developed by the Getty Conservation Institute and the World Monuments Fund, builds an informative system based on GIS for the tangible cultural heritage, which allows mapping of information about archaeological sites, historical buildings and areas of cultural relevance. CIDOC-CRM ontology has been adopted for the architecture of data, in particular with the aim of modelling the relationships among entities. In archaeology, CIDOC-CRM has also been used in the infrastructure Ariadne (https://ariadne-infrastructure.eu/) funded by the European Commission within the 7th Framework Programme.
Almeida, Mauricio, Renato Souza and Fred Fonseca. 2011. “Semantics in the Semantic Web: A Critical Evaluation”. Knowledge Organization 38, no. 3: 187-203.
Baader, Franz, Ian Horrocks and Ulrike Sattler. 2009. “Description Logics”. In Handbook on Ontologies. 2nd ed., eds. Steffen Staab and Rudi Studer. Berlin: Springer, 21-43.
Berners-Lee, Tim, James Hendler and Ora Lassila. 2001. “The Semantic Web”. Scientific American May: 34-43.
Biagetti,Maria Teresa. 2018. “A Comparative Analysis and Evaluation of Bibliographic Ontologies”. In Challenges and Opportunities for Knowledge Organization in the Digital Age: Proceedings of the fifteenth International ISKO conference, Porto, July 9-11 2018, eds. Fernanda Ribeiro and Maria Elisa Cerveira. Baden-Baden: Ergon, 501-05.
Blumauer, Andreas and Tassilo Pellegrini. 2006. “Semantic Web und semantische Technologien: Zentrale Begriffe und Unterscheidungen”. In Semantic Web, eds. Tassilo Pellegrini and Andreas Blumauer. Berlin: Springer, 9-29. https://doi.org/10.1007/3-540-29325-6_2.
Buckland, Michael. 2018. “Document theory”. Knowledge Organization 45, no. 5: 425-36. Also available in ISKO Encyclopedia of Knowledge Organization, eds. Birger Hjørland and Claudio Gnoli, http://https://www.isko.org/cyclo/document.
Buckner, Cameron, Mathias Niepert and Colin Allen. 2011. “From Encyclopedia to Ontology: Toward Dynamic Representation of the Discipline of Philosophy”. Synthese 182: 205-33.
CNR-ISTC. 2001. DOLCE - Descriptive Ontology for Linguistic and Cognitive Engineering, Laboratory for Applied Ontology, Istituto di Scienze e Tecnologie della Cognizione of the CNR, Trento (Italy). Ed. Nicola Guarino. The abridged version: http://www.loa.istc.cnr.it/ontologies/DUL.owl.
Costantini, Stefania and Arianna Tocchio. 2002. “A Logic Programming Language for Multi-agent Systems”. In Logics in Artificial Intelligence. Proceedings of the European Conference, JELIA 2002, Cosenza, Italy, September, 23-26, 2002, eds. Sergio Flesca, Sergio Greco, Nicola Leone and Giovambattista Ianni. Berlin, Heidelberg, Springer: 1-13.
Dextre Clarke, Stella G. 2016. “Origins and Trajectory of the Long Thesaurus Debate”. Special Issue “This House Believes that the Traditional Thesaurus has no Place in Modern Information Retrieval”. Knowledge Organization 43, no. 3: 138-144.
Dextre Clarke, Stella G. 2019. “Thesaurus (for Information Retrieval)”. Available in ISKO Encyclopaedia of Knowledge Organization, eds. Birger Hjørland and Claudio Gnoli, https://www.isko.org/cyclo/thesaurus.
Dextre Clarke, Stella G. and Marcia Lei Zeng. 2012. “From ISO 2788 to ISO 25964: The Evolution of Thesaurus Standards towards Interoperability and Data Modeling”. ISQ Information Standard Quarterly 24, no. 1: 20-6.
Doerr, Martin. 2003. “The CIDOC Conceptual Reference Module: An Ontological Approach to Semantic Interoperability of Metadata”. AIMagazine 24, no. 3: 75–92.
Fensel, Dieter, Jim Hendler, Henry Lieberman, and Wolfgang Wahlster (eds.). 2003. Spinning the Semantic Web: Bringing the World Wide Web to Its Full Potential. 2003. Cambridge: MIT Press.
Fritzsche Donna, Michael Grüninger, Ken Baclawski, et al. 2017. “Ontology Summit 2016 Communiqué: Ontologies Within Semantic Interoperability Ecosystems”. Applied Ontology 12: 91-111.
Frosterus, Matias and Eero Hyvönen. 2009. “Bridging the Search Gap Between the Web of Pages and Web of Data by Combining Ontological Document Expansion with Text Search”. In ICSD International Conference for Digital Libraries and the Semantic Web: Proceedings: Trento, September: 90-104 https://seco.cs.aalto.fi/publications/2009/frosterus-hyvonen-airo-icsd-2009.pdf.
Gangemi, Aldo, Nicola Guarino, Claudio Masolo et al. 2002. “Sweetening ontologies with DOLCE”: Proceedings of the 13th European Conference on Knowledge Engineering and Knowledge Management 2473: 166-181. DOI: 10.1007/3-540-45810-7_18.
The Gene Ontology Handbook. 2017. Eds. Christophe Dessimoz and Nives Škunca. Methods in Molecular Biology, vol 1446. New York, Humana Press (Springer Nature), https://doi.org/10.1007/978-1-4939-3743-1_21.
Genesereth Michael R., Richard E. Fikes, Daniel Bobrow, et alii. 1992. “Knowledge Interchange Format”. Version 3.0 Reference Manual. Logic Group Computer Science Department, Standford, Stanford University, https://www.cs.auckland.ac.nz/courses/compsci367s2c/resources/kif.pdf.
Giunchiglia, Fausto, Maurizio Marchese and Ilya Zaihrayeu. 2006. “Encoding Classifications into Lightweight Ontologies”. In The Semantic Web: Research and Applications, Proceedings of the 3rd European Semantic Web Conference, ESWC 2006, Budva, Montenegro, June 11-14. Eds. York Sure and John Domingue, 80-94, http://www.science.unitn.it/~marchese/pdf/P4_eswc06_Encoding.pdf.
Green, Rebecca. 2001. “Relationships in the Organization of Knowledge: an Overview”. In Relationships in the Organization of Knowledge, eds. Carol A. Bean and Rebecca Green. Dordrecht etc.: Kluwer, 3-18.
Grenon Pierre and Barry Smith. 2004. “SNAP and SPAN: Towards Dynamic Spatial Ontology”. Spatial Cognition and Computation 4, 1: 69-103.
Gruber, Thomas R. 1993. “A Translation Approach to Portable Ontology Specifications”. Knowledge Acquisition 5, no. 2: 199-220.
Guarino, Nicola. 2006. “Ontology and Terminology: how can formal ontology help concept modeling and terminology?” in EAFTNordTerm on Terminology, Concept Modeling and Ontology, Vaasa, February 10th, 2006, slide no 13.
Guarino, Nicola and Pierdaniele Giaretta. 1995. “Ontologies and Knowledge Bases. Toward a terminological clarification”. In Towards Very Large Knowledge Bases: Knowledge Building and Knowledge Sharin, ed. Nicolaas J. I. Mars. Amsterdam: IOS Press: 25-32.
Guarino, Nicola and Roberto Poli. 1995. “Editorial: The Role of Formal Ontology in the Information Technology”. International Journal of Human-Computer Studies 43, 623-624.
Guarino, Nicola, Daniel Oberle and Steffen Staab. 2009. “What is an Ontology?” In Handbook on Ontologies, 2nd ed., eds. Steffen Staab and Rudi Studer. Dordrecht etc.: Springer, 1-17.
Hjørland, Birger. 2007. “Semantics and Knowledge Organization”. Annual Review of Information Science and Technology 41: 367-405.
Hjørland, Birger. 2015a. “Are Relations in Thesauri ‘Context-Free, Definitional, and True in All Possible Worlds’?” Journal of the Association for Information Science and Technology 66, no. 7: 1367-73.
Hjørland, Birger. 2015b. “Theories are knowledge organizing systems (KOS)”. KnowledgeOrganisation 42, no. 2, 113-28.
Hjørland, Birger. 2016a. “Knowledge Organization”. Knowledge Organization 43, no. 6: 475-84. Also available in ISKO Encyclopedia of Knowledge Organization, eds. Birger Hjørland and Claudio Gnoli, https://www.isko.org/cyclo/knowledge_organization.
Hjørland, Birger. 2016b. “Does the Traditional Thesaurus Have a Place in Modern Information Retrieval?” Knowledge Organization 43, no. 3: 145-59.
Hjørland, Birger. 2017. “Classification”. KnowledgeOrganization 44, no. 2: 97-128. Also available in ISKO Encyclopedia of Knowledge Organization, eds. Birger Hjørland and Claudio Gnoli, https://www.isko.org/cyclo/classification.
Hjørland, Birger. 2018a. “Library and Information Science (LIS). Part 1”. Knowledge Organization 45, no. 3: 232-54. Also available in ISKO Encyclopedia of Knowledge Organization, eds. Birger Hjørland and Claudio Gnoli, https://www.isko.org/cyclo/lis.
Hjørland, Birger. 2018b. “Library and Information Science (LIS). Part 2”. Knowledge Organization 45, no. 4: 319-38. Also available in ISKO Encyclopedia of Knowledge Organization, eds. Birger Hjørland and Claudio Gnoli, https://www.isko.org/cyclo/lis.
Hjørland, Birger. 2018c. “Data (with big data and database semantics)”. Knowledge Organization 45, no. 8: 685-708. Also available in ISKO Encyclopedia of Knowledge Organization, ed. Birger Hjørland, coed. Claudio Gnoli, https://www.isko.org/cyclo/data.
Hodge, Gail. 2000. Systems of Knowledge Organization for Digital Libraries: Beyond Traditional Authority Files. Washington: The Digital Library Federation, http://www.clir.org/pubs/abstract/pub91abst.html.
Horridge Matthew, Rafael S. Gonçalves, Csongor I. Nyulas et al. 2019. “WebProtégé: A Cloud-Based Ontology Editor”. In Companion Proceedings of the 2019 World Wide Web Conference (WWW’19 ACM, New York, NY, Companion), May 13–17, San Francisco, CA, USA,686-89. https://doi.org/10.1145/3308560.3317707.
Hoy, Matthew B. 2018. “Alexa, Siri, Cortana, and More: An Introduction to Voice Assistants”. Medical Reference Services Quarterly 37, no. 1: 81-8, DOI: 10.1080/02763869.2018.1404391.
IFLA. 1998. Functional requirements for bibliographic records: final report. München: K. G. Saur.
IFLA. 2009. Functional requirements for bibliographic records: final report. Approved by the Standing Committee … as amended and corrected through February 2009.
IFLA. 2016. Definition of FRBRoo: A Conceptual Model for Bibliographic Information in Object-oriented Formalism. Version 2.4. Prepared by Working Group on FRBR/CRM Dialogue, eds. Chryssoula Bekiari, Martin Doerr, Patrick Le Boeuf, Pat Riva, https://www.ifla.org/files/assets/cataloguing/FRBRoo/frbroo_v_2.4.pdf.
IFLA. 2017. Library Reference Model. Consolidation Editorial Group of the IFLA FRBR Review Group …, eds. Pat Riva, Patrick Le Boeuf, and Maja Žumer, revised after world-wide review, not yet endorsed by the IFLA Professional Committee or Governing Board.
Johnson-Laird, P. N., Sangeet S. Khemlani and Geoffrey P. Goodwin. 2015. “Logic, Probability, and Human Reasoning”. Trends in Cognitive Sciences 19, no. 4: 201-14.
Kalibatiene Diana and Olegas Vasilecas. 2011. “Survey on Ontology Languages”. In Perspectives in Business Informatics Research: BIR 2011.Lecture Notes in Business Information Processing, vol. 90. Eds. J. Grabis and M. Kirikova, Berlin, Heidelberg: Springer, 124-41. https://doi.org/10.1007/978-3-642-24511-4_10.
Koons, Robert. 2014. “Defeasible Reasoning”. In The Stanford Encyclopedia of Philosophy, ed. Edward N. Zalta. Spring Edition.
Legg, Catherine. 2007. “Ontologies on the Semantic Web”. Annual Review of Information Science and Technology 41: 407-51.
Mazzocchi, Fulvio. 2018. “Knowledge Organization Systems”. Knowledge Organization 45, no.1: 54-78. Also available in ISKO Encyclopedia of Knowledge Organization, eds. Birger Hjørland and Claudio Gnoli, https://www.isko.org/cyclo/kos.htm.
McGuinnes Deborah L. 2003. Ontologies Come of Age. In Spinning the Semantic Web. Bringing the World Wide Web to Its Full Potential, eds. Dieter Fensel, James Hendler, Henry Lieberman and Wolfgang Wahlster, Cambridge, MA,: MIT Press, 171–96. https://www.researchgate.net/publication/221024668_Ontologies_Come_of_Age.
Miller, George A. and Christiane Fellbaum. 2007. “WordNet then and now”. Lang Resources & Evaluation 41: 209-14.
Musen, Mark A. and the Protégé Team. 2015. “The Protégé Project: A Look Back and a Look Forward”. AI Matters. Jun, 1, 4: 4–12. doi: 10.1145/2757001.2757003
Nardi, Daniele and Ronald J. Brachman. 2003. “An Introduction to Description Logics”. In The Description Logic Handbook: Theory, Implementation, Applications, eds. Franz Baader, Diego Calvanese, Deborah L. McGuiness et al. Cambridge: Cambridge University Press, 1-40.
Niles, Ian and Adam Pease. 2001. “Towards a Standard Upper Ontology”. Proceedings of the 2nd International Conference on Formal Ontology in Information Systems (FOIS-2001), eds. Chris Welty and Barry Smith. Ogunquit, Maine, October 17-19, 2001. http://home.earthlink.net/~adampease/professional/FOIS.pdf.
Nurmikko-Fuller, Terhi, Colleen Fallaw, Jacob Jett et alii. 2015. Bibliographic Ontologies Comparative Features Dataset. Champaign, IL: University of Illinois, http://hdl.handle.net/2142/88356.
Nurmikko-Fuller, Terhi, Jacob Jett, Timothy W. Cole et alii. 2016. “A Comparative Analysis of Bibliographic Ontologies: Implications for Digital Humanities”. In Digital Humanities 2016: Conference Abstracts. Kraków: Jagiellonian University and Pedagogical University, 639-42.
Panzer, Michael and Marcia L. Zeng, 2009. “Modeling Classification Systems in SKOS: Some Challenges and Best-Practice Recommendations”. Proceedings DCMI International Conference on Dublin Core and Metadata Applications, Seoul, 12-16 October 2009, 3-14.
Peroni, Silvio and David Shotton. 2012. “FaBiO and CiTO: Ontologies for Describing Bibliographic Resources and Citations”. Web Semantics: Science, Services and Agents on the WWW 17: 33–43.
Peroni, Silvio, David Shotton and Fabio Vitali. 2012. “Scholarly Publishing and Linked Data: Describing Roles, Statuses, Temporal and Contextual Extents”. In I-SEMANTICS 2012: Proceedings of the 8th International Conference on Semantic Systems. Held at Graz, Austria, 5-7 September 2012.
Poli, Roberto. 1996. “Ontology for Knowledge Organization”. In Knowledge organization and change:Proceedings of the 4th lnternational ISKO Conference, 15-18 July 1996, Washington, DC. Ed. Rebecca Green,. Frankfurt: INDEKS Verlag 313-19.
Poli Roberto and Leo Obrst. 2010. “The Interplay Between Ontology as Categorial Analysis and Ontology as Technology”. In Theory and Applications of Ontology: Computer Applications,eds. Roberto Poli, Michael Healy and Achilles Kameas, Dordrecht: Springer, 1-26. doi:10.1007/978-90-481-8847-5_1
PRESSoo. 2016. Extension of CIDOC CRM and FRBRoo for the Modelling of Bibliographic Information Pertaining to Continuing Resources. Version 1.2, January 2016. Approved by CIDOC CRM-SIG, ed. Patrick Le Boeuf.
Sini, Margherita, Boris Lauser, Gauri Salokhe, Johannes Keizer and Stephen Katz. 2008. “The AGROVOC Concept Server: rationale, goals and usage”. LibraryReview57 no. 3: 200-12.
Smiraglia, Richard P. 2019. “Work”. KnowledgeOrganization 46, no. 4: 308-19. Also available in ISKO Encyclopedia of Knowledge Organization, eds. Birger Hjørland and Claudio Gnoli, https://www.isko.org/cyclo/work.
Smith, Barry. 2003. “Ontology”. In Blackwell Guide to the Philosophy of Computing and Information. Ed. Luciano Floridi. Oxford: Blackwell, 155-166. https://philpapers.org/archive/SMIO-11.pdf.
Smith, Barry. 2004. “Beyond Concepts: Ontology as Reality Representation”. Proceedings of FOIS 2004: International Conference on Formal Ontology and Information Systems, Turin, 4-6 November 2004, eds. Achille Varzi and Laure Vieu, 73-84. http://ontology.buffalo.edu/bfo/BeyondConcepts.pdf
Smith, Barry and Christopher Welty. 2001. “Ontology: Towards a New Synthesis”. FOIS’01, October 17-19, Ogunquit, Maine, USA: 3-9. https://philpapers.org/archive/SMIOTA-9.pdf.
Smith, Barry and Werner Ceusters. 2010. “Ontological Realism: A Methodology for Coordinated Evolution of Scientific Ontologies”. Applied Ontology 5, nos. 3-4: 139-88.
Soergel, Dagobert. 1999. “The Rise of Ontologies or the Reinvention of Classification”. Journal of The American Society for Information Science 50, no. 12: 1119-1120.
Soergel, Dagobert. 2009. “Digital Libraries and the Semantic Web: A conceptual Framework and an Agenda for Research and Practice”. Invited paper. International Conference for Digital Libraries and the Semantic Web (ICSD) Trento, 10-11 September 2009.
Soergel, Dagobert, Boris Lauser, Anita Liang et al. 2004. “Reengineering Thesauri for New Applications: the AGROVOC Example”. Journal of Digital Information 4, no. 4.
Souza, Renato R., Douglas Tudhope and Mauricio B. Almeida. 2010. “The KOS spectra: a tentative faceted typology of knowledge organization systems”. Proceedings 11th International Conference of the International Society for Knowledge Organization, Rome. Würzburg: Ergon, 122-28.
Souza, Renato R., Douglas Tudhope and Mauricio B. Almeida. 2012. “Towards a taxonomy of KOS: Dimensions for classifying Knowledge Organization Systems”. Knowledge Organization 39: 179–192.
Sowa, John F. 1995. “Top-level ontological categories”. International Journal of Human-Computer Studies 43, 669-85.
Sowa, John F. 2006. “A dynamic theory of ontology”. Formal Ontology in Information Systems:Proceedings of the Fourth International Conference (FOIS 2006), eds. B. Bennett & C. Fellbaum, Amsterdam: IOS Press, 204-13.
Studer, Rudi, V. Richard Benjamins and Dieter Fensel. 1998. “Knowledge Engineering: Principles and Methods”. Data & Knowledge Engineering 25, nos. 1-2: 161-98.
Tillett, Barbara B. 1989. “Bibliographic Structures: the Evolution of Catalog Entries, References, and Tracings”. In The Conceptual Foundations of Descriptive Cataloging, ed. Elaine Svenonius. San Diego: Academic Press Inc., 149-65.
Tudhope, Douglas and Ceri Binding. 2016. “Still Quite Popular After all Those Years: The Continued Relevance of the Information Retrieval Thesaurus”. KnowledgeOrganization 43, no. 3: 174-9.
WordNet: an Electronic Lexical Database. 1998. Ed. Christiane Fellbaum. Cambridge (Mass), London: MIT Press.
Zeng, Marcia Lei. 2019. “Interoperability”. KnowledgeOrganization 46, no. 2: 122-46. Also available in ISKO Encyclopaedia of Knowledge Organization, eds. Birger Hjørland, and Claudio Gnoli, https://www.isko.org/cyclo/interoperability.
Zhu, Hongwei and Stuart E. Madnick. 2006. A Lightweight Ontology Approach to Scalable Interoperability. MIT Sloan Working Paper 4621-06. Working Paper CISL – Composite Information Systems Laboratory – June.
This article (version 1.0) is also published in Knowledge Organization. How to cite it:
Biagetti, Maria Teresa. 2021.
“Ontologies as knowledge organization systems”.
Knowledge Organization 48, no. 2: 152-176.
Also available in
ISKO Encyclopedia of Knowledge Organization,
eds. Birger Hjørland and Claudio Gnoli,
https://www.isko.org/cyclo/ontologies