1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145
|
.. raw:: html
<style>.center {margin-left:20%}</style>
The Structure of a Wordnet
==========================
A **wordnet** is an online lexicon which is organized by concepts.
The basic unit of a wordnet is the synonym set (**synset**), a group of words that all refer to the
same concept. Words and synsets are linked by means of conceptual-semantic relations to form the
structure of wordnet.
Words, Senses, and Synsets
--------------------------
We all know that **words** are the basic building blocks of languages, a word is built up with two parts,
its form and its meaning, but in natural languages, the word form and word meaning are not in an elegant
one-to-one match, one word form may connect to many different meanings, so hereforth, we need **senses**,
to work as the unit of word meanings, for example, the word *bank* has at least two senses:
1. bank\ :sup:`1`\: financial institution, like *City Bank*;
2. bank\ :sup:`2`\: sloping land, like *river bank*;
Since **synsets** are group of words sharing the same concept, bank\ :sup:`1`\ and bank\ :sup:`2`\ are members of
two different synsets, although they have the same word form.
On the other hand, different word forms may also convey the same concept, such as *cab* and *taxi*,
these word forms with the same concept are grouped together into one synset.
.. raw:: html
:file: images/word-sense-synset.svg
.. role:: center
:class: center
:center:`Figure: relations between words, senses and synsets`
Synset Relations
----------------
In wordnet, synsets are linked with each other to form various kinds of relations. For example, if
the concept expressed by a synset is more general than a given synset, then it is in a
*hypernym* relation with the given synset. As shown in the figure below, the synset with *car*, *auto* and *automobile* as its
member is the *hypernym* of the other synset with *cab*, *taxi* and *hack*. Such relation which is built on
the synset level is categorized as synset relations.
.. raw:: html
:file: images/synset-synset.svg
:center:`Figure: example of synset relations`
Sense Relations
---------------
Some relations in wordnet are also built on sense level, which can be further divided into two types,
relations that link sense with another sense, and relations that link sense with another synset.
.. note:: In wordnet, synset relation and sense relation can both employ a particular
relation type, such as `domain topic <https://globalwordnet.github.io/gwadoc/#domain_topic>`_.
**Sense-Sense**
Sense to sense relations emphasize the connections between different senses, especially when dealing
with morphologically related words. For example, *behavioral* is the adjective to the noun *behavior*,
which is known as in the *pertainym* relation with *behavior*, however, such relation doesn't exist between
*behavioral* and *conduct*, which is a synonym of *behavior* and is in the same synset. Here *pertainym*
is a sense-sense relation.
.. raw:: html
:file: images/sense-sense.svg
:center:`Figure: example of sense-sense relations`
**Sense-Synset**
Sense-synset relations connect a particular sense with a synset. For example, *cursor* is a term in the
*computer science* discipline, in wordnet, it is in the *has domain topic* relation with the
*computer science* synset, but *pointer*, which is in the same synset with *cursor*, is not a term, thus
has no such relation with *computer science* synset.
.. raw:: html
:file: images/sense-synset.svg
:center:`Figure: example of sense-synset relations`
Other Information
-----------------
A wordnet should be built in an appropriate form, two schemas are accepted:
* XML schema based on the Lexical Markup Framework (LMF)
* JSON-LD using the Lexicon Model for Ontologies
The structure of a wordnet should contain below info:
**Definition**
Definition is used to define senses and synsets in a wordnet, it is given in the language
of the wordnet it came from.
**Example**
Example is used to clarify the senses and synsets in a wordnet, users can understand the definition
more clearly with a given example.
**Metadata**
A wordnet has its own metadata, based on the `Dublin Core <https://dublincore.org/>`_, to state the
basic info of it, below table lists all the items in the metadata of a wordnet:
+------------------+-----------+-----------+
| contributor | Optional | str |
+------------------+-----------+-----------+
| coverage | Optional | str |
+------------------+-----------+-----------+
| creator | Optional | str |
+------------------+-----------+-----------+
| date | Optional | str |
+------------------+-----------+-----------+
| description | Optional | str |
+------------------+-----------+-----------+
| format | Optional | str |
+------------------+-----------+-----------+
| identifier | Optional | str |
+------------------+-----------+-----------+
| publisher | Optional | str |
+------------------+-----------+-----------+
| relation | Optional | str |
+------------------+-----------+-----------+
| rights | Optional | str |
+------------------+-----------+-----------+
| source | Optional | str |
+------------------+-----------+-----------+
| subject | Optional | str |
+------------------+-----------+-----------+
| title | Optional | str |
+------------------+-----------+-----------+
| type | Optional | str |
+------------------+-----------+-----------+
| status | Optional | str |
+------------------+-----------+-----------+
| note | Optional | str |
+------------------+-----------+-----------+
| confidence | Optional | float |
+------------------+-----------+-----------+
|