WordNet (PWN) / WordnetPlus (WNP) Dictionary

Introduction

The WordNet Encyclopedic Dictionary (PWN – Princeton WordNet) is one of the most spectacular disruptive innovations in universal lexicography. Conceived in mid-1965 and developed since 1985 at the Cognitive Science Laboratory of Princeton University, a center for computational linguistics and cognitive psychology, it has become an international reference for the construction, in dozens of languages, of similar works.

Until then, dictionaries took care of defining the words of a language, explaining their different meanings or informing, essentially, the relationships with other words in terms of synonymy, antonymy and even analogs.

However, an important use case was not resolved in these types of dictionaries: THE SEMANTIC (ONTOLOGICAL) RELATIONSHIPS BETWEEN WORDS. In a simple example, if you wanted to identify how to represent a “car or automobile” that takes patients to hospitals, the word “ambulance” is in common dictionaries, but not in the definition of “car”. This and much more, as will be seen later, you would discover immediately in WordNet, as part of a semantic relation.

In summary, WordNet, in addition to functioning as an excellent traditional dictionary with definitions, examples, and synonyms of words, serves as an identifier of the lexical-semantic relationships between concepts of synonymous words.

However, it deals only with words in the grammatical categories Noun, Adjective, Verb and Adverb, as these are the ones that contain the semantic relationships raised.

The WordnetPlus Dictionary (WNP) contains the following addendums:

  • All other parts of speech:
    • Prepositions, Pronouns, Articles
    • Conjunctions and their types (coordinating, correlative … )
    • Interjections and their usage categorizations (doubt, realization, greeting …)
  • Prefixes:
    • Definitions (eg. acr- => sharp, strong). Example: (eg. acrimonious: marked by strong resentment or cynicism)
    • Semantic relationship of the prefix with the words that define it (eg. acr- relates to the following definition of strong: not faint or feeble).
  • Suffixes:
    • Definition (eg. -able => able to be done; liable). Example: (movable: able to be moved)
    • Grammatical category formation rule (eg. verb => adj as in move => movable)
    • Semantic Relationship of the suffix with the words that define it, eg. -able relates to the following definition of able: having the necessary means or skill or know-how or authority to do something

Some Numbers

The database contains around:

  • 160.000 entries (lemmas), compounds, collocations, phrasal verbs and idiomatic phrases.
  • 180.000 synsets (synonym sets)
  • 210.000 concept definitions.

Organization of the Database

  • Word is the basic unit.
  • Grammatical categories.
  • (synsets) Seamless synonym groups that share a single definition and examples:
  • Lexical relations between words from different synsets.
  • Semantic relations between synsets. Example of the “kind of” relation in synset shown above:
  • Knowledge Structure: The synsets of nouns and verbs are organized in a hierarchical structure in their semantic relations of generalization (hypernym) / specialization (hyponym). Such a structure can be traversed in both directions. Example of “car” of the synset shown above:
    -entity (synset at the top of the hierarchy)
    –physical entity
    —physical object
    —-whole
    —–artifact
    ——instrumentality
    ——-transport
    ——–vehicle
    ———wheeled vehicle
    ———-self-propelled vehicle
    ———–motor vehicle …
    ————car, auto
    ————-ambulance
    ————–funny wagon (synset at the end of the hierarchy)

Semantic Relations

The semantic relations that WordNet deals with obey the concepts of Ontology from the point of view of computer science, that is, the formal representation of categories, properties and relations between concepts, associated with certain domains of knowledge.

Here, the domain is the set of words in the language; the categories and properties refer to their lexical aspects; the relations were delimited by the scientists who built this work, which is summarized below.

  • Words are grouped into synsets that are perfectly synonymous with each other, that is, they share a single definition.
  • Semantic relations occur between synsets.
  • Two relations hierarchize the synsets of the grammatical categories:
    • Broader: A given synset leads to another hierarchically higher, that is, given a synset, one is obtained that is more general (e.g. tree -> forest; remove -> aspirate).
    • Narrower: A given synset leads to another hierarchically lower, that is, given a synset, one is obtained that is more specific to it (e.g., forest -> tree; aspirate -> remove).
  • Other non-hierarchical relations are defined for all grammatical categories.

 In LexSemantic, for each synset, the icon must be pressed, that is, given a synset (original ), you get another (obtained) by the following semantic relations.

Nouns:

  • Broader – Kind/Instance (hypernym): the original synset is a kind or instance of the obtained synset.
    • Kind: dog is a kind of canine.
    • Kind: ambulance is a kind of car.
    • Instance: renaissance is an instance of historic period.
  • Narrower – Kind/Instance (hyponym): the obtained synset is a kind or instanceof the original synset.
    • The examples are the same as for the Broader because obtained and the original are inverted in the expression that defines them.
  • Broader – Part/Member/Substance (holonym): the original synset is Part, Member or Substanceof the obtained synset.
    • Part: engine is part of the car.
    • Member: tree is a member of the forest.
    • Substance: nitrogen is a substance in the air.
  • Narrower – Part/Member/Substance (meronym): the obtained synset is Part, Member or Substance of the original synset.
  • Broader – Topic/Region/Usage Domain: the original synset is a member of the obtained synset Topic/Region/Usage Domain.
    • Topic: plant is a member ofthe botany domain.
    • Region: Japanese is a member of theJapan domain.
    • Region: harakiri is a member of theJapan domain.
    • Usage: congratulation is a member ofthe plural domain (usually used in the plural).
    • Usage: livid (furiously angry) is a member of the colloquialism domain.
    • Usage: pint-size (well below average height) is a member of the slang domain.
  • Narrower – Member of Topic/Region/Usage Domain: the obtained synset is a member of theoriginal synset Topic/Region/Usage Domain.
  • Attribute: the original synset (noun) has the obtained synset(adjective) as an attribute and vice-versa.
    • weight has heavy as an attribute.
    • weight has light as an attribute.
    • measure has standard as an attribute.
    • sexual activity has heterosexual as an attribute.
    • commerce has commercial as an attribute.

Verbs:

  • Broader (Hypernym): the obtained synset is a generalization of the original synset.
    • remove is a generalization of aspirate.
    • get is a generalization of receive.
  • Narrower (Troponym): the obtained synset is a specialization of the original synset.
    • aspirate is a specialization of remove.
    • receive is a specialization of get.
  • Broader – Topic/Region/Usage Domain: the original synset is a member of the obtained synset Topic/Region/Usage Domain.
    Note: when the Domain refers to just one word in the synset, it is shown in the Lexical Relation, described below.
    • Topic: freeze is a member of the surgery domain.
    • Region: judder is a member of the United Kingdom domain.
    • Usage: keep_one’s_eyes_peeled is a member of the colloquialism domain.
  • Entailment: the original synset entails the obtainedsynset.
    • listen entails hear.
    • catch, take in entails listen.
    • taste (distinguish flavors) entails taste (perceive by the sense of taste).
    • purchase entails pay.
    • purchase entails choose, take.
  • Cause: the original synset is the cause of the obtained synset.
    • coerce, force is the cause of act, move.
    • hurt is the cause of ache, smart.
  • Verb Group: the original synset has a similar meaning to,or is derived from theobtained synset.
    • irritate has a similar meaning to chafe.
    • receive has a similar meaning to accept, take.
    • find (obtain through effort) has a similar meaning to regain (come upon after searching).

Adjectives:

  • Attribute: the original synset (adjective) has the obtained synset (noun) as an attribute and vice-versa.
    • heavy has weight as an attribute.
    • light has weight as an attribute.
    • standard has measure as an attribute.
    • heterosexual has sexual activity as an attribute.
    • commercial has commerce as an attribute.
  • Broader – Topic/Region/Usage Domain: the original synset is a member of the obtained synset Topic/Region/Usage Domain.
    Note: when the Domain refers to just one word in the synset, it is shown in the Lexical Relation, described below.
    • Topic: acidic is a member of the chemistry, chemical science domain.
    • Topic: bimodal is a member of the statistic domain
    • Region: fly (not to be deceived) is a member of the United Kingdom domain.
    • Region: bashful, blate is a member of the Scotland domain.
    • Usage: billed is a member of the combining form (bill+ed) domain.
    • Usage: dishy is a member of the colloquialism domain.
  • Also see: the original synset is related to obtained synset, both adjectives.
    • abridged is related to short.
    • covert is related to implicit, invisible, concealed.
  • Similar to: the original synset is similar to the obtained synset and vice-versa.
    • abridged is similar to cut, shortened.
    • covert is similar to furtive, black, clandestine.

Adverbs:

  • Broader – Topic/Region/Usage Domain: the original synset is a member of the obtained synset Topic/Region/Usage Domain.
    Note: when the Domain refers to just one word in the synset, it is shown in the Lexical Relation, described below.
    • Topic: with brio is a member of themusic domain.
    • Topic: therefore is a member of the law, jurisprudence domain.
    • Region: langsyne is a member of the Scotland domain.
    • Usage: neareris a member of the comparative (form) domain.

Lexical Relations

The lexical relations covered by WordNet make a connection between two words that belong to different synsets that present morphological and antonymic derivations, with the synonyms already represented by the synsets themselves.

In LexSemantic, you must press the  icon in the top bar of the Definition screen, that is, given a word, you get the synsets that contain the word lexically related to it.

Antonym:

It applies to all grammatical categories and represents the opposite meaning of the given word.

  • Noun
    • appearance is an antonym for disappearance.
    • retreat is an antonym for progress.
  • Verb
    • exhale is an antonym for inhale, inspire.
    • rest is an antonym for be active, move.
  • Adjective
    • beautiful is an antonym for ugly.
    • painful is an antonym for painless.
  • Adverb
    • necessarily is an antonym for unnecessarily.
    • retail (at a retail price) is an antonym for wholesale (at a wholesale price).


Indirect Antonym:

It occurs only in adjectives, being obtained by inference so that it occurs when the related words belong to two synsets linked by the Semantic Relation: Similar to. The antonym, in this case, is “weaker” than that obtained between synonyms synsets.

  • Adjective
    • beautiful (highly enjoyable) is Similar to: pleasant, so it is an indirect antonym for unpleasant.

Phrasal Verb:

It occurs only in verbs, with the addition of another word, usually a preposition or adverb to form a sentence.

  • Verb
    • sleep forms sleep in, sleep out, sleep late.
    • fresh forms freshen up.

Participle of the Verb:

It occurs only in adjectives formed by the past participle of a verb.

Note: In the Lex-Semantic Definition screen, the  icon can be pressed to navigate to the verb in the infinitive.

  • Adjective
    • collected comes from the past participle of collect.
    • beaten comes from the past participle of beat.


Pertaynym:

An adjective that is pertinent to another word.

  • Adjective
    • academic is pertinent to academy.
    • adolescent is pertinent to adolescence.

(Adverb) Derived from Adjective:

  • Adverb
    • scarcely derived from scarce.
    • essentially derived from essential.

Related by Derivation (same root):

Occurs in all grammatical categories, and indicates a relation between two words with the same root and vice-versa.

  • Noun
    • abstraction related to abstract (verb).
    • object related to objectify (verb).
    • concept related to conceptual, conceive (verbs).
  • Verb
    • conceive related to concept, conception (nouns).
    • refresh related to refreshment (noun).
  • Adjective
    • unbearable related to unbearably (adverb).
    • adolescent related to adolesce (verb), adolescence (noun, adjective).
  • Adverb
    • unbearably related to unbearable (adjective).
    • rabidly related to rabid (adjective).

References

WordNet LICENSE

This product uses the database WordNet 3.1 provided by Princeton University. All rights reserved.

THIS SOFTWARE AND DATABASE IS PROVIDED “AS IS” AND PRINCETON UNIVERSITY MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PRINCETON UNIVERSITY MAKES NO REPRESENTATIONS OR WARRANTIES OF MERCHANT- ABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE LICENSED SOFTWARE, DATABASE OR DOCUMENTATION WILL NOT INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR OTHER RIGHTS.