AUTHOR=Giouli Voula 

TITLE=A model for representing the semantics of MWEs: From lexical semantics to the semantic annotation of complex predicates

JOURNAL=Frontiers in Artificial Intelligence

VOLUME=Volume 6 - 2023

YEAR=2023

URL=https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2023.802218

DOI=10.3389/frai.2023.802218

ISSN=2624-8212

ABSTRACT=Multiword expressions (MWEs) are sequences of words that pose a challenge to the computational processing of human languages due to their idiosyncrasies and the mismatch between their phrasal structure and their semantics that renders them "a pain in the neck for Natural Language Processing" (Sag et al., 2002). These idiosyncrasies are of lexical, morphosyntactic and semantic nature (Gross, 1982; 1998a; 1998b; Lamiroy, 2003; Baldwin and Kim, 2010, Constant et al., 2017), namely: non-compositionality, i.e., the meaning of the expression cannot be computed from the meanings of its constituents; discontiguity, i.e., in that alien elements may intervene; non-substitutability, i.e., at least one of the expression constituents is lexicalized and therefore, does not enter in alternations at the paradigmatic axis; and non-modifiability, in that they enter in syntactically rigid structures, posing further constraints over modification, transformations, etc. In terms of meaning, they appear in a continuum of compositionality, which ranges from expressions that are very analysable to others that are partially analysable or ultimately non-analysable at all (Nunberg et al., 1994). The paper presents a model for representing verbal multi-word expressions (VMWEs) by taking into account all these inherent idiosyncrasies. The model assumes the form of a linguistic ontology and is applied to Greek verbal multi-word expressions. We focus on the semantics of the lexical entries under scrutiny also based on corpus evidence. In this regard, modelling the semantics of VMWEs is placed in the lexicon-corpus interface.