An anlological proportion (a more precise term for "analogy") is a relation between four elements A, B, C, and D, usually of the same nature. It is a statement of the form "A is to B as C is to D", written "A : B :: C : D". Such a statement means that the relation between A and B is similar to the one between C and D.
The problem of analogical proportion can be split in two key problems, namely, analogy resolution and detection. Analogy detection is, as the name indicates, the problem of deciding whether four elements A, B, C, and D are in analogical proportion. Analogy resolution is the problem of finding the missing element of an analogy A:B:C:?.
These two problems provide a logical framework to address learning, transfer, and explainability concerns. Such a framework finds useful applications in artificial intelligence and natural language processing.
Morphological analogies are analogical proportions between words, using morphological relations. For example in English, dead:undead::do:undo is a morphological analogy: undead is dead with the prefix un-, and undo is do with the prefix un-. In other words, dead is to undead as do is to undo.
Morphological analogies are analogical proportions between words, using morphological relations. For example in English, dead:undead::do:undo is a morphological analogy: undead is dead with the prefix un-, and undo is do with the prefix un-. In other words, dead is to undead as do is to undo.
ANNa-MD (ANNa for Morphological analogy Detection) and ANNa-MR (ANNa for Morphological analogy Resolution) are deep learning models designed to tackle morphological analogies. They displays competitive performance on analogy detection and resolution over 11 languages.
To transform the words into computer-readable elements we use an embedding model inspired by Kim et al. This embedding model is capable of capturing morphological features of wods and express it as a vector called word embedding.
ANNa-MD relies on a convolutional neural network to detect valid morphological analogies using the above-mentionned embedding model (AlSaidi et al (a)). The model used is similar to the one by Lim et al. The embedding model and the clssifier ANNa-MD are trained together, using data from 11 different languages.
ANNa-MR uses multiple fully-connected layers to resolve a morphological analogical equation, i.e., an analogie with a missing element. Similarly to ANNa-MD, it uses the morphological embedding model we designed. The structure of the model is similar to what is proposed by Lim et al. From the embeddings of 3 words, ANNa-MR computes the embedding of the last word to complete the analogy. The word returned is the closest existing word to the model output in the embedding space.
S. Alsaidi, A. Decker, P. Lay, E. Marquer, P.-A. Murena, M. Couceiro
DSAA, 2021
Analogical proportions are statements of the form "A is to B as C is to D" that are used for several reasoning and classification tasks in artificial intelligence and natural language processing (NLP). For instance, there are analogy based approaches to semantics as well as to morphology. In fact, symbolic approaches were developed to solve or to detect analogies between character strings, e.g., the axiomatic approach as well as that based on Kolmogorov complexity. In this paper, we propose a deep learning approach to detect morphological analogies, for instance, with reinflexion or conjugation. We present empirical results that show that our framework is competitive with the above-mentioned state of the art symbolic approaches. We also explore empirically its transferability capacity across languages, which highlights interesting similarities between them.