Universiteit Utrecht Universiteitsbibliotheek

Cover illustration SIMuLLDA : a Multilingual Lexical Database Application using a Structured Interlingua

SIMuLLDA : a Multilingual Lexical Database Application using a Structured Interlingua / Maarten Janssen - [S.l.] : [s.n.], 2002 - Tekst. - Proefschrift Universiteit Utrecht

NBC: 17.60 : lexicologie, lexicografie

Trefwoorden: Formal Concept Analysis, Multilingual Lexical Database, lexical gap, automatic translation, computational lexicography, interpretative semantics, WordNet


Abstract:

It is commonly accepted that there are about five to six thousand languages. For many pairs of languages (X,Y), there is no dictionary X->Y or Y->X, there are only dictionaries for the pairs X->English/French/Spanish, and English/French/Spanish->Y. There is a clear need for dictionaries translating between languages without the intervention of a small number of Western European languages with a colonial past. Also from a theoretical point of view, such a need can be defended.
The creation of a dictionary of good quality takes a lot of time, and given the fact that 5000-6000 languages yield 25-30 million pairs of languages, it is important to have a database that provides the possibility to translate directly between pairs of languages. This thesis highlights some problems that play a role in the creation of such a database, attempts to solve some of them, and tries to show that some other problems cannot be solved.
A well-known problem is that words are often hard to match across languages: different words from different languages do not have the same range of meanings, not all words from one languages have an equivalent in the other, etc. In this thesis, a sketch is given of a database in which most of these problems are solved. Crucial in this set-up is the structure of the interlingua, which provides the possibility to relate non-corresponding meanings in a structural way. The structure of the interlingua is provided by a logical framework called Formal Concept Analysis. With the set-up proposed in this thesis it is possible to generate a descriptive translation for words in the source language that lack a direct translation in the target language. This should ease the work of a lexicographer making a dictionary for a new pair of languages.


PDF