Exploring the Boundaries of
Language-Independent Lexicon Models
Blurb. The community responsible for developing lexicons for Natural Language Processing (NLP) and Machine Readable Dictionaries (MRDs) started their ISO standardization activities in 2003. These activities resulted in the ISO standard – Lexical Markup Framework (LMF).
After selecting and defining a common terminology, the LMF team had to identify the common notions shared by all lexicons in order to specify a common skeleton (called the core model) and understand the various requirements coming from different groups of users.
The goals of LMF are to provide a common model for the creation and use of lexical resources, to manage the exchange of data between and among these resources, and to enable the merging of a large number of individual electronic resources to form extensive global electronic resources.
The various types of individual instantiations of LMF can include monolingual, bilingual or multilingual lexical resources. The same specifications can be used for small and large lexicons, both simple and complex, as well as for both written and spoken lexical representations. The descriptions range from morphology, syntax and computational semantics to computer-assisted translation. The languages covered are not restricted to European languages, but apply to all natural languages.
The LMF specification is now a success and numerous lexicon managers currently use LMF in different languages and contexts.
This book starts with the historical context of LMF, before providing an overview of the LMF model and the Data Category Registry, which provides a flexible means for applying constants like /grammatical gender/ in a variety of different settings. It then presents concrete applications and experiments on real data, which are important for developers who want to learn about the use of LMF.