Dr. Christian M. Meyer

Multilingual Knowledge

in Aligned Wiktionary and OmegaWiki for Translation Applications

Abstract. Multilingual lexical-semantic resources play an important role in translation applications. However, multilingual resources with sufficient quality and coverage are rare as the effort of manually constructing such a resource is substantial. In recent years, the emergence of Web 2.0 has opened new possibilities for constructing large-scale lexical-semantic resources. We identified Wiktionary and Omega­Wiki as two important multilingual initiatives where a community of users (“crowd”) collaboratively edits and refines the lexical in­for­ma­tion. They seem especially appropriate in the multilingual domain as users from all languages and cultures can easily contribute. However, despite their advantages such as open access and coverage of multiple languages, these resources have hardly been systematically investigated and utilized until now. Therefore, the goals of our contribution are threefold: (1) We analyze how these resources emerged and characterize their content and structure; (2) We propose an alignment at the word sense level to exploit the complementary in­for­ma­tion contained in both esources for increased coverage; (3) We describe a mapping of the resources to a standardized, unified model (UBY-LMF) thus creating a large freely available multilingual resource designed for easy integration into applications such as machine translation or computer-aided translation en­vi­ron­ments.

Submitted: 31.07.2012 | Published: 27.06.2013
Incorrect translations of <i>bass</i> proposed by a statistical machine translation system.
Incorrect translations of bass proposed by a statistical machine translation system.