Cross-Lingual Textual Entailment and Applications

Mehdad, Yashar

Repository landing page

thesis

oai:eprints-phd.biblio.unitn.it:806

Cross-Lingual Textual Entailment and Applications

Authors: Yashar Mehdad
Publication date: 23 March 2012
Publisher: University of Trento

Abstract

Textual Entailment (TE) has been proposed as a generic framework for modeling language variability. The great potential of integrating (monolingual) TE recognition components into NLP architectures has been reported in several areas, such as question answering, information retrieval, information extraction and document summarization. Mainly due to the absence of cross-lingual TE (CLTE) recognition components, similar improvements have not yet been achieved in any corresponding cross-lingual application. In this thesis, we propose and investigate Cross-Lingual Textual Entailment (CLTE) as a semantic relation between two text portions in dierent languages. We present dierent practical solutions to approach this problem by i) bringing CLTE back to the monolingual scenario, translating the two texts into the same language; and ii) integrating machine translation and TE algorithms and techniques. We argue that CLTE can be a core tech- nology for several cross-lingual NLP applications and tasks. Experiments on dierent datasets and two interesting cross-lingual NLP applications, namely content synchronization and machine translation evaluation, conrm the eectiveness of our approaches leading to successful results. As a complement to the research in the algorithmic side, we successfully explored the creation of cross-lingual textual entailment corpora by means of crowdsourcing, as a cheap and replicable data collection methodology that minimizes the manual work done by expert annotators

Similar works

Full text

Open in the Core reader

Download PDF

Unitn-eprints PhD

oai:eprints-phd.biblio.unitn.i...

Last time updated on 21/05/2016

This paper was published in Unitn-eprints PhD.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.