Chauhan, Shweta and Daniel, Philemon and Saxena, Shefali and Sharma, Ayush (2022) Fully Unsupervised Machine Translation Using Context-Aware Word Translation and Denoising Autoencoder. Applied Artificial Intelligence, 36 (1). ISSN 0883-9514
Fully Unsupervised Machine Translation Using Context Aware Word Translation and Denoising Autoencoder.pdf - Published Version
Download (5MB)
Abstract
Learning machine translation by using only monolingual data sets is a complex task as there are many possible ways to connect or associate target sentences with source sentences. The monolingual word embeddings are linearly mapped on a common shared space through robust learning or adversarial training in an unsupervised way, but these learning techniques have fundamental limitations in translating sentences. In this paper, a simple yet effective method has been proposed for fully unsupervised machine translation that is based on cross-lingual sense to word embedding instead of cross-lingual word embedding and language model. We have utilized word sense disambiguation to incorporate the source language context in order to select the sense of a word more appropriately. A language model for considering target language context in lexical choices and denoising autoencoder for language insertion, deletion, and reordering are integrated. The proposed approach eliminates the problem of noisy target language context due to erroneous word translations. This work takes into account the challenge of homonyms and polysemous words in the case of morphologically rich languages. The experiments performed on English-Hindi and Hindi-English using different evaluation metrics show an improvement of +3 points in BLEU and METEOR-Hindi over the baseline system.
Item Type: | Article |
---|---|
Subjects: | OA STM Library > Computer Science |
Depositing User: | Unnamed user with email support@oastmlibrary.com |
Date Deposited: | 30 Jun 2023 05:25 |
Last Modified: | 16 Sep 2024 10:17 |
URI: | http://geographical.openscholararchive.com/id/eprint/1087 |