Abstract:
Machine Translation (MT) has gained much attention in recent years. It is a sub-field of
computational linguistic which focus on translating text from one language to other language.
The same MT approach may not work for European languages as for Afaan Oromo, because
of its structure. MT based on Neural Networks (NN) methods has recently become an
alternative approach to the statistical MT. In this thesis, we propose a design and
implementation of a bi-directional MT between English and Afaan Oromo language pair. We
carried out our study by exploring the capabilities of deep learning approach on the algorithms
of the Recurrent NN (Bi-encoder and decoder Gated Recurrent Units (GRU) and Bi-encoder
and decoder Long Short-Term Memory (LSTM)) with attention and Transformer model. A
total of 21323 parallel corpora was used for the experiment, with percentage splits 80% of the
entire corpus for training dataset, 10% of the corpus for validation dataset and 10% of the
corpus for testing dataset. We have compared the performance of the three models on MT task
between Afaan Oromo and English language pair. Two evaluation metrics have been used to
compare the efficiency of the three models. These are BLUE and Perplexity. In terms of BLUE
score, the BLUE score for Afaan Oromo to English MT increased from 6.48 while using the
Bi-encoder and decoder LSTM with Attention model, and it increased to 15.04 while using
the Bi-encoder and decoder GRU with Attention model, and it increased to 16.40 while using
the Transformer model. In terms of perplexity, perplexity of the model for Afaan Oromo to
English MT decreased to 150.377 while using the Bi-encoder and decoder LSTM with
Attention model, and it decreased to 50.805 while using the Bi-encoder and decoder GRU
with Attention model, and it decreased to 28.188 while using the Transformer model. The
result of the comparison shows that Transformer based MT works better than the other two in
terms of both metrics. Transformer model is mainly to reduce the number of trainable
parameters, memory demand, as well as training time and performed best result as length of
sentence increased when to our proposed models.