Large Language Models and Machine Translation

Tomenchuk, Maryana; Popovych, Kseniia; Томенчук, Мар’яна Василівна

Please use this identifier to cite or link to this item: https://dspace.uzhnu.edu.ua/jspui/handle/lib/68269

Title:	Large Language Models and Machine Translation
Authors:	Tomenchuk, Maryana Popovych, Kseniia Томенчук, Мар’яна Василівна
Keywords:	machine translation, large language model, artificial intelligence, lexis, syntax, semantics
Issue Date:	2024
Citation:	Tomenchuk M., Popovych K. Large Language Models and Machine Translation. Věda a Perspektivy. ČR: E24142. (42). 2024. № 11. P. 422–431.
Series/Report no.:	Věda a Perspektivy;
Abstract:	Our article deals with linguistic peculiarities of machine translation provided by large language models, focusing on its lexical, semantic and syntactic aspects. Due to the fact that large language models are a rapidly developing notion of 21st century, the ability to understand their architecture and relation to linguistics and translation in particular is of a great importance. The study is based on the investigation of ChatGPT – a very efficient large language model, which is capable of producing human-like outputs when provided with a certain task. Through the analysis of ChatGPT-responses, we further investigate its ability to convey linguistic meaning in different aspects of a language. The main purpose of our investigation is to analyze linguistic notions from different subfields, such as syntax, lexis and semantics by reviewing the translations of GPT from target to source language. In our case, these are represented by English and Ukrainian languages. The corpus of texts which were analyzed in our research has been taken from English scientific papers published in online scientific journals [11], [13], [15] from various academic fields, such as artificial intelligence, computer science, mathematics, chemistry, biology etc. The motivation to choose scientific discourse lies within the fact that academic language is rich in terminology, formality and special structures that allow to investigate the performance of GPT in different areas. After the analysis of a model was conducted, the outputs of ChatGPT were compared to the human translation and further investigated. Our investigation has shown that such a linguistic model is a powerful tool than can be used in the area of machine translation. Its structure and architecture contribute a lot to the identification of various patterns in text, which imitates the way human percept and analyze information given. However, although being very precise and natural in the formation of responses, ChatGPT proved to be not yet a perfect machine that would substitute a professional’s work at the whole. The results of our investigation show that it tends to make mistakes when it comes to different complex structures of a language, in our case – English and Ukrainian. In our work, we used descriptive analysis of the content of large language models and linguistics to provide the relevant data on the investigated topic and to highlight the most common features of academic language translation and exploratory analysis to find most common issues occurring when translating such a corpus from English into the Ukrainian language. The inferential analysis was used in order to reach some conclusions in investigated data, including the highlight of linguistic aspect of machine translation. We used the comparison method to discover the differences of lexis and grammar in the English and Ukrainian languages.
Type:	Text
Publication type:	Стаття
URI:	https://dspace.uzhnu.edu.ua/jspui/handle/lib/68269
ISSN:	2695-1584 2695-1592
Appears in Collections:	Наукові публікації кафедри прикладної лінгвістики

Files in This Item:

File	Description	Size	Format
S12.pdf		143.07 kB	Adobe PDF	View/Open

Show full item record