|
|
| (13 intermediate revisions by 5 users not shown) |
| Line 5: |
Line 5: |
| | 30 Chapters(0/30) | | 30 Chapters(0/30) |
| | | | |
| − | [[Machine_Trans_EN_1]] [[Machine_Trans_EN_3]] [[Machine_Trans_EN_4]] [[Machine_Trans_EN_5]] [[Machine_Trans_EN_6]] [[Machine_Trans_EN_7]] [[Machine_Trans_EN_8]] [[Machine_Trans_EN_9]] [[Machine_Trans_EN_10]] [[Machine_Trans_EN_11]] [[Machine_Trans_EN_12]] [[Machine_Trans_EN_13]] [[Machine_Trans_EN_14]] [[Machine_Trans_EN_15]] [[Machine_Trans_EN_16]] [[Machine_Trans_EN_17]] [[Machine_Trans_EN_18]] [[Machine_Trans_EN_19]] [[Machine_Trans_EN_20]] [[Machine_Trans_EN_21]] [[Machine_Trans_EN_22]] [[Machine_Trans_EN_23]] [[Machine_Trans_EN_24]] [[Machine_Trans_EN_25]] [[Machine_Trans_EN_26]] [[Machine_Trans_EN_27]] [[Machine_Trans_EN_28]] [[Machine_Trans_EN_29]] [[Machine_Trans_EN_30]] ... | + | [[Machine_Trans_EN_1]] [[Machine_Trans_EN_2]] [[Machine_Trans_EN_3]] [[Machine_Trans_EN_4]] [[Machine_Trans_EN_5]] [[Machine_Trans_EN_6]] [[Machine_Trans_EN_7]] [[Machine_Trans_EN_8]] [[Machine_Trans_EN_9]] [[Machine_Trans_EN_10]] [[Machine_Trans_EN_11]] [[Machine_Trans_EN_12]] [[Machine_Trans_EN_13]] [[Machine_Trans_EN_14]] [[Machine_Trans_EN_15]] [[Machine_Trans_EN_16]] [[Machine_Trans_EN_17]] [[Machine_Trans_EN_18]] [[Machine_Trans_EN_19]] [[Machine_Trans_EN_20]] [[Machine_Trans_EN_21]] [[Machine_Trans_EN_22]] [[Machine_Trans_EN_23]] [[Machine_Trans_EN_24]] [[Machine_Trans_EN_25]] [[Machine_Trans_EN_26]] [[Machine_Trans_EN_27]] [[Machine_Trans_EN_28]] [[Machine_Trans_EN_29]] [[Machine_Trans_EN_30]] ... |
| | | | |
| | [[Book_projects|Back to translation project overview]] | | [[Book_projects|Back to translation project overview]] |
| Line 18: |
Line 18: |
| | [[Machine_Trans_EN_1]] | | [[Machine_Trans_EN_1]] |
| | | | |
| − | =Chapter 2:On the Realm Advantages And Mutual Development of Machine Translation And Huamn Translation)= | + | =Chapter 2:On the Realm Advantages and Mutual Development of Machine Translation and Huamn Translation)= |
| − | "'论机器翻译与人工翻译的领域优势及共同发展'"
| + | '''论机器翻译与人工翻译的领域优势及共同发展''' |
| | | | |
| | 肖毅瑶 Xiao Yiyao, Hunan Normal University, China | | 肖毅瑶 Xiao Yiyao, Hunan Normal University, China |
| | | | |
| | [[Machine_Trans_EN_2]] | | [[Machine_Trans_EN_2]] |
| | + | |
| | + | =Chapter 3: -Missing-= |
| | + | |
| | + | [[Machine_Trans_EN_3]] |
| | | | |
| | =Chapter 4 : A Comparison Between Machine Translation of Netease and Traditional Human Translation—A Case Study of The Economist Articles)= | | =Chapter 4 : A Comparison Between Machine Translation of Netease and Traditional Human Translation—A Case Study of The Economist Articles)= |
| Line 53: |
Line 57: |
| | | | |
| | =9 谢佳芬( Machine Translation and Artificial Translation in the Era of Artificial Intelligence)= | | =9 谢佳芬( Machine Translation and Artificial Translation in the Era of Artificial Intelligence)= |
| | + | 人工智能时代下的机器翻译与人工翻译 |
| | + | |
| | + | 谢佳芬 Xie Jiafen ,Hunan Normal University, China |
| | + | |
| | [[Machine_Trans_EN_9]] | | [[Machine_Trans_EN_9]] |
| | | | |
| Line 62: |
Line 70: |
| | [[Machine_Trans_EN_10]] | | [[Machine_Trans_EN_10]] |
| | | | |
| − | =Chapter 11 陈惠妮 Study on Pre- editing of Machine Translation - A Case Study of Medical Abstracts= | + | =Chapter 11 Study on Pre- editing of Machine Translation - A Case Study of Medical Abstracts= |
| | | | |
| | 机器翻译的译前编辑研究——以医学类文摘为例 | | 机器翻译的译前编辑研究——以医学类文摘为例 |
| Line 70: |
Line 78: |
| | [[Machine_Trans_EN_11]] | | [[Machine_Trans_EN_11]] |
| | | | |
| − | ===Abstract===
| |
| − | At present, globalization is accelerating and the market demand for language services is rapidly increasing . Machine translation, as an important translation method, can greatly improve translation efficiency due to its low cost and high speed. However, because of the limitations of machine translation and the differences between Chinese and English language, machine translation is not accurate enough. In order to balance translation efficiency and translation quality, a great number of manual revisions in translation are required for the machine translating texts. Medical papers are specialized, special and purposeful, so it requires accurate,qualified and professional translation. However, the quality of translations by machine is inefficient to meet the high-quality requirements of medical papers translation. Therefore, the introduction of pre-editing can greatly improve the efficiency and quality of machine translation.
| |
| − |
| |
| − | ===Key words===
| |
| − | Pre-editing, Machine translation, Medical texts
| |
| − |
| |
| − | ===题目===
| |
| − | Study on Pre- editing of Machine Translation - A Case Study of Medical Abstracts
| |
| − |
| |
| − | ===摘要===
| |
| − | 在全球化加速发展的今天,市场对语言服务的需求迅速增加。机器翻译作为一种重要的翻译途径,由于其成本低、速度快,可以大大提高翻译效率。然而,由于机器翻译的局限性以及中英文语言的差异,机器翻译的准确性不高。为了平衡翻译效率和翻译质量,机器翻译文本需要大量的手工修改。
| |
| − | 医学论文具有专业性、特殊性和目的性,要求其译文准确、合格、专业。然而,机器翻译的质量较低,无法满足医学论文对翻译的高质量要求。因此,译前编辑的引入可以大大提高机器翻译的效率和质量。
| |
| − |
| |
| − | ===关键词===
| |
| − | 译前编辑;机器翻译;医学文本
| |
| − |
| |
| − | ===1. Introduction===
| |
| − | ===1.1 Definition of Machine Translation===
| |
| − | As Cronin(2013) revealed: "Translation is undergoing a revolutionary upheaval. The influence of digital technology and the Internet on translation is continuous, extensive and profound. From the popularity of automatic online translation applications, translation revolution is everywhere (Cronin,2013) However,the concept of machine translation was firstly proposed in the 1930s. Since 1940s, the machine translation technology has been evolving from rule-based machine translation (RBMT) to statistical machine translation (SMT), and to neural machine translation (NMT). Machine translation refers to the automatic translation of source language into target language by using a computer system. That is, machine translation refers to the automatic translation of text from one language into another natural language by computer software or other online translation webs. Machine translation is also defined as the process of “using a computer system to automatically translate text or speech from one natural language to another” according to the definition by ISO (Cui 2014:68-73).O 'Brien (2002) defines it as "the behavior of modifying errors in machine translation to ensure that the target translation meets certain quality requirements". On the basic level, machine translation performs mechanical substitution of words in one language for words in another language, but that rarely produces a good translation, therefore, recognition of the whole phrases and their closest counterparts in the target language is needed. Not all words in one language have equivalents in another language, and many words have more than one meaning. A huge demand for translation is greatly needed in today’s global world, which creates new opportunities for the development of machine translation, attracts more and more attention and becomes one of the current research focuses.
| |
| − |
| |
| − | As Cronin(2013) revealed: "Translation is undergoing a revolutionary upheaval. The influence of digital technology and the Internet on translation is continuous, extensive and profound. From the popularity of automatic online translation applications, translation revolution is everywhere (Cronin,2013) However,the concept of machine translation was firstly proposed in the 1930s. Since 1940s, the machine translation technology has been evolving from rule-based machine translation (RBMT) to statistical machine translation (SMT), and to neural machine translation (NMT). Machine translation refers to the automatic translation of source language into target language by using a computer system. That is, machine translation refers to the automatic translation of text from one language into another natural language by computer software or other online translation webs. Machine translation is also defined as the process of “using a computer system to automatically translate text or speech from one natural language to another” according to the definition by ISO (Cui 2014:68-73).O 'Brien (2002) defines it as "the behavior of modifying errors in machine translation to ensure that the target translation meets certain quality requirements". On the basic level, machine translation performs mechanical substitution of words in one language for words in another language, but that rarely produces a good translation, therefore, recognition of the whole phrases and their closest counterparts in the target language is needed. Not all words in one language have equivalents in another language, and many words have more than one meaning. A huge demand for translation is greatly needed in today’s global world, which creates new opportunities for the development of machine translation, attracts more and more attention and becomes one of the current research focuses.
| |
| − | --[[User:Cai Zhufeng|Cai Zhufeng]] ([[User talk:Cai Zhufeng|talk]]) 13:34, 13 December 2021 (UTC)correted by Cai Zhufeng
| |
| − |
| |
| − | ===1.2 Definition of Pre-editing===
| |
| − | Pre-editing means to adjust and modify the source language to make it fit more with the characteristics of the machine translation software before putting the source language before the into machine translation, so as to improve the quality of the translation machine translation (Wei Changhong 2008:93-94). Pre-editing is to modify the original text before putting it into machine translation software, in order to improve the recognition rate of machine translation, optimize the output quality of translated text and reduce the workload of post-editing. Because pre-translation editing only needs to be modified in one language, the operation is simpler than post-translation editing, which can realize the double improvement of quality and efficiency. A good pre-editing translation can help machine translation more smoothly, thus improving the machine readability and quality of the output translation. Pre-editing is mostly applied in the following situations: one is when the original text is of poor quality and the machine is difficult to recognize the meaning of the sentence, such as user generated content with poor readability and translatability (Gerlach et al 2003:45-53); Documents that need to be published in multiple languages; Next is when the original text contains a lot of jargon; The last is the original text has a corresponding translation memory bank. If the original text is edited, it can better match the content of the translation memory bank. Pre-editing is a process of identifying problems. It requires to pre-edit the source texts before putting it into machine translation according to the requirements, listing the expressions or sentences that may have trouble in machine translation and then pre-edit it by human. The purpose is to enable the computer to translate better, improve the translatability of machine translation. (Slype G V & Guinet J F & Seitz F 1984:115)
| |
| − |
| |
| − | Pre-editing means to adjust and modify the source language to make it fit more with the characteristics of the machine translation software before putting the source language before the into machine translation, so as to improve the quality of the translation machine translation (Wei Changhong 2008:93-94). Pre-editing is to modify the original text before putting it into machine translation software, in order to improve the recognition rate of machine translation, optimize the output quality of translated text and reduce the workload of post-editing. Because pre-translation editing only needs to be modified in one language, the operation is simpler than post-translation editing, which can realize the double improvement of quality and efficiency. A good pre-editing translation can help machine translation more smoothly, thus improving the machine readability and quality of the output translation. Pre-editing is mostly applied in the following situations: one is when the original text is of poor quality and the machine is difficult to recognize the meaning of the sentence, such as user generated content with poor readability and translatability (Gerlach et al 2003:45-53); Documents that need to be published in multiple languages; Next is when the original text contains a lot of jargon; The last is the original text has a corresponding translation memory bank. If the original text is edited, it can better match the content of the translation memory bank. Pre-editing is a process of identifying problems. It requires to pre-edit the source texts before putting it into machine translation according to the requirements, listing the expressions or sentences that may have trouble in machine translation and then pre-edit it by human. The purpose is to enable the computer to translate better, improve the translatability of machine translation. (Slype G V & Guinet J F & Seitz F 1984:115)
| |
| − | --[[User:Cai Zhufeng|Cai Zhufeng]] ([[User talk:Cai Zhufeng|talk]]) 13:36, 13 December 2021 (UTC)correted by Cai Zhufeng
| |
| − |
| |
| − | ===1.3 Machine Translation Mode===
| |
| − | According to the different knowledge acquisition methods, machine translation modes can be classified as follows: one is rule-based machine translation, which is based on bilingual dictionaries and a library of language rules for each language. The quality of translation depends on whether the source language conforms to the existing rules, but the inexhaustible rules are the hinders of this model. The second mode is machine translation based on statistics. This translation model relies on the principles of mathematics and statistics to find various existing translations corresponding to the translation tasks through the employment of corpus, analyzing the frequency of their occurrence and selecting the translation with the highest frequency for output. The disadvantage of this translation model is that it ignores the flexibility of language and the importance of context. The last one is neural network language model. This model is different from previous translation models in that it uses end-to-end neural network to realize automatic translation between natural languages. At present, the quality of its translation is much higher than that of the previous translation models.
| |
| − |
| |
| − | According to the different knowledge acquisition methods, machine translation modes can be classified as follows: one is rule-based machine translation, which is based on bilingual dictionaries and a library of language rules for each language. The quality of translation depends on whether the source language conforms to the existing rules, but the inexhaustible rules are the hinders of this model. The second mode is machine translation based on statistics. This translation model relies on the principles of mathematics and statistics to find various existing translations corresponding to the translation tasks through the employment of corpus, analyzing the frequency of their occurrence and selecting the translation with the highest frequency for output. The disadvantage of this translation model is that it ignores the flexibility of language and the importance of context. The last one is neural network language model. This model is different from previous translation models in that it uses end-to-end neural network to realize automatic translation between natural languages. At present, the quality of its translation is much higher than that of the previous translation models.
| |
| − |
| |
| − | --[[User:Cai Zhufeng|Cai Zhufeng]] ([[User talk:Cai Zhufeng|talk]]) 13:48, 13 December 2021 (UTC)correted by Cai Zhufeng
| |
| − |
| |
| − | ===1.4 Source of Study Abstracts===
| |
| − | The core medical journals from domestic are selected in order to make the paper more representative and authoritative, such as National Medical Journal of China, Journal of Peking University (Health Science), and Journal of Third Military Medical University. All of them include basic medical science, biomedical technology, laboratory medical science and other fields. In this way, the pre-editing approaches included are applicable enough to machine translation of medical abstracts.
| |
| − |
| |
| − | The core medical journals from domestic are selected in order to make the paper more representative and authoritative, such as National Medical Journal of China, Journal of Peking University (Health Science), and Journal of Third Military Medical University. All of them include basic medical science, biomedical technology, laboratory medical science and other fields. In this way, the pre-editing approaches included are applicable enough to machine translation of medical abstracts.
| |
| − |
| |
| − | --[[User:Cai Zhufeng|Cai Zhufeng]] ([[User talk:Cai Zhufeng|talk]]) 13:48, 13 December 2021 (UTC)correted by Cai Zhufeng
| |
| − |
| |
| − | ===1.5 Selection of Translation Software===
| |
| − | Generally, machine translation software includes Google Translation, Youdao Translator and Niu Translation, which has their own special use in translation. Take Google as an example, it is more like neurons in human brain, enabling to learn and collect information to establish connections with its neural machine translator’s neurons. However, it also causes many errors because of the lack of enough information. In this paper, contrastive analysis will be carried on by using Google translation. On the one hand, being a pioneer in the translation of NET, it is inevitably to sue Google as the translation software to translate medical texts in this paper. Comparing the program called “Google brain”, other NMT of translation software are relatively disadvantages. On the other hand, the Google translation enjoys the largest users in the world, with its downloads of more than one billion. The output quality of Google translation is more correct and complete than other machine translation software. With years of development and improvement, Google Translation has been greatly promoted. In this paper, Chinese medical abstract will be automatically translated by Google translation, and then the translation output will be compared with the translation by human and the publication of English abstracts. The main purpose is to prove that the improvement and promotion of quality and accuracy in the medical abstracts will be obtained through the pre-editing approaches.
| |
| − |
| |
| − | Generally, machine translation software includes Google Translation, Youdao Translator and Niu Translation, which has their own special use in translation. Take Google as an example, it is more like neurons in human brain, enabling to learn and collect information to establish connections with its neural machine translator’s neurons. However, it also causes many errors because of the lack of enough information. In this paper, contrastive analysis will be carried on by using Google translation. On the one hand, being a pioneer in the translation of NET, it is inevitably to sue Google as the translation software to translate medical texts in this paper. Comparing the program called “Google brain”, other NMT of translation software are relatively disadvantages. On the other hand, the Google translation enjoys the largest users in the world, with its downloads of more than one billion. The output quality of Google translation is more correct and complete than other machine translation software. With years of development and improvement, Google Translation has been greatly promoted. In this paper, Chinese medical abstract will be automatically translated by Google translation, and then the translation output will be compared with the translation by human and the publication of English abstracts. The main purpose is to prove that the improvement and promotion of quality and accuracy in the medical abstracts will be obtained through the pre-editing approaches.
| |
| − |
| |
| − | --[[User:Cai Zhufeng|Cai Zhufeng]] ([[User talk:Cai Zhufeng|talk]]) 13:48, 13 December 2021 (UTC)correted by Cai Zhufeng
| |
| − |
| |
| − | ===2.Language Characteristics and Error Division===
| |
| − | Since Chinese and English are two different languages, it is quite neccessary to identify their own characteristics so as to better analyze and understand the two languages. Tytler (1978: 118-119) argued in his Essay on the Principles of Translation that there are three principles of translation that in all the translation should give readers the same feelings as the source text, except for complete transcript of ideas. There actually also exit some mistakes of these two languages. So the following will make some clarrifications of these two languages to ensure more accurate translation.
| |
| − |
| |
| − | Since Chinese and English are two different languages, it is quite neccessary to identify their own characteristics so as to better analyze and understand the two languages. Tytler (1978: 118-119) argued in his Essay on the Principles of Translation that there are three principles of translation that in all the translation should give readers the same feelings as the source text, except for complete transcript of ideas. There actually also exit some mistakes of these two languages. So the following will make some clarrifications of these two languages to ensure more accurate translation.
| |
| − |
| |
| − | --[[User:Cai Zhufeng|Cai Zhufeng]] ([[User talk:Cai Zhufeng|talk]]) 13:48, 13 December 2021 (UTC)correted by Cai Zhufeng
| |
| − |
| |
| − | ===2.1 Language Characteristics of Medical Abstracts===
| |
| − | Chinese and English belong to different language systems, so there are differences in their language structure and the way users think of the languages. When using machine translation from Chinese to English, due to the unequal language levels, there will be many mistakes in the translation process, especially for ESP texts, such as medical papers.Here, these differences mainly refer to the linguistic characteristics of medical abstracts. In medical abstracts, it usually includes structured and unstructured abstracts. Although in different forms, they both describe the purpose, methods and conclusions of the research.
| |
| − |
| |
| − | In the method section of medial abstract, several Chinese sentences can be connected with commas, but each sentence may convey different information. In contrast, English sentences contain a great deal of information, but in order to ensure clarity, some modifiers need to be isolated and then reconstructed. However, in Chinese-English machine translation, a lot of information is put into the sentence because the machine segments the sentence on the basis of the full comma.
| |
| − |
| |
| − | In addition, subjectless sentence is used in the objective and method parts of Chinese medical abstracts. The subjectless sentence means the sentence without or free from subject and is usually employed in two contexts. The first is "needless to say". In Chinese sentences, it is common to omit the subject of the sentence. Chinese sentences can convey meanings by using incomplete sentence structures, so Chinese speakers can understand the meanings of the sentences even though the sentence subject is omitted. The second is "emphasis of action". In this context, subjectless sentences are used to describe behaviors, especially in the study of traditional Chinese medicine abstracts. Comparatively speaking, it should be avoided in English medical abstracts. Abstract sentences are more subjective when describing the learning process, while the essential requirement of English medical abstract is objectivity. From this point, sentences without subject should be avoided in English medical abstracts. Another feature is voice. Few words with passive meanings appear in Chinese abstracts. English sentences are more favoured in using passive voice. When translating from Chinese to English, passive voice should be used to make the contents objective, which is also the basic requirement of medical papers.
| |
| − |
| |
| − | These are the linguistic features of Chinese medical abstracts. There are great differences in sentence structure and expression between Chinese and English medical abstracts. These differences may reduce the accuracy of machine translation, so it is necessary to introduce pre-editing to edit the source text to ensure that the source text can be accurately recognized by the machine and fully translated into English.
| |
| − |
| |
| − | Chinese and English belong to different language systems, so there are differences in their language structure and the way users think of the languages. When using machine translation from Chinese to English, due to the unequal language levels, there will be many mistakes in the translation process, especially for ESP texts, such as medical papers.Here, these differences mainly refer to the linguistic characteristics of medical abstracts. In medical abstracts, it usually includes structured and unstructured abstracts. Although in different forms, they both describe the purpose, methods and conclusions of the research.
| |
| − |
| |
| − | In the method section of medial abstract, several Chinese sentences can be connected with commas, but each sentence may convey different information. In contrast, English sentences contain a great deal of information, but in order to ensure clarity, some modifiers need to be isolated and then reconstructed. However, in Chinese-English machine translation, a lot of information is put into the sentence because the machine segments the sentence on the basis of the full comma.
| |
| − |
| |
| − | In addition, subjectless sentence is used in the objective and method parts of Chinese medical abstracts. The subjectless sentence means the sentence without or free from subject and is usually employed in two contexts. The first is "needless to say". In Chinese sentences, it is common to omit the subject of the sentence. Chinese sentences can convey meanings by using incomplete sentence structures, so Chinese speakers can understand the meanings of the sentences even though the sentence subject is omitted. The second is "emphasis of action". In this context, subjectless sentences are used to describe behaviors, especially in the study of traditional Chinese medicine abstracts. Comparatively speaking, it should be avoided in English medical abstracts. Abstract sentences are more subjective when describing the learning process, while the essential requirement of English medical abstract is objectivity. From this point, sentences without subject should be avoided in English medical abstracts. Another feature is voice. Few words with passive meanings appear in Chinese abstracts. English sentences are more favoured in using passive voice. When translating from Chinese to English, passive voice should be used to make the contents objective, which is also the basic requirement of medical papers.
| |
| − |
| |
| − | These are the linguistic features of Chinese medical abstracts. There are great differences in sentence structure and expression between Chinese and English medical abstracts. These differences may reduce the accuracy of machine translation, so it is necessary to introduce pre-editing to edit the source text to ensure that the source text can be accurately recognized by the machine and fully translated into English.
| |
| − | --[[User:Cai Zhufeng|Cai Zhufeng]] ([[User talk:Cai Zhufeng|talk]]) 13:48, 13 December 2021 (UTC)correted by Cai Zhufeng
| |
| − |
| |
| − | ===2.2 Error Division of Machine Translation in Translating Abstracts of Medical Papers from Chinese to English===
| |
| − | In this paper, errors in machine translation are listed out after analyzing. Some are due to the principles, such as statistical-based MT and NMT, employed by machine translation. Based on the original errors, they can be studied from two levels. One is from the macro-level, referring to mistakes caused by objective factors. These are not human factors, but the disadvantage of machine translation and the limitations of language. The limitations of machine translation are derived from the principles and models adopted and manifests itself as a reliance on reference sources. In this case, semantic ambiguity and textual incoherence may occur in the absence of reference sources.
| |
| | | | |
| − | However, for ESP texts such as medical papers, the requirements for terms and sentence patterns are far beyond the existing corpora. For unfamiliar texts (that is, the corresponding texts cannot be found in the corpora). The quality of output translation will be relatively low, which is determined by the principles of machine translation. As mentioned above, machine translation has evolved over the past few decades from rules-based to corp-based statistics to NMT. However, there still exist some limitations in machine translation.
| |
| − |
| |
| − | Mistakes on micro-level are mainly caused by the variations and differences of linguistic structure between Chinese and English. Chinese is an implicit language, while English is an explicit one (Lian, 2010) To put it in another way, Chinese expression does not depend on language structures, while English does the opposite. This may result in mismatches between the original and machine translated sentences. This kind of error is mainly divided into two types: component fragment and component missing. For a medical paper, such errors are very serious. As a kind of ESP discourse, medical paper has the characteristics of fixed, objective and accurate language structure. In order to reproduce the characteristics of medical abstracts in translation, it is necessary to avoid errors at the level of words and sentences and pay attention to logic and consistency.
| |
| − |
| |
| − | Lexical errors here refer to the inconsistency of fixed expressions in the translation of terms. Although these terms are machine translation based on the corpus, the corpus may not be perfect for ESP texts such as medical papers. Therefore, fixed expressions are not used in term translation. This is also the most common mistake in machine translation (the term here includes but is not limited to nouns). For ESP texts, medical text in particular, the fixed expressions and the accuracy of terms are of great importance.
| |
| − |
| |
| − | In this paper, errors in machine translation are listed out after analyzing. Some are due to the principles, such as statistical-based MT and NMT, employed by machine translation. Based on the original errors, they can be studied from two levels. One is from the macro-level, referring to mistakes caused by objective factors. These are not human factors, but the disadvantage of machine translation and the limitations of language. The limitations of machine translation are derived from the principles and models adopted and manifests itself as a reliance on reference sources. In this case, semantic ambiguity and textual incoherence may occur in the absence of reference sources.
| |
| − |
| |
| − | However, for ESP texts such as medical papers, the requirements for terms and sentence patterns are far beyond the existing corpora. For unfamiliar texts (that is, the corresponding texts cannot be found in the corpora). The quality of output translation will be relatively low, which is determined by the principles of machine translation. As mentioned above, machine translation has evolved over the past few decades from rules-based to corp-based statistics to NMT. However, there still exist some limitations in machine translation.
| |
| − |
| |
| − | Mistakes on micro-level are mainly caused by the variations and differences of linguistic structure between Chinese and English. Chinese is an implicit language, while English is an explicit one (Lian, 2010) To put it in another way, Chinese expression does not depend on language structures, while English does the opposite. This may result in mismatches between the original and machine translated sentences. This kind of error is mainly divided into two types: component fragment and component missing. For a medical paper, such errors are very serious. As a kind of ESP discourse, medical paper has the characteristics of fixed, objective and accurate language structure. In order to reproduce the characteristics of medical abstracts in translation, it is necessary to avoid errors at the level of words and sentences and pay attention to logic and consistency.
| |
| − |
| |
| − | Lexical errors here refer to the inconsistency of fixed expressions in the translation of terms. Although these terms are machine translation based on the corpus, the corpus may not be perfect for ESP texts such as medical papers. Therefore, fixed expressions are not used in term translation. This is also the most common mistake in machine translation (the term here includes but is not limited to nouns). For ESP texts, medical text in particular, the fixed expressions and the accuracy of terms are of great importance.
| |
| − |
| |
| − | --[[User:Cai Zhufeng|Cai Zhufeng]] ([[User talk:Cai Zhufeng|talk]]) 13:48, 13 December 2021 (UTC)correted by Cai Zhufeng
| |
| − |
| |
| − | ===3.Approaches Proposed for Pre-editing ===
| |
| − | Generally speaking, there is a close relationship between pre-editing and post-editing, both of which aim to convey information and ensure high or publishable translation quality. Proper pre-editing can improve the quality of machine translation in terms of adequacy and consistency.
| |
| − | Complexity of natural language and people use language arbitrarily, bringing many difficulties to Chinese-English machine translation, Hu Qingping (2005:24) has proposed "the research of translation software and the development of the controlled language are the two directions to improve the quality of machine translation : the former aims at the difficulty in the natural language processing, the latter overcomes the arbitrariness of natural language". Feng Quangong and Gao Lin (2017: 63-68) put forward: "The writing principles of controlled language can be applied to pre-editing of machine translation. Pre-editing based on controlled language can effectively reduce the complexity and ambiguity of source text, improve the identifiability of machine translation (the translatability of source text itself), and thus reduce (fully) the workload of post-editing".
| |
| − | The central task of pre-editing is to transform human-friendly content into machine-friendly content, so words and sentences need to be repositioned or even changed. Based on the analysis of the language characteristics of Chinese and English, the Chinese ideographic group should be split before the Chinese source text is input into the machine software to translate so that the sentence structure is complete which can be easily recognized by the machine translation software.
| |
| − |
| |
| − | Generally speaking, there is a close relationship between pre-editing and post-editing, both of which aim to convey information and ensure high or publishable translation quality. Proper pre-editing can improve the quality of machine translation in terms of adequacy and consistency.
| |
| − |
| |
| − | Complexity of natural language and people use language arbitrarily, bringing many difficulties to Chinese-English machine translation, Hu Qingping (2005:24) has proposed "the research of translation software and the development of the controlled language are the two directions to improve the quality of machine translation : the former aims at the difficulty in the natural language processing, the latter overcomes the arbitrariness of natural language". Feng Quangong and Gao Lin (2017: 63-68) put forward: "The writing principles of controlled language can be applied to pre-editing of machine translation. Pre-editing based on controlled language can effectively reduce the complexity and ambiguity of source text, improve the identifiability of machine translation (the translatability of source text itself), and thus reduce (fully) the workload of post-editing".
| |
| − |
| |
| − | The central task of pre-editing is to transform human-friendly content into machine-friendly content, so words and sentences need to be repositioned or even changed. Based on the analysis of the language characteristics of Chinese and English, the Chinese ideographic group should be split before the Chinese source text is input into the machine software to translate so that the sentence structure is complete which can be easily recognized by the machine translation software.
| |
| − |
| |
| − | --[[User:Cai Zhufeng|Cai Zhufeng]] ([[User talk:Cai Zhufeng|talk]]) 13:48, 13 December 2021 (UTC)correted by Cai Zhufefng
| |
| − |
| |
| − | ===3.1 Extraction and Replacement of Terms in the Source Text===
| |
| − | As a branch of applied English, medical English is the product of the combination of English language knowledge and medical knowledge. The terms in medical papers have the characteristics of fixed and complex language structure. Although Google Translation is based on a corpus, the language structure is not fixed due to the limited size of the corpus. Therefore, the following two steps should be followed before machine translation. The first is to extract terms and create a glossary, and then replace the Chinese terms in the source text with the corresponding English expressions in the glossary. Of course, the terms of extraction should be searched and verified according to the actual situation. All of these words are verified before they can be replaced. This method can not only ensure the accuracy of term translation, but also maintain the consistency of expression, especially when dealing with long texts.
| |
| − | Example1
| |
| − | Source abstract: 口腔白斑病癌变相关缺氧应答基因和微小RNA的芯片及表达验证。
| |
| − | Google translation: Microarray detection/Chip detection and expression verification of hypoxia response genes and microRNAs related to oral leukoplakia canceration.
| |
| − | Published translation:Transcriptome array screening and verification of oral leukoplakia carcinogenesis-related hypoxia-responsive gene and microRNA.
| |
| − | Analysis: The expression “ Affymetrix GeneChip” empolyed in this thesis is a human transcription array for transcriptome array. In the source abstract, “芯片检测“ can be meant “screening” but not detection, so the proper and appropriate translation of these expression should be “transcription array screening”, but the Google translates it into “microarry detection of chip detection”. Therefore, term extraction is essential and inevitable before putting the source abstracts into the machine translation software.
| |
| − | After pre-editing source abstracts: 对 oral leukoplakia carcinogenesis 相关 hypoxia-responsive gene 和微小RNA进行 transcriptome array screening 及表达验证。
| |
| − | After pre-editing translation: Transcriptome array screening and expression- verification of hypoxia-responsive genes and microRNAs related to oral leukoplakia carcinogenesis.
| |
| − |
| |
| − | As a branch of applied English, medical English is the product of the combination of English language knowledge and medical knowledge. The terms in medical papers have the characteristics of fixed and complex language structure. Although Google Translation is based on a corpus, the language structure is not fixed due to the limited size of the corpus. Therefore, the following two steps should be followed before machine translation. The first is to extract terms and create a glossary, and then replace the Chinese terms in the source text with the corresponding English expressions in the glossary. Of course, the terms of extraction should be searched and verified according to the actual situation. All of these words are verified before they can be replaced. This method can not only ensure the accuracy of term translation, but also maintain the consistency of expression, especially when dealing with long texts.
| |
| − |
| |
| − | Example1
| |
| − | Source abstract: 口腔白斑病癌变相关缺氧应答基因和微小RNA的芯片及表达验证。
| |
| − | Google translation: Microarray detection/Chip detection and expression verification of hypoxia response genes and microRNAs related to oral leukoplakia canceration.
| |
| − | Published translation:Transcriptome array screening and verification of oral leukoplakia carcinogenesis-related hypoxia-responsive gene and microRNA.
| |
| − | Analysis: The expression “ Affymetrix GeneChip” empolyed in this thesis is a human transcription array for transcriptome array. In the source abstract, “芯片检测“ can be meant “screening” but not detection, so the proper and appropriate translation of these expression should be “transcription array screening”, but the Google translates it into “microarry detection of chip detection”. Therefore, term extraction is essential and inevitable before putting the source abstracts into the machine translation software.
| |
| − | After pre-editing source abstracts: 对 oral leukoplakia carcinogenesis 相关 hypoxia-responsive gene 和微小RNA进行 transcriptome array screening 及表达验证。
| |
| − | After pre-editing translation: Transcriptome array screening and expression- verification of hypoxia-responsive genes and microRNAs related to oral leukoplakia carcinogenesis.
| |
| − |
| |
| − | --[[User:Cai Zhufeng|Cai Zhufeng]] ([[User talk:Cai Zhufeng|talk]]) 13:48, 13 December 2021 (UTC)correted by Cai Zhufeng
| |
| − |
| |
| − | ===3.2 Explication of Subordination===
| |
| − | As mentioned above, the subordination of Chinese sentences is judged by certain specific words in machine translation . Chinese is a paratactic language, and sentences are often connected by internal logical relationships; while English depends on sentence structure, so sentences are often closely linked by various language forms. In order to improve the accuracy of machine translation, we can adjust the sentence structure and adjust the position of adjectives and their modifiers.
| |
| − |
| |
| − | Example 2
| |
| − | Source abstracts:回顾性分析南京医科大学附属儿童医院中重度HIE患儿49例及同期就诊的无神经系统症状体征的足月新生儿为对照组31例的头颅磁共振成像(MRI)资料。
| |
| − | Google translation: A retrospective analysis that the brain magnetic resonance imaging (MRI) data of 49 children with moderate to severe HIE in the Children's Hospital of Nanjing Medical University and full-term neonates with no neurological symptoms and signs during the same period were included in the control group.
| |
| − | Published translation: A total of 49 children with moderate to severe HIE admitted to the children’s Hospital Affiliated to Nanjing Medical University were retrospectively analyzed. Cranial magnetic resonance imaging (MRI) date of 31 full-term neonates without neurological symptoms and signs who visited the hospital during the same period were recruited as the control group.
| |
| − |
| |
| − | Analysis: As the above example shows that “中重度HIE患儿” and “足月新生儿” in the source abstracts are from the Children’s Hospital Affiliated Nanjing Medical University. The Google translation shows that 49 children with moderate to severe HIE in the Children's Hospital of Nanjing Medical University. There is an error due to the misuse of subordination. There are some advice in order to solve the problem. That is, first to segment the sentence and then reconstruct the sentence. For instance, the Chinese expression “南京医科大学附属儿童医院” carries many modifiers, which requires to cut the modifiers and reconstruct the sentence.
| |
| − | Therefore, the Chinese expression “南京医科大学附属儿童医院’can be divided into abstractions, leaving two sentences instead of one long sentence with modifiers..
| |
| − | After pre-editing source abstracts: 对49例中重度HIE患儿以及31例无神经系统症状体征的足月新生儿的MRI资料进行回顾性分析。这些患者收治于南京医科大学儿童附属医院。
| |
| − | After pre-editing translation: The MRI data of 49 children with moderate to severe HIE and 31 full-term newborns without neurological symptoms and signs were retrospectively analyzed. These patients were admitted to the Children’s Hospital of Nanjing Medical University.
| |
| − |
| |
| − | As mentioned above, the subordination of Chinese sentences is judged by certain specific words in machine translation . Chinese is a paratactic language, and sentences are often connected by internal logical relationships; while English depends on sentence structure, so sentences are often closely linked by various language forms. In order to improve the accuracy of machine translation, we can adjust the sentence structure and adjust the position of adjectives and their modifiers.
| |
| − |
| |
| − | Example 2
| |
| − | Source abstracts:回顾性分析南京医科大学附属儿童医院中重度HIE患儿49例及同期就诊的无神经系统症状体征的足月新生儿为对照组31例的头颅磁共振成像(MRI)资料。
| |
| − | Google translation: A retrospective analysis that the brain magnetic resonance imaging (MRI) data of 49 children with moderate to severe HIE in the Children's Hospital of Nanjing Medical University and full-term neonates with no neurological symptoms and signs during the same period were included in the control group.
| |
| − | Published translation: A total of 49 children with moderate to severe HIE admitted to the children’s Hospital Affiliated to Nanjing Medical University were retrospectively analyzed. Cranial magnetic resonance imaging (MRI) date of 31 full-term neonates without neurological symptoms and signs who visited the hospital during the same period were recruited as the control group.
| |
| − |
| |
| − | Analysis: As the above example shows that “中重度HIE患儿” and “足月新生儿” in the source abstracts are from the Children’s Hospital Affiliated Nanjing Medical University. The Google translation shows that 49 children with moderate to severe HIE in the Children's Hospital of Nanjing Medical University. There is an error due to the misuse of subordination. There are some advice in order to solve the problem. That is, first to segment the sentence and then reconstruct the sentence. For instance, the Chinese expression “南京医科大学附属儿童医院” carries many modifiers, which requires to cut the modifiers and reconstruct the sentence.
| |
| − |
| |
| − | Therefore, the Chinese expression “南京医科大学附属儿童医院’can be divided into abstractions, leaving two sentences instead of one long sentence with modifiers..
| |
| − | After pre-editing source abstracts: 对49例中重度HIE患儿以及31例无神经系统症状体征的足月新生儿的MRI资料进行回顾性分析。这些患者收治于南京医科大学儿童附属医院。
| |
| − | After pre-editing translation: The MRI data of 49 children with moderate to severe HIE and 31 full-term newborns without neurological symptoms and signs were retrospectively analyzed. These patients were admitted to the Children’s Hospital of Nanjing Medical University.
| |
| − |
| |
| − | --[[User:Cai Zhufeng|Cai Zhufeng]] ([[User talk:Cai Zhufeng|talk]]) 13:48, 13 December 2021 (UTC)correted by Cai Zhufeng
| |
| − |
| |
| − | ===3.3 Explication of Subject through Voice Changing===
| |
| − | The extensive use of subjectless sentences are quite common in the Chinese abstracts, while English prefers to employ passive voice in the sentences so as to make them object and accurate. In order to take the characteristics of English sentences into great and careful consideration, the sentences written in passive voice in particular, it will be helpful for machine translation to find out the subject and reconstruct a sentence in passive voice (Wang Yan: 2008). The first is to discover the “doee” in a sentence and then is to reconstruct the sentence structure and put the “doee” before the verb. The last is to explicate the passive relation between the noun and the verb.
| |
| − | Example 3
| |
| − | Source abstract: 回顾性分析2018年1月至2020年12月解放军总医院第七医学中心收治的500例老年髋部骨折患者的资料。
| |
| − | Google translation: A retrospective analysis of the data of 500 elderly patients with hip fractures admitted to the Seventh Medical Center of the PLA General Hospital from January 2018 to December 2020.
| |
| − | Published translation: From January 2018 to December 2020, the data of 500 elderly patients with hip fracture treated in the Seventh Medical Center of PLA General Hospital were analyzed retrospectively.
| |
| − | Analysis: The above example indicates that the “回顾性分析”in the Google Translate is a noun phrase, but the source abstracts actually describes an action or a behavior. The reason for such a mistake is that the machine can’t recognize the Chinese subjetless sentences. To find the subject of the sentence, the source abstracts can be revised and adjusted. Finding the “doee” is the first step, which refers to “500例老年髋部骨折患者的资料”in the source abstracts. And next is to put it in front of the verb, that is to place it in the beginning of the sentence. Last but not least, the passive voice should be explicated in the sentence.
| |
| − | After pre-editing source abstract: 500例老年髋部骨折患者的资料被回顾性分析,他们在2018年1月之2020年12月期间收治于解放军总医院第七医学中心。
| |
| − | After pre-editing translation: The data of 500 elderly hip fracture patients were retrospectively analyzed. They were admitted to the Seventh Medical Center of the PLA General Hospital between January 2018 and December 2020.
| |
| − |
| |
| − | The extensive use of subjectless sentences are quite common in the Chinese abstracts, while English prefers to employ passive voice in the sentences so as to make them object and accurate. In order to take the characteristics of English sentences into great and careful consideration, the sentences written in passive voice in particular, it will be helpful for machine translation to find out the subject and reconstruct a sentence in passive voice (Wang Yan: 2008). The first is to discover the “doee” in a sentence and then is to reconstruct the sentence structure and put the “doee” before the verb. The last is to explicate the passive relation between the noun and the verb.
| |
| − |
| |
| − | Example 3
| |
| − | Source abstract: 回顾性分析2018年1月至2020年12月解放军总医院第七医学中心收治的500例老年髋部骨折患者的资料。
| |
| − | Google translation: A retrospective analysis of the data of 500 elderly patients with hip fractures admitted to the Seventh Medical Center of the PLA General Hospital from January 2018 to December 2020.
| |
| − | Published translation: From January 2018 to December 2020, the data of 500 elderly patients with hip fracture treated in the Seventh Medical Center of PLA General Hospital were analyzed retrospectively.
| |
| − |
| |
| − | Analysis: The above example indicates that the “回顾性分析”in the Google Translate is a noun phrase, but the source abstracts actually describes an action or a behavior. The reason for such a mistake is that the machine can’t recognize the Chinese subjetless sentences. To find the subject of the sentence, the source abstracts can be revised and adjusted. Finding the “doee” is the first step, which refers to “500例老年髋部骨折患者的资料”in the source abstracts. And next is to put it in front of the verb, that is to place it in the beginning of the sentence. Last but not least, the passive voice should be explicated in the sentence.
| |
| − | After pre-editing source abstract: 500例老年髋部骨折患者的资料被回顾性分析,他们在2018年1月之2020年12月期间收治于解放军总医院第七医学中心。
| |
| − | After pre-editing translation: The data of 500 elderly hip fracture patients were retrospectively analyzed. They were admitted to the Seventh Medical Center of the PLA General Hospital between January 2018 and December 2020.
| |
| − |
| |
| − | --[[User:Cai Zhufeng|Cai Zhufeng]] ([[User talk:Cai Zhufeng|talk]]) 13:48, 13 December 2021 (UTC)correted by Cai Zhufeng
| |
| − |
| |
| − | ===3.4 Relocation of Modifiers===
| |
| − | Modifiers, including noun, adverbial and attributive are constantly employed in the Chinese medical papers in order to add information to subject and object in the sentence. By doing this, complicated sentences can be structured, thus causing obstacles to machine translation for recognizing this complex sentence structures when translating Chinese-English sentences. To make the machine translation successfully recognize the sentence structure, simplifying the sentence structure is quite necessary. The key is to moving the location of the modifiers and thus making the modifiers an independent sentence. In other words, the modifiers ought to be put before or behind the main part of the sentence to satisfy the common use of English.
| |
| − | Usually, the modifiers in Chinese sentence can only be placed in front of the core word, while in English, modifiers are very flexible. It is all right to place the modifiers in front of or behind the major part of the sentences with adjective or a connective noun.
| |
| − |
| |
| − | Example 4
| |
| − | Source abstracts: 利用DNA重组技术以pET-28a表达系统在E.coli BL21(DE3) 中重组表达Hepl。
| |
| − | Google Translation: Hepl is recombined in E.coli BL21 (DE3) using DNA recombination technology with pET-28a expression system.
| |
| − | Published translation: The recombinant Hcpl protein was expressed by using DNA recombination technology through pET-28a expression system in E. coli BL21 (De3).
| |
| − | Analysis: The Chinese medical text includes two modifiers, “利用DNA”and “以pET-28a)表达系统”. These two modifiers will be translated by machine, which catches more attention to the two constituents. From the Google translation above, one failure is obvious that it misplaces the location of the two modifiers and presents it in not accurate form. To make it translate correctly by machine translation, dividing the sentences is very important so that the source abstracts can be correctly recognized by machine translation.
| |
| − |
| |
| − | After pre-editing source abstract: 利用DNA重组技术,Hcp1重组表达在E.coli BL21 (DE3)中, 通过pET-28a表达系统在E. coli BL21 (DE3)中。
| |
| − | Google translation: Using DNA recombination technology, Hcp1 recombination is expressed in E.coli BL21 (DE3), via pET-28a expression system in E. coli BL21 (DE3).
| |
| − | The second is the independence of modifiers, that is, modifiers can also be reconstructed into clauses to modify core words. Compared with English, There is no subordinate clause in Chinese, so redundant Chinese modifiers need to be reconstructed into subordinate clauses in Chinese-English translation to meet the characteristics of English. English, especially sentences with multiple modifiers. Otherwise, sentence structure may be confused, such as scattered modifiers of core words. To avoid this mistake, it is necessary to separate the modifier from the main part of the sentence. These modifiers should be reconstituted into clauses. Also, keep your sentences simple and easy to understand.
| |
| − |
| |
| − | Modifiers, including noun, adverbial and attributive are constantly employed in the Chinese medical papers in order to add information to subject and object in the sentence. By doing this, complicated sentences can be structured, thus causing obstacles to machine translation for recognizing this complex sentence structures when translating Chinese-English sentences. To make the machine translation successfully recognize the sentence structure, simplifying the sentence structure is quite necessary. The key is to moving the location of the modifiers and thus making the modifiers an independent sentence. In other words, the modifiers ought to be put before or behind the main part of the sentence to satisfy the common use of English.
| |
| − | Usually, the modifiers in Chinese sentence can only be placed in front of the core word, while in English, modifiers are very flexible. It is all right to place the modifiers in front of or behind the major part of the sentences with adjective or a connective noun.
| |
| − |
| |
| − | Example 4
| |
| − | Source abstracts: 利用DNA重组技术以pET-28a表达系统在E.coli BL21(DE3) 中重组表达Hepl。
| |
| − | Google Translation: Hepl is recombined in E.coli BL21 (DE3) using DNA recombination technology with pET-28a expression system.
| |
| − | Published translation: The recombinant Hcpl protein was expressed by using DNA recombination technology through pET-28a expression system in E. coli BL21 (De3).
| |
| − | Analysis: The Chinese medical text includes two modifiers, “利用DNA”and “以pET-28a)表达系统”. These two modifiers will be translated by machine, which catches more attention to the two constituents. From the Google translation above, one failure is obvious that it misplaces the location of the two modifiers and presents it in not accurate form. To make it translate correctly by machine translation, dividing the sentences is very important so that the source abstracts can be correctly recognized by machine translation.
| |
| − |
| |
| − | After pre-editing source abstract: 利用DNA重组技术,Hcp1重组表达在E.coli BL21 (DE3)中, 通过pET-28a表达系统在E. coli BL21 (DE3)中。
| |
| − | Google translation: Using DNA recombination technology, Hcp1 recombination is expressed in E.coli BL21 (DE3), via pET-28a expression system in E. coli BL21 (DE3).
| |
| − | The second is the independence of modifiers, that is, modifiers can also be reconstructed into clauses to modify core words. Compared with English, There is no subordinate clause in Chinese, so redundant Chinese modifiers need to be reconstructed into subordinate clauses in Chinese-English translation to meet the characteristics of English. English, especially sentences with multiple modifiers. Otherwise, sentence structure may be confused, such as scattered modifiers of core words. To avoid this mistake, it is necessary to separate the modifier from the main part of the sentence. These modifiers should be reconstituted into clauses. Also, keep your sentences simple and easy to understand.
| |
| − |
| |
| − | --[[User:Cai Zhufeng|Cai Zhufeng]] ([[User talk:Cai Zhufeng|talk]]) 13:48, 13 December 2021 (UTC)correted by Cai Zhufeng
| |
| − |
| |
| − | ===3.5 Proper Omission and Deletion of Category Words===
| |
| − | Category words are commonly used in Chinese. Category words complement the meaning of words, including problems, positions, situations and jobs. Adding this supplementary word is more in line with Chinese custom. In many cases, it has no real meaning, so it can be omitted in translation. In Chinese, category words are frequently used. However, it is rarely used in English, which is one of the differences between Chinese and English. There are many kinds of category words. Considering objectivity and the fixed structure of language, redundant category words should be deleted in Chinese-English translation. In the general, the most commonly used category of words in the Chinese medical abstracts are "process", "behavior", and "situation". These categories of words hinder the language conversion of Chinese and English, resulting in redundancy.
| |
| − |
| |
| − | Example 5
| |
| − | Source abstract: 探讨口腔白斑病癌变进程中的缺氧应答基因及相关微小RNA (miRNA) 的表达。
| |
| − | Google translation: The expression of hypoxia response genes and associated microRNAs in the process of oral leukoplakia cancer was discussed.
| |
| − | Published translation: To study the hypoxia response gene and microRNA (miRNA)expression profiles in the pathogenesis and progression of oral leukoplakia (OLK).
| |
| − | Analysis: As the above example shows, the Chinese words “进程” belongs to a category word. However, the Chinese expression “癌变”contains the process, so the “进程” expression can be deleted before placing it into the machine translation, because the meaning of it has been overlapping between the expression “癌变”. Therefore, its place can be replaced with preposition “during”.
| |
| − |
| |
| − | After pre-editing source abstract: 探讨口腔白斑病癌变中的缺氧应答基因及相关微小RNA (miRNA) 的表达。
| |
| − | Google translation: The expression of hypoxia responsive genes and miRNA in oral leukoplakia cancer was investigated.
| |
| − |
| |
| − | Category words are commonly used in Chinese. Category words complement the meaning of words, including problems, positions, situations and jobs. Adding this supplementary word is more in line with Chinese custom. In many cases, it has no real meaning, so it can be omitted in translation. In Chinese, category words are frequently used. However, it is rarely used in English, which is one of the differences between Chinese and English. There are many kinds of category words. Considering objectivity and the fixed structure of language, redundant category words should be deleted in Chinese-English translation. In the general, the most commonly used category of words in the Chinese medical abstracts are "process", "behavior", and "situation". These categories of words hinder the language conversion of Chinese and English, resulting in redundancy.
| |
| − |
| |
| − | Example 5
| |
| − | Source abstract: 探讨口腔白斑病癌变进程中的缺氧应答基因及相关微小RNA (miRNA) 的表达。
| |
| − | Google translation: The expression of hypoxia response genes and associated microRNAs in the process of oral leukoplakia cancer was discussed.
| |
| − | Published translation: To study the hypoxia response gene and microRNA (miRNA)expression profiles in the pathogenesis and progression of oral leukoplakia (OLK).
| |
| − | Analysis: As the above example shows, the Chinese words “进程” belongs to a category word. However, the Chinese expression “癌变”contains the process, so the “进程” expression can be deleted before placing it into the machine translation, because the meaning of it has been overlapping between the expression “癌变”. Therefore, its place can be replaced with preposition “during”.
| |
| − |
| |
| − | After pre-editing source abstract: 探讨口腔白斑病癌变中的缺氧应答基因及相关微小RNA (miRNA) 的表达。
| |
| − | Google translation: The expression of hypoxia responsive genes and miRNA in oral leukoplakia cancer was investigated.
| |
| − |
| |
| − | --[[User:Cai Zhufeng|Cai Zhufeng]] ([[User talk:Cai Zhufeng|talk]]) 13:48, 13 December 2021 (UTC)correted by Cai Zhufeng
| |
| − |
| |
| − | ===Conclusion===
| |
| − | After years of development, machine translation has made great progress. The accuracy of machine translation has been greatly improved in both text recognition and sentence pattern conversion. However, machine translation has its own limitations. In other words, it needs to rely on the parallel corpus as a reference source for improving its accuracy. ESP text, in particular, is harder to get the high quality by machine translation.
| |
| − |
| |
| − | As one of the research papers, the characteristics of medical abstracts are fixed language structures, objectivity and accuracy (Qin Yi 2004:421-423). Therefore, medical translation must be accurate, object and understandable to follow the specific demands of the medical paper. Being an important field in the human society, medical paper translation is on a great demand, which means that it needs a huge demand for human labor. However, with the machine translation promoting, it will be more efficient to translate medical papers combining the effort by human and machine. The improvement and development of machine translation requires the joint efforts of computer science, information science, statistics, linguistics and other academic circles to achieve more mature human-computer mutual assistance translation (Li Yafei, Zhang Ruihua 2019:38-45).
| |
| − |
| |
| − | However, errors can occur during the process of machine translation of Chinese- English, because of the differences of the Chinese and English and the processing of the machine. Errors from the perspective of linguistic or grammar can affect the machine translation a lot. After division and recognition of errors, some pre-editing approaches are put forward to help the machine translation more accurate and readable, that are, extraction and replacement of terms in the source text, relocation of modifiers, explication of subordination, proper omission, deletion of category words and explication of subject through voice changing.
| |
| − |
| |
| − |
| |
| − | The paper mainly focuses on the pre-editing machine translation by using medical papers as a case study. The errors of machine translation occurring in the translation of medical abstracts and pre-editing approaches for machine translation. The quality of machine translation of medical papers is greatly improved after employing the pre-editing methods. However, machine translation is not as flexible and accurate as human brain, so it is of importance to combine pre-editing and post-editing approaches with machine translation in order to produce more accurate, more object machine translation of medical papers.
| |
| − |
| |
| − | After years of development, machine translation has made great progress. The accuracy of machine translation has been greatly improved in both text recognition and sentence pattern conversion. However, machine translation has its own limitations. In other words, it needs to rely on the parallel corpus as a reference source for improving its accuracy. ESP text, in particular, is harder to get the high quality by machine translation.
| |
| − |
| |
| − | As one of the research papers, the characteristics of medical abstracts are fixed language structures, objectivity and accuracy (Qin Yi 2004:421-423). Therefore, medical translation must be accurate, object and understandable to follow the specific demands of the medical paper. Being an important field in the human society, medical paper translation is on a great demand, which means that it needs a huge demand for human labor. However, with the machine translation promoting, it will be more efficient to translate medical papers combining the effort by human and machine. The improvement and development of machine translation requires the joint efforts of computer science, information science, statistics, linguistics and other academic circles to achieve more mature human-computer mutual assistance translation (Li Yafei, Zhang Ruihua 2019:38-45).
| |
| − |
| |
| − | However, errors can occur during the process of machine translation of Chinese- English, because of the differences of the Chinese and English and the processing of the machine. Errors from the perspective of linguistic or grammar can affect the machine translation a lot. After division and recognition of errors, some pre-editing approaches are put forward to help the machine translation more accurate and readable, that are, extraction and replacement of terms in the source text, relocation of modifiers, explication of subordination, proper omission, deletion of category words and explication of subject through voice changing.
| |
| − |
| |
| − | The paper mainly focuses on the pre-editing machine translation by using medical papers as a case study. The errors of machine translation occurring in the translation of medical abstracts and pre-editing approaches for machine translation. The quality of machine translation of medical papers is greatly improved after employing the pre-editing methods. However, machine translation is not as flexible and accurate as human brain, so it is of importance to combine pre-editing and post-editing approaches with machine translation in order to produce more accurate, more object machine translation of medical papers.
| |
| − |
| |
| − | --[[User:Cai Zhufeng|Cai Zhufeng]] ([[User talk:Cai Zhufeng|talk]]) 13:48, 13 December 2021 (UTC)correted by Cai Zhufeng
| |
| − |
| |
| − | ===References===
| |
| − |
| |
| − | Cui Qiliang崔启亮(2014).论机器翻译的译后编辑[J] ''On Post-Editing of Machine Translatio''. 中国翻译 Chinese Translators Journal, 035(006):68-73
| |
| − |
| |
| − | Feng Quangong, Gao Lin冯全功,高琳 (2017). 基于受控语言的译前编辑对机器翻译的影响[J] ''Influence of Pre-editing Based on Controlled Language on Machine Translation''. 当代外语研究Contemporary Foreign Language Research,(2): 63-68+87+110.
| |
| − |
| |
| − | GERLACH J, et al ( 2013). ''Combining Pre-editing and Post-editing to Improve SMT of User-generated Content''[M]// Proceedings of MT Summit XIV Workshop on Post-editing Technology and Practice. 45-53
| |
| − |
| |
| − | Hu Qingping胡清平(2005). 机器翻译中的受控语言[J] ''Controlled Language in Machine Translation''. 中国科技翻译 Chinese Science and Technology Translation, (03): 24-27.
| |
| − |
| |
| − | Lian Shuneng连淑能 (2010). 英汉对比研究增订本[M]''An Updated Version of English-Chinese Contrastive Studies'' . 北京:高等教育出版社Beijing: Higher Education Publishing House. 35-36.
| |
| − |
| |
| − | Li Yafei, Zhang Ruihua黎亚飞,张瑞华 (2019). 机器翻译发展与现状[J]''The Development and Current Situation of Machine Translation''. 中国轻工教育 China Light Industry Education, (5):38-45.
| |
| − |
| |
| − | Qin Yi秦毅(2004),从翻译基本标准议医学英语的翻译[J] ''On the Translation of Medical English from the Basic Standard of Translation''. 遵义医学院学报 Journal of Zunyi Medical College,27 (4): 421-423.
| |
| − |
| |
| − | Slype G V & Guinet J F & Seitz F (1984). ''Better Translation for Better Communication'' [M] . Oxford: Pergamon Press Ltd (U.K.). 90-93
| |
| − |
| |
| − | O'Brien S (2002). Teaching Post-editing: A Proposal for Course Con-tent [EB/OL]. http://mt-archive. Info/EAMT-2002-0brien. Pdf.
| |
| − |
| |
| − | Tytler, A. F. (1978). ''Essay On The Principles of Translation''[M]. Amsterdam: JohnBenjamins Publishing. 118-119
| |
| − |
| |
| − | Wang Yan王燕 (2008). 医学英语翻译与写作教程[M] ''Medical English Translation and Writing Course''. 重庆:重庆大学出版社 Chongqing: Chongqing University Press. 60-61
| |
| | | | |
| | Written by --[[User:Chen Huini|Chen Huini]] ([[User talk:Chen Huini|talk]]) 04:58, 15 December 2021 (UTC)Chen Huini | | Written by --[[User:Chen Huini|Chen Huini]] ([[User talk:Chen Huini|talk]]) 04:58, 15 December 2021 (UTC)Chen Huini |
| | | | |
| − | =12 蔡珠凤 The Mistranslation of C-J Machine Translation of Political Statements= | + | =Chapter 12 The Mistranslation of C-J Machine Translation of Political Statements= |
| | | | |
| | 机器翻译中政治发言中译日的误译 | | 机器翻译中政治发言中译日的误译 |
| Line 357: |
Line 89: |
| | | | |
| | [[Machine_Trans_EN_12]] | | [[Machine_Trans_EN_12]] |
| − | ===Abstract===
| |
| − | Language is the main way of communication between people. With the continuous development of globalization, the scale of cross-border exchanges is also expanding. However, due to cultural differences and diversity, the languages of different countries and regions are very different, which seriously hinders people's communication. The demand for efficient and convenient translation tools is increasing. At the same time, with the development of network technology and artificial intelligence, recognition technology based on deep learning is more and more widely used in English, Japanese and other fields.
| |
| | | | |
| − | ===Key words=== | + | =Chapter 13 Study on Post-editing from the Perspective of Functional Equivalence Theory= |
| − | machine translation; political statements; mistranslation of C-J machine translation
| |
| − | ===题目===
| |
| − | The Mistranslation of C-J Machine Translation of Political Statements
| |
| − | ===摘要===
| |
| − | 语言是人与人之间交流的主要方式。随着全球化的不断发展,跨境交流的规模也在不断扩大。然而,由于文化的差异和多样性,不同国家和地区的语言差异很大,这严重阻碍了人们的交流。对高效便捷的翻译工具的需求正在增加。同时,随着网络技术和人工智能的发展,基于深度学习的识别技术在英语、日语等领域的应用越来越广泛。
| |
| | | | |
| − | ===关键词===
| + | 陈湘琼, Hunan Normal University |
| − | 机器翻译;政治发言;政治发言中译日的误译
| |
| − | ===1. Introduction===
| |
| − | ===Introduction to machine translation===
| |
| − | Machine translation, also known as automatic translation, is a process of using computers to convert one natural language (source language) into another natural language (target language). It is a branch of computational linguistics, one of the ultimate goals of artificial intelligence, and has important scientific research value.At the same time, machine translation has important practical value. With the rapid development of economic globalization and the Internet, machine translation technology plays a more and more important role in promoting political, economic and cultural exchanges.The development of machine translation technology has been closely accompanied by the development of computer technology, information theory, linguistics and other disciplines. From the early dictionary matching, to the rule translation of dictionaries combined with linguistic expert knowledge, and then to the statistical machine translation based on corpus, with the improvement of computer computing power and the explosive growth of multilingual information, machine translation technology gradually stepped out of the ivory tower and began to provide real-time and convenient translation services for general users.(Zhang 2019:5-6)
| |
| | | | |
| − | === C-J machine translation software=== | + | [[Machine_Trans_EN_13]] |
| − | Today's online machine translation software includes Baidu translation, Tencent translation, Google translation, Youdao translation, Bing translation and so on. Google was the first company to launch the machine translation system, and Baidu was the first company to import the machine translation system in China. In addition, Tencent and Youdao have attracted much attention.Machine translation is the process of using computers to convert one natural language into another. It usually refers to sentence and full-text translation between natural languages. In order to continuously improve the translation quality, R & D personnel have added artificial intelligence technologies such as speech recognition, image processing and deep neural network to machine translation on the basis of traditional machine translation based on rules, statistics and examples.With the increase of using machine translation, the joint cooperation between manual translation and machine translation will also increase significantly in the future. What criteria should be used to evaluate the quality of machine translation? In the evolving field of machine translation, there is an urgent need to clarify the unsolvable questions and solved problems.(Lv 1996:3)
| + | =Chapter 14 Machine Translation a Challenge for Human Translators= |
| | | | |
| − | ===The history of machine translation===
| + | Bi bi Nadia, Hunan Normal University |
| − | The research history of machine translation can be traced back to the 1930s and 1940s. In the early 1930s, the French scientist G.B. alchuni put forward the idea of using machines for translation. In 1933, Soviet inventor П.П. Trojansky designed a machine to translate one language into another, and registered his invention on September 5 of the same year; However, due to the low technical level in the 1930s, his translation machine was not made. In 1946, the first modern electronic computer ENIAC was born. Shortly after that, W. weaver, an American scientist and A. D. booth, a British engineer, a pioneer of information theory, put forward the idea of automatic language translation by computer in 1947 when discussing the application scope of electronic computer. In 1949, W. Weaver published the translation memorandum, which formally put forward the idea of machine translation. After 60 years of ups and downs, machine translation has experienced a tortuous and long development path. The academic community generally divides it into the following four stages:
| |
| | | | |
| − | Pioneering period(1947-1964)
| + | [[Machine_Trans_EN_14]] |
| − | | + | =Chapter 15 Machine Translation: Advantage or Disadvantage for the Human Translator= |
| − | In 1954, with the cooperation of IBM, Georgetown University completed the English Russian machine translation experiment with ibm-701 computer for the first time, showing the feasibility of machine translation to the public and the scientific community, thus opening the prelude to the study of machine translation.
| |
| − | It is not too late for China to start this research. As early as 1956, the state included this research in the national scientific work development plan. The topic name is "machine translation, the construction of natural language translation rules and the mathematical theory of natural language". In 1957, the Institute of language and the Institute of computing technology of the Chinese Academy of Sciences cooperated in the Russian Chinese machine translation experiment, translating 9 different types of more complex sentences.From the 1950s to the first half of the 1960s, machine translation research has been on the rise. The United States and the former Soviet Union, two superpowers, have provided a lot of financial support for machine translation projects for military, political and economic purposes, while European countries have also paid considerable attention to machine translation research due to geopolitical and economic needs, and machine translation has become an upsurge for a time. In this period, although machine translation is just in the pioneering stage, it has entered an optimistic period of prosperity.
| |
| − | | |
| − | Frustrated period(1964-1975)
| |
| − | | |
| − | In 1964, in order to evaluate the research progress of machine translation, the American Academy of Sciences established the automatic language processing Advisory Committee (Alpac Committee) and began a two-year comprehensive investigation, analysis and test.In November 1966, the committee published a report entitled "language and machine" (Alpac report for short), which comprehensively denied the feasibility of machine translation and suggested stopping the financial support for machine translation projects. The publication of this report has dealt a blow to the booming machine translation, and the research of machine translation has fallen into a standstill. Coincidentally, during this period, China broke out the "ten-year Cultural Revolution", and basically these studies also stagnated. Machine translation has entered a depression.
| |
| − | | |
| − | convalescence(1975-1989)
| |
| − | | |
| − | Since the 1970s, with the development of science and technology and the increasingly frequent exchange of scientific and technological information among countries, the language barriers between countries have become more serious. The traditional manual operation mode has been far from meeting the needs, and there is an urgent need for computers to engage in translation. At the same time, the development of computer science and linguistics, especially the substantial improvement of computer hardware technology and the application of artificial intelligence in natural language processing, have promoted the recovery of machine translation research from the technical level. Machine translation projects have begun to develop again, and various practical and experimental systems have been launched successively, such as weinder system Eurpotra multilingual translation system, taum-meteo system, etc.However, after the end of the "ten-year holocaust", China has perked up again, and machine translation research has been put on the agenda again. The "784" project has paid enough attention to machine translation research. After the mid-1980s, the development of machine translation research in China has further accelerated. Firstly, two English Chinese machine translation systems, ky-1 and MT / ec863, have been successfully developed, indicating that China has made great progress in machine translation technology.(Chen 2016:5)
| |
| − | | |
| − | New period(1990 present)
| |
| − | | |
| − | With the universal application of the Internet, the acceleration of the process of world economic integration and the increasingly frequent exchanges in the international community, the traditional way of manual operation is far from meeting the rapidly growing needs of translation. People's demand for machine translation has increased unprecedentedly, and machine translation has ushered in a new development opportunity. International conferences on machine translation research have been held frequently, and China has made unprecedented achievements. A series of machine translation software have been launched, such as "Yixing", "Yaxin", "Tongyi", "Huajian", etc. Driven by the market demand, the commercial machine translation system has entered the practical stage, entered the market and came to the users.Since the new century, with the emergence and popularization of the Internet, the amount of data has increased sharply, and statistical methods have been fully applied. Internet companies have set up machine translation research groups and developed machine translation systems based on Internet big data, so as to make machine translation really practical, such as "Baidu translation", "Google translation", etc. In recent years, with the progress of in-depth learning, machine translation technology has further developed, which has promoted the rapid improvement of translation quality, and the translation in oral and other fields is more authentic and fluent.(Liu 2014:6)
| |
| − | | |
| − | ===The problem of machine translation at present===
| |
| − | Error is inevitable
| |
| − | Many people have misunderstandings about machine translation. They think that machine translation has great deviation and can't help people solve any problems. In fact, the error is inevitable. The reason is that machine translation uses linguistic principles. The machine automatically recognizes grammar, calls the stored thesaurus and automatically performs corresponding translation. However, errors are inevitable due to changes or irregularities in grammar, morphology and syntax, such as sentences with adverbials after "give me a reason to kill you first" in Dahua journey to the West. After all, a machine is a machine. No one has special feelings for language. How can it feel the lasting charm of "the tenderness of lowering its head, like the shame of a water lotus? After all, the meaning of Chinese is very different due to the changes of morphology, grammar and syntax and the change of context. Even many Chinese people are zhanger monks - they can't touch their heads, let alone machines.(Liu 2014:3)
| |
| − | | |
| − | Bottleneck
| |
| − | In fact, no matter which method, the biggest factor affecting the development of machine translation lies in the quality of translation. Judging from the achievements, the quality of machine translation is still far from the ultimate goal.
| |
| − | Chinese mathematician and linguist Zhou Haizhong once pointed out in his paper "fifty years of machine translation": to improve the quality of machine translation, the first thing to solve is the problem of language itself rather than programming; It is certainly impossible to improve the quality of machine translation by relying on several programs alone. At the same time, he also pointed out that it is impossible for machine translation to achieve the degree of "faithfulness, expressiveness and elegance" when human beings have not yet understood how the brain performs fuzzy recognition and logical judgment of language. This view may reveal the bottleneck restricting the quality of translation.
| |
| − | It is worth mentioning that American inventor and futurist ray cozwell predicted in an interview with Huffington Post that the quality of machine translation will reach the level of human translation by 2029. There are still many disputes about this thesis in the academic circles.(Cui 2019:4)
| |
| − | | |
| − | ===2.Mistranslation of Chinese Japanese machine translation===
| |
| − | ===2.1Vocabulary mistranslation in Chinese Japanese machine translation===
| |
| − | ===2.1.1Mistranslation of proper nouns===
| |
| − | In political materials, there are often political dignitaries' names, place names or a large number of proper nouns in the political field. The morphemes of such words are definite and inseparable. Mistranslation will make the source language lose its specific meaning.
| |
| − | | |
| − | Japanese translation into Chinese Chinese translation into Japanese
| |
| − |
| |
| − | original text translation by Youdao reference translation original text translation by Youdao reference translation
| |
| − | | |
| − | 朱鎔基 朱基 朱镕基 栗战书 栗戰史書 栗戰書
| |
| − |
| |
| − | 労安 劳安 劳安 李克强 李克強 李克強
| |
| − | | |
| − | 筑紫哲也 筑紫哲也 筑紫哲也 习近平 習近平 習近平
| |
| − |
| |
| − | 山口百惠 山口百惠 山口百惠 韩正 韓中 韓正
| |
| − |
| |
| − | 田中角栄 田中角荣 田中角荣 王沪宁 王上海氏 王滬寧
| |
| − |
| |
| − | 東条英機 东条英社 东条英机 汪洋 汪洋 汪洋
| |
| − |
| |
| − | 毛沢东 毛泽东 毛泽东 赵乐际 趙樂南 趙樂際
| |
| − |
| |
| − | トウ・ショウヘイ 大酱 邓小平 江泽民 江沢民 江沢民
| |
| − |
| |
| − | 周恩来 周恩来 周恩来
| |
| − | | |
| − | クリントン 克林顿 克林顿
| |
| − | | |
| − | The above table counts 18 special names in the two texts, and 7 machine translation errors. In terms of mistranslation, there are not only "战书" but also "戰史" and "史書" in Chinese. "沪" is the abbreviation of Shanghai. In other words, since "战书" and "沪" are originally common nouns, this disrupts the choice of target language in machine translation. Except Katakana interference (except for トウシゃウヘイ, most of the mistranslations appear in the new Standing Committee. It can be seen that the machine translation system does not update the thesaurus in time. Different from other words, people's names have particularity. Especially as members of the Standing Committee of the Political Bureau of the CPC Central Committee, the translation of their names is officially unified. In this regard, public figures such as political dignitaries, stars, famous hosts and important. In addition, the machine also translates Japanese surnames such as "タ二モト" (谷本), "アンドウ" (安藤) into "塔尼莫特" and "龙胆". It can be found that the language feature that Japanese will be written together with Chinese characters and Hiragana also directly affects the translation quality of Japanese Chinese machine translation. Japanese people often use Katakana pronunciation. For example, when Japanese people talk with Premier Zhu, they often use "二ーハウ" (Hello). However, the machine only recognizes the two pseudonyms "二" and "ハ" and transliterates them into "尼哈" , ignoring the long sound after "二" and "ハ".(Guan 2018:10-12)
| |
| − | | |
| − | original text Translation by Youdao reference translation
| |
| − | | |
| − | 日美安全体制 日米の安全体制 日米安保体制
| |
| − | | |
| − | 中国共产党第十九次全国代表大会 中国共産党第19回全国代表大会 中国共産党第19回全国代表大会(第19回党大会)
| |
| − | | |
| − | 十八大 十八大 第18回党大会中国特色社会主义
| |
| − |
| |
| − | 中国特色社会主義 中国の特色ある社会主義 第18回党大会
| |
| − | | |
| − | 中国共产党中央委员会 中国共産党中央委員会 中国共産党中央委員会
| |
| − | | |
| − | 中国共産党中央委員会十八届中共中央政治局常委 第18代中国共產党中央政治局常務委員 第18期中共中央政治局常務委員
| |
| − | | |
| − | 十八届中共中央政治局委员 18期の中国共產党中央政治局委員 第18期中共中央政治局委員
| |
| − | | |
| − | 十九届中共中央政治局常委 十九回中国共產党中央政治局常務委員 第19期中央政治局常務委員
| |
| − | | |
| − | 中共十九届一中全会 中国共產党第十九回一中央委員会 第19期中央委員会第1回全体会議
| |
| − | | |
| − | The above table is a comparison of the original and translated versions of some proper nouns. As shown in the table, the mistranslation problems are mainly reflected in the mismatch of numerals + quantifiers, the wrong addition of case auxiliary word "の", the lack of connectors, the Mistranslation of abbreviations, dead translation, and the writing errors of Chinese characters. The following is a specific analysis one by one.
| |
| − | | |
| − | "十八届中共中央政治局常委", "十八届中共中央政治局委员", "十九届中共中央政治局常委" and "中共十九届一中全会" all have quantifiers "届", which are translated into "代", "期" and "回" respectively. The meaning is vague and should be uniformly translated into "期"; Among them, the translation of the last three proper nouns lacks the Chinese conjunction "第". The case auxiliary word "の" was added by mistake in the translation of "十八届中共中央政治局委员" and "日美安全体制". The "中国共产党中央委员会" did not write in the form of Japanese Ming Dynasty characters, but directly used simplified Chinese characters.
| |
| − | | |
| − | The full name of "中共十九届一中全会" is "中国共产党第十九届中央委员会第一次全体会议", in which "一" stands for "第一次", which is an ordinal word rather than a cardinal word. Machine translation does not produce a correct translation. The fundamental reason is that there are no "规则" for translating such words in machine translation. If we can formulate corresponding rules for such words (for example, 中共m'1届m2 中全会→第m1期中央委员会第m2回全体会议), the translation system must be able to translate well no matter how many plenary sessions of the CPC Central Committee.(Guan 2018:6-7)
| |
| − | | |
| − | ===2.1.2Mistranslation of Polysemy===
| |
| − | original text Translation by Youdao reference translation
| |
| − | | |
| − | スタジオ 摄影棚/工作室 直播现场/演播厅
| |
| − | | |
| − | 日中関係の話 中日关系的故事 就中日关系(话题)
| |
| − | | |
| − | 溝 水沟 鸿沟
| |
| − | | |
| − | それでは日中の問題について質問のある方。 那么对白天的问题有提问的人。 关于中日问题的话题,举手提问。
| |
| − | | |
| − | 私たちのクラスは20人ちょっとですが、 我们班有20人左右, 我们班二十多人的意见统一很难,
| |
| − | | |
| − | いろいろな意見が出て、まとめるのは大変です。 但是又各种各样的意见,总结起来很困难。
| |
| − | | |
| − | 一体どうやって、13億人もの人をまとめているんですか。 到底是怎么处理13亿的人的呢? 中国是怎样把13亿人凝聚在一起的?
| |
| − | | |
| − | In the original text, the word "スタジオ" appeared four times and was translated into "摄影棚" or "工作室" respectively, but the context is on-site interview, not the production site of photography or film. "話" has many semantics, such as "说话", "事情", "道理", etc. In accordance with the practice of the interview program, Premier Zhu Rongji was invited to answer five questions on the topic of China Japan relations. Obviously, the meaning of the word "故事" is very abrupt. The inherent Japanese word "日中" means "晌午、白天". At the same time, it is also the abbreviation of the names of the two countries. The machine failed to deal with it correctly according to the context. "溝" refers to the gap between China and Japan, not a "水沟". The former "まとめる" appears in the original text with the word "意见", which is intended to describe that it is difficult to unify everyone's opinions. The latter "まとめている" also refers to unifying the thoughts of 1.3 billion people, not "处理". Therefore, it is more appropriate to use "统一" and "处理" in human translation, rather than "总结意见" or "处理意见".(Zhang 2019:5)
| |
| − | | |
| − | Mistranslation of polysemy has always been a difficult problem in machine translation research. The translation of each word is correct, but it is often very different from the original expression in the context. Zhang Zhengzeng said that ambiguity is a common phenomenon in natural language. Its essence is that the same language form may have different meanings, which is also one of the differences between natural language and artificial language. Therefore, one of the difficulties faced by machine translation is language disambiguation (Zhang Zheng, 2005:60). In this regard, we can mark all the meanings of polysemous words and judge which meaning to choose by common collocation with other words. At the same time, strengthen the text recognition ability of the machine to avoid the translation inconsistent with the current context. In this way, we can avoid the mistake of "大酱" in the place where famous figures such as Deng Xiaoping should have appeared in the previous political articles.(Wang 2020:7-9)
| |
| − | | |
| − | ===2.1.3Mistranslation of compound words===
| |
| − | Multi class words refer to a word with two or more parts of speech, also known as the same word and different classes.
| |
| − | | |
| − | original text Translation by Youdao reference translation
| |
| − | | |
| − | 1998年江泽民主席曾经访问日本, 1998年の江沢民国家主席の日本訪問し、 1998年、江沢民総書記が日本を訪問し、かつ
| |
| − | | |
| − | 同已故小渊首相签署了联合宣言。 かつて同じ故小渕首相が署名した共同宣言 亡くなられた小渕総理と宣言に調印されました
| |
| − | | |
| − | Chinese "同" has two parts of speech: prepositions and conjunctions: "故" has many parts of speech, such as nouns, verbs, adjectives and conjunctions. This directly affects the judgment of the machine at the source language level. The word "同" in the above table is used as a conjunction to indicate the other party of a common act. "故" is used as a verb with the semantic meaning of "死亡", which is a modifier of "小渊首相". The machine regards "同" as an adjective and "故" as a noun "原因", which leads to confusion in the structure and unclear semantics of the translation. Different from the polysemy problem, multi category words have at least two parts of speech, and there is often not only one meaning under each part of speech. In this regard, software R & D personnel should fully consider the existence of multi category words, so that the translation machine can distinguish the meaning of words on the basis of marking the part of speech, so as to select the translation through the context and the components of the word in the sentence. Of course, the realization of this function is difficult, and we need to give full play to the wisdom of R & D personnel.(Guan 2018:9-12)
| |
| − | | |
| − | ===2.2 Syntactic mistranslation in Chinese Japanese machine translation===
| |
| − | ===2.2.1Mistranslation of tenses===
| |
| − | original text:History is written by the people, and all achievements are attributed to the people.
| |
| − | | |
| − | Translation by Youdao:歴史は人民が書いたものであり、すべての成果は人民のためである。
| |
| − | | |
| − | reference translation:歴史は人民が綴っていくものであり、すべての成果は人民に帰することとなります。
| |
| − | | |
| − | The original sentence meaning is "历史是人民书写的历史" or "历史是人民书写的东西". When translated into Japanese, due to the influence of Japanese language habits, the formal noun "もの" should be supplemented accordingly. In this sentence, both the machine and the interpreter have translated correctly. However, the machine's recognition of "的" is biased, resulting in tense translation errors. The past, present and future will become "history", and this is a continuous action. The form translated as "~ ていく" not only reflects that the action is a continuous action, but also conforms to the tense and semantic information of the original text. In addition, the machine's treatment of the preposition "于" is also inappropriate. In the original text, "人民" is the recipient of action, not the target language. As the most obvious feature of isolated language, function words in Chinese play an important role in semantic expression, and their translation should be regarded as a focus of machine translation research.(Zuo 2021:8)
| |
| − | | |
| − | original text:李克强同志是十六届中共中央政治局常委,其他五位同志都是十六届中共中央政治局委员。
| |
| | | | |
| − | Translation by Youdao:李克強総理は第16代中国共産党中央政治局常務委員であり、他の5人の同志はいずれも16期の中国共産党中央政治局委員である。
| + | Mariam Touré, Hunan Normal University |
| − | | |
| − | reference translation:李克強同志は第16期中国共産党中央政治局常務委員を務め、他の5人は第16期中共中央政治局委員を務めました。
| |
| − | | |
| − | Judging the verb "是" in the original text plays its most basic positive role, but due to the complexity of Chinese language, "是" often can not be completely transformed into the form of "だ / である" in the process of translation. This sentence is a judgment of what has happened. Compared with the 19th session, the 18th session has become history. This is an implicit temporal information. It is very difficult for machines without human brain to recognize this implicit information. "中共中央政治局常委" is the abbreviation of "中共中央政治局常务委员会委员" and it is a kind of position. In Japanese, often use it with verbs such as "務める","担当する",etc. The translator adopts the past tense of "務める", which conforms to the expression habit of Japanese and deals with the problem of tense at the same time. However, machine translation only mechanically translates this sentence into judgment sentence, and fails to correctly deal with the past information implied by the word "十八届".(Guan 2018:4)
| |
| − | | |
| − | ===2.2.2Mistranslation of honorifics===
| |
| − | Honorific language is a language means to show respect to the listener. Different from Japanese, there is no grammatical category of honorifics in Chinese. There is no specific fixed grammatical form to express honorifics, self modesty and politeness. Instead, specific words such as "您", "请" and "劳驾" are used to express all kinds of respect or self modesty. (Yang 2020:5-9)
| |
| − | | |
| − | Original text translation by Youdao reference translation
| |
| − | | |
| − | 女士们,先生们,同志们,朋友们 さんたち、先生たち、同志たち、お友达さん ご列席の皆さん
| |
| − | | |
| − | 谢谢大家! ありがとうございます! ご清聴ありがとうございました。
| |
| − | | |
| − | これはどうされますか。 这是怎么回事呢? 您将如何解决这一问题?
| |
| − | | |
| − | こうした問題をどうお考えでしょうか。 我们会如何考虑这些问题呢? 您如何看待这一问题?
| |
| − |
| |
| − | For example, for the processing of "女士们,先生们,同志们,朋友们", the machine makes the translation correspond to the original one by one based on the principle of unchanged format, but there are errors in semantic communication and pragmatic habits. When speaking on formal occasions, Chinese expression tends to be comprehensive and detailed, as well as the address of the audience. In contrast, Japanese usually uses general terms such as "皆様", "ご臨席の皆さん" or "代表団の方々" as the opening remarks. In terms of Japanese Chinese translation, the machine also failed to recognize the usage of Japanese honorifics such as "さ れ る" and "お考え". It can be seen that Chinese Japanese machine translation has a low ability to deal with honorific expressions. The occasion is more formal. A good translation should not only be fluent in meaning, but also conform to the expression habits of the target language and match the current translation environment.(Che 2021:3-7)
| |
| − | | |
| − | ===Conclusion===
| |
| − | In the process of discussing the Mistranslation of machine translation, this paper mainly cites the Mistranslation of Youdao translator. When I studied the problem of machine translation mistranslation, I also used a large number of translation software such as Google and Baidu for parallel comparison, and found that these translation software also had similar problems with Youdao translator. For example, the "新世界新未来" in Chinese ABAC structural phrases is translated by Baidu as "新し世界の新しい未来", which, like Youdao, is not translated in the form of parallel phrases; Google online translation translates the word "“全心全意" into a completely wrong "全面的に"; The original sentence "我吃了很多亏" in the Mistranslation of the unique expression in the language is translated by Google online as "私はたくさんの損失を食べました". Because the "吃" and "亏" in the original text are not closely adjacent, the machine can not recognize this echo and mistakenly treats "亏" as a kind of food, so the machine translates the predicate as "食べました"; Japanese "こうした問題をどう考えでしょうか", Google's online translation is "你如何看待这些问题?", "你" tone can not reflect the tone of Japanese honorific, while Baidu translates as "你怎么想这样的问题呢?" Although the meaning can be understood, it is an irregular spoken language. Google and Baidu also made the same mistakes as Youdao in the translation of proper nouns. For example, they translated "十七届(中共中央政治局常委)”" into "第17回..." .In short, similar mistranslations of Youdao are also common in Baidu and Google. Due to space constraints, they will not be listed one by one here.(Cui 2019:7)
| |
| − |
| |
| − | Mr. Liu Yongquan, Institute of language, Chinese Academy of Social Sciences (1997) It has been pointed out that machine translation is a linguistic problem in the final analysis. Although corpus based machine translation does not require a lot of linguistic knowledge, language expression is ever-changing. Machine translation based solely on statistical ideas can not avoid the translation quality problems caused by the lack of language rules. The following is based on the analysis of lexical, syntactic and other mistranslations From the perspective of the characteristics of the source language, this paper summarizes some difficulties of Chinese and Japanese in machine translation.(Liu 2014:8)
| |
| − | | |
| − | (1) The difficulties of Chinese in machine translation
| |
| − | | |
| − | Chinese is a typical isolated language. The relationship between words needs to be reflected by word order and function words. We should pay attention to the transformation of function words such as "的", "在", "向" and "了". Some special verbs, such as "是", "做" and "作", are widely used. How to translate on the basis of conforming to Japanese pragmatic habits and expressions is a difficulty in machine translation research. In terms of word order, Chinese is basically "subject→predicate→object". At the same time, Chinese pays attention to parataxis, and there is no need to be clear in expression with meaningful cohesion, which increases the difficulty of Chinese Japanese machine translation. When there are multiple verbs or modifier components in complex long sentences, machine translation usually can not accurately divide the components of the sentence, resulting in the result that the translation is completely unreadable.(Guan 2018:6-12)
| |
| − | | |
| − | (2) Difficulties of Japanese in machine translation
| |
| − | | |
| − | Both Chinese and Japanese languages use Chinese characters, and most machines produce the target language through the corresponding translation of Chinese characters. However, pseudonyms in Japanese play an important role in judging the part of speech and meaning, and can not only recognize Chinese characters and judge the structure and semantics of sentences. Japanese vocabulary is composed of Chinese vocabulary, inherent vocabulary and foreign vocabulary. Among them, the first two have a great impact on Chinese Japanese machine translation. For example "提出" is translated as "提出する" or "打ち出す"; and "重视" is translated as "重視する" or "大切にする". Japanese is an adhesive language, which contains a lot of "けど", "が" and "けれども" Some of them are transitional and progressive structures, but there are also some sequential expressions that do not need translation, and these translation software often can not accurately grasp them.(Che 2021:10)
| |
| − | | |
| − | ===References===
| |
| − | [1] Navroz Kaur Kahlon,(2021(prepublish));Williamjeet Singh.Machine translation from text to sign language: a systematic review[J].Universal Access in the Information Society,1-35.
| |
| − | | |
| − | [2] Cao Qianyu;Hao Hanmei,(2021);Ahmed Syed Hassan.A Chaotic Neural Network Model for English Machine Translation Based on Big Data Analysis[J].Computational Intelligence and Neuroscience,3274326-3274326.
| |
| − | | |
| − | [3]Hwang Yongkeun;Kim Yanghoon;Jung Kyomin.(2021)Context-Aware Neural Machine Translation for Korean Honorific Expressions[J].Electronics,10(13):1589-1589.
| |
| − | | |
| − | [4]Zakaryia Almahasees.(2021)Analysing English-Arabic Machine Translation:Google Translate, Microsoft Translator and Sakhr.
| |
| − | | |
| − | [5](2021)Machine learning in translation[J].Nature Biomedical Engineering,5(6):485-486.
| |
| − | | |
| − | [6]Shaimaa Marzouk.(2021(prepublish))An in-depth analysis of the individual impact of controlled language rules on machine translation output: a mixed-methods approach[J].Machine Translation,1-37.
| |
| − |
| |
| − | [7]Welnitzová Katarína;Munková Daša.(2021)Sentence-structure errors of machine translation into Slovak[J].Topics in Linguistics,22(1):78-92.
| |
| − | | |
| − | [8]Xu Xueyuan.(2021).Machine learning-based prediction of urban soil environment and corpus translation teaching[J].Arabian Journal of Geosciences,14(11).
| |
| − | | |
| − | [9]Chen Bingchang 陈丙昌(2016).機械翻訳の誤訳分析【D】.Error analysis of mechanical translation.贵州大学.2016(05)
| |
| − | | |
| − | [10]Lv Yinqiu 呂寅秋(1996).機械翻訳の言語規則と伝統文法との相違点.【D】The language rules of mechanical translation, the traditional grammar, and the points of contradiction.日本学研究.Japanese Studies.1996(00):21-22
| |
| − | | |
| − | [11]Liu Jun 刘君(2014).基于语料库的中日同形词词义用法对比及其日中机器翻译研究【D】.A Corpus-based Comparison of the Meanings of Chinese and Japanese Homographs and Research on Japanese-Chinese Machine Translation.广西大学.(03)
| |
| − | | |
| − | [12]Cun Qianqian 崔倩倩(2019).机器翻译错误与译后编辑策略研究【D】.Research on Machine Translation Errors and Post-Editing Strategies.北京外国语大学.(09)
| |
| − | | |
| − | [13]Zhang Yi 张义(2019).机器翻译的译文分析【D】.Translation analysis of machine translation.西安外国语大学.(10)
| |
| − | | |
| − | [14]Zhang Linqian 张琳婧(2019).在线机器翻译中日翻译错误原因及对策【D】.Causes and countermeasures of online machine translation errors in Chinese-Japanese translation.山西大学.(02)
| |
| − |
| |
| − | [15]Wang Dan 王丹(2020).基于机器翻译的专利文本译后编辑对策研究【D】.Research on countermeasures for post-translational editing of patent texts based on machine translation.大连理工大学.(06)
| |
| − |
| |
| − | [16]Yang Xiaokun 杨晓琨(2020).日中机器翻译中的前编辑规则与效果验证【D】.Pre-editing rules and effect verification in Japanese-Chinese machine translation.大连理工大学.(06)
| |
| − |
| |
| − | [17]Zuo Jia 左嘉(2021). 机器翻译日译汉误译研究【D】. Research on Mistranslation of Machine Translation from Japanese to Chinese.北京第二外国语学院.
| |
| − | | |
| − | [18]Guan Biying 关碧莹(2018).关于政治类发言的汉日机器翻译误译分析【D】.Analysis of Chinese-Japanese Machine Translation Mistranslations of Political Speeches.哈尔滨理工大学.
| |
| − | | |
| − | [19]Che Tong 车彤(2021).汉译日机器翻译质量评估及译后编辑策略研究【D】.Research on Quality Evaluation of Chinese-Japanese Machine Translation and Post-translation Editing Strategies.北京外国语大学.(09)
| |
| − | | |
| − | Networking Linking
| |
| − | | |
| − | http://www.elecfans.com/rengongzhineng/692245.html
| |
| − | | |
| − | https://baike.baidu.com/item/%E6%9C%BA%E5%99%A8%E7%BF%BB%E8%AF%91/411793
| |
| − | | |
| − | =13 陈湘琼Chen Xiangqiong(Study on Post-editing from the Perspective of Functional Equivalence Theory )=
| |
| − | [[Machine_Trans_EN_13]]
| |
| − | | |
| − | =Chapter 14 Bi bi Nadia(Machine Translation a Challenge for Human Translators)=
| |
| − | [[Machine_Trans_EN_14]]
| |
| | | | |
| − | Bi bi Nadia, Hunan Normal University, China
| + | [[Machine_Trans_EN_15]] |