Machine translation

From China Studies Wiki
Jump to navigation Jump to search

Machine Translation - A challenge or a chance for human translators?

Overview Page of Machine Translation

30 Chapters(0/30)

Machine_Trans_EN_1 Machine_Trans_EN_2 Machine_Trans_EN_3 Machine_Trans_EN_4 Machine_Trans_EN_5 Machine_Trans_EN_6 Machine_Trans_EN_7 Machine_Trans_EN_8 Machine_Trans_EN_9 Machine_Trans_EN_10 Machine_Trans_EN_11 Machine_Trans_EN_12 Machine_Trans_EN_13 Machine_Trans_EN_14 Machine_Trans_EN_15 Machine_Trans_EN_16 Machine_Trans_EN_17 Machine_Trans_EN_18 Machine_Trans_EN_19 Machine_Trans_EN_20 Machine_Trans_EN_21 Machine_Trans_EN_22 Machine_Trans_EN_23 Machine_Trans_EN_24 Machine_Trans_EN_25 Machine_Trans_EN_26 Machine_Trans_EN_27 Machine_Trans_EN_28 Machine_Trans_EN_29 Machine_Trans_EN_30 ...

Back to translation project overview

To the To Do list

1 卫怡雯(A Comparison Between the Quality of Machine Translation and Human Translation——A Case Study of the Application of artificial intelligence in Sports Events)

Machine_Trans_EN_1

2 吴映红(The Introduction of Machine Translation)

Machine_Trans_EN_2

3 肖毅瑶(On the Realm Advantages And Symbiotic Development of Machine Translation And Huamn Translation)

Machine_Trans_EN_3

4 王李菲 (Comparison Between Neural Machine Translation of Netease and Traditional Human Translation—A Case Study of The Economist Articles)

Machine_Trans_EN_4

Abstract

Machine translation is a subfield of artificial intelligence and natural language processing that investigates transforming the source language into the target language. On this basis, the emergence of neural machine translation, a new method based on sequence-to-sequence model, improves the quality and accuracy of translation to a new level. As one of the earliest companies to invest in machine translation in China, Netease launched neural machine translation in 2017, which adopts the unique structure of neural network to encode sentences, imitating the working mechanism of human brain, and generates a translation that is more professional and more in line with the target language context. This paper takes the articles in The Economist as the corpus for analysis, and aims to explore the types and causes of common errors, as well as the advantages and challenges of each, through the comparative analysis of Netease neural machine translation and human translation, and finally to forecast the future development trend and make a summary of this paper.

Key words

Neural Machine Translation; Human Translation; Contrastive Analysis

题目

有道神经网络机器翻译与传统人工翻译的译文对比——以经济学人语料为例

摘要

机器翻译研究将源语言所表达的语义自动转换为目标语言的相同语义,是人工智能和自然语言处理的重要研究分区。在此基础上,一种基于序列到序列模型的全新机器翻译方法——神经机器翻译的出现让译文的质量和准确度提升到了新的层次。网易作为国内最早投身机器翻译的公司之一,在2017年上线的神经网络翻译采用了独到的神经网络结构,模仿人脑的工作机制对句子进行编码,生成的译文更具专业性,也更符合目的语语境。本文以经济学人内的文章为分析语料,旨在通过对网易神经机器翻译和人工翻译的英汉译文进行对比分析,探究常见错误类型及生成原因,以及各自存在的优势与挑战,最后展望未来发展趋势,并对本文做出总结。

关键词

神经网络翻译;人工翻译;对比分析

1. Introduction

1.1 Neural Machine Translation

Recent years have witnessed the rapid development of neural machine translation (NMT), which has replaced traditional statistical machine translation (SMT) to become a new mainstream technique, playing a crucial part in many fields, like business, academic and industry.

The previous SMT was more like a mechanical system, consisting of several components, including phrase conditions, partial conditions, sequential conditions, primitive models, and so on. Each module has its own function and goal, and then outputs the translation results through mechanical splicing. Its main disadvantage is that the model contains low syntactic and semantic components, so it will encounter problems when dealing with languages with large syntactic differences, such as Chinese-English. Sometimes the result is unreadable even though it is "word-for-word".

Compared with SMT, NMT model is more like an organism. There are many parameters in the model that can be adjusted and optimized for the same goal, making the combination and interaction more organic and the overall translation effect better. Its core is deep learning of artificial intelligence which can imitate the working mechanism of human brain and adopt unique neural network structure to model the whole process of translation. The whole model is composed of a large number of "neurons", and each "neuron" has to complete some simple tasks, and then through the combination of all of them to coordinate the work, a much better translation text appears.

Since NMT put more emphasis on context and the whole text, it produces more coherent and comprehensible content to readers than traditional SMT, and be widely accepted and used in various field in a very short time.


1.2 Business English Translation

The process of economic globalization has accelerated overwhelmingly nowadays, and considerable resources are poured into the business field. As a branch of global language English, business English is proposed under the theoretical framework of English for Specific Purpose (ESP), serves the international business activities which is a professional subject requiring specialized English.

According to the general standard, business English can be divided into two categories: English for General Business Purpose and English for Specific Business Purpose. Under this standard, business English is closely related to serious economic activities, resulting different functional variants, such as legal English, practical writing English, advertising English and so on. In a narrow definition, business English at least includes the following three types: 1) texts like commercial advertising, company profile, product description and so on; 2) texts related to cross- culture communication between business people to job hunting; text connected with world economy, international trade, finance, securities and investment, marketing, management, logistics and transport, contracts and agreements, insurance and arbitration.

The Economist is an international news and Business Weekly offering clear coverage, commentary and analysis of global politics, business, finance, science and technology. A huge number of terminologies plus the polysemy contained in the texts, put forward a tricky problem to both machine translation and human translation.

2.

3.

4.

5.

Conclusion

References

5 杨柳青

Machine_Trans_EN_5

6 徐敏赟(Machine Translation Based on Neural Network --Challenge or Chance?)

Machine_Trans_EN_6

Abstract

With the acceleration of economic globalization, there is a growing demand for translation services. In recent years, with the rapid development of neural networks and deep learning, the quality of machine translation has been significantly improved. Compared with human translation, machine translation has the advantages of low cost and high speed.

Neural machine translation brings both convenience and pressure to translators. Based on the principles of neural machine translation, this paper will objectively analyze the advantages and disadvantages of neural machine translation, and discuss whether neural machine translation is a chance or a challenge for human translators.

Key words

Neural network; Deep learning; Machine translation; human translation

题目

基于神经网络的机器翻译 --机遇还是挑战?

摘要

随着经济全球化进程的加速,人们对翻译服务的需求也越来越大。近些年来,神经网络和深度学习等技术得到快速发展,机器翻译的质量得到了显著提高,较人工翻译来讲有着成本低,速度快等优势。

基于神经网络的机器翻译给翻译工作者带来便捷的同时,同时也给翻译工作者们带来了一定的压力。本文将会从神经机器翻译的原理出发,客观分析基于神经网络的机器翻译中存在的一些优势与劣势,并以此来探讨机器翻译对于翻译工作者来说到底是机遇还是挑战这一问题。

关键词

神经网络;深度学习;机器翻译;人工翻译;

Introduction

Machine Translation, a branch of Natural Language Processing (NLP), refers to the process of using a machine to automatically translate a natural language (source language) sentence into another language (target language) sentence (Li Mu, Liu Shujie, Zhang Dongdong, Zhou Ming 2018, 2). The natural language here refers to human language used daily (such as English, Chinese, Japanese, etc.), which is different from languages created by humans for specific purposes (such as computer programming languages).

According to statistics, there are about 5600 human languages in existence. In China, as we are a big family composed of 56 ethnic groups, some ethnic minorities also have their own languages and scripts. In other countries, due to the history of colonization, these countries usually have multiple official languages, so some official documents usually need to be written in more than two languages. In the context of the “One Belt, One Road” initiative, communication among different languages has become an important part of building a community with a shared future for mankind. Therefore, the application of machine translation technology can help promote national unity, communication between different languages and cross-cultural communication.

Although the latest machine translation method, neural machine translation, has advantages such as speed and low cost, machine translation is still far from being as effective as human translation. Li Yao (Li Yao 2021, 39) selects Chronicle of a Blood Merchant translated by Andrew F. Jones, a classic work of Yu Hua, and the versions of Baidu Translation, Youdao Translation and Google Translation as corpus to conduct a comparative study on translation quality. Starting from the development of machine translation, Jin Wenlu (Jin Wenlu 2019, 82) analyzed the advantages and disadvantages of machine translation and manual translation, discussed the question of whether machine translation can replace human translation. In order to further explore the impact of machine translation on translators, this paper will take neural machine translation - the latest machine translation as an example to discuss whether machine translation is a chance or a challenge for translators.

Comparison of different machine translation methods

Actually, the development of Machine Translation methods is going through four stages: rule-based methods, instance-based methods, statistical machine translation and neural machine translation. At present, thanks to the application of deep learning methods, neural machine translation has become the mainstream. Compared with statistical machine translation, neural machine translation has the following advantages:

1) End-to-end learning does not rely on too many prior assumptions. In the era of statistical machine translation, model design makes more or less assumptions about the process of translation. Phrase-based models, for example, assume that both source and target languages are sliced into sequences of phrases, with some alignment between them. This hypothesis has both advantages and disadvantages. On the one hand, it draws lessons from the relevant concepts of linguistics and helps to integrate the model into human prior knowledge. On the other hand, the more assumptions, the more constrained the model. If the assumptions are correct, the model can describe the problem well. But if the assumptions are wrong, the model can be biased. Deep learning does not rely on prior knowledge, nor does it require manual design of features. The model learns directly from the mapping of input and output (end-to-end learning), which also avoids possible deviations caused by assumptions to a certain extent.

2) The continuous space model of neural network has better representation ability. A basic problem in machine translation is how to represent a sentence. Statistical machine translation regards the process of sentence generation as the derivation of phrases or rules, which is essentially a symbol system in discrete space. Deep learning transforms traditional discrete-based representations into representations of continuous space. For example, a distributed representation of the space of real numbers replaces the discrete lexical representation, and the entire sentence can be described as a vector of real numbers. Therefore, the translation problem can be described in continuous space, which greatly alleviates the dimension disaster of traditional discrete space model. More importantly, continuous space model can be optimized by gradient descent and other methods, which has good mathematical properties and is easy to implement.

Principles of Neural Machine Translation

Word Representation

Actually, we know that machine translation is a branch of natural language processing. One of the things we need to decided is how to represent individual words in a sentence. The first thing we do is come up with a vocabulary which is also called a dictionary, and that means making a list of the words that we will use in our representations. What we can do is then use one-hot representation to represent each of words in a sentence, in which each word is represented as a long vector. The dimension of this vector is the size of vocabulary, most of the elements are 0, and only one dimension has a value of 1, and this dimension represents the current word.

For example, “tiger” is represented as [0, 0, 0, 0, 1, 0, 0, 0, 0 …] and “panda” is represented as [0, 1, 0, 0, 0, 0, 0, 0, 0 …]. Therefore, we can label “tiger” as 4 and label “panda” as “1”. However, there exists a critical problem in one-hot representation: any two words are isolated from each other. It's impossible to tell from these two vectors whether the words are related or not. There is a key idea which is a new way of representing words called words embeddings. In Deep Learning, what we commonly used isn’t one-hot representation just mentioned, but distributed representation which is often called word representation or word embedding. Such a vector would look something like this: [0.123, −0.258, −0.762, 0.556, −0.131 …]. And the dimension of this vector is far smaller than the vector dimension which is represented by one-hot representation.

It can often let our algorithms automatically understand analogies like that, man is to women, as king is to queen, and many other examples. And we will find a way to learn words embeddings later, what we should know that is these high dimensional feature vectors can give a better representation than one-hot vectors for representing different words.

If words are represented in one-hot representation, it may cause a dimension disaster when it comes to solving certain tasks, such as building language models (Bengio 2003, 1137–1155). But using lower-dimensional feature vectors doesn't have this problem. In practice, if high dimensional feature vectors are applied to deep learning, their complexity is almost unacceptable. Therefore, low dimensional feature vectors are also popular here. In my opinion, the biggest contribution of word embedding is to make related or similar words closer in distance.

Recurrent Neural Network Language Model

Language modeling is one of the basic and important tasks in natural language processing. There’s also one that Recurrent Neural Networks (RNNs) do very well. The language model which is built as RNNs is called Recurrent Neural Network Language Model (RNNLM).

What is a language model? For example, if we are going to build a speech recognition system, and we hear a sentence, “The apple and pear(pair) salad were delicious.”, so what exactly did we hear? “The apple and pear salad were delicious.” or “The apple and pair salad were delicious.”? As human, we might think that we heard are more like the second. In fact, that's what a good speech recognition system helps output, even if the two sentences sound so similar. The way to get speech recognition to choose the second sentence is to use a language model that can calculate the probability of each sentence.

For example, a speech recognition model might calculate the probability of the first sentence being: 𝑃(The Apple and pear salad) = 2.6 × 10-13, whereas 𝑃(The Apple and pear salad) = 4.3 × 10-10. Compare the two values, because the second sentence is 1,000 times more likely than the first, which is why the speech recognition system is able to choose the correct answer between the two sentences.

what the language model does is that it tells you what the probability of a particular sentence is. Language model is a fundamental component for both speech recognition systems as we just mentioned, as well as for machine translation systems where translation systems want to output only sentences that are likely. The basic job of a language model is to input a sentence, then the language model will estimate the probability of that particular word in a sequence of sentences.

How to build a language model with an RNN? First of all, we need a training set comprising a large corpus of English text or text from whatever language we want to build a language model of. Here is the architecture of RNNML which is referenced by Mikolov (Mikolov 2012). As the picture, RNNML predicts the probability of the 5th position word based on the previous 4 words, so this model are more likely outputs the sentence: “The students opened their books”.

RNNLM.jpg

Encoders and Decoders

Actually, in machine translation, the length of the source language sentence and the target language sentence are generally different. Therefore, the common language models cannot meet the needs of machine translation. In 2014, the researchers (Sutskever et al. 2014) (Cho et al. 2014) designed new models called sequence to sequence models, which is also often called encoder-decoder models.

Firstly, we should create a network, which we are going to call the encoder network be built as RNNs. Then, we should feed in the input a source language word at a time. And after ingesting the input sequence, the RNNs will output a vector that represents the input sequence. And after that we can build a decoder network, which takes as input the encoding output by the encoder network. The decoder network can be trained to output a translated word at a time until eventually it outputs the end of sequence or the sentence token.

As shown in the figure below, the upper and lower dotted boxes respectively represent the encoder and decoder of neural machine translation, S and T sequences respectively represent the source language sentences and the target language sentences, represents the end of sentence, and small circles represent feature vectors and neural network hidden layer.

Encoder and decoder model

However, the model also has some problems, and the main problem is that only constant length vector C can be used to represent the entire source language sentences. In other words, regardless of the length of source language sentences, it can only be encoded as a vector C of fixed length. If it is a long sentence, the information contained by the constant length vector C will decrease or even disappear. Therefore, the problem of gradient disappearance or explosion exists in model training.

Attention Model

In order to solve the above problems, the researchers (Junczys-Dowmunt et al. 2016) introduced an attention model that can dynamically capture context. The attention model is an improvement of the traditional neural network model. The basic idea is that each target language word has nothing to do with most of the source language words, but only some words. By improving the representation of source language using bidirectional recurrent neural network, vector representation containing global information can be generated for each source word.

As the picture shows, in attention model, the encoder will use the forward recurrent neural network to encode the source language sentence from left to right in turn, generating a set of hidden states, and then use the backward recurrent neural network to generate another set of hidden states. Finally, the two sets of hidden states at the corresponding moments are spliced together to generate a new set of hidden states (Li Mu et al. 2018, 165), which is represented as the source feature vectors. Its advantage is that each source language feature vector representation contains the context information on the left and right sides, and the concatenation of the two means that it contains the entire source sentence information, so both can be used as the vector representation of the entire sentence.

Attention model.jpg

Actually, there are some attention weights between encoder hidden states and decoder hidden states, and the wights is trained by comparing the hidden states of encoder and decoder. The attention model will use these attention wights to weight all the encoded hidden states by bit to obtain the source language sentence context vector C’ at that moment. For example, when the attention model is generating the target word “T2”, and “T2” is only most relevant to the source word “S1”, but is irrelevant or under-relevant to other words, so the weights between “T2” and “S1” will be large, and the other weights will be small. Repeatedly, we will get the target sentence finally.

Drawbacks of Neural Machine Translation

As a new technology, neural machine translation still has many problems. Recently, relevant comprehensive studies mainly include: improvement of attention mechanism, integration of priors and constraints, model training and fusion, new model, architecture construction and evaluation of neural machine translation. The research on improving the quality of neural machine translation mainly includes:

1)Due to insufficient computational space, we cannot save all the words in words embeddings, so there exits the problem of out-of-vocabulary words translation when we training models.

2)Because of the scarcity of language source and insufficient training corpus, the minority languages translation becomes extremely difficult.

3)Because the model is sensitive to sentence length, it tends to produce short results, resulting in translation difficulties in long and difficult sentences.

Translation of Out-of-Vocabulary Words

In the decoding process, neural machine translation needs to normalize the probability distribution with the help of the whole translation word embeddings, which has a large time and space consumption and high computational complexity. In order to control the temporal and spatial overhead, only high-frequency words are generally used in model training, and the number is limited to 30,000 to 80,000. All other uncovered words, which is often called out-of-set words, are uniformly identified as <UNK> characters. <UNK> characters means that the semantic structure of parallel sentences in the training set is damaged, and the quality of model parameters will be affected. Also, it means that source language sentences are difficult to be expressed correctly, and the risk of ambiguity increases. Last but not least, unknown words will appear in the target language, and the quality and readability of the target language will be damaged. Moreover, language changes rapidly in this modern information society: old words add new ideas, new words continue to emerge, and named entities appear frequently. Therefore, translation of out-of-vocabulary words is one of the basic topics in the research of neural machine translation, which is of great significance to improve the quality of neural machine translation.

Out-of-vocabulary words translation is a difficult issue in the current research of neural machine translation. There are two ways to alleviate it. One is word granularity processing, which is mainly achieved by replacing out-of-vocabulary words. Although this method can reduce sentence ambiguity and improve the quality of translation and model parameters, it is not accurate enough to replace low-frequency words and polysemous words, so it is difficult to effectively deal with the problem in certain cases. The other is sub-word and character granularity processing method, which is the most popular method in present, mainly solves the problems of parataxis language segmentation and hypotactic language deformation by reducing translation granularity and data sparsity. The sub-word and character granularity processing method does not need to use out-of-vocabulary words processing module alone, but decomposes out-of-vocabulary words into sub-words and characters, which is simple and effective and widely used. However, fine-grained lexical segmentation may change semantic information, increase the number of sentence words, the risk of ambiguity, and the difficulty of training.

Translation of Minority Languages

According to statistics, there are about 5,600 languages in the world. The current neural network model for major languages will not be able to cope with the increasing translation needs in the era of big data. Therefore, it is of great practical value to improve the translation performance of neural network model under the condition of resource scarcity. Similar to statistical machine translation, neural machine translation is also a data-driven translation model, and its performance is highly dependent on the scale, quality and breadth of parallel corpus. The scale of artificial neural network parameters is huge, and only when the training corpus reaches a certain level, neural machine translation will significantly surpass traditional statistical machine translation (Zoph et al. 2016, 1568). However, the reality is that, except for some vertical fields in major languages with relatively rich resources, most minority languages or vertical fields still lack large-scale, high-quality and broad-covered parallel corpora. Therefore, how to use existing resources to alleviate the translation problems of minority languages is a focus of current research. At present, the latest methods to deal with this problem mainly include zero-resource, data augmentation, and diverse learning methods.

The zero-resource method is one of the effective ways to alleviate the problem of neural machine translation in minority languages. The specific method is as follows: If there are three languages of A, B and C, and you want to realize the translation between A and C, and there is no parallel corpus between them. But the parallel corpus of A and B is sufficient, and the parallel corpus of B and C is sufficient too. Then you can choose B as the pivot to realize the translation between A and C.

The data augmentation method can also effectively alleviate the insufficient generalization ability of the model due to the scarcity of training data. According to the currently available paper, data augmentation techniques used for neural machine translation mainly include back translation and word exchange.

The diverse learning methods, such as meta-learning, transfer learning, multi-task learning, unsupervised learning and so on are also an effective way to solve the shortage of minority language resources, although the latest results are mostly seen in the latter two.

Although the methods above mentioned can alleviate the translation problems of minority languages to some extent, the breadth and depth of the experiment are still limited. Whether each method is applicable to all language pairs is a subject worthy of in-depth discussion. In addition, for major languages, the information processing technology of minority languages is often more backward. Some languages, such as Mongolian, even the basic problems of part-of-speech tagging and named entity recognition are still not well solved (Bao et al. 2018, 61). Therefore, while actively developing new training methods, we must also pay attention to improving the level of information processing technology in minor languages.

Translation of Long and Difficult Sentence

Due to the insufficient number of long sentences in the training corpus, and the long-term memory problem of the cyclic neural network (Li Yachao et al. 2018, 2745). Therefore, neural network model is not able to translate long and difficult sentences. It only has a comparative advantage over traditional statistical machine translation in the translation of sentences within about 60 words (Koehn, Knowles 2017, 28). If this limit is exceeded, the quality will drop sharply. Although the encoder-decoder model based on attention mechanism can dynamically capture context information, solve the problem of information transmission in long distance, improve the performance of neural machine translation, due to the complex structure of natural language, even the attention model cannot properly focus on all the information in the source language sentences. Therefore, in the translation of long and difficult sentences, there will be mistranslations such as over-translation and under-translation. Although the fluency of the target language has improved, the semantic fidelity is worrying.

There are two main solutions to the problem of long and difficult sentences translation: one is to improve the capture ability of the model in long distance; the other is to adopt the long sentence divide and conquer strategy. These two methods can alleviate the sensitivity of sentence length to a certain extent, but their effects still need to be improved. In view of the complexity and diversity of languages, not all languages can be divided and conquered, so the first method can be considered to improve the quality of long and difficult sentences translation in the future.

Prospect of Neural Machine Translation

In the long term, neural machine translation, as a new technology, is in the ascendant and has a promising future. In recent years, especially since 2014, neural machine translation has made great progress and developed rapidly. It is not only outstanding in traditional text translation, but also excellent in image and speech translation. It is a machine translation model with great potential. At present, the following trends are promising, and we believe that neural machine translation will have a more brilliant future as time goes on.

Unsupervised Translation

Recurrent neural network is suitable for processing sequential data, especially variable length sequential data, and is the mainstream implementation of traditional neural machine translation. Recurrent neural network is a typical supervised learning model. During the training process, it is highly dependent on bilingual or multilingual parallel corpus, and the scale and quality of corpus will directly restrict the translation performance of the model. However, the reality is that most languages do not have ready-made parallel corpora, and they are not naturally tagged, so it is expensive to process the corpus such as alignment and labeling. At present, a number of scholars have tried to use different methods to achieve unsupervised translation, such as the unsupervised cross-language embedding method (Artetxe et al. 2018) and the latent semantic space sharing method (Lample et al. 2018). Unsupervised translation has shown great development potential, and will surely become one of the key exploration objects in the future.

Multilingual Translation

Barrier-free communication has always been a dream of mankind, and word vector technology provides the possibility for it to realize the dream of Babel Tower. By mapping the words to the latent semantic space and using low-dimensional continuous real number vectors to describe their features (Li Feng et al. 2017, 610), we can not only avoid dimension disaster, but also improve semantic representation accuracy. More importantly, different languages can not only share the same semantic space, but also share the same attention mechanism (Firat et al. 2016, 866), which lays a good foundation for multilingual neural machine translation. Although in practice, whether word embeddings can be used to represent all language vocabulary remains to be verified, in theory, it does play the role of universal language. How to use word embeddings technology to improve the existing neural network model and make it to achieve barrier-free communication truly is a topic to be explored.

Cross-cultural Communication

What is cross-cultural communication? Cross-cultural communication is not only the interpersonal communication and information dissemination activities between social members with different cultural backgrounds, but also involves the process of migration, diffusion and change of various cultural elements in the global society, and its impact on different groups, cultures, countries and even the human community. Language is the carrier of culture.

Nowadays, cultural differences cannot be ignored. In my opinion, the quality of translation determines the quality of cross-cultural communication. If different cultures are compared to two lands that have never communicated with each other, then translation is a bridge of cross-cultural communication. The width and flatness of the bridge determine how many people can cross it on both sides, and the capacity and flow of the bridge are determined by the meaning that the translation can convey. Only more accurate and rigorous translation based on different culture can make participants more extensive and enthusiastic, so as to facilitate the smooth dissemination of culture.

Therefore, the ultimate goal of neural machine translation to help translate high-quality cultural works such as classic literature and good domestic animation, to let Chinese culture go out and absorb excellent foreign cultures.

Impact on Human Translation Market

With the development of neural machine translation, what kind of impact has been or will have on the human translation market? Whether neural machine translation is a challenge for human translators?

Firstly, human translation market will not shrink, just as the textile machine has expanded the textile market. Some applications that require only "near" translation, such as the localization of some e-commerce sites, which might be abandoned because of the high cost of human translation. But it might be survived with the help of machine translation. And in this process, the human translation market has created new demand which is called post-editing. when there was only human translation, this demand did not exist. But under machine translation, there were some human translation businesses. Not only the machine translation market, but also the human translation market has expanded.

Secondly, high-end human translation will not die. Any excellent handmade work is still expensive today. After all, some translation tasks such as translation of poetry and literature is a creative labor. We should know that machine translation is not sensitive to culture. Different cultures have its unique language systems, so the translations they produce may not conform to the values and specific norms of the culture. Therefore, humans can play to their unique advantages to translate some certain translation tasks.

Finally, technological improvements often lead to improvements in efficiency, allowing people to complete their work more efficiently. The steam engine improves the efficiency of moving bricks, but it still requires a driver. The development of technology will bring various positions around machine translation, such as post- editing, quality controller and so on. In fact, Machine Assisted Translation (CAT) has begun to become a compulsory course in many translation schools. If you can master these technologies proficiently, you will have the upper hand in the market.

Conclusion

Let’s go back to the question at the beginning of this article: whether neural machine translation is a chance or a challenge for human translators? Maybe we can say that the neural machine translation is not only a chance but also a challenge for human translators.

From this paper, we can know that the neural machine translation has been developed a lot because of some critical methods such as deep learning, and the quality of machine translation has been greatly improved. Nowadays, machine translation also has a place in translation market. However, as far as I am concerned, we should not pay too much attention to the impact of neural machine translation on human translation, what we should really talk about is how to effectively combine two different types of translation services.

As a new model, neural machine translation still has a long way to go. How to improve the existing neural network model and make itself more intelligent is a major challenge. Therefore, we should look at machine translation from a dialectical and developmental perspective. Humans do not need to be afraid of technology, but should learn to use technology to enhance the efficiency and value of their work.

References

Li Mu et al. 李沐 等. (2018). 机器翻译 [Machine Translation]. Beijing: Higher Education Press 高等教育出版社.

Li Yao 李耀. (2021). 基于语料库的机器翻译文学作品质量研究--以《许三观卖血记》为例 [A Corpus-based Study on the Quality of Machine Translation Literary Works--Taking Chronicle of a Blood Merchant as an example]. 海外英语 Overseas English (18) 39-40+42.

Jin Wenlu 靳文璐. (2019). 机器翻译可以取代人工翻译吗? [Can machine translation replace human translation?]. 智库时代 Think Tank Times (40) 282-284.

Yoshua Bengio et al. (2003). A neural probabilistic language model. Journal of Machine Learning Research (JMLR) (3) 1137–1155.

Mikolov Tomáš. (2012). Statistical Language Models based on Neural Networks. PhD thesis, Brno University of Technology.

Sutskever et al. (2014). Sequence to sequence learning with neural networks.

Cho et al. (2014). Learning phrase representation using RNN encoder-decoder for statistical machine translation.

Junczys-Dowmunt et al. (2016). Is Neural Machine Translation Ready for Deployment? A Case Study on 30 Translation Directions.

Zoph et al. (2016). Transfer Learning for Low-resource Neural Machine Translation.

Bao et al. 包乌格德勒 等. (2018). 基于RNN和CNN的蒙汉神经机器翻译研究 [Mongolian-Chinese Neural Machine Translation Based on RNN and CNN]. 中文信息学报 Journal of Chinese Information Processing (8) 61.

Li Yachao et al. 李亚超 等. (2018). 神经机器翻译综述 [A Survey of Neural Machine Translation]. 计算学报 Chinese Journal of Computers (12) 2745.

Koehn, Knowles. (2017). Six Challenges for Neural Machine Translation.

Artetxe et al. (2018). Unsupervised Neural Machine Translation.

Lample et al. (2018). Unsupervised Machine Translation Using Monolingual Corpora Only.

Firat et al. (2016). Multi-way, Multilingual Neural Machine Translation with a Shared AttentionMechanism.

7 颜莉莉(一带一路背景下人工智能与翻译人才的培养)

Machine_Trans_EN_7

Abstract

In the era of artificial intelligence, artificial intelligence has been applied to various fields. In the field of translation, traditional translation models can no longer meet the rapid development and updating of the information age. The development of machine translation has brought structural changes to the language service industry, which poses challenges to the cultivation of translation talents. Under the background of "The Belt and Road initiative", translation talents have higher and higher requirements on translation literacy. Artificial intelligence and translation technology are used to reform the training mode of translation talents, so as to better serve the development of The Times. This paper mainly explores the cultivation of artificial intelligence and translation talents under the background of the Belt and Road Initiative. The cultivation of translation talents is moving towards comprehensive cultivation of talents. On the contrary, artificial intelligence and machine translation can also be used to improve the teaching mode and teaching content, so as to win together in cooperation.

Key words

Artificial intelligence,Machine translation,cultivation of translation talents,"The Belt and Road initiative"

题目

一带一路背景下人工智能与翻译人才的培养

摘要

进入人工智能时代,人工智能被应用于各个领域。在翻译领域,传统的翻译模式已无法满足信息化时代的飞速发展和更新,机器翻译的发展给语言服务行业带来了结构性改变,这对翻译人才的培养提出了挑战。“一带一路”背景下,对翻译人才的翻译素养要求越来越高,利用人工智能和翻译技术对翻译人才培养模式进行革新,更好为时代发展服务。本文主要探究在一带一路背景下人工智能和翻译人才培养,翻译人才的培养过程中正向对人才的综合性培养,反之也可以利用人工智能和机器翻译完善教学模式和教学内容,在合作中共赢。

关键词

人工智能;机器翻译;翻译人才培养;一带一路

1. Introduction

With the development of science and technology in China, artificial intelligence has also been greatly improved, and related technologies have been applied to various fields, such as the use of intelligent robots to deliver food to quarantined people during the epidemic, which has made people's lives more convenient. The most controversial and widely discussed issue is machine translation. Before the emergence of machine translation, translation was generally dominated by human translation, including translation and interpretation, which was divided into simultaneous interpretation and hand transmission, etc. It takes a lot of time and energy to cultivate a translation talent. However, nowadays, the era is developing rapidly and information is updated rapidly. As a translation talent, it is necessary to constantly update its knowledge reserve to keep up with the pace of The Times. The emergence of machine translation has also posed challenges to translation talents and the training of translation talents. Although machine translation had some problems in the early stage, it is now constantly improving its functions. In the context of the belt and Road Initiative, both machine translation and human translation are facing difficulties. Regardless of whether human translation is still needed, what is more important at present is how to train translators to adapt to difficulties and promote the cooperation between human translation and machine translation.

2.Development status of machine translation in the era of artificial intelligence

With the development of AI technology, machine translation has made great progress and has been applied to people's lives. For example, more and more tourists choose to download translation software when traveling abroad, which makes machine translation take an absolute advantage in daily email reply and other translation activities that do not require high accuracy. The translation software commonly used by netizens include Google Translation, Baidu Translation, Youdao Translation, IFly.com Translation, etc. Even wechat and other chat software can also carry out instant Translation into English. Some companies have also launched translation pens, translation machines and other equipment, which enables even native speakers to rely on machine translation to carry out basic communication with other Chinese people. But so far, machine translation still faces huge problems. Although machine translation has made great progress, it is highly dependent on corpus and other big data matching. It does not reach the thinking level of human brain, and cannot deal with the problem of translation differences caused by culture and religion. In addition, many minor languages cannot be translated by machine due to lack of corpus.

What's more, most of the corpus is about developed countries such as Britain and France, and most of the corpus is about diplomacy, politics, science and technology, etc., while there are very few about nationality, culture, religion, etc.

In addition, machine translation can only be used for daily communication at present. If it involves important occasions such as large conferences and international affairs, it is impossible to risk using machine translation for translation work. Professional translators are required to carry out translation work. So machine translation still has a long way to go.

3.Challenges in the training of translation talents in universities

The cultivation of translators is targeted at the market. Professors Zhu Yifan and Guan Xinchao from the School of Foreign Languages at Shanghai Jiao Tong University believe that the cultivation of translators can be divided into four types: high-end translators and interpreters, senior translators and researchers, compound translators and applied translators.

From their names, it can be seen that high-end translators and interpreters and senior translators and researchers talents have high requirements on the knowledge and quality of interpreters, because they have to face the changing international situation, and have to deal with all kinds of sensitive relations and political related content, they should have flexible cross-cultural communication skills. In addition, for literature, sociology and humanities academic works, it is not only necessary to translate their content, but also to understand their essence. Therefore, translators should not only have humanistic feelings, but also need to have a deep understanding of Chinese and western culture.

However, there is not much demand for this kind of translation in the society. Such high-level translation requirements are not needed in daily life and work. The greatest demand is for compound translators, which means that they should master knowledge in a specific field while mastering a foreign language. For example, compound translators in the financial field should not only be good at foreign languages, but also master financial knowledge, including professional terms, special expressions and sentence patterns.

Now we say that machine translation can replace human translation should refer to the field of compound translation talents. Although AI technology has enabled machine translation to participate in creation, it does not mean that compound translation talents will be replaced by machines. The complexity of language and the flexible cross-cultural awareness required in communication make it impossible for machine translation to completely replace human translation.

The last type of applied translation talents are mostly involved in the general text without too much technical content and few professional terms, so it is easy to be replaced by machine translation.

Therefore, the author thinks that what universities are facing at present is not only how to train translation talents to cope with the development of machine translation, but to consider the application of machine translation in the process of training translation talents to achieve human-machine integration, so as to better complete the translation work.

4.The Language environment and opportunities and challenges of the Belt and Road initiative

During visits to Central and Southeast Asian countries in September and October 2013, Chinese President Xi Jinping put forward the major initiative of jointly building the Silk Road Economic Belt and the 21st Century Maritime Silk Road. And began to be abbreviated as the Belt and Road Initiative.

According to the Vision and Actions for Jointly Building silk Road Economic Belt and 21st Century Maritime Silk Road, the Silk Road Economic Belt focuses on connecting China, Central Asia, Russia and Europe (the Baltic Sea). From China to the Persian Gulf and the Mediterranean Sea via Central and West Asia; China to Southeast Asia, South Asia, Indian Ocean. The focus of the 21st Century Maritime Silk Road is to stretch from China's coastal ports to Europe, through the South China Sea and the Indian Ocean. From China's coastal ports across the South China Sea to the South Pacific.

The Belt and Road "construction is comply with the world multi-polarization and economic globalization, cultural diversity, the initiative of social informatization tide, drive along the countries achieve economic policy coordination, to carry out a wider range, higher level, the deeper regional cooperation and jointly create open, inclusive and balanced, pratt &whitney regional economic cooperation framework.

4.1一带一路的语言环境

The "Belt and Road" involves a wide range of countries and regions, and their languages and cultures are very complex. How to make good use of language, do a good job in translation services, actively spread Chinese culture to the world, strengthen the ability of discourse, and tell Chinese stories well, the first thing to do is to understand the language situation of the countries along the "Belt and Road".

4.1.1The most common language in countries along the "Belt and Road"

There are a wide variety of languages spoken in 65 countries along the Belt and Road, involving nine language families. However, The status of English as the first language in the world is undeniable. Most of the countries participating in the Belt and Road are developing countries, and many of them speak English as their first foreign language. Especially in southeast Asian and South Asian countries, English plays an important role in foreign communication, whether as the official language or the first foreign language. Besides English, more than 100 million people speak Russian, Hindi, Bengali, Arabic and other major languages in the "Belt and Road" countries. It can also be seen that a common feature of languages in countries along the "Belt and Road" is the popularization of English education. English is widely used in international politics, economy, culture, education, science and technology, playing the role of the most important language in the world.

4.1.2The complex language conditions of countries along the "Belt and Road"

The languages spoken in countries along the Belt and Road involve nine major language families and almost all the world's religious types. Differences in religious beliefs also result in differences in culture, customs and social values behind languages. The languages of some countries along the belt and Road have also been influenced by historical and realistic factors, such as colonization, internal division and immigration.

India, for example, has no national language, but more than 20 official languages. India is a multi-ethnic country, a total of more than 100 people, one of the most obvious difference between nation and nation is the language problem. Therefore, according to the difference of language, India divides different ethnic groups into different states, big and small. Ethnic groups that use the same language are divided into one state. If there are two languages in a state, the state is divided into two parts. And Indian languages differ not only in word order but also in the way they are written. In India, for example, Hindi is spoken by the largest number of people in the north, with about 700 million speakers and 530 million as their first language. It is written in The Hindu language and belongs to the Indo-European language family. Telugu in the east is spoken by about 95 million people and 81.13 million as their first language. It is written in Telugu, which belongs to the Dravidian language family and is quite different from Hindi. As a result, a parliamentary session in India requires dozens of interpreters.

These factors cannot be ignored in the process of translation, from language communication to cultural understanding, from text to thought exchange, through the bridge of language to truly connect the people, so as to avoid misreading and misunderstanding caused by differences in language and national conditions.

4.2Opportunities and challenges of the "Belt and Road"

With the promotion of the Belt and Road Initiative, there has been an unprecedented boom in translation. In the previous translation boom in China, most of the foreign languages were translated into Chinese, and most of the foreign cultures were imported into China. However, this time, in the context of the "Belt and Road" initiative, translating Chinese into foreign languages has become an important task for translators. As is known to all, there are many different kinds of "One Belt And One Road" along the national language and culture is complex, the service "area" construction has become a factor in Chinese translation talents training mode reform, one of the foreign language universities have action, many colleges and universities to establish the "area" all the way along the country's small language major, as a result, "One Belt And One Road" initiative to promote, It has brought unprecedented opportunities for human translation. The cultivation of diversified translation talents and the cultivation of translation talents in small languages is an urgent problem to be solved in China. The cultivation of translation talents cannot be completed overnight, and the state needs to reform the training mode of translation talents from the perspective of language strategic development. Only in this way can we meet the new demand for human translation under the new situation of the belt and Road Initiative.

For a long time, the traditional orientation of translation curriculum and training goal in colleges and universities is to train translation teachers and translators in need of society through translation theory and practice and literary translation practice, which cannot meet the needs of society. Since 2007, in order to meet the needs of the socialist market economy for application-oriented high-level professionals, the Academic Degrees Committee of The State Council approved the establishment of Master of Translation and Interpreting (MTI for short). After joining the pilot program of MTI, more and more universities are reforming the curriculum and training mode of master of Translation in order to cultivate translators who meet the needs of the society.

Language is an important carrier of culture, and translation is an important link for exporting culture. The quality of translation output also reflects the cultural soft power of a country. With the rise of China, more and more people are interested in Chinese culture, and the number of Chinese learners keeps increasing. Under the background of "One Belt and One Road", excellent translators are urgently needed to spread Chinese culture. With the promotion of "One Belt and One Road" Initiative, the number of other countries learning mutual learning and cultural exchanges with China has increased unprecedeningly, bringing vigorous opportunities for the spread of Chinese culture. Translation talents who understand small languages and multi-lingual translators are needed. They should not only use language to convey information, but also use language as a lubricant for communication.

5.在机器翻译视域下如何培养翻译人才

5.1 对翻译人才的素养要求

5.2 利用人工智能进行翻译实践活动

5.3 大数据、术语库和语料库的应用

6针对一带一路的机器翻译与翻译人才的合作

Conclusion

References

8 颜静(On Machine Translation Under Lanuguage Intelligence——An Option and Opportunity for Human Translators)

Machine_Trans_EN_8

Abstract

Nowadays the artificial intelligence is sweeping the world, however, the traditional language research and language service industry is facing new challenges. This paper attempts to comb and analyze the development process of language intelligence in artificial intelligence and the development status of language industry under the background of information age to interpret the feasibility of liberal arts translators to engage in machine translation research and necessity to apply machine translation, thus to provide an option for human translators in information age to develop.

Key words

New Libral Arts; Language Intelligence; Machine Translation; Interdisciplinarity

题目

论语言智能之机器翻译——我们的选择和未来

摘要

当今人工智能的热潮席卷全球,而传统的语言研究和语言服务行业却面临着新的挑战。本文通过梳理分析人工智能中语言智能领域的发展历程和信息时代背景下语言行业的发展现状,对文科译者从事机器翻译研究的可行性和应用机器翻译的必要性进行阐述,为信息时代译者的发展路径提供参考。

关键词

新文科;语言智能;机器翻译;学科交叉

1. Introduction

Obviously, we are now in an era of "explosion" of information and knowledge, which makes us have to find ways to deal with it quickly. Language is the manifestation of information, and the tool that can deal with language with complicated information is just computer. It happens that human beings do not have a special organ to perceive language, but carry the image and sound symbols of language through visual and auditory perception, and then form language information through brain processing and abstraction. Therefore, language intelligence also belongs to the research category of "cognitive intelligence". In view of this, computer has carried out the research on language, among which the common research fields are "natural language processing", "language information processing" and "Computational Linguistics". These three are different, but they all have the same goal, that is, to enable computers to realize and express with language, solve language related problems and simulate human language ability. Among them, machine translation is the integration of language intelligence and technology. The comprehensive research of MT in China starts from the mid-1980s. Especially since the 1990s, a number of MT systems have been published and commercialized systems have been launched. In addition, various universities in China have also carried out MT and computational linguistics research, developed various translation experimental systems and achieved fruitful results. In the research of machine translation, it involves not only translation model and language model, but also alignment method, part of speech tagging, syntactic analysis method, translation evaluation and so on. Therefore, researchers must understand the basic knowledge of translation and be proficient in English, Chinese or other languages. Therefore, we say that compound talents with computer and language related knowledge will be more needed in the language industry or the computer field.

2. Chapter 1 Artificial Intelligence in Rapid Development

At the Dartmouth Conference in 1956, the word "artificial intelligence" appeared in the human world for the first time. In the past 65 years, with the in-depth study of science, artificial intelligence seems to have come out of the original science fiction movies and science fictions, and is closer to human daily life step by step. Nowadays, autopilot, machine translation, chess and E-sports robots, AI synthetic anchor, AI generated portrait and so on have been realized and widely known. Artificial intelligence has also moved from logical intelligence and computational intelligence to today's cognitive intelligence.

1.1 The Development of Language Intelligence

According to academician Tan Tieniu, "Artificial intelligence is a technical science that studies and develops theories, methods, technologies and application systems that can simulate, extend and expand human intelligence. Its purpose is to enable intelligent machines to listen, see, speak, think, learn and act, that is, they have the following capabilities——speech recognition and machine translation, image and character recognition, speech synthesis and man-machine dialogue, man-machine games and theorems proving, machine learning and knowledge representation, autopilot and so on. So, from these purposes we can see that language plays a vital role in AI. In order to imitate human intelligence, an advanced form of artificial intelligence is to analyze and process human language by using computer and information technology. We call it "language intelligence". Language intelligence is not only the core part of artificial intelligence, but also an important basis and means of human-computer interaction cognition, whose development will contribute to the whole process of AI and further to let AI technologies to practice. Therefore, it is known as the Pearl on the crown of artificial intelligence.

The concept of “language intelligence” was proposed in 2013 at Beijing Academic Forum on Language Intelligence. However, as mentioned above, its research direction in the computer field has always been called natural language processing (NLP). Its history is almost as long as computer and artificial intelligence. After the emergence of computer, there has been the research of artificial intelligence. Natural language processing generally includes two parts: natural language understanding and natural language generation. The early research of artificial intelligence has involved machine translation and natural language understanding, which is basically divided into three stages.

The first stage is from 1960s to 1980s. In this period, the common method is to establish vocabulary, syntactic and semantic analysis, question and answer, chat and machine translation systems based on rules. The advantage is that rules can make use of human’s own knowledge instead of relying on data, and can start quickly; The problem is on its insufficient coverage, and its rule management and scalability have not been solved.

The second stage starts from 1990s. At this time, statistics-based machine learning (ML) has become popular, and many NLP began to use statistics-based methods. The main idea is to use labeled data to establish a machine learning system based on manually defined features, and to use the data to determine the parameters of the machine learning system through learning. At runtime, by using these learned parameters, the input data is decoded and output. Machine translation and search engines just make use of statistical methods and get success.

The third stage is after 2008, when deep learning functions in voice and image. Subsequently, NLP researchers begin to turn to deep learning. First, they use deep learning for feature calculation or establish a new feature, and then experience the effect under the original statistical learning framework. For example, search engines add in-depth learning to calculate the similarity between search words and documents to improve the relevance of search. Since 2014, people have tried to conduct end-to-end training directly through deep learning modeling. At present, progress has been made in the fields of machine translation, question and answer, reading comprehension and so on.

1.2 The Research on Machine Translation

Machine translation is an important research direction in the field of natural language processing. As early as the 17th century, Descartes, a famous French philosopher, put forward the concept of world language in order to convert words that expressing the same meaning in different languages into unified symbols. In 1946, Warren Weaver put forward the idea of using machines to convert words from one language into another, and published the famous memorandum Translation, formally marking the born of the modern concept——machine translation.

Until now, machine translation has experienced four stages according to its translation method: rule-based machine translation, case-based machine translation, statistics-based machine translation and neural machine translation. In the early stage of the development of machine translation, due to the limited computing power and lack of data, people usually input the rules designed by translators and Linguistics experts into the computer. The computer converts the sentences of the source language into the sentences of the target language based on these rules, which is rule-based machine translation. Rule based machine translation is usually divided into three procedures: source language sentence analysis, transformation and target language sentence generation. The source language sentence of the given input will generate a syntax tree after the lexical and syntactic analysis, and then the syntax tree is converted through the conversion rules to generate the syntax tree of the target language. Finally, the target language sentences are obtained by traversing the leaf nodes based on the target language syntax tree.

Rule-based machine translation requires professionals to design rules. When there are too many rules, the dependence between rules will become very complex and it is difficult to build a large-scale translation system. With the development of science and technology, people collect some bilingual and monolingual data, and extract translation templates and translation dictionaries based on these data. In translation process, the computer matches the translation template of the input sentence and generates the translation result based on the successfully matched template fragments and the translation knowledge in the dictionary, which is case-based machine translation.

With the rapid development of the Internet, it is possible to obtain large-scale bilingual and monolingual corpora. Statistical method based on large-scale corpora has become the mainstream of machine translation. Given the source language sentence, the statistical machine translation method models the conditional probability of the target language sentence, which is usually divided into language model and translation model. The translation model describes the meaning consistency between the target language sentence and the source language sentence, while the language model describes the fluency of the target language sentence. The language model uses large-scale monolingual data for training, and the translation model uses large-scale bilingual data for training. Statistical machine translation usually uses a decoding algorithm to generate translation candidates, then uses the language model and translation model to score and sort the translation candidates, and finally selects the best translation candidates as the translation output. Decoding algorithms usually include beam decoding, CKY decoding, etc.

Statistical machine translation uses translation rules (usually extracted from bilingual data based on alignment results) to match the input sentences to obtain the translation candidates of fragments in the input sentences. If there are multiple translation candidates in a segment, the language model and translation model are used to sort these translation candidates, and only some candidates with the highest scores are retained. Translation candidates based on these fragments use translation rules to splice fragments and then form translation candidates of longer fragments. There are two ways of splicing translation fragments: sequential and reverse. Translation model and language model will have different weights when scoring. The weights are usually trained by a development data set.

With the further improvement of computing power, especially the rapid development of parallel training based on GPU, the method based on deep neural network has attracted more and more attention in natural language processing. The method based on deep neural network was first used to train some sub models in statistical machine translation (language model based on deep neural network or translation model based on deep neural network), and significantly improved the performance of statistical machine translation. With the proposal of decoder and encoder framework and attention mechanism, neural machine translation has comprehensively surpassed statistical machine translation, and machine translation has entered the era of neural network.

3. Chapter 2 Interdisciplinarity in Irresistible Trend

2.1 The Construction of New Liberal Arts

2.1 The Current Status of New Liberal Arts

4. Chapter 3 Language Service Industry with Machine Translation

3.1 Translation Mode of Man-machine Cooperation

3.2 Translators with More Professional and Diversified Career Path

3.2.1 The Improvement of Tranlation Ability

3.2.2 The Combination with Other Field

Conclusion

References

9 谢佳芬(人工智能时代下的机器翻译与人工翻译)

Machine_Trans_EN_9

Abstract

With the continuous development of information technology, many industries are facing the competitive pressure of artificial intelligence, and so is the field of translation. Artificial intelligence technology has developed rapidly and combined with the field of translation,which has brought great impact and changes to traditional translation, but artificial intelligence translation and artificial translation have their own advantages and disadvantages. Artificial translation is in the leading position in adapting to human language logical habits and understanding characteristics, but in terms of translation threshold and economic value, the efficiency of artificial intelligence translation is even better. In a word, we need to know that machine translation and human translation are complementary rather than antagonistic.

Key Words

Machine Translation; Artificial Translation; Artificial Intelligence

题目

人工智能时代下的机器翻译与人工翻译

摘要

伴随着信息技术的不断发展,多个行业面临着人工智能的竞争压力,翻译领域也是如此。人工智能技术快速发展并与翻译领域结合,人工智能翻译给传统翻译带来了巨大的冲击和变革,但人工智能翻译与人工翻译存在着各自的优劣特点和发展空间,在适应人类语言逻辑习惯和理解特点的翻译效果上,人工翻译处于领先地位,但在翻译门槛和经济价值上,人工智能翻译的效率则更胜一筹。总的来说,我们要知道机器翻译与人工翻译是互补而非对立的关系。

关键词

机器翻译;人工翻译;人工智能

1. Introduction

1.1 The History of Machine Translation Aborad

The research history of machine translation can be traced back to the 1930s and 1940s. In the early 1930s, the French scientist G.B. Alchuni put forward the idea of using machines for translation. In 1933, the Soviet inventor Troyansky designed a machine to translate one language into another. [1]In 1946, the world's first modern electronic computer ENIAC was born. Soon after, American scientist Warren Weaver, a pioneer of information theory, put forward the idea of automatic language translation by computer in 1947. In 1949, Warren Weaver published a memorandum entitled Translation, which formally raised the issue of machine translation. In 1954, Georgetown University, with the cooperation of IBM, completed the English-Russian machine translation experiment with IBM-701 computer for the first time, which opened the prelude of machine translation research. [2] In 2006, Google translation was officially released as a free service software, bringing a big upsurge of statistical machine translation research. It was Franz Och who joined Google in 2004 and led Google translation. What’s more, it is precisely because of the unremitting efforts of generations of scientists that science fiction has been brought into reality step by step.

1.2 The History of Machine Translation in China

In 1956, the research and development of machine translation has been named in the scientific and technological work and made little achievements in China. On the eve of the tenth anniversary of the National Day in 1959, our country successfully carried out experiments, which translated nine different types of complicated sentences on large general-purpose electronic computers. The dictionary includes 2030 entries, and the grammar rule system consists of 29 circuit diagrams. [3]. After a period of stagnation, China's machine translation ushered in a high-speed development stage after the 1980s in the wave of the third scientific and technological revolution. With the rapid development of economy and science and technology, China has made a qualitative leap in the field of machine translation research with the pace of reform and opening up. In 1978, Institute of Scientific and Technological Information of China, Institute of Computing Technology and Institute of Linguistics carried out an English-Chinese translation experiment with 20 Metallurgical Title examples as the objects and achieved satisfactory results. Subsequently, they developed a JYE-I machine translation system, which based on 200 sentences from metallurgical documents. Its principles and methods were also widely used in the machine translation system developed in the future. In addition, the research achievements of machine translation in China during the 1980s and 1990s also include that Institute of Post and Telecommunication Sciences developed a machine translation system, C Retrieval and automatic typesetting system with good performance and strong practicability in October 1986; In 1988, ISTC launched the ISTIC-I English-Chinese Title System for the translation of applied literature of metallurgy, Information Research Institute of Railway developed an English-Chinese Title Recording machine translation system for railway documents; the Language Institute of the Academy of Social Sciences developed "Tianyu" English-Chinese machine translation system and Matr English-Chinese machine translation system developed by the computer department of National University of Defense Technology. After many explorations and studies, machine translation in China has gradually moved towards application, popularization and commercialization. China Software Technology Corporation launched "Yixing I" in 1988, marking China's machine translation system officially going to the market. After "Yixing", a series of machine translation systems such as Gaoli system in Beijing, Tongyi system in Tianjin and Langwei system in Shaanxi have also entered the public. In the 21st century, the development of a series of apps such as Kingsoft Powerword, Youdao translation and Baidu translation has greatly met the needs of ordinary users for translation. According to the working principle, machine translation has roughly experienced three stages: rule-based machine translation, statistics-based machine translation and deep learning based neural machine translation. [4] These three stages witnessed a leap in the quality of machine translation. Machine translation is more and more used in daily life and even the translation of some texts is almost comparable to artificial translation. In addition to text translation, voice translation, photo translation and other functions have also been listed, which provides great convenience for people's life. It is undeniable that machine translation has become the development trend of translation in the future.

1.3 The Status Quo of Machine Translation

In this big data era of information explosion, the prospect of machine translation is also bright. At present, the circular neural network system launched by Google has supported universal translation in more than 60 languages. Many Internet companies such as Microsoft Bing, Sogou, Tencent, Baidu and NetEase Youdao have also launched their own Internet free machine translation systems. [5] Users can obtain translation results free of charge by logging in to the corresponding websites. At present, the circular neural network translation system launched by Google can support real-time translation of more than 60 languages, and the domestic Baidu online machine translation system can also support real-time translation of 28 languages. These Internet online machine translation systems are suitable for a variety of terminal platforms such as mobile phone, PC, tablet and web and its functions are also quite diverse, supporting many translation forms, such as screen word selection, text scanning translation, photo translation, offline translation, web page translation and so on. Although its translation quality needs to be improved, it has been outstanding in the fields of daily dialogue, news translation and so on.

2. Advantages and Disadvantages of Machine Translation

Generally speaking, machine translation has the characteristics of high efficiency, low cost, accurate term translation and great development potential and etc. Machine translation is fast and efficient, this is something that artificial translation can’t catch up with. In addition, with the continuous emergence of all kinds of translation software in the market, compared with artificial translation, machine translation is cheap and sometimes even free, which greatly saves the economic cost and time for users with low translation quality requirements. What's more, compared with artificial translation, machine translation has a huge corpus, which makes the translation of some terms, especially the latest scientific and technological terms, more rapid and accurate. The accurate translation of these terms requires the translator to constantly learn, but learning needs a process, which has a certain test on the translator's learning ability and learning speed. In this regard, artificial translation has uncertainty and hysteretic nature. At the same time, with the progress of science and technology and the development of society, the function of machine translation will be more perfect and the quality of translation will be better.Today's machine translation tools and software are easy to carry, all you need to do is just to use the software and electronic dictionary in the mobile phone. There is no need to carry paper dictionaries and books for translation, which saves time and space. At the same time, machine translation covers many fields and is suitable for translation practice in different situations, such as academic, education, commercial trade, social networking, tourism, production technology, etc, it is also easy to deal with various professional terms. However, due to the limitation of translators' own knowledge, artificial translation is often limited to one or a few fields or industries. For example, it is difficult for an interpreter specializing in medical English to translate legal English. At the same time, machine translation also has its limitations. At first, machine can only operate word to word translation, which only plays the function and role of dictionary. Then, the application of syntax enables the process of sentence translation and it can be solved by using the direct translation method. When the original text and the target language are highly similar, it can be translated directly. For example, the original text "他是个老师." The target language is "he is a teacher ". With the increase of the structural complexity of the original text, the effect of machine translation is greatly reduced. Therefore, at the syntactic level, machine translation still stays in sentences with relatively simple structure. Meanwhile, the original text and the results of machine translation cannot be interchanged equally, indicating that English-Chinese translation has strong randomness, and is not rigorous and scientific enough. Nowadays, machine translation is highly dependent on parallel corpora, but the construction of parallel corpora is not perfect. At present, the resources of some mainstream languages such as Chinese and English are relatively rich, while the data collection of many small languages is not satisfactory. Moreover, the current corpus is mainly concentrated in the fields of government literature, science and technology, current affairs and news, while there is a serious lack of data in other fields, which can’t reflect the advantages of machine translation. At the same time, corpus construction lags behind. Some informative texts introducing the latest cutting-edge research results often spread the latest academic knowledge and use a large number of new professional terms, such as academic papers and teaching materials while the corpus often lacks the corresponding words of the target language, which makes machine translation powerless Besides, machine translation is not culturally sensitive. Human may never be able to program machines to understand and experience a particular culture. Different cultures have unique and different language systems, and machines do not have complexity to understand or recognize slang, jargon, puns and idioms. Therefore, their translation may not conform to cultural values and specific norms. This is also one of the challenges that the machine needs to overcome.[6] Artificial intelligence may have human abstract thinking ability in the future, but it is difficult to have image thinking ability including imagination and emotion. [7] Therefore, machine translation is often used in news, science and technology, patents, specifications and other text fields with the purpose of fact description, knowledge and information transmission. These words rarely involve emotional and cultural background. When translating expressive texts, the limitations of machine translation are exposed. The so-called expressive text refers to the text that pays attention to emotional expression and is full of imagination. Its main characteristics are subjectivity, emotion and imagination, such as novels, poetry, prose, art and so on. This kind of text attaches importance to the emotional expression of the author or character image, and uses a lot of metaphors, symbols and other expressions. Machine translation is difficult to catch up with artificial translation in this kind of text, it can only translate the main idea, lack of connotation and literary grace and it cannot have subjective feelings and rational analysis like human beings. In fact, it is not difficult to simulate the human brain, the difficulty is that it is impossible to learn from the rich social experience and life experience of excellent translators. In other words, machine translation lacks the personalization and creativity of human translation. It is this personalization and creativity that promote the development and evolution of language, and what machine translation can only output is mechanical "machine language".

3.The Irreplaceability of Artificial Translation

3.1 Translation is Constrained by Context

At present, machine translation can help people deal with language communication in people's daily life and work, such as clothing, food, housing and transportation, but there is a big gap from the "faithfulness, expressiveness and elegance" emphasized by high-level translation. Language itself is art,which pays more attention to artistry than functionality, and the discipline of art is difficult to quantify and unify. Sometimes it is regular, rigorous, logical and clear, and sometimes it is random, free and logical. If it is translated by machine, it is difficult to grasp this degree. Sometimes, machine translation cannot connect words with contextual meaning. In many languages, the same word may have multiple completely unrelated meanings. In this case, context will have a great impact on word meaning, and the understanding of word meaning depends largely on the meaning read from context. Only human beings can combine words with context, determine their true meaning, and creatively adjust and modify the language to obtain a complete and accurate translation. This is undoubtedly very difficult for machine translation. Artificial translation can get rid of the constraints of the source language and translate the translation in line with the grammar, sentence patterns and word habits of the target language. In the process of translation, translators can use their own knowledge reserves to analyze the differences between the source language and the target language in thinking mode, cultural characteristics, social background, customs and habits, so as to translate a more accurate translation. Artificial translation can also add, delete, domesticate, modify and polish the translation according to the style, make up for the lack of culture, try to maintain the thought, artistic conception and charm of the original text and the style of the source language. In addition, translators can also judge and consider the words with multiple meanings or easy to produce ambiguity according to the context, so as to make the translation more clear and more accurate and improve the quality of the translation.


4. Discussion on the Relationship Between Machine Translation and Artificial Translation

5. Suggestions on the Combined Development of Machine Translation and Artificial Translation

6.

7.

Conclusion

References

10 熊敏(Research on the English Chinese Translation Ability of Machine Translation for Various Types of Texts)

Machine_Trans_EN_10

Abstract

With the rapid development of information technology,machine translation technology emerged and is gradually becoming mature.In order to explore the ability of machine translation, I adopts two versions of translation, which are manual translation and machine translation(this paper uses Youdao translation) for different types of texts(according to Peter Newmark's types of text). The results are quite different in terms of quality and accuracy.

Key words

machine translation; manual translation; Newmark's type of texts

题目

Research on the English Chinese Translation Ability of Machine Translation for Various Types of Texts

摘要

随着信息技术的高速发展,机器翻译技术出现了,并且逐渐成熟。为了探究机器翻译的能力水平,本人根据纽马克的文本类型分类,选择了相应的译文类型,并且将其机器翻译的版本以及人工翻译的版本进行对比。就质量和准确度而言,译文的水平大相径庭。

关键词

机器翻译;人工翻译;纽马克文本类型

1. Introduction

1.1.Introduction to machine translation

Machine translation, also known as computer translation is a technique to translating through machine translator, such as Google translation and Youdao translation. Machine translation is one of the branches of computational linguistics, ranging from computer science, statistics, information science and so on. Machine translation plays an important role in all aspects. Machine translation can be traced back to 1940s, when British engineer Booth and American engineer Weaver proposed using computer to translate and started to study machines used for translation. However in the 1960s, reports from ALPAC (Automated Language Processing Advisory Committee) showed studies on machine translation had stagnated for a decade. In the 1970s, with the advancement of computer, machine translation was back to track. In the last decades, machine translation has mainly developed into four stages: rule-based machine translation, statistic machine translation, example-based machine translation and neural machine translation.

1.2.Process of machine translation

The process of manual translation is different from that of machine translation. Here is the process of the former. (1) Understand source language. (2) Use target language to organize language. (3) Generate translation. Unlike manual translation, machine translation tends to analyze and code source language first, then look for related codes in corpus, and work out the code that represents target language, generating translation. But they share a common feature, which is that Lexicon, grammatical rules and syntactic structure are taken into consideration. This is one of the biggest challenges for machine translation.

2.Newmark’s type of texts

Peter Newmark divided texts into informative type, expressive text and vocative type according to the linguistic functions of various texts.

2.11Informative text

The core of the informative texts is the truth. It is to convey facts, information, knowledge and the like. The language style of the text is objective and logical. Reports, papers, scientific and technological textbooks are all attributed to informative texts.

2.2Expressive text

The core of the expressive text is the emotion. It is to express preferences, feelings, views and so on. The language style of it is subjective. Literary works, including fictions, poems and drama, autobiography and authoritative statements belong to expressive text.

2.3Vocative text

The core of the vocative text is readership. It is to call upon readers to act in the way intended by the text. So it is reader-oriented. Such texts advertisement, propaganda and notices are of vocative text.

2.4Study Method

Manual translations of the three texts are selected from authoritative versions and universally acknowledged. And machine translations of those come from Youdao Translator. And in this thesis I will compare and evaluate the two methods in word diction, sentence structure, word order and redundancy.

3.

4.

5.

6.

7.

Conclusion

References

11 陈惠妮=(Study on Pre- editing of Machine Translation - A Case Study of Medical Abstracts)

Machine_Trans_EN_11

Abstract

At present, globalization is accelerating and the market demand for language services is rapidly increasing . Machine translation, as an important translation method, can greatly improve translation efficiency due to its low cost and high speed. However, because of the limitations of machine translation and the differences between Chinese and English language, machine translation is not accurate enough. In order to balance translation efficiency and translation quality, a great number of manual revisions in translation are required for the machine translating texts. Medical papers are specialized, special and purposeful, so it requires accurate,qualified and professional translation. However, the quality of translations by machine is inefficient to meet the high-quality requirements of medical papers translation. Therefore, the introduction of pre-editing can greatly improve the efficiency and quality of machine translation.

Key words

Pre-editing, Machine translation, Medical texts

题目

Study on Pre- editing of Machine Translation - A Case Study of Medical Abstracts

摘要

在全球化加速发展的今天,市场对语言服务的需求迅速增加。机器翻译作为一种重要的翻译途径,由于其成本低、速度快,可以大大提高翻译效率。然而,由于机器翻译的局限性以及中英文语言的差异,机器翻译的准确性不高。为了平衡翻译效率和翻译质量,机器翻译文本需要大量的手工修改。 医学论文具有专业性、特殊性和目的性,要求其译文准确、合格、专业。然而,机器翻译的质量较低,无法满足医学论文对翻译的高质量要求。因此,译前编辑的引入可以大大提高机器翻译的效率和质量。

关键词

译前编辑;机器翻译;医学文本

1. Introduction

1.1 Definition of Machine Translation

The concept of machine translation was firstly proposed in the 1930s. Since 1940s, the machine translation technology has been evolving from rule-based machine translation (RBMT) to statistical machine translation (SMT), and to neural machine translation (NMT). Machine translation refers to the automatic translation of source language into target language by using a computer system. That is, machine translation refers to the automatic translation of text from one language into another natural language by computer software or other online translation webs. On the basic level, machine translation performs mechanical substitution of words in one language for words in another language, but that rarely produces a good translation, therefore, recognition of the whole phrases and their closest counterparts in the target language is needed. Not all words in one language have equivalents in another language, and many words have more than one meaning. A huge demand for translation is greatly needed in today’s global world, which creates new opportunities for the development of machine translation, attracts more and more attention and becomes one of the current research focuses.

1.2 Definition of Pre-editing

Pre-editing means to adjust and modify the source language to make it fit more with the characteristics of the machine translation software before putting the source language before the into machine translation, so as to improve the quality of the translation machine translation (Wei Changhong, 2008:93-94). Pre-editing is to modify the original text before putting it into machine translation software, in order to improve the recognition rate of machine translation, optimize the output quality of translated text and reduce the workload of post-editing. Because pre-translation editing only needs to be modified in one language, the operation is simpler than post-translation editing, which can realize the double improvement of quality and efficiency. A good pre-editing translation can help machine translation more smoothly, thus improving the machine readability and quality of the output translation. Pre-editing is mostly applied in the following situations: one is when the original text is of poor quality and the machine is difficult to recognize the meaning of the sentence, such as user generated content with poor readability and translatability (Gerlach et al, 2003:45-53); Documents that need to be published in multiple languages; Next is when the original text contains a lot of jargon; The last is the original text has a corresponding translation memory bank. If the original text is edited, it can better match the content of the translation memory bank.

1.3 Machine Translation Mode

According to the different knowledge acquisition methods, machine translation modes can be classified as follows: one is rule-based machine translation, which is based on bilingual dictionaries and a library of language rules for each language. The quality of translation depends on whether the source language conforms to the existing rules, but the inexhaustible rules are the hinders of this model. The second mode is machine translation based on statistics. This translation model relies on the principles of mathematics and statistics to find various existing translations corresponding to the translation tasks through the employment of corpus, analyzing the frequency of their occurrence and selecting the translation with the highest frequency for output. The disadvantage of this translation model is that it ignores the flexibility of language and the importance of context. The last one is neural network language model. This model is different from previous translation models in that it uses end-to-end neural network to realize automatic translation between natural languages. At present, the quality of its translation is much higher than that of the previous translation models.

2.Language Characteristics and Error Division

Since Chinese and English are two different languages, it is quite neccessary to identify their own characteristics so as to better analyze and understand the two languages. There also exit some mistakes of these two languages. So the following will make some clarrifications of these two languages.

2.1 Language Characteristics of Medical Abstracts

Chinese and English belong to different language systems, so there are differences in their language structure and the way users think of the languages. When using machine translation from Chinese to English, due to the unequal language levels, there will be many mistakes in the translation process, especially for ESP texts, such as medical papers. Here, these differences mainly refer to the linguistic characteristics of medical abstracts. In medical abstracts, it usually includes structured and unstructured abstracts. Although in different forms, they both describe the purpose, methods and conclusions of the research. In the method section of medial abstract, several Chinese sentences can be connected with commas, but each sentence may convey different information. In contrast, English sentences contain a great deal of information, but in order to ensure clarity, some modifiers need to be isolated and then reconstructed. However, in Chinese-English machine translation, a lot of information is put into the sentence because the machine segments the sentence on the basis of the full comma. In addition, subjectless sentence is used in the objective and method parts of Chinese medical abstracts. The subjectless sentence means the sentence without or free from subject and is usually employed in two contexts. The first is "needless to say". In Chinese sentences, it is common to omit the subject of the sentence. Chinese sentences can convey meanings by using incomplete sentence structures, so Chinese speakers can understand the meanings of the sentences even though the sentence subject is omitted. The second is "emphasis of action". In this context, subjectless sentences are used to describe behaviors, especially in the study of traditional Chinese medicine abstracts. Comparatively speaking, it should be avoided in English medical abstracts. Abstract sentences are more subjective when describing the learning process, while the essential requirement of English medical abstract is objectivity. From this point, sentences without subject should be avoided in English medical abstracts. Another feature is voice. Few words with passive meanings appear in Chinese abstracts. English sentences are more favoured in using passive voice. When translating from Chinese to English, passive voice should be used to make the contents objective, which is also the basic requirement of medical papers. These are the linguistic features of Chinese medical abstracts. There are great differences in sentence structure and expression between Chinese and English medical abstracts. These differences may reduce the accuracy of machine translation, so it is necessary to introduce pre-editing to edit the source text to ensure that the source text can be accurately recognized by the machine and fully translated into English.

2.2 Error Division of Machine Translation in Translating Abstracts of Medical Papers from Chinese to English

In this paper, errors in machine translation are listed out after analyzing. Some are due to the principles, such as statistical-based MT and NMT, employed by machine translation. Based on the original errors, they can be studied from two levels. One is from the macro-level, referring to mistakes caused by objective factors. These are not human factors, but the disadvantage of machine translation and the limitations of language. The limitations of machine translation are derived from the principles and models adopted and manifests itself as a reliance on reference sources. In this case, semantic ambiguity and textual incoherence may occur in the absence of reference sources. However, for ESP texts such as medical papers, the requirements for terms and sentence patterns are far beyond the existing corpora. For unfamiliar texts (that is, the corresponding texts cannot be found in the corpora). The quality of output translation will be relatively low, which is determined by the principles of machine translation. As mentioned above, machine translation has evolved over the past few decades from rules-based to corp-based statistics to NMT. However, there still exist some limitations in machine translation. Mistakes on micro-level are mainly caused by the variations and differences of linguistic structure between Chinese and English. Chinese is an implicit language, while English is an explicit one (Lian, 2010) To put it in another way, Chinese expression does not depend on language structures, while English does the opposite. This may result in mismatches between the original and machine translated sentences. This kind of error is mainly divided into two types: component fragment and component missing. For a medical paper, such errors are very serious. As a kind of ESP discourse, medical paper has the characteristics of fixed, objective and accurate language structure. In order to reproduce the characteristics of medical abstracts in translation, it is necessary to avoid errors at the level of words and sentences and pay attention to logic and consistency. Lexical errors here refer to the inconsistency of fixed expressions in the translation of terms. Although these terms are machine translation based on the corpus, the corpus may not be perfect for ESP texts such as medical papers. Therefore, fixed expressions are not used in term translation. This is also the most common mistake in machine translation (the term here includes but is not limited to nouns). For ESP texts, medical text in particular, the fixed expressions and the accuracy of terms are of great importance.

3.

4.

Conclusion

References

12 蔡珠凤=(The Mistranslation of C-J Machine Translation of Political Statements)

Machine_Trans_EN_12

Abstract

Language is the main way of communication between people. With the continuous development of globalization, the scale of cross-border exchanges is also expanding. However, due to cultural differences and diversity, the languages of different countries and regions are very different, which seriously hinders people's communication. The demand for efficient and convenient translation tools is increasing. At the same time, with the development of network technology and artificial intelligence, recognition technology based on deep learning is more and more widely used in English, Japanese and other fields.

Key words

machine translation; political statements; mistranslation of C-J machine translation

题目

The Mistranslation of C-J Machine Translation of Political Statements

摘要

语言是人与人之间交流的主要方式。随着全球化的不断发展,跨境交流的规模也在不断扩大。然而,由于文化的差异和多样性,不同国家和地区的语言差异很大,这严重阻碍了人们的交流。对高效便捷的翻译工具的需求正在增加。同时,随着网络技术和人工智能的发展,基于深度学习的识别技术在英语、日语等领域的应用越来越广泛。

关键词

机器翻译;政治发言;政治发言中译日的误译

1. Introduction

Introduction to machine translation

Machine translation, also known as automatic translation, is a process of using computers to convert one natural language (source language) into another natural language (target language). It is a branch of computational linguistics, one of the ultimate goals of artificial intelligence, and has important scientific research value.At the same time, machine translation has important practical value. With the rapid development of economic globalization and the Internet, machine translation technology plays a more and more important role in promoting political, economic and cultural exchanges.The development of machine translation technology has been closely accompanied by the development of computer technology, information theory, linguistics and other disciplines. From the early dictionary matching, to the rule translation of dictionaries combined with linguistic expert knowledge, and then to the statistical machine translation based on corpus, with the improvement of computer computing power and the explosive growth of multilingual information, machine translation technology gradually stepped out of the ivory tower and began to provide real-time and convenient translation services for general users.

C-J machine translation software

Today's online machine translation software includes Baidu translation, Tencent translation, Google translation, Youdao translation, Bing translation and so on. Google was the first company to launch the machine translation system, and Baidu was the first company to import the machine translation system in China. In addition, Tencent and Youdao have attracted much attention.Machine translation is the process of using computers to convert one natural language into another. It usually refers to sentence and full-text translation between natural languages. In order to continuously improve the translation quality, R & D personnel have added artificial intelligence technologies such as speech recognition, image processing and deep neural network to machine translation on the basis of traditional machine translation based on rules, statistics and examples.With the increase of using machine translation, the joint cooperation between manual translation and machine translation will also increase significantly in the future. What criteria should be used to evaluate the quality of machine translation? In the evolving field of machine translation, there is an urgent need to clarify the unsolvable questions and solved problems.

The history of machine translation

The research history of machine translation can be traced back to the 1930s and 1940s. In the early 1930s, the French scientist G.B. alchuni put forward the idea of using machines for translation. In 1933, Soviet inventor П.П. Trojansky designed a machine to translate one language into another, and registered his invention on September 5 of the same year; However, due to the low technical level in the 1930s, his translation machine was not made. In 1946, the first modern electronic computer ENIAC was born. Shortly after that, W. weaver, an American scientist and A. D. booth, a British engineer, a pioneer of information theory, put forward the idea of automatic language translation by computer in 1947 when discussing the application scope of electronic computer. In 1949, W. Weaver published the translation memorandum, which formally put forward the idea of machine translation. After 60 years of ups and downs, machine translation has experienced a tortuous and long development path. The academic community generally divides it into the following four stages: Pioneering period(1947-1964) In 1954, with the cooperation of IBM, Georgetown University completed the English Russian machine translation experiment with ibm-701 computer for the first time, showing the feasibility of machine translation to the public and the scientific community, thus opening the prelude to the study of machine translation. It is not too late for China to start this research. As early as 1956, the state included this research in the national scientific work development plan. The topic name is "machine translation, the construction of natural language translation rules and the mathematical theory of natural language". In 1957, the Institute of language and the Institute of computing technology of the Chinese Academy of Sciences cooperated in the Russian Chinese machine translation experiment, translating 9 different types of more complex sentences. From the 1950s to the first half of the 1960s, machine translation research has been on the rise. The United States and the former Soviet Union, two superpowers, have provided a lot of financial support for machine translation projects for military, political and economic purposes, while European countries have also paid considerable attention to machine translation research due to geopolitical and economic needs, and machine translation has become an upsurge for a time. In this period, although machine translation is just in the pioneering stage, it has entered an optimistic period of prosperity. Frustrated period(1964-1975) In 1964, in order to evaluate the research progress of machine translation, the American Academy of Sciences established the automatic language processing Advisory Committee (Alpac Committee) and began a two-year comprehensive investigation, analysis and test. In November 1966, the committee published a report entitled "language and machine" (Alpac report for short), which comprehensively denied the feasibility of machine translation and suggested stopping the financial support for machine translation projects. The publication of this report has dealt a blow to the booming machine translation, and the research of machine translation has fallen into a standstill. Coincidentally, during this period, China broke out the "ten-year Cultural Revolution", and basically these studies also stagnated. Machine translation has entered a depression. convalescence(1975-1989) Since the 1970s, with the development of science and technology and the increasingly frequent exchange of scientific and technological information among countries, the language barriers between countries have become more serious. The traditional manual operation mode has been far from meeting the needs, and there is an urgent need for computers to engage in translation. At the same time, the development of computer science and linguistics, especially the substantial improvement of computer hardware technology and the application of artificial intelligence in natural language processing, have promoted the recovery of machine translation research from the technical level. Machine translation projects have begun to develop again, and various practical and experimental systems have been launched successively, such as weinder system Eurpotra multilingual translation system, taum-meteo system, etc. However, after the end of the "ten-year holocaust", China has perked up again, and machine translation research has been put on the agenda again. The "784" project has paid enough attention to machine translation research. After the mid-1980s, the development of machine translation research in China has further accelerated. Firstly, two English Chinese machine translation systems, ky-1 and MT / ec863, have been successfully developed, indicating that China has made great progress in machine translation technology. New period(1990 present) With the universal application of the Internet, the acceleration of the process of world economic integration and the increasingly frequent exchanges in the international community, the traditional way of manual operation is far from meeting the rapidly growing needs of translation. People's demand for machine translation has increased unprecedentedly, and machine translation has ushered in a new development opportunity. International conferences on machine translation research have been held frequently, and China has made unprecedented achievements. A series of machine translation software have been launched, such as "Yixing", "Yaxin", "Tongyi", "Huajian", etc. Driven by the market demand, the commercial machine translation system has entered the practical stage, entered the market and came to the users. Since the new century, with the emergence and popularization of the Internet, the amount of data has increased sharply, and statistical methods have been fully applied. Internet companies have set up machine translation research groups and developed machine translation systems based on Internet big data, so as to make machine translation really practical, such as "Baidu translation", "Google translation", etc. In recent years, with the progress of in-depth learning, machine translation technology has further developed, which has promoted the rapid improvement of translation quality, and the translation in oral and other fields is more authentic and fluent.

The problem of machine translation at present

Error is inevitable Many people have misunderstandings about machine translation. They think that machine translation has great deviation and can't help people solve any problems. In fact, the error is inevitable. The reason is that machine translation uses linguistic principles. The machine automatically recognizes grammar, calls the stored thesaurus and automatically performs corresponding translation. However, errors are inevitable due to changes or irregularities in grammar, morphology and syntax, such as sentences with adverbials after "give me a reason to kill you first" in Dahua journey to the West. After all, a machine is a machine. No one has special feelings for language. How can it feel the lasting charm of "the tenderness of lowering its head, like the shame of a water lotus? After all, the meaning of Chinese is very different due to the changes of morphology, grammar and syntax and the change of context. Even many Chinese people are zhanger monks - they can't touch their heads, let alone machines. Bottleneck In fact, no matter which method, the biggest factor affecting the development of machine translation lies in the quality of translation. Judging from the achievements, the quality of machine translation is still far from the ultimate goal. Chinese mathematician and linguist Zhou Haizhong once pointed out in his paper "fifty years of machine translation": to improve the quality of machine translation, the first thing to solve is the problem of language itself rather than programming; It is certainly impossible to improve the quality of machine translation by relying on several programs alone. At the same time, he also pointed out that it is impossible for machine translation to achieve the degree of "faithfulness, expressiveness and elegance" when human beings have not yet understood how the brain performs fuzzy recognition and logical judgment of language. This view may reveal the bottleneck restricting the quality of translation. It is worth mentioning that American inventor and futurist ray cozwell predicted in an interview with Huffington Post that the quality of machine translation will reach the level of human translation by 2029. There are still many disputes about this thesis in the academic circles.

2.Mistranslation of Chinese Japanese machine translation

2.1Vocabulary mistranslation in Chinese Japanese machine translation

2.1.1Mistranslation of proper nouns

2.1.2Mistranslation of Polysemy

2.1.3Mistranslation of compound words

2.2 Syntactic mistranslation in Chinese Japanese machine translation

2.2.1Main dynamic Mistranslation

2.2.2Dynamic Mistranslation

2.2.3Mistranslation of tenses

2.2.4Mistranslation of honorifics

3.

4.

Conclusion

References

13 陈湘琼Chen Xiangqiong(Study on Post-editing from the Perspective of Functional Equivalence Theory )

Machine_Trans_EN_13

Abstract

With the development of technology,machine translation methods are changing. From rule-based methods to corpus-based methods,and then to neural network translation,every time machine translation become more precise, which means it is not impossible the complete replacement of human translation by machine translation. But machine translation still faces many problems until today such as : fail to translate special terms, incapable to set the right sentence order, unable to understand content and culture background etc. All of these need to be checked out and modified by human translator, so it can be predict that the model Human + Machine will last for a long period. This article will discuss mistakes made in machine translation and describe what translators should do in post-editing based on the skopos theory and functional equivalence theory

Key words

machine translation,post-editing,skopos theory,functional equivalence theory

题目

基于功能对等视角探讨译后编辑问题与对策

摘要

随着科技的不断发展,机器翻译方法也在不断变革,从基于规则的机器翻译,到基于统计的机器翻译,再到今天基于人工神经网络的机器翻译,每一次变化都让机器翻译变得更精确,更高质。这意味着在不远的将来,机器翻译完全代替人工翻译成为一种可能。但是直至今天,机器翻译仍然面临许多的问题如:无法准确翻译术语、无法正确排列句子语序、无法分辨语境等,这些问题依然需要人工检查和修改。机器翻译自有其优点,人工翻译也有无可替代之处,所以在很长一段时间内,翻译都应该是机器+人工的运作方式。本文将基于翻译目的论和功能对等理论,对机器翻译可能出现的错误之处进行探讨,并且旨在描述译者在进行译后编辑时需要注重的方面,为广大译员提供参考。

关键词

机器翻译,译后编辑,翻译目的论,功能对等

1. Introduction

For a long time, researchers believe MT may have seemed relatively peripheral, with limited use. But recently, because of the technological advances in the field of machine translation, the translation industry has been experiencing a great revolution where the speed and amount of translation has been raised desperately. So, the idea that human translation may be completely replaced by machine translation in the future may come true. This changing landscape of the translation industry raises questions to translators. On the one hand, they earnestly want to identify their own role in translation field and confront a serious problem that they may lost job in the future. On the other hand, in more professional contexts, machine translation still can’t overcome difficulties such as: fail to translate special terms, incapable to set the right sentence order, unable to understand content and culture background etc. For this reason, human-machine interaction is certainly becoming a trend in the recent future. Therefore, translators start to use machine translations as raw versions to be further post-edited, which becomes the topic we want to discuss today. This paper presents a research investigating the post-editing work in machine translation. From the prospect of functional equivalence and skopos theory, we discuss the errors machine translation may made in the process and what strategies translator should use when translating. Section 2 provides an overview of the two theories and the development in the practical use. Section 3 presents debates on relationship between MT and HT. Section 4 review the history and development of post-editing.

2. Functional Equivalence and Skopos Theory

Functional equivalence theory is the core of Eugene Nida’s translation theory, who is a famous translator and researcher in America. It aims to set a general standard for evaluating the quality of translation. In his theory Nida points out that “translation is to convey the information from source language to target language with the most proper and natural language.”(Guo Jianzhong, 2000:65) He holds that translator should not only achieve the information equivalence in lexical sense but also take into account the cultural background of the target language and achieve the equivalence in semantics, style and literature form. So the dynamic equivalence contains four aspects: 1. lexical equivalence;2.syntactic equivalence;3.textual equivalence;4.stylistic equivalence, which basically construct and guide the idea of this article.

In 1978, Hans Vermeer put forward skopos theory in his book Framework for a General Translation Theory. In this theory, he believes that translation is a human activity which means it has special purpose in itself like other human activities.(Nord, 2001:12) Also, there are some rules that the translator should follow in the progress of translation: 1.purpose principle; 2.intra-textual coherence; 3.fidelity rule, which exactly shows its correlation with machine translation.

According to these two theories, we can start now to explore some principles and standard that translator ought to obey in post-editing. Firstly, efficiency and accuracy are really important because the translator’s purpose is obviously raising money in comparatively short time. If they fail to provide translation with high quality or if they unable to finish the job before deadline, the consequence will be relatively bad. Secondly, translators need to achieve equivalence in lexical level, syntactic level, textual level and stylistic level in post-editing for the reason that machine translation can be always misunderstood when they are dealing with words and sentences with special background knowledge. Thirdly, it is almost impossible for machine translation to achieve communicative goal and fulfil cultural exchange that human brain is indispensable to jump over the gap. And more details will be discussed later on.

3. Machine Translation Versus Human Translation

The dream that natural language can be translated by machine come true in the late twentieth. Though not completely perfect, machine translation still fulfil the requirement of translation in technical manuals, scientific documents, commercial prospectuses, administrative memoranda and medical reports.(W.John Hutchins, 1995:431)

Researchers divide traditional machine translation method into three categories:Rule-Based, Corpus-Based and Hybrid methods, and all of them have their own merits and demerits. The first one builds the translation knowledge base on dictionaries and grammar rules, but it is not so practical for languages without much correlation and highly rely on human experience. The second one builds the translation knowledge by making full use of the corpus, which is still the mainstream of today’s machine translation. The last one mix both of rule and corpus and successfully raise the efficiency of translation, but it is tough to be managed because of complex system and weak extend ability. (Hou Qiang, 2019:30)

According to Martin Woesle, the advantages and disadvantages of machine translation can be obvious. For advantages, machine translation has its speed and availability, low costs, efficiency and welcome of cooperation. However, it can not satisfy some special situation such as: noisy background, ill connectivity, short of electricity, corpus limitation and cultural sensitivity. (Martin Woesle, 2021:203)

4. Post-editing

For a long time, researchers believe MT may have seemed relatively peripheral, with limited use. But recently, because of the technological advances in the field of machine translation, the translation industry has been experiencing a great revolution where the speed and amount of translation has been raised desperately. So, the idea that human translation may be completely replaced by machine translation in the future may come true. This changing landscape of the translation industry raises questions to translators. On the one hand, they earnestly want to identify their own role in translation field and confront a serious problem that they may lost job in the future. On the other hand, in more professional contexts, machine translation still can’t overcome difficulties such as: fail to translate special terms, incapable to set the right sentence order, unable to understand content and culture background etc. For this reason, human-machine interaction is certainly becoming a trend in the recent future. Therefore, translators start to use machine translations as raw versions to be further post-edited, which becomes the topic we want to discuss today. This paper presents a research investigating the post-editing work in machine translation. From the prospect of functional equivalence and skopos theory, we discuss the errors machine translation may made in the process and what strategies translator should use when translating. Section 2 provides an overview of the two theories and the development in the practical use. Section 3 presents debates on relationship between MT and HT. Section 4 review the history and development of post-editing.

5. Word Errors

According to …, post-editing machine translation can increase the productivity of translators in terms of speed, while retaining or in some cases even improving the quality of their translations. However, such benefits are not always guaranteed except in the right condition.[2] Since the purpose of the translator is efficiency and accuracy, they have to evaluate what are right texts and what are worth to be post-edited.

6. Post-editing On Sentences

7. Post-editing On Style and Culture Background

Conclusion

It is very important to mention that the translator’s experience is not always being taken into account, and obviously novice translators are quite different from those professional translators. In this paper, we discuss the problems in a very general situation from the point view of machine translation errors for professional translators as well as student translator.

References

14 Bi bi Nadia(Machine Translation a Challenge for Human Translators)

Machine_Trans_EN_14

Abstract

Machine translation is a big obstacle in the way of Human translators or interpretors although it is quick and less time consuming.people are trying to get translation of their target language using through source language.For this purpose they are using digital apps like Google translation Google translation does not give accurate and exact interpretation.Google translator is translates word to word translate that doesn't clarifies it's true and actual meaning.On the other hand human translators can give exact and accurate translation ,they take care of grammatical errors , diction and sentence structure.They clarify the purpose of target language through using source language.

Keywords

Descriptive translation, academic, interdiscipline, comparative literature, localization,Translatology,school of thought, translation , studies, linguistics, corresponding

Body of article

Translation is the process of reworking text from one language into another to maintain the original message and communication. But, like everything else, there are different methods of translation, and they vary in form and function. What is translation? The term has become widely used among knowledge transfer researchers and practitioners, especially in the fields of health and health care. In a landmark review, Jonathan Lomas began to argue that 'The tasks… may be defined as to establish and maintain links between researchers and their audience, via the appropriate translation of research findings' (Lomas 1997, p 4). In 2004, the World Health Organization's World Report on Knowledge for Better Health suggested that 'One of the key contributions of research to health systems is the translation of knowledge into actions' (WHO 2004, p 33 and p 100). By 2006, special issues of WHO's Bulletin as well as the journals Evaluation and the Health Professions and the Journal of Continuing Education in the Health Professions were dedicated to translation.But what does 'translation' mean? It may be a new word for an old problem, meaning nothing more than 'transfer'. The rapidly emerging field of 'translational medicine' seems to take translation to mean generally what transfer might have meant, that is the transmission of knowledge and evidence 'from bench to bedside'. As one commentator has observed, "Translational medicine" as a fashionable term is being increasingly used to describe the wish of biomedical researchers to ultimately help patients' (Wehling 2008, abstract). Semantic uncertainty persists, not least because of the different interests of scientists, clinicians, patients and commercial firms (Littman et al 2007): 'Translational research means different things to different people, but it seems important to almost everyone' (Woolf 2008, p 211). In related fields, such as public health (Armstrong et al 2006), 'translation' seems to signify dissatisfaction with 'transfer'. It wants to move away from thinking of knowledge transfer as a form of technology transfer or dissemination, rejecting if only by implication its mechanistic assumptions and its model of linear messaging from A to B. But still, what does it signify?

Why translation?

'Translation' indicates a closer attention to the problem of shared meaning and how it might be developed. It seems to represent some new epistemological lubricant, facilitating the dissemination of texts and the application and use of the knowledge and information they in. Simply, translation might be the key to transfer. And yet, when we stop to think, we are more ambivalent. What is translated often seems somehow inferior, not real or original. Note how readily commentators reach for the idea that things might be 'lost in translation'. Knowing at a distance – made in and mediated by translation - makes for incomplete renditions, blurred images, partial truths. So what might 'translation' really mean? The purpose of this paper is to set out, for policy makers and practitioners, the theoretical and conceptual resources translation holds and seems to represent. In doing so, it explores understandings of translation in the fields of literature and linguistics and in the sociology of science and technology. It begins by setting out just why this idea of translation should make immediate, intuitive sense in relation to research, policy and practice. 1translation in research, policy and practice research as translation Research often entails translation from one language to another: where data is collected from more than one ethnic group, for example, or where the language of the researcher is other than that of the research subject. It may draw on a secondary literature or source documents written in different languages, and may be published and disseminated in languages other than the one in which it is first written up. 2In a different way, to conduct an interview is to ask for an account of experience and its meanings, but it is also to construct and translate that experience in terms defined at least in part by the researcher. In representing what is said, transcripts then select data, usually excluding significant gesture and eye-contact, for example. Often, certain characteristics of speech-acts (such as hesitations) will be edited out. In turn, the format of the transcript shapes the analytic use the researcher may make of it. The basis of research 'findings', then, is an artefact, a transcript or translation, not an original interaction (Ochs 1979, Barnes, Bloor and Henry 1996, Ross 2009). 3In this way, the researcher recasts aspects of his or her problem or topic in new, scientific form: 'All researchers "translate" the experiences of others' (Temple 1997, p 609). Research is invariably conducted in a sort of 'metalanguage' (Hantrais and Ager 1985): the research process can be conceived as one of successive translations (from theoretical formulation to operationalization, transcription, interpretation and dissemination). Theorization is a process of reciprocal back and forth between theory and fact, in which conceptions of each are revised in order that one fit the other (Baldamus 1974). 4 It is a kind of translation: a rereading, re-use, re-application or re-representation of what we know in new terms (Turner 1980). Referencing, too, is an act of translation, a form of appropriation and incorporation of one text by another (Gilbert 1977).policy as translation Each of the fields covered by the paper is diverse and ill-defined, and there is no intention here to provide a comprehensive account of any them. Sources have been chosen for their relevance: main references are cited in the text, and additional sources listed in footnotes.

2For a brief introduction to the technical issues involved in social research in more than one language, see Birbili (2000). For translation issues in survey research and question design in general, see Ervin and Bower (1952) and Deutscher (1968); on translating survey research instruments (in this instance health-related quality of life measures), Bowden and Fox-Rushby (2003). On the use of translators and interpreters, see Temple (1997) and Jentsch (1998). 3For an interesting discussion of this problem, see Bourdieu (1999), esp pp 621-626, 'The risks of writing'.

Difference between machine translation and human translators

Few people disagree on the differences between the two, but many argue over the quality of the translations. How accurate are machine translations? How reliable are human translations? Some say machine translation produces near-perfect translations while others are adamant that translations are incomprehensible and cause more problems than they solve. Results will, of course, vary depending on the source and target languages, the machine translation service used (e.g. Google Translate), and the complexity of the original text. Machine translation, love it or hate it, is here to stay. In fact, the machine translation market is growing at such a fast pace that it is predicted to reach $980 million by 2022.

Machine translation: the pros and cons

The advantages of machine translation generally come down to two factors: it’s faster and cheaper. The downside to this is the standard of translation can be anywhere from inaccurate, to incomprehensible, and potentially dangerous (more on that shortly).

The advantages of machine translation

Many free tools are readily available (Google Translate, Skype Translator, etc.) Quick turnaround time You can translate between multiple languages using one tool Translation technology is constantly improving The disadvantages of machine translation Level of accuracy can be very low Accuracy is also very inconsistent across different languages

Machines can't translate context

Mistakes are sometimes costly Sometimes translation simply doesn’t work The most important thing to consider with any kind of translation is the cost of potential mistakes. Translating instructions for medical equipment, aviation manuals, legal documents and many other kinds of content require 100% accuracy. In such cases, mistakes can cost lives, huge amounts of money and irreparable damage to your company’s image. So choose carefully! Human translation: the pros and cons Human translation essentially switches the table in terms of pros and cons. A higher standard of accuracy comes at the price of longer turnaround times and higher costs. What you have to decide is whether that initial investment outweighs the potential cost of mistakes. Alternatively, whether mistakes simply aren’t an option, like the cases we looked at in the previous section.

The advantages of human translation

It’s a translator’s job to ensure the highest accuracy Humans can interpret context and capture the same meaning, rather than simply translating words Human translators can review their work and provide a quality process Humans can interpret the creative use of language, e.g. puns, metaphors, slogans, etc. Professional translators understand the idiomatic differences between their languages Humans can spot pieces of content where literal translation isn’t possible and find the most suitable alternative The disadvantages of human translation Turnaround time is longer Translators rarely work for free Unless you use a translation agency, with access to thousands of translators, you’re limited to the languages any one translator can work with Simply put, human translation is your best option when accuracy is even remotely important. Other considerations to make are the complexity of your source material and the two languages you’re translating between – both of which can render machines pretty useless.

When to use machine and human translation The truth is, the debate over machine vs human translation is an unnecessary distraction. What we should really be talking about is when to use these two different types of translation services, because they both serve a very valid purpose.

Examples of when to use machine translation When you have a large bulk of content to translate and the general meaning is enough When your translation never reaches the final audience, e.g. you’re translating a resource as research for another piece of content Translating documents for internal use within a company, provided 100% accuracy isn’t needed To partially translate large chunks of content for a human translator to improve upon Examples of when to use human translation When accuracy is important Most cases where your translated content is received by a consumer audience When you have a duty of care to provide accurate translations (e.g. legal documents, product instructions, medical guidelines or health and safety content) When translating marketing material or other texts for creative language uses.

Conclusion

In the light of above mentioned facts and figures here by we would say that machine translation is challenge for human translators because it can reduces the wokload of translation but can't give accurate and exact translation of the traget language.It can be less reliable than human translation..