Difference between revisions of "Machine Trans EN 8"

From China Studies Wiki
Jump to navigation Jump to search
Line 17: Line 17:
 
Nowadays the artificial intelligence is sweeping the world, however, the traditional language study and language service industry are facing new challenges.  This paper attempts to comb and analyze the development process of language intelligence in artificial intelligence and the development status of language study and language industry under the background of information age to interpret the feasibility of liberal arts translators to engage in machine translation research and necessity to apply machine translation, thus to provide a reference on the development path for preparatory translators(students majored in language and translation) and full-time and part-time formal translators.
 
Nowadays the artificial intelligence is sweeping the world, however, the traditional language study and language service industry are facing new challenges.  This paper attempts to comb and analyze the development process of language intelligence in artificial intelligence and the development status of language study and language industry under the background of information age to interpret the feasibility of liberal arts translators to engage in machine translation research and necessity to apply machine translation, thus to provide a reference on the development path for preparatory translators(students majored in language and translation) and full-time and part-time formal translators.
 
===Key words===
 
===Key words===
New Libral Arts; Language Intelligence; Machine Translation; Interdisciplinarity
+
Language Intelligence; Machine Translation;New Libral Arts; Interdisciplinarity
  
 
===题目===
 
===题目===
Line 26: Line 26:
  
 
===关键词===
 
===关键词===
新文科;语言智能;机器翻译;学科交叉
+
语言智能;机器翻译;新文科;学科交叉
  
 
===1. Introduction===
 
===1. Introduction===
Line 57: Line 57:
 
With the further improvement of computing power, especially the rapid development of parallel training based on GPU, the method based on deep neural network has attracted more and more attention in natural language processing. The method based on deep neural network was first used to train some sub models in statistical machine translation (language model based on deep neural network or translation model based on deep neural network), and significantly improved the performance of statistical machine translation. With the proposal of decoder and encoder framework and attention mechanism, neural machine translation has comprehensively surpassed statistical machine translation, and machine translation has entered the era of neural network.
 
With the further improvement of computing power, especially the rapid development of parallel training based on GPU, the method based on deep neural network has attracted more and more attention in natural language processing. The method based on deep neural network was first used to train some sub models in statistical machine translation (language model based on deep neural network or translation model based on deep neural network), and significantly improved the performance of statistical machine translation. With the proposal of decoder and encoder framework and attention mechanism, neural machine translation has comprehensively surpassed statistical machine translation, and machine translation has entered the era of neural network.
  
===3. Interdisciplinarity in Irresistible Trend===
+
===3. Language Study in Information Times===
  
====3.1 The Construction of New Liberal Arts====
+
====3.1 Fundamental Study====
  
====3.2 The Current Status of New Liberal Arts====
+
====3.2 Application Study====
 +
 
 +
====3.3 Interdisciplinarity Study====
  
 
===4. Language Service Industry with Machine Translation===
 
===4. Language Service Industry with Machine Translation===
  
====4.1 Translation Mode of Man-machine Cooperation====
+
====4.1 Translation Mode====
 
 
====4.2 Translators with More Professional and Diversified Career Path====
 
 
 
4.2.1 The Improvement of Tranlation Ability
 
  
4.2.2 The Combination with Other Field
+
====4.2 Translators====
  
 
===Conclusion===
 
===Conclusion===
  
 
===References===
 
===References===

Revision as of 15:29, 1 December 2021

Machine Translation - A challenge or a chance for human translators?

Overview Page of Machine Translation

30 Chapters(0/30)

Machine_Trans_EN_1 Machine_Trans_EN_2 Machine_Trans_EN_3 Machine_Trans_EN_4 Machine_Trans_EN_5 Machine_Trans_EN_6 Machine_Trans_EN_7 Machine_Trans_EN_8 Machine_Trans_EN_9 Machine_Trans_EN_10 Machine_Trans_EN_11 Machine_Trans_EN_12 Machine_Trans_EN_13 Machine_Trans_EN_14 Machine_Trans_EN_15 Machine_Trans_EN_16 Machine_Trans_EN_17 Machine_Trans_EN_18 Machine_Trans_EN_19 Machine_Trans_EN_20 Machine_Trans_EN_21 Machine_Trans_EN_22 Machine_Trans_EN_23 Machine_Trans_EN_24 Machine_Trans_EN_25 Machine_Trans_EN_26 Machine_Trans_EN_27 Machine_Trans_EN_28 Machine_Trans_EN_29 Machine_Trans_EN_30 ...

Back to translation project overview

To the To Do list

8 颜静(On Machine Translation Under Lanuguage Intelligence——An Option and Opportunity for Human Translators) Machine_Trans_EN_8

Abstract

Nowadays the artificial intelligence is sweeping the world, however, the traditional language study and language service industry are facing new challenges. This paper attempts to comb and analyze the development process of language intelligence in artificial intelligence and the development status of language study and language industry under the background of information age to interpret the feasibility of liberal arts translators to engage in machine translation research and necessity to apply machine translation, thus to provide a reference on the development path for preparatory translators(students majored in language and translation) and full-time and part-time formal translators.

Key words

Language Intelligence; Machine Translation;New Libral Arts; Interdisciplinarity

题目

论语言智能之机器翻译——我们的选择和未来

摘要

当今人工智能的热潮席卷全球,而传统的语言研究和语言服务行业却面临着新的挑战。本文通过梳理分析人工智能中语言智能领域的发展历程和信息时代背景下语言研究和语言服务行业的发展现状,对文科译者从事机器翻译研究的可行性和应用机器翻译的必要性进行阐述,为信息时代包括语言和翻译专业学生的准译员和专、兼职的正式译员提供发展路径上的参考。

关键词

语言智能;机器翻译;新文科;学科交叉

1. Introduction

Obviously, we are now in an era of "explosion" of information and knowledge, which makes us have to find ways to deal with it quickly. Language is the manifestation of information, and the tool that can deal with language with complicated information is just computer. It happens that human beings do not have a special organ to perceive language, but carry the image and sound symbols of language through visual and auditory perception, and then form language information through brain processing and abstraction. Therefore, language intelligence also belongs to the research category of "cognitive intelligence". In view of this, computer has carried out the research on language, among which the common research fields are "natural language processing", "language information processing" and "Computational Linguistics". These three are different, but they all have the same goal, that is, to enable computers to realize and express with language, solve language related problems and simulate human language ability. Among them, machine translation is the integration of language intelligence and technology. The comprehensive research of MT in China starts from the mid-1980s. Especially since the 1990s, a number of MT systems have been published and commercialized systems have been launched. In addition, various universities in China have also carried out MT and computational linguistics research, developed various translation experimental systems and achieved fruitful results. In the research of machine translation, it involves not only translation model and language model, but also alignment method, part of speech tagging, syntactic analysis method, translation evaluation and so on. Therefore, researchers must understand the basic knowledge of translation and be proficient in English, Chinese or other languages. Therefore, we say that compound talents with computer and language related knowledge will be more needed in the language industry or the computer field.

2. Artificial Intelligence in Rapid Development

At the Dartmouth Conference in 1956, the word "artificial intelligence" appeared in the human world for the first time. In the past 65 years, with the in-depth study of science, artificial intelligence seems to have come out of the original science fiction movies and science fictions, and is closer to human daily life step by step. Nowadays, autopilot, machine translation, chess and E-sports robots, AI synthetic anchor, AI generated portrait and so on have been realized and widely known. Artificial intelligence has also moved from logical intelligence and computational intelligence to today's cognitive intelligence.

2.1 The Development of Language Intelligence

According to academician Tan Tieniu, "Artificial intelligence is a technical science that studies and develops theories, methods, technologies and application systems that can simulate, extend and expand human intelligence. Its purpose is to enable intelligent machines to listen, see, speak, think, learn and act, that is, they have the following capabilities——speech recognition and machine translation, image and character recognition, speech synthesis and man-machine dialogue, man-machine games and theorems proving, machine learning and knowledge representation, autopilot and so on. So, from these purposes we can see that language plays a vital role in AI. In order to imitate human intelligence, an advanced form of artificial intelligence is to analyze and process human language by using computer and information technology. We call it "language intelligence". Language intelligence is not only the core part of artificial intelligence, but also an important basis and means of human-computer interaction cognition, whose development will contribute to the whole process of AI and further to let AI technologies to practice. Therefore, it is known as the Pearl on the crown of artificial intelligence.

The concept of “language intelligence” was proposed in 2013 at Beijing Academic Forum on Language Intelligence. However, as mentioned above, its research direction in the computer field has always been called natural language processing (NLP). Its history is almost as long as computer and artificial intelligence. After the emergence of computer, there has been the research of artificial intelligence. Natural language processing generally includes two parts: natural language understanding and natural language generation. The early research of artificial intelligence has involved machine translation and natural language understanding, which is basically divided into three stages.

The first stage is from 1960s to 1980s. In this period, the common method is to establish vocabulary, syntactic and semantic analysis, question and answer, chat and machine translation systems based on rules. The advantage is that rules can make use of human’s own knowledge instead of relying on data, and can start quickly; The problem is on its insufficient coverage, and its rule management and scalability have not been solved.

The second stage starts from 1990s. At this time, statistics-based machine learning (ML) has become popular, and many NLP began to use statistics-based methods. The main idea is to use labeled data to establish a machine learning system based on manually defined features, and to use the data to determine the parameters of the machine learning system through learning. At runtime, by using these learned parameters, the input data is decoded and output. Machine translation and search engines just make use of statistical methods and get success.

The third stage is after 2008, when deep learning functions in voice and image. Subsequently, NLP researchers begin to turn to deep learning. First, they use deep learning for feature calculation or establish a new feature, and then experience the effect under the original statistical learning framework. For example, search engines add in-depth learning to calculate the similarity between search words and documents to improve the relevance of search. Since 2014, people have tried to conduct end-to-end training directly through deep learning modeling. At present, progress has been made in the fields of machine translation, question and answer, reading comprehension and so on.

2.2 The Research on Machine Translation

Machine translation is an important research direction in the field of natural language processing. As early as the 17th century, Descartes, a famous French philosopher, put forward the concept of world language in order to convert words that expressing the same meaning in different languages into unified symbols. In 1946, Warren Weaver put forward the idea of using machines to convert words from one language into another, and published the famous memorandum Translation, formally marking the born of the modern concept——machine translation.

Until now, machine translation has experienced four stages according to its translation method: rule-based machine translation, case-based machine translation, statistics-based machine translation and neural machine translation. In the early stage of the development of machine translation, due to the limited computing power and lack of data, people usually input the rules designed by translators and Linguistics experts into the computer. The computer converts the sentences of the source language into the sentences of the target language based on these rules, which is rule-based machine translation. Rule based machine translation is usually divided into three procedures: source language sentence analysis, transformation and target language sentence generation. The source language sentence of the given input will generate a syntax tree after the lexical and syntactic analysis, and then the syntax tree is converted through the conversion rules to generate the syntax tree of the target language. Finally, the target language sentences are obtained by traversing the leaf nodes based on the target language syntax tree.

Rule-based machine translation requires professionals to design rules. When there are too many rules, the dependence between rules will become very complex and it is difficult to build a large-scale translation system. With the development of science and technology, people collect some bilingual and monolingual data, and extract translation templates and translation dictionaries based on these data. In translation process, the computer matches the translation template of the input sentence and generates the translation result based on the successfully matched template fragments and the translation knowledge in the dictionary, which is case-based machine translation.

With the rapid development of the Internet, it is possible to obtain large-scale bilingual and monolingual corpora. Statistical method based on large-scale corpora has become the mainstream of machine translation. Given the source language sentence, the statistical machine translation method models the conditional probability of the target language sentence, which is usually divided into language model and translation model. The translation model describes the meaning consistency between the target language sentence and the source language sentence, while the language model describes the fluency of the target language sentence. The language model uses large-scale monolingual data for training, and the translation model uses large-scale bilingual data for training. Statistical machine translation usually uses a decoding algorithm to generate translation candidates, then uses the language model and translation model to score and sort the translation candidates, and finally selects the best translation candidates as the translation output. Decoding algorithms usually include beam decoding, CKY decoding, etc.

Statistical machine translation uses translation rules (usually extracted from bilingual data based on alignment results) to match the input sentences to obtain the translation candidates of fragments in the input sentences. If there are multiple translation candidates in a segment, the language model and translation model are used to sort these translation candidates, and only some candidates with the highest scores are retained. Translation candidates based on these fragments use translation rules to splice fragments and then form translation candidates of longer fragments. There are two ways of splicing translation fragments: sequential and reverse. Translation model and language model will have different weights when scoring. The weights are usually trained by a development data set.

With the further improvement of computing power, especially the rapid development of parallel training based on GPU, the method based on deep neural network has attracted more and more attention in natural language processing. The method based on deep neural network was first used to train some sub models in statistical machine translation (language model based on deep neural network or translation model based on deep neural network), and significantly improved the performance of statistical machine translation. With the proposal of decoder and encoder framework and attention mechanism, neural machine translation has comprehensively surpassed statistical machine translation, and machine translation has entered the era of neural network.

3. Language Study in Information Times

3.1 Fundamental Study

3.2 Application Study

3.3 Interdisciplinarity Study

4. Language Service Industry with Machine Translation

4.1 Translation Mode

4.2 Translators

Conclusion

References