Reported by: Adapted from Sohu network (http://roll.sohu.com/20120104/n331149983.shtml) Jan.5, 2012
Translated by: Li Nan and LU Zhixiu
Edited by: Patti Broderick
Translating popular Chinese on-line expressions: “有木有”, “我勒个去” and“神马都是浮云” is challenging. Taking “神马都是浮云” as an example, Baidu Translator renders it as “Everything is nothing”, which is closest to the original meaning, while another version from Google, “Horses are clouds of God”, is quite unintelligible. Compared with Baidu Translator, Google Translator lacks understanding of Chinese culture. Harbin Institute of Technology’s (HIT’s) Adjunct Professor and alumnus WANG Haifeng, Senior Scientist at Baidu, laid the foundation for Baidu translation technology.
WANG is a celebrated expert in the field of computer engineering, the newly appointed Dean of Peking University’s Computational Linguistic Engineering Department and, effective November 2010, the Vice President of the Association for Computational Linguistics (ACL). He is the first Chinese vice president of ACL in its 50 year history. “The honor means international peer recognition not only of my job, but also of the contribution made by Chinese scholars in the field, as well as Chinese companies like Baidu,” WANG said. WANG has been engaged in the computer industry for more than 20 years, having been admitted to HIT as a computer science major in 1989. He’s devoted himself to computers ever since.
WANG’s parents were both university students in the sixties. His father graduated from Tsinghua University and his mother from Harbin Medical University. Influenced by an intellectual family, high-tech education and an environment that valued higher education, WANG was determined to be a scientist from an early age.
WANG became involved in the challenging field of machine translation while still an undergraduate. During his postgraduate studies, he developed a Chinese-English machine translation system in only one year, which won him first prize in the national 863 Program and the Science & Technology Progress Award.
In early 1999, WANG graduated from HIT with a doctoral degree and myriad offers of lucrative employment. He launched his career with the newly founded Microsoft Research China as an assistant researcher, before managing research at Toshiba and finally joining Baidu in January 2010 to head a machine translation research team. As an expert in machine translation, WANG was aware of the value that Baidu’s large bilingual corpora had to the development of machine translation. The exploration, selection and processing of the bilingual materials was the research team’s most important mission.
WANG and his team gathered nearly 10 million bilingual sentences, but problems soon followed. The translation quality was not what they had expected. For example, the online translation for the simple sentence “how old are you” was “怎么老是你”. The majority of the bilingual corpora translated another well-known Chinese idiom “好好学习,天天向上” into “good good study, day day up”. After one month’s efforts and the help of new computer technologies, WANG’s team weeded out vast numbers of low-quality translations, reduced their corpora by more than half to only 4 million sentences, thereby greatly improving the quality of machine translation.
Wang’s team managed to launch Baidu Translator last year after only a year of intense development. Today, Baidu has developed a unique way of translating Chinese on-line languages and gained technological advantage in Chinese internet. However, for WANG, Baidu’s achievements to date are just the beginning. In addition to machine translation, WANG is also in charge of other technologies that support Baidu’s product development: natural language processing, data collection, information retrieval, machine learning, information filtering, and speech technology. WANG is confident of breakthroughs in these fields and greater product innovations.