Local Language Computing Policy in Korea Jan. 22-24, 2007. Se Young Park KyungPook National University
Contents Ⅰ Background Ⅱ IT Infrastructure Ⅲ R&D Status Ⅳ Relevant Ministries V Policy Initiatives Ⅵ Challenges and Conclusion 2
My Experiences Professor in KyungPook National University Advisor of the Minister of Information and Communications in Korea Planning, Management, Evaluation President of a venture company National Project Leader for local language computing 3
Background Literacy Rate in Korean Korean is easy to read and write Very High literacy Literacy Rate in Foreign Language High in English Low in 2 nd Foreign Language Japanese, Chinese, French, Germany 4
IT Infrastructure High-Speed Internet 74.8% -2 nd country in the world Computer and Internet Availability Available anywhere Internet User 35 Million People Mobile Diffusion Most of the Korean economic population owns a cell phone 5
About Korean Language Linguistic and Ethnological Korean is the Altaic Language Family Korean characters (Hangul) Was created by King Sejong in 1443 Looked to old Chinese characters, Uighur and Mongolian scripts Consists of 10 vowels and 14 consonants Can be combined to form syllabic groupings 6
R&D Status in Korea Character Code International standards activities are supported by government Some project for standardization of nonstandard character code such as old Chinese character, symbols and special characters Fonts Many kinds of fonts are available in Microsoft windows environments No more national projects 7
R&D Status in Korea Keyboard No more concerns by government Private companies are interested in the input method of Korean character for mobile phone Language Resources Many kinds of language resources are developed 100 millions of raw text corpus 10 millions of tagged corpus 10 thousands of syntactic tagged corpus Many language resources are available in Internet 8
R&D Status in Korea Language processing tools Morphological Analyzer Commercially available Syntactic Analyzer R&D is supported by government 9
R&D Status in Machine Translation Developed from early 1980 Many national projects have been tried Korean/Japanese machine translation Japanese-Korean machine translator Commercialized in 1997 Available as S/W package or in Internet service Korean-Japanese machine translator Commercialized in 1999 Not yet perfect than we expect 10
R&D Status in Machine Translation English-Korean Machine Translator 3-4 commercialized products in Korea Not yet practical in general domain For specific application such as patent document, 85% accuracy Try to another specific domain such as broadcasting caption Korean-English Machine Translator More difficult than English-Korean machine translation Only one commercial product in Korea, but not practical 80% accuracy for patent document 11
R&D Status in Machine Translation Chinese-Korean machine translator Chinese-Korean machine translation is available in a portal site but not pratical Korean-Chinese machine translator No commercial product A national project for broadcasting news 12
R&D Status Information Retrieval Many information retrieval engines were commercialized A question & answering system has been developing in the encyclopedia domain with 85% of user s satisfaction Voice Recognition Applied to a specific domain such as commander of robot and telematics Spelling checker and corrector A spelling checker and corrector has been included in a Korean word processor 13
Relevant Government Ministries Ministry of Information & Communications The biggest sponsor for local language computing area Most national projects has been implemented in ETRI IT839 strategy was planned and executed by MIC These project are developing in the context of IT839 14
Relevant Government Ministries Ministry of Culture and Tourism The Ministry has some policies and national projects for maintenance and development of Korean language The most important project of the Ministry is The 21th century Sejong Project Ministry of Science and Technology The Ministry is mandate to provide central directions, planning, coordination and evaluation of all science and technology activities in Korea 15
Other Public Organizations ETRI (Electronics and Telecommunications Research Institute) The biggest institute of IT area in Korea 1,800 regular researchers and 500 contract members Research center for language computing Machine Translation, Question & Answering System, Text Mining & Voice Recognition The national Institute of the Korean Language To build up language databases for making of national language policy 16
Other Public Organizations ATS (Korean Agency for Technology and Standards) To lead the industrial standards Promoting the conformity of Korean standards with international standards Hosting meetings for international standardization Conducting researches for standardization Koterm (Korea Technology Research Center for Language and Knowledge Engineering) To make terminology standardization Proposed the language resource standard subcommittee to ISO/TC37 in 2001 17
Policy Initiative U-IT839 Strategy 8 new services 3 high-tech Infrastructures 9 new growth engines The 21th century Sejong Project Corpus construction Electronic dictionary construction Non-standard characters 18
Some Social Issues Digital Divide Information Gaps rather than Generation Gaps Illiteracy of Internet and Computer Language Misusage Abbreviated, Squized words for Messenger or E- mail Privacy Hacking and Virus Prevention and Counteraction Private Information Protection 19
Challenges and Conclusion World best IT Infrastructure High-speed Broadband Network Early Adaptive Internet Users High Diffusion of Mobile Phone To build Infrastructure of Knowledge based Economic Society National IT Ontology Infrastructure Development Ontology and Semantic Web Language Computing is the most important Technology 20
References [1] National Internet Development Agency of Korea, 2006 Korea Internet White paper, 2007, http://www.nida,or,kr (English) [2] Korea.net, Gateway Korea, 2007, http://www.korea.net (English) [3] Se Young Park, Machine Translation in Korea, Invited paper, MT SUMMIT, Singapore, 1999(English) [4] Se Young, Park, S/W and NLP R&D Strategy in Korea, Invited Talk, IJCNLP, 2005 (English) [5] ETRI, ETRI Guide, 2006, http://www.etri.re.kr (Korean and English) [6] ETRI, ETRI is setting Korea up to be an IT superpower by driving, 2006, http://www.etri.re.kr (Korean and English) [7] Ministry of Science and Technology, NEW VISION, FIRST START, Mission of the second S&T Deputy Prime Ministry, 2006, http://www.most.go.kr (English) [8] Ministry of Information and Communication, IT839 Strategy, 2005, http://www.mic.go.kr (English) [9] The National Institute of the Korean Language, The 21st century Sejong Project, http://www.korean.or.kr (English) [10] Korean Agency for Technology and Standards, http://www.ats.or.kr [11] Koterm, http://www.korterm.re.kr 21