CHINESE BASE PHRASES CHUNKING BASED ON LDCRF
CHINESE BASE PHRASES CHUNKING BASED ON LDCRF
カテゴリ: 部門大会
論文No: GS13-2
グループ名: 【C】平成21年電気学会電子・情報・システム部門大会講演論文集
発行日: 2009/09/03
タイトル(英語): CHINESE BASE PHRASES CHUNKING BASED ON LDCRF
著者名: 孫暁 (徳島大学),任福継 (徳島大学)
著者名(英語): Xiao Sun(The University of Tokushima),Fuji Ren(The University of Tokushima)
キーワード: Natural Language Processing|Base Phrases Chunking|Latent Variable Model|Lantent Dynamic Contional Random Fields
要約(日本語): The Multi words expression is any phrase that is not entirely predictable on the basis of standard grammar rules and lexical entries. No immediate counterexamples to the claim that any expression that can be realised hyphenated/as a single lexeme or alternatively with spaces (e.g. mailman/postman vs. mail/post man), is a MWE. This could be used in the evaluation of extraction techniques, possibly using external resources to determine whether extracted expressions can be expressed hyphenated/without spaces (e.g. determine "optimal extraction volume" as the point where the ratio of such expressions is maximised). The Chinese multi words expression is problematic in Machine Translation and other natural language processing tasks. The Latent Dynamic conditional Random Fields, which is better than the traditional Contitional Random fields and the SVM, are trained to detect the Chinese multi words expression. After the detection, GLR algrithm is applied to detect the inner structure of the multi words expression and the sense of the center words in the multi words expression is also imported from the word sense dictionary. The detected multi words expression phrased with corresponding information are applied in the Chinese Japanese Machine Translation.
The Latent variable model is better than traditional models, the importation of the information of the base noun phrase will increase the effectiveness of the Chinese Japanese Machine Translation.
PDFファイルサイズ: 2,909 Kバイト
受取状況を読み込めませんでした
