商品情報にスキップ
1 1

コーパス間の類似語の差異に着目したマイクロブログにおける隠語検出

コーパス間の類似語の差異に着目したマイクロブログにおける隠語検出

通常価格 ¥770 JPY
通常価格 セール価格 ¥770 JPY
セール 売り切れ
税込

カテゴリ: 論文誌(論文単位)

グループ名: 【C】電子・情報・システム部門

発行日: 2022/02/01

タイトル(英語): Codewords Detection in Microblogs Focusing on Differences in Word Use Between Two Corpora

著者名: 羽田 拓朗(電気通信大学大学院情報理工学研究科/警察庁長官官房企画課兼情報通信局情報管理課),清 雄一(電気通信大学大学院情報理工学研究科),田原 康之(電気通信大学大学院情報理工学研究科),大須賀 昭彦(電気通信大学大学院情報理工学研究科)

著者名(英語): Takuro Hada (Graduate School of Informatics and Engineering, The University of Electro-Communications/National Police Agency), Yuichi Sei (Graduate School of Informatics and Engineering, The University of Electro-Communications), Yasuyuki Tahara (Graduate School of Informatics and Engineering, The University of Electro-Communications), Akihiko Ohsuga (Graduate School of Informatics and Engineering, The University of Electro-Communications)

キーワード: 隠語,類似語,単語分散表現,マイクロブログ  codewords,similar words,word embedding,microblog

要約(英語): In recent years, the number of drug trafficking using microblogs has been increasing, which has become a social problem. While cyber patrols have been conducted to crack down on such crimes, those who post crime-inducing messages use terms that camouflage their criminal intentions so-called “codewords” to avoid keywords such as “enjo kosai,” “marijuana,” and “methamphetamine” that may be monitored and attract police attention. These codewords change once they become popular, so it is always necessary to keep track of the latest codewords. Therefore, we propose a new method for detecting the latest codewords. In this paper, we offer a new way of detecting code words from the differences in the words used in posts to detect codewords used in a crime. Specifically, we propose a new method in which we divide words into two corpora, depending on whether a post containing a word has a criminal intention and detect codewords from the differences between similar words of the same word between two corpora. To confirm the effectiveness of the proposed method, we conducted an experiment to detect codewords. The experimental results showed that the proposed method was able to detect codewords with an accuracy of 0.56 persentages points higher than that of the baseline method. The experiment shows that the proposed method can reduce the burden of continuously monitoring code words by rapidly and automatically detecting new codewords that change with time; thus, it provides the possibility of showing clues for crimes.

本誌: 電気学会論文誌C(電子・情報・システム部門誌) Vol.142 No.2 (2022) 特集:確率的最適化手法・機械学習技術を用いたシステム知能化の最新動向

本誌掲載ページ: 177-189 p

原稿種別: 論文/日本語

電子版へのリンク: https://www.jstage.jst.go.jp/article/ieejeiss/142/2/142_177/_article/-char/ja/

販売タイプ
書籍サイズ
ページ数
詳細を表示する