Could Authors of Academic Reports be Discerned Using Formatting Information Obtained by Parsing XML of .docx Documents?
Could Authors of Academic Reports be Discerned Using Formatting Information Obtained by Parsing XML of .docx Documents?
カテゴリ: 論文誌(論文単位)
グループ名: 【C】電子・情報・システム部門
発行日: 2023/01/01
タイトル(英語): Could Authors of Academic Reports be Discerned Using Formatting Information Obtained by Parsing XML of .docx Documents?
著者名: Asako Ohno (Faculty of Engineering, Osaka Sangyo University)
著者名(英語): Asako Ohno (Faculty of Engineering, Osaka Sangyo University)
キーワード: author's writing feature,word formatting information,academic report,plagiarism detection,decision tree,random forest
要約(英語): Electronic documents are easier to copy, paste, or duplicate than handwritten reports. Consequently, plagiarism in class assignment reports is increasing. Existing plagiarism detection methods primarily calculate similarity based on matching characters or words in a document. However, class assignment reports are written simultaneously by multiple students on the same topic, and the teacher often specifies the format in detail, making the contents quite comparable. The risk of false-positive results is preventable if the teachers visually check whether matching parts of class assignment reports are coincidental or plagiarized. However, this is a time-consuming and labor-intensive task. Herein, we propose a method to discriminate authors using word-formatting information obtained by parsing Extensible Markup Language (XML) of word .docx documents as document creation features. We conducted an experiment using university class reports and visualized obtained classification rules that discriminate between the same author's writing using a decision tree. We also evaluated classification performance using random forests.
本誌: 電気学会論文誌C(電子・情報・システム部門誌) Vol.143 No.1 (2023) 特集:電子回路関連技術
本誌掲載ページ: 91-100 p
原稿種別: 論文/英語
電子版へのリンク: https://www.jstage.jst.go.jp/article/ieejeiss/143/1/143_91/_article/-char/ja/
受取状況を読み込めませんでした
