以討論人物隱含類別輔助論壇討論句自動分類之研究

No Thumbnail Available

Date

2010

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

網路論壇是使用者自由分享意見交流的平台,同時也充滿著各式各樣的討論,熱門的棒球賽事一天動輒百篇以上的回覆量,使用者不容易從大量的討論內容裡找到自己感興趣的觀點。本論文研究方法可以透過論壇討論內容的人物類型,依其字詞的關聯程度,對討論句進行分類。若討論句出現人名,則可用來查詢近期的新聞文章,作為擴展資料來源。將討論句自動分類整理為投手句以及野手句。分類的過程中,特徵的選擇是重要的一環。在特徵選取過程中,我們透過統計方法得到足以代表各自分類的特徵,並用以建立特徵向量,透過這些特徵向量進行分類學習,來建立分類模型。讓論壇內容的討論句透過這些工具,來決定分類為投手句或野手句的類別。實驗結果顯示,本論文系統所決定的類別與實際的類別有很高的一致性,當利用新聞擴展句之後,也能得到更好的分類效果。
There are kinds of discussions in the web forum which is also a platform for users sharing their opinions. In the hot baseball game , there are hundreds of replies , and the users are not easy to find the viewpoints they are interested in from the large number of discussions. In the thesis, the discussed sentences were classified by the discussed person or the associative level of a word in the content. The discussed sentence will be classified into pitcher-related sentences or batter-related sentences. In our work, feature selection plays an important part in the classification process. During this process, we get representable feature and make the feature vectors by the Jason-Shannon divergence. Using these vectors, our system learns and makes a classification model. Through these processes, the discussed sentences in the forum could be classified into the appropriate class. The experiment results show that the class assigned by the proposed method is consistent. In our work, if there is a person name in the sentence, we could query the latest news as a news expansive sentence. It gets a better result to use the news expansive sentences.

Description

Keywords

分類, 棒球, 論壇, 人物, 新聞擴展句, classification, baseball, forum

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By