以位元序列為基礎探勘容錯重複樣式之研究
dc.contributor | 柯佳伶 | zh_TW |
dc.contributor | Jia-Ling Koh | en_US |
dc.contributor.author | 龔毓婷 | zh_TW |
dc.contributor.author | Yu-Ting, Kung | en_US |
dc.date.accessioned | 2019-08-29T07:45:35Z | |
dc.date.available | 2004-7-25 | |
dc.date.available | 2019-08-29T07:45:35Z | |
dc.date.issued | 2004 | |
dc.description.abstract | 本論文提出一個有效率的方法對資料序列探勘出前K個非顯然且滿足最小長度限制的容錯重複樣式。我們擴展出現位元序列的表示法,設計出容錯出現位元序列,在考慮有插入或刪除錯誤的容錯情況下,用來表示候選樣式在資料序列中的出現位置。本論文提出二個演算法,分別命名為TFTRP-Mine (Top-K non-trivial FT-RPs Mining)及RE-TFTRP-Mine (REfinement of TFTRP-Mine)。兩個演算法皆根據論文中歸納出的遞迴公式,可系統性地計算出一個樣式的容錯出現位元序列,因而很有效率地得到每個侯選樣式的容錯出現次數。此外,RE-TFTRP-Mine演算法採用額外兩個技巧來砍除搜尋空間以加快探勘效率。由實驗結果得知,當K及min_len的值較小時,RE-TFTRP-Mine比TFTRP-Mine有較好的執行效率;而由實際的樂曲資料之實驗顯示,當探勘過程中有考慮容錯比對時,可以找出更多重要且隱藏的重複樣式。 | zh_TW |
dc.description.abstract | An efficient way of mining top-K non-trivial fault-tolerant repeating patterns (FT-RPs in short) with length no less than min_len for data sequences is proposed in this thesis. By extending the idea of appearing bit sequences, fault-tolerant appearing bit sequences are defined to represent the locations where candidate patterns appear in a data sequence with insertion/deletion errors allowed. Two algorithms, named TFTRP-Mine (Top-K non-trivial FT-RPs Mining) and RE-TFTRP-Mine (REfinement of TFTRP-Mine), respectively, are proposed. Both of two algorithms use the recursive formulas to obtain fault-tolerant appearing bit sequences of a pattern systematically and then the fault-tolerant frequency of each candidate pattern could be counted quickly. Besides, RE-TFTRP-Mine adopts two additional strategies to prune the searching space in order to increase the mining efficiency. From experiment results, we can know that RE-TFTRP-Mine outperforms TFTRP-Mine algorithm when K and min_len are small. In addition, when adopting fault tolerant mining, more important and implicit repeating patterns could be found for music objects. | en_US |
dc.description.sponsorship | 資訊教育研究所 | zh_TW |
dc.identifier | G0069108029 | |
dc.identifier.uri | http://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22G0069108029%22.&%22.id.& | |
dc.identifier.uri | http://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/92676 | |
dc.language | 英文 | |
dc.subject | 重複樣式 | zh_TW |
dc.subject | 容錯探勘 | zh_TW |
dc.subject | 位元序列 | zh_TW |
dc.subject | 前K個樣式探勘 | zh_TW |
dc.subject | 資料序列 | zh_TW |
dc.subject | Repeating Patterns | en_US |
dc.subject | Fault-Tolerant Mining | en_US |
dc.subject | Bit Sequences | en_US |
dc.subject | Top-K patterns Mining | en_US |
dc.subject | Data Sequence | en_US |
dc.title | 以位元序列為基礎探勘容錯重複樣式之研究 | zh_TW |
dc.title | An Efficient Approach for Mining Fault-Tolerant Repeating Patterns based on Bit Sequences | en_US |
Files
Original bundle
1 - 5 of 6