考慮密度限制之數值區間關聯規則探勘

No Thumbnail Available

Date

2003

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

本論文提出一個新的數值區間關聯規則探勘方法,稱為PQAR(Partition-based Quantitative Association Rule mining)演算法,以空間分割方式先探勘出滿足相對密度限制的常見數值區間項集合,再由其產生數值區間關聯規則。PQAR方法在探勘常見數值區間項集合時除了考慮最小支持度門檻值的限制外,亦訂定相對密度的限制,避免在相同支持度門檻值要求下,找出資料分佈不集中的區間。此外,PQAR方法採用空間分割方式探勘出符合要求的最大數值區間,不但減少需要掃描資料庫的次數,使得執行時間大為縮短,亦使探勘結果中的區間個數較少,達到找出精簡而重要的數值區間關聯規則之目的。由實驗結果顯示PQAR方法在探勘具不同支持度及相對密度的常見區間項集合,都有很高的正確率。而且在相同的正確率的條件下,本論文方法也較QAR演算法的執行更有效率。
A new approach, called PQAR (Partition-based Quantitative Association Rules mining) algorithm, is proposed in this thesis for mining quantitative association rules. This approach finds out all the frequent interval itemsets that satisfy the minimum relative density requirement based on space partitioning method, and the quantitative association rules are produced from these interval itemsets. When mining frequent interval itemsets, PQAR algorithm considers not only the minimum support as the filtering condition, but also the minimum relative density to prevent finding the intervals in which data distribution is sparse. In addition, based on space partitioning method to find out the largest intervals that meet the threshold requirements, the number of qualified intervals is reduced such that the resulting rules are significant and concise. Furthermore, because the number of times to scan database is reduced possibly in PQAR algorithm, the mining time is shorten considerably than the previous approaches. The experimental results show that, when testing data sets with various supports and relative densities setting, PQAR algorithm obtains results with high accuracy and recall in most cases. Moreover, under the same accuracy condition, PQAR algorithm takes much less time than QAR algorithm.

Description

Keywords

資料探勘, 關聯規則, data mining, association rule

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By