實現於圖形處理器的高性能平行位置檢知近似字串比對演算法

dc.contributor林政宏zh_TW
dc.contributorLin, Cheng-Hungen_US
dc.contributor.author黃俊程zh_TW
dc.contributor.authorHuang, Chun-chengen_US
dc.date.accessioned2019-09-03T10:45:18Z
dc.date.available2022-12-31
dc.date.available2019-09-03T10:45:18Z
dc.date.issued2017
dc.description.abstract近似字串比對被廣泛的應用於各種不同的領域。例如:去氧核糖核酸序列搜尋比對、電腦文字輸入校正、文字資料探勘以及垃圾電子郵件過濾等。近似字串比對的設計是用來搜索文件中所有近似字串文字樣式字串及使用者定義的插入、刪除、取代的容許誤差值搜索後的匹配的位置。在眾多被提出的近似字串比對演算法中,位元平行演算法被認定是最適、最有效率的演算法。然而,傳統的位元平行演算法無法同時偵測樣式字串匹配字串的起始及結束位置。除此之外,因應現代多核心處理器的硬體發展,以及叢集運算、雲端運算以及大數據處理的需求。如何加速位元平行演算法成為當今重要的課題。本研究分別提出利用多串流平行和高維度平行這兩種位置偵測近似字串比對方法並利用圖形運算處理器(GPU)來加速演算法。實驗結果顯示,比較高維度平行演算法在圖形運算處理器(GPU)以及CPU的執行效能,圖形運算處理器(GPU)版本無論是在作業系統執行環境或者是在圖形顯示處理(GPU)核心執行的情況,效能皆得到顯著的提升。相較於其他相關研究,本研究所提出的高維度平行的方法達到了 11 至 105 倍的效能改善。最終,我們開發了一個讓使用者可以在線上執行位置偵測近似字串比對並找出所有樣式字串匹配結果起點及終點位置網路服務。zh_TW
dc.description.abstractApproximate string matching has been widely used in many applications, including deoxyribonucleic acid sequence searching, spell checking, text mining, and spam filters. The method is designed to find all locations of strings that approximately match a pattern in accordance with the number of insertion, deletion, and substitution operations. Among the proposed algorithms, the bit-parallel algorithms are considered to be the best and highly efficient algorithms. However, the traditional bit-parallel algorithms lacks the ability of identifying the start and end positions of a matched pattern. Furthermore, acceleration of the bit-parallel algorithms has become a crucial issue for processing big data nowadays. In this paper, we propose two kinds of parallel location-aware algorithms called data-segmented parallelism and high-degree parallelism as means to accelerate approximate string matching using graphic processing units. Experimental results show that the high-degree parallelism on GPUs achieves significant improvement in system and kernel throughputs compared to CPU counterparts. Compared to state-of-the-art approaches, the proposed high-degree parallelism achieves 11 to 105 times improvement. Finally, we develop a web service which allows users to perform approximate string matching on line and deliver all matched substrings with the start and end positions.en_US
dc.description.sponsorship電機工程學系zh_TW
dc.identifierG060075027H
dc.identifier.urihttp://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22G060075027H%22.&%22.id.&
dc.identifier.urihttp://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/95606
dc.language中文
dc.subject近似字串比對zh_TW
dc.subject位元平行演算法zh_TW
dc.subject圖形運算處理器zh_TW
dc.subject編輯距離zh_TW
dc.subject非確定有限狀態自動機zh_TW
dc.subject平行演算法zh_TW
dc.subjectapproximate string matchingen_US
dc.subjectbit-parallel algorithmen_US
dc.subjectgraphic processing unitsen_US
dc.subjectLevenshtein distanceen_US
dc.subjectnondeterministic finite automatonen_US
dc.subjectparallel algorithmen_US
dc.title實現於圖形處理器的高性能平行位置檢知近似字串比對演算法zh_TW
dc.titleHigh-Performance Parallel Location-Aware Algorithms for Approximate String Matching on GPUsen_US

Files

Collections