實現於圖形處理器的高性能平行位置檢知近似字串比對演算法

黃俊程; Huang, Chun-cheng

實現於圖形處理器的高性能平行位置檢知近似字串比對演算法

dc.contributor	林政宏	zh_TW
dc.contributor	Lin, Cheng-Hung	en_US
dc.contributor.author	黃俊程	zh_TW
dc.contributor.author	Huang, Chun-cheng	en_US
dc.date.accessioned	2019-09-03T10:45:18Z
dc.date.available	2022-12-31
dc.date.available	2019-09-03T10:45:18Z
dc.date.issued	2017
dc.description.abstract	近似字串比對被廣泛的應用於各種不同的領域。例如：去氧核糖核酸序列搜尋比對、電腦文字輸入校正、文字資料探勘以及垃圾電子郵件過濾等。近似字串比對的設計是用來搜索文件中所有近似字串文字樣式字串及使用者定義的插入、刪除、取代的容許誤差值搜索後的匹配的位置。在眾多被提出的近似字串比對演算法中，位元平行演算法被認定是最適、最有效率的演算法。然而，傳統的位元平行演算法無法同時偵測樣式字串匹配字串的起始及結束位置。除此之外，因應現代多核心處理器的硬體發展，以及叢集運算、雲端運算以及大數據處理的需求。如何加速位元平行演算法成為當今重要的課題。本研究分別提出利用多串流平行和高維度平行這兩種位置偵測近似字串比對方法並利用圖形運算處理器(GPU)來加速演算法。實驗結果顯示，比較高維度平行演算法在圖形運算處理器(GPU)以及CPU的執行效能，圖形運算處理器(GPU)版本無論是在作業系統執行環境或者是在圖形顯示處理(GPU)核心執行的情況，效能皆得到顯著的提升。相較於其他相關研究，本研究所提出的高維度平行的方法達到了 11 至 105 倍的效能改善。最終，我們開發了一個讓使用者可以在線上執行位置偵測近似字串比對並找出所有樣式字串匹配結果起點及終點位置網路服務。	zh_TW
dc.description.abstract	Approximate string matching has been widely used in many applications, including deoxyribonucleic acid sequence searching, spell checking, text mining, and spam filters. The method is designed to find all locations of strings that approximately match a pattern in accordance with the number of insertion, deletion, and substitution operations. Among the proposed algorithms, the bit-parallel algorithms are considered to be the best and highly efficient algorithms. However, the traditional bit-parallel algorithms lacks the ability of identifying the start and end positions of a matched pattern. Furthermore, acceleration of the bit-parallel algorithms has become a crucial issue for processing big data nowadays. In this paper, we propose two kinds of parallel location-aware algorithms called data-segmented parallelism and high-degree parallelism as means to accelerate approximate string matching using graphic processing units. Experimental results show that the high-degree parallelism on GPUs achieves significant improvement in system and kernel throughputs compared to CPU counterparts. Compared to state-of-the-art approaches, the proposed high-degree parallelism achieves 11 to 105 times improvement. Finally, we develop a web service which allows users to perform approximate string matching on line and deliver all matched substrings with the start and end positions.	en_US
dc.description.sponsorship	電機工程學系	zh_TW
dc.identifier	G060075027H
dc.identifier.uri	http://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22G060075027H%22.&%22.id.&
dc.identifier.uri	http://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/95606
dc.language	中文
dc.subject	近似字串比對	zh_TW
dc.subject	位元平行演算法	zh_TW
dc.subject	圖形運算處理器	zh_TW
dc.subject	編輯距離	zh_TW
dc.subject	非確定有限狀態自動機	zh_TW
dc.subject	平行演算法	zh_TW
dc.subject	approximate string matching	en_US
dc.subject	bit-parallel algorithm	en_US
dc.subject	graphic processing units	en_US
dc.subject	Levenshtein distance	en_US
dc.subject	nondeterministic finite automaton	en_US
dc.subject	parallel algorithm	en_US
dc.title	實現於圖形處理器的高性能平行位置檢知近似字串比對演算法	zh_TW
dc.title	High-Performance Parallel Location-Aware Algorithms for Approximate String Matching on GPUs	en_US

Collections

學位論文

實現於圖形處理器的高性能平行位置檢知近似字串比對演算法

Files

Collections