針對空拍影像物件偵測之改良型YOLOv7演算法研究

鍾宜修; Chung, Yi-Hsiu

針對空拍影像物件偵測之改良型YOLOv7演算法研究

Files

202400045200-107524.pdf (29.04 MB)

Date

2024

Authors

鍾宜修

Chung, Yi-Hsiu

Abstract

近幾年無人機的技術發展迅速，飛行距離越來越遠、體積也不斷縮小，甚至能自動飛行，因此能應用的範圍也越來越廣泛，例如交通監測、工業或自然環境巡檢等等。另外隨著人工智慧的興起，現在無人機也會結合人工智慧演算法協助其辨識影像。由於無人機所拍攝的影像內物件往往尺寸偏小，且無人機本身的運算支援有限，因此如何提升小物件的辨識效果且同時降低模型運算時所需的資源至關重要。本論文以YOLOv7為基礎模型進行改良，提升它對小物件的偵測效果且同時降低模型參數量及計算量，我們以VisDrone-DET2019資料集來驗證模型改良成效。總共修改五種方式，第一種方式是將ELAN (Efficient Layer Aggregation Network)替換成M-ELAN (Modified Efficient Layer Aggregation Network)，第二種方式是在高階特徵層添加M-FLAM (Modified Feature Layer Attention Module)，第三種方式是將特徵融合的結構從PANet (Path Aggregation Network)改成ResFF (Residual Feature Fusion)，第四種方式是將模型內下採樣的模塊改成I-MP模塊 (Improved MaxPool Module)，最後一種方式是將SPPCSPC (Spatial Pyramid Pooling Cross Stage Partial Networks)替換成GSPP(Group Spatial Pyramid Pooling)。綜合以上方法，將mAP (mean Average Precision)提升1%，同時模型參數量卻下降24.5%，模型計算量GFLOPs (Giga Floating Point of Operations)也降低13.7%。
In recent years, the advancement of unmanned aerial vehicle technology has been rapid, with increased flying distances, reduced sizes, and even autonomous capabilities. Consequently, the scope of applications has expanded significantly, including traffic surveillance, industrial inspections, and environmental monitoring. Moreover, with the rise of artificial intelligence, UAV now integrate AI algorithms to aid in image recognition. However, due to the typically small size of objects captured by drones and limited computational resource, enhancing the recognition of small objects while simultaneously reducing the computational required is crucial.This paper proposes improvements to the YOLOv7 as base model to enhance its ability to detection small objects while reducing model parameters and computation volume. We validate the model enhancements using the VisDrone-DET2019 dataset. There are five modifications. First of all, replacing Efficient Layer Aggregation Network (ELAN) with Modified Efficient Layer Aggregation Network (M-ELAN). Secondly, adding Modified Feature Layer Attention Module (M-FLAM) at high-level feature layers. Then modifying the feature fusion structure from Path Aggregation Network (PANet) to Residual Feature Fusion (ResFF). The fourth modification is replacing the downsampling modules within the model with I-MP (Improved MaxPool) modules. Finally, replacing Spatial Pyramid Pooling Cross Stage Partial Networks (SPPCSPC) with Group Spatial Pyramid Pooling (GSPP).Combining these methods led to a 1% increase in mean Average Precision (mAP), while reducing model parameters by 24.5% and decreasing computation volume Giga Floating Point of Operations (GFLOPs) by 13.7%.

Keywords

深度學習, 物件偵測, YOLOv7, 無人機, 小物件, Deep learning, Object detection, YOLOv7, UAV, Small object

URI

https://etds.lib.ntnu.edu.tw/thesis/detail/1e6ef519ec75c48353b135a53d7e29e9/
http://rportal.lib.ntnu.edu.tw/handle/20.500.12235/122943

Collections

學位論文

Full item page

針對空拍影像物件偵測之改良型YOLOv7演算法研究

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By