基於影像到動作轉換之未知環境下目標物件夾取策略
No Thumbnail Available
Date
2023
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
本論文的主要目標是利用僅有的彩色影像,使機械手臂在沒有相關的3D位置信息的情況下夾取靜態或動態目標。所提出方法的優點包括在未知環境下,為各種類型的機器人手臂提供一類通用控制策略、能夠自主生成相應的自由度動作指令的影像到動作轉換,以及不需要目標位置。首先,使用YOLO (You Only Look Once)算法進行影像分割,然後將彩色影像分成不同的有意義的對象或區域。採用近端策略最佳化(Proximal Policy Optimization, PPO)算法對卷積神經網絡 (CNN)模型進行訓練。機械手臂和目標物件的彩色影像以及馬達的轉動量分別是CNN模型的輸入和輸出。為了避免機器人手臂與物體碰撞造成機構損壞,在深度增強式學習訓練中使用Gazebo模擬環境。最後,實驗結果展示了所提出策略的有效性。
The main objective of this thesis is to utilize only RGB images to let a robotic arm grasp a static or dynamic target without the related 3D position information. The advantages of the proposed method include a class of general control strategies for various types of robotic arms in uncertain environments, image-to-action translations that can autonomously generate the corresponding degrees of freedom action instructions, and a target position that is not necessary. Firstly, the YOLO (You Only Look Once) algorithm performs the image segmentation. Then every RGB image divides into different meaningful objects or regions. The proximal policy optimization (PPO) algorithm trains the CNN model. The RGB images which keep only the robotic arm and target and the rotational amounts of the motors are the inputs and outputs of the CNN model, respectively. To avoid damage to the mechanism caused by the robotic arm colliding with objects, the Gazebo simulated environment is utilized during deep reinforcement learning training. Finally, some illustrative examples show how effective the proposed strategy is.
The main objective of this thesis is to utilize only RGB images to let a robotic arm grasp a static or dynamic target without the related 3D position information. The advantages of the proposed method include a class of general control strategies for various types of robotic arms in uncertain environments, image-to-action translations that can autonomously generate the corresponding degrees of freedom action instructions, and a target position that is not necessary. Firstly, the YOLO (You Only Look Once) algorithm performs the image segmentation. Then every RGB image divides into different meaningful objects or regions. The proximal policy optimization (PPO) algorithm trains the CNN model. The RGB images which keep only the robotic arm and target and the rotational amounts of the motors are the inputs and outputs of the CNN model, respectively. To avoid damage to the mechanism caused by the robotic arm colliding with objects, the Gazebo simulated environment is utilized during deep reinforcement learning training. Finally, some illustrative examples show how effective the proposed strategy is.
Description
Keywords
深度增強式學習, 目標物件夾取策略, 近端策略最佳化, 影像到動作的轉換, Deep Reinforcement learning, target grasp strategy, proximal policy optimization(PPO), image-to-action translations