改良深度學習的人形機器人於高動態雜訊之視覺定位
No Thumbnail Available
Date
2024
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
一些基於相機或其他技術的視覺 SLAM 方法已經被提出。 光學感測器來導航和了解其環境。例如, ORB-SLAM 是一個完 整的 SLAM 系統,包括視覺里程計、追蹤和定位 ORB-SLAM 僅 依賴使用單目視攝影機進行特徵偵測,但在與人形機器人一起工 作時,會出現嚴重的問題晃動模糊問題。深度學習已被證明對於穩健且即時的單眼影像重新定位是有 效的。視覺定位的深度學習是基於卷積神經網路來學習 6-DoF 姿 勢。 它對於複雜的照明和運動條件更加穩健。然而,深度學習的 問題是視覺定位方法的一個缺點是它們需要大量的資料集和對這 些資料集的準確標記。本文也提出了標記視覺定位資料和自動辨識的方法用於訓練 視覺定位的資料集。我們的標籤為基於 2D 平面( x 軸、 y 軸、 方向)的姿勢。最後,就結果而言可見,深度學習方法確實可以 解決運動模糊的問題。比較與我們以往的系統相比,視覺定位方 法減少了最大誤差率 31.73% ,平均錯誤率減少了 55.18% 。
Some visual SLAM methods have been proposed, based on cameras or other optical sensors to navigate and understand their environment. For example,ORBSLAM is a complete SLAM system, including visual odometry, tracking, and loop back detection. ORB-SLAM depends solely on feature detection using a monocular camera, but when working with humanoid robots, there will be serious motion blur problems.Deep learning has shown to be effective for robust and real-time monocular image relocalization. The deep learning for visual localization is based on a convolutional neural network to learn the 6-DoF pose. It is more robust to codifficult lighting and motion conditions. However, the problem with deep learning methods for visual localization is that they require a lot of datasets and accurate labeling for these datasets.This thesis also proposes methods for labeling visual localization data and augmenting datasets for training visual localization. Our labels regress the camera pose based on a 2D plane (x-axis, y-axis, orientation). Finally, in terms of results, deep learning methods can indeed solve the problem of motion blur. Compared to our previous systems, the visual localization method reduces maximum errors by 31.73% and average errors by 55.18%.
Some visual SLAM methods have been proposed, based on cameras or other optical sensors to navigate and understand their environment. For example,ORBSLAM is a complete SLAM system, including visual odometry, tracking, and loop back detection. ORB-SLAM depends solely on feature detection using a monocular camera, but when working with humanoid robots, there will be serious motion blur problems.Deep learning has shown to be effective for robust and real-time monocular image relocalization. The deep learning for visual localization is based on a convolutional neural network to learn the 6-DoF pose. It is more robust to codifficult lighting and motion conditions. However, the problem with deep learning methods for visual localization is that they require a lot of datasets and accurate labeling for these datasets.This thesis also proposes methods for labeling visual localization data and augmenting datasets for training visual localization. Our labels regress the camera pose based on a 2D plane (x-axis, y-axis, orientation). Finally, in terms of results, deep learning methods can indeed solve the problem of motion blur. Compared to our previous systems, the visual localization method reduces maximum errors by 31.73% and average errors by 55.18%.
Description
Keywords
人形機器人, 深度學習, 機器人定位, 視覺里程計, Humanoid Robot, Deep Learning, Robot Localization, Visual Odometry