以深度學習為基礎之多人即時動作辨識系統
No Thumbnail Available
Date
2020
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
隨著人工智慧領域快速的發展,人類動作辨識技術在近年來獲得極大關注,應用也非常廣泛,例如在長照中心對受照護者提供即時活動偵測,在工廠環境及公共環境中提供異常行為之偵測等,都能藉由動作辨識的結果實現提醒、警示、紀錄等智慧監控之功能,預防意外的發生,也解決人力資源短缺的問題。因此,如何利用穩健的動作辨識達到即時智慧監控的目的,實為一重要議題。本文提出一套以深度學習為基礎之多人即時動作辨識系統,以達到智慧型監控的目的,並應用於長照環境中。本系統結合YOLOv3與Deep SORT演算法,能從影片中同時偵測多個人物,並進行追蹤。在人物面對鏡頭時,還能透過FaceNet架構辨識人物身份姓名。對於遠距離人物而言,我們開發一套zoom in方法,根據人物框選的大小自動使用高解析度畫面以獲得更好的辨識效果。為了提升系統的穩健性,在將畫面輸入I3D前,我們會先模糊人物以外區域,減少背景帶來的影響。最後,利用非最大值抑制方法,降低因多個滑動視窗所造成的不穩定情形。實驗結果顯示,本文所提出的方法能夠實現一套即時多人之動作辨識系統。
Smart surveillance has a huge advantage in the aspect of human recognition and interpretation, and can be applied to the field of care, such as home, kindergarten, nursing home and day-care center. For caregivers, it supplies real-time activity detection to avoid accidents when they take care of patients. For agency and society, it solves the shortage of human resources. With the rapid development of artificial intelligence and related applications, how to use action recognition to achieve the real-time smart surveillance is an important issue. In this thesis, we propose a deep-learning-based multiple-person action recognition system to establish a smart surveillance system for a long-care environment. Combining YOLOv3 and SORT algorithm, we can detect and track multiple people from a video. If the person faces the camera, we can even recognize the person's name through FaceNet. Thanks to the high resolution of images from the camera, we develop a ‘zoom in’ approach according to the size of the bounding boxes to obtain a more satisfactory action recognition results for people locating at longer distance in the environment. To achieve a robust recognition result, video frames including the bounding boxes are sent to I3D to detect the action of the people where background areas are blurred to reduce noise. Finally, a NMS approach is adopted based on the results from I3D due to various sliding windows to improve the accuracy of the recognition result. Experimental results show that real-time performance for multiple-person action recognition can be achieved by using the proposed approach.
Smart surveillance has a huge advantage in the aspect of human recognition and interpretation, and can be applied to the field of care, such as home, kindergarten, nursing home and day-care center. For caregivers, it supplies real-time activity detection to avoid accidents when they take care of patients. For agency and society, it solves the shortage of human resources. With the rapid development of artificial intelligence and related applications, how to use action recognition to achieve the real-time smart surveillance is an important issue. In this thesis, we propose a deep-learning-based multiple-person action recognition system to establish a smart surveillance system for a long-care environment. Combining YOLOv3 and SORT algorithm, we can detect and track multiple people from a video. If the person faces the camera, we can even recognize the person's name through FaceNet. Thanks to the high resolution of images from the camera, we develop a ‘zoom in’ approach according to the size of the bounding boxes to obtain a more satisfactory action recognition results for people locating at longer distance in the environment. To achieve a robust recognition result, video frames including the bounding boxes are sent to I3D to detect the action of the people where background areas are blurred to reduce noise. Finally, a NMS approach is adopted based on the results from I3D due to various sliding windows to improve the accuracy of the recognition result. Experimental results show that real-time performance for multiple-person action recognition can be achieved by using the proposed approach.
Description
Keywords
動作辨識, 深度學習, 人物追蹤, 智慧型監控, 三維卷積, 人臉辨識, action recognition, deep learning, face recognition, human tracking, smart surveillance, 3D convolution