Online process phase detection using multimodal deep learning

Document Type

Conference Proceeding

Publication Date

12-7-2016

Journal

2016 IEEE 7th Annual Ubiquitous Computing, Electronics and Mobile Communication Conference, UEMCON 2016

DOI

10.1109/UEMCON.2016.7777912

Keywords

Activity recognition; deep learning; Kinect; multimodal sensing; process phase recognition; trauma resuscitation

Abstract

© 2016 IEEE. We present a multimodal deep-learning structure that automatically predicts phases of the trauma resuscitation process in real-time. The system first pre-processes the audio and video streams captured by a Kinect's built-in microphone array and depth sensor. A multimodal deep learning structure then extracts video and audio features, which are later combined through a 'slow fusion' model. The final decision is then made from the combined features through a modified softmax classification layer. The model was trained on 20 trauma resuscitation cases (>13 hours), and was tested on 5 other cases. Our results showed over 80% online detection accuracy with 0.7 F-Score, outperforming previous systems.

This document is currently not available here.

Share

COinS