Online process phase detection using multimodal deep learning
Document Type
Conference Proceeding
Publication Date
12-7-2016
Journal
2016 IEEE 7th Annual Ubiquitous Computing, Electronics and Mobile Communication Conference, UEMCON 2016
DOI
10.1109/UEMCON.2016.7777912
Keywords
Activity recognition; deep learning; Kinect; multimodal sensing; process phase recognition; trauma resuscitation
Abstract
© 2016 IEEE. We present a multimodal deep-learning structure that automatically predicts phases of the trauma resuscitation process in real-time. The system first pre-processes the audio and video streams captured by a Kinect's built-in microphone array and depth sensor. A multimodal deep learning structure then extracts video and audio features, which are later combined through a 'slow fusion' model. The final decision is then made from the combined features through a modified softmax classification layer. The model was trained on 20 trauma resuscitation cases (>13 hours), and was tested on 5 other cases. Our results showed over 80% online detection accuracy with 0.7 F-Score, outperforming previous systems.
APA Citation
Li, X., Zhang, Y., Li, M., Chen, S., Austin, F., Marsic, I., & Burd, R. (2016). Online process phase detection using multimodal deep learning. 2016 IEEE 7th Annual Ubiquitous Computing, Electronics and Mobile Communication Conference, UEMCON 2016, (). http://dx.doi.org/10.1109/UEMCON.2016.7777912