Paper: | PS-2A.16 |
Session: | Poster Session 2A |
Location: | Symphony/Overture |
Session Time: | Friday, September 7, 17:15 - 19:15 |
Presentation Time: | Friday, September 7, 17:15 - 19:15 |
Presentation: |
Poster
|
Publication: |
2018 Conference on Cognitive Computational Neuroscience, 5-8 September 2018, Philadelphia, Pennsylvania |
Paper Title: |
A Large Scale Multi-Label Action Dataset for Video Understanding |
Manuscript: |
Click here to view manuscript |
DOI: |
https://doi.org/10.32470/CCN.2018.1137-0 |
Authors: |
Mathew Monfort, Kandan Ramakrishnan, MIT, United States; Dan Gutfreund, IBM Research and MIT-IBM Watson AI Lab, United States; Aude Oliva, MIT, United States |
Abstract: |
The world is inherently multi-label. Even when restricted to the space of actions, multiple things and events often happen simultaneously and a single label is commonly insufficient for adequately explaining the full meaning of an event. To develop methods reaching human-level understanding of dynamical events, we need to capture the complex nature of our environment. Here, we present a multi-label extension to the Moments in Time Dataset which includes annotation of multiple actions in each video. We perform a baseline analysis and compare recognition results, class selectivity, and network robustness of a temporal relation network (TRN) trained on both single-label Moments in Time and the proposed multi-label extension. |