| Paper: | PS-2A.16 |
| Session: | Poster Session 2A |
| Location: | Symphony/Overture |
| Session Time: | Friday, September 7, 17:15 - 19:15 |
| Presentation Time: | Friday, September 7, 17:15 - 19:15 |
| Presentation: |
Poster
|
| Publication: |
2018 Conference on Cognitive Computational Neuroscience, 5-8 September 2018, Philadelphia, Pennsylvania |
| Paper Title: |
A Large Scale Multi-Label Action Dataset for Video Understanding |
| Manuscript: |
Click here to view manuscript |
| DOI: |
https://doi.org/10.32470/CCN.2018.1137-0 |
| Authors: |
Mathew Monfort, Kandan Ramakrishnan, MIT, United States; Dan Gutfreund, IBM Research and MIT-IBM Watson AI Lab, United States; Aude Oliva, MIT, United States |
| Abstract: |
The world is inherently multi-label. Even when restricted to the space of actions, multiple things and events often happen simultaneously and a single label is commonly insufficient for adequately explaining the full meaning of an event. To develop methods reaching human-level understanding of dynamical events, we need to capture the complex nature of our environment. Here, we present a multi-label extension to the Moments in Time Dataset which includes annotation of multiple actions in each video. We perform a baseline analysis and compare recognition results, class selectivity, and network robustness of a temporal relation network (TRN) trained on both single-label Moments in Time and the proposed multi-label extension. |