Technical Program

Paper Detail

Paper:	PS-2A.16
Session:	Poster Session 2A
Location:	Symphony/Overture
Session Time:	Friday, September 7, 17:15 - 19:15
Presentation Time:	Friday, September 7, 17:15 - 19:15
Presentation:	Poster
Publication:	2018 Conference on Cognitive Computational Neuroscience, 5-8 September 2018, Philadelphia, Pennsylvania
Paper Title:	A Large Scale Multi-Label Action Dataset for Video Understanding
Manuscript:	Click here to view manuscript
DOI:	https://doi.org/10.32470/CCN.2018.1137-0
Authors:	Mathew Monfort, Kandan Ramakrishnan, MIT, United States; Dan Gutfreund, IBM Research and MIT-IBM Watson AI Lab, United States; Aude Oliva, MIT, United States
Abstract:	The world is inherently multi-label. Even when restricted to the space of actions, multiple things and events often happen simultaneously and a single label is commonly insufficient for adequately explaining the full meaning of an event. To develop methods reaching human-level understanding of dynamical events, we need to capture the complex nature of our environment. Here, we present a multi-label extension to the Moments in Time Dataset which includes annotation of multiple actions in each video. We perform a baseline analysis and compare recognition results, class selectivity, and network robustness of a temporal relation network (TRN) trained on both single-label Moments in Time and the proposed multi-label extension.