DejaVid
We propose a novel framework for Semantic Video Retrieval (SVR), where we aim to find videos within a corpus that are semantically similar to a given query video. Difficulties with this problem include identifying semantically relevant events in a video and matching events in videos despite events spanning different durations. One promising technique is Dynamic Time Warping (DTW), which is temporal deformation-invariant but typically only supports low-dimensional data. In this work, we propose a DTW-augmented neural network architecture that learns the semantic relevance of events and features in a video, enabling general-purpose SVR without hand-coded events or features.
Participants
Darryl Ho, Samuel Madden