Relaxed Spatio-Temporal Deep Feature Aggregation for Real-Fake Expression Prediction

Ozkan, Savas; Akar, Gözde

Repository landing page

oai:https://open.metu.edu.tr:11511/47843

Relaxed Spatio-Temporal Deep Feature Aggregation for Real-Fake Expression Prediction

Authors: Savas Ozkan
Gözde Akar
Publication date: 29 October 2017
Publisher
Doi

Abstract

Frame-level visual features are generally aggregated in time with the techniques such as LSTM, Fisher Vectors, NetVLAD etc. to produce a robust video-level representation. We here introduce a learnable aggregation technique whose primary objective is to retain short-time temporal structure between frame-level features and their spatial interdependencies in the representation. Also, it can be easily adapted to the cases where there have very scarce training samples. We evaluate the method on a real-fake expression prediction dataset to demonstrate its superiority. Our method obtains 65% score on the test dataset in the official MAP evaluation and there is only one misclassified decision with the best reported result in the Chalearn Challenge (i.e. 66.7%). Lastly, we believe that this method can be extended to different problems such as action/event recognition in future

Similar works

Full text

Open in the Core reader

Download PDF

OpenMETU (Middle East Technical University)

oai:https://open.metu.edu.tr:1...

Last time updated on 02/12/2021

This paper was published in OpenMETU (Middle East Technical University).

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.