Abstract
In this paper we present architectures based on deep neural nets for expression recognition in videos, which are invariant to local scaling. We amalgamate autoencoder and predictor architectures using an adaptive weighting scheme coping with a reduced size labeled dataset, while enriching our models from enormous unlabeled sets. We further improve robustness to lighting conditions by introducing a new adaptive filter based on temporal local scale normalization. We provide superior results over known methods, including recent reported approaches based on neural nets.
Original language | English |
---|---|
Pages (from-to) | 25-35 |
Number of pages | 11 |
Journal | Pattern Recognition |
Volume | 76 |
DOIs | |
State | Published - Apr 2018 |
Keywords
- Deep learning
- Expression recognition
- Machine learning
- Neural nets
- Video classification
All Science Journal Classification (ASJC) codes
- Software
- Signal Processing
- Computer Vision and Pattern Recognition
- Artificial Intelligence