Discriminatively Trained Latent Ordinal Model for Video Classification
release_j6bwte7afvdwdbulmlzl5tvmg4
by
Karan Sikka, Gaurav Sharma
2016
Abstract
We study the problem of video classification for facial analysis and human
action recognition. We propose a novel weakly supervised learning method that
models the video as a sequence of automatically mined, discriminative
sub-events (eg. onset and offset phase for "smile", running and jumping for
"highjump"). The proposed model is inspired by the recent works on Multiple
Instance Learning and latent SVM/HCRF -- it extends such frameworks to model
the ordinal aspect in the videos, approximately. We obtain consistent
improvements over relevant competitive baselines on four challenging and
publicly available video based facial analysis datasets for prediction of
expression, clinical pain and intent in dyadic conversations and on three
challenging human action datasets. We also validate the method with qualitative
results and show that they largely support the intuitions behind the method.
In text/plain
format
Archived Files and Locations
application/pdf 3.1 MB
file_vydxr4tivzd3vjmjdrvx2hhjny
|
arxiv.org (repository) web.archive.org (webarchive) |
1608.02318v1
access all versions, variants, and formats of this works (eg, pre-prints)