Saved in:
Bibliographic Details
Main Authors: Li, Yuke, Chen, Guangyi, Abramowitz, Ben, Anzellott, Stefano, Wei, Donglai
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2402.12706
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911904223461376
author Li, Yuke
Chen, Guangyi
Abramowitz, Ben
Anzellott, Stefano
Wei, Donglai
author_facet Li, Yuke
Chen, Guangyi
Abramowitz, Ben
Anzellott, Stefano
Wei, Donglai
contents Few-shot action recognition aims at quickly adapting a pre-trained model to the novel data with a distribution shift using only a limited number of samples. Key challenges include how to identify and leverage the transferable knowledge learned by the pre-trained model. We therefore propose CDTD, or Causal Domain-Invariant Temporal Dynamics for knowledge transfer. To identify the temporally invariant and variant representations, we employ the causal representation learning methods for unsupervised pertaining, and then tune the classifier with supervisions in next stage. Specifically, we assume the domain information can be well estimated and the pre-trained image decoder and transition models can be well transferred. During adaptation, we fix the transferable temporal dynamics and update the image encoder and domain estimator. The efficacy of our approach is revealed by the superior accuracy of CDTD over leading alternatives across standard few-shot action recognition datasets.
format Preprint
id arxiv_https___arxiv_org_abs_2402_12706
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Learning Causal Domain-Invariant Temporal Dynamics for Few-Shot Action Recognition
Li, Yuke
Chen, Guangyi
Abramowitz, Ben
Anzellott, Stefano
Wei, Donglai
Computer Vision and Pattern Recognition
Few-shot action recognition aims at quickly adapting a pre-trained model to the novel data with a distribution shift using only a limited number of samples. Key challenges include how to identify and leverage the transferable knowledge learned by the pre-trained model. We therefore propose CDTD, or Causal Domain-Invariant Temporal Dynamics for knowledge transfer. To identify the temporally invariant and variant representations, we employ the causal representation learning methods for unsupervised pertaining, and then tune the classifier with supervisions in next stage. Specifically, we assume the domain information can be well estimated and the pre-trained image decoder and transition models can be well transferred. During adaptation, we fix the transferable temporal dynamics and update the image encoder and domain estimator. The efficacy of our approach is revealed by the superior accuracy of CDTD over leading alternatives across standard few-shot action recognition datasets.
title Learning Causal Domain-Invariant Temporal Dynamics for Few-Shot Action Recognition
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2402.12706