Saved in:
Bibliographic Details
Main Authors: Rogers, Mitchell, Knowles, Kobe, Gendron, Gaël, Heidari, Shahrokh, Valdez, David Arturo Soriano, Azhar, Mihailo, O'Leary, Padriac, Eyre, Simon, Witbrock, Michael, Delmas, Patrice
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2406.13002
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917698850521088
author Rogers, Mitchell
Knowles, Kobe
Gendron, Gaël
Heidari, Shahrokh
Valdez, David Arturo Soriano
Azhar, Mihailo
O'Leary, Padriac
Eyre, Simon
Witbrock, Michael
Delmas, Patrice
author_facet Rogers, Mitchell
Knowles, Kobe
Gendron, Gaël
Heidari, Shahrokh
Valdez, David Arturo Soriano
Azhar, Mihailo
O'Leary, Padriac
Eyre, Simon
Witbrock, Michael
Delmas, Patrice
contents Deep learning approaches for animal re-identification have had a major impact on conservation, significantly reducing the time required for many downstream tasks, such as well-being monitoring. We propose a method called Recurrence over Video Frames (RoVF), which uses a recurrent head based on the Perceiver architecture to iteratively construct an embedding from a video clip. RoVF is trained using triplet loss based on the co-occurrence of individuals in the video frames, where the individual IDs are unavailable. We tested this method and various models based on the DINOv2 transformer architecture on a dataset of meerkats collected at the Wellington Zoo. Our method achieves a top-1 re-identification accuracy of $49\%$, which is higher than that of the best DINOv2 model ($42\%$). We found that the model can match observations of individuals where humans cannot, and our model (RoVF) performs better than the comparisons with minimal fine-tuning. In future work, we plan to improve these models by using pre-text tasks, apply them to animal behaviour classification, and perform a hyperparameter search to optimise the models further.
format Preprint
id arxiv_https___arxiv_org_abs_2406_13002
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Recurrence over Video Frames (RoVF) for the Re-identification of Meerkats
Rogers, Mitchell
Knowles, Kobe
Gendron, Gaël
Heidari, Shahrokh
Valdez, David Arturo Soriano
Azhar, Mihailo
O'Leary, Padriac
Eyre, Simon
Witbrock, Michael
Delmas, Patrice
Computer Vision and Pattern Recognition
Deep learning approaches for animal re-identification have had a major impact on conservation, significantly reducing the time required for many downstream tasks, such as well-being monitoring. We propose a method called Recurrence over Video Frames (RoVF), which uses a recurrent head based on the Perceiver architecture to iteratively construct an embedding from a video clip. RoVF is trained using triplet loss based on the co-occurrence of individuals in the video frames, where the individual IDs are unavailable. We tested this method and various models based on the DINOv2 transformer architecture on a dataset of meerkats collected at the Wellington Zoo. Our method achieves a top-1 re-identification accuracy of $49\%$, which is higher than that of the best DINOv2 model ($42\%$). We found that the model can match observations of individuals where humans cannot, and our model (RoVF) performs better than the comparisons with minimal fine-tuning. In future work, we plan to improve these models by using pre-text tasks, apply them to animal behaviour classification, and perform a hyperparameter search to optimise the models further.
title Recurrence over Video Frames (RoVF) for the Re-identification of Meerkats
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2406.13002