Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Chivereanu, Radu, Cosma, Adrian, Catruna, Andy, Rughinis, Razvan, Radoi, Emilian
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2404.12192
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911845052317696
author	Chivereanu, Radu Cosma, Adrian Catruna, Andy Rughinis, Razvan Radoi, Emilian
author_facet	Chivereanu, Radu Cosma, Adrian Catruna, Andy Rughinis, Razvan Radoi, Emilian
contents	Large Language Models (LLMs) have demonstrated remarkable capabilities in various domains, including data augmentation and synthetic data generation. This work explores the use of LLMs to generate rich textual descriptions for motion sequences, encompassing both actions and walking patterns. We leverage the expressive power of LLMs to align motion representations with high-level linguistic cues, addressing two distinct tasks: action recognition and retrieval of walking sequences based on appearance attributes. For action recognition, we employ LLMs to generate textual descriptions of actions in the BABEL-60 dataset, facilitating the alignment of motion sequences with linguistic representations. In the domain of gait analysis, we investigate the impact of appearance attributes on walking patterns by generating textual descriptions of motion sequences from the DenseGait dataset using LLMs. These descriptions capture subtle variations in walking styles influenced by factors such as clothing choices and footwear. Our approach demonstrates the potential of LLMs in augmenting structured motion attributes and aligning multi-modal representations. The findings contribute to the advancement of comprehensive motion understanding and open up new avenues for leveraging LLMs in multi-modal alignment and data augmentation for motion analysis. We make the code publicly available at https://github.com/Radu1999/WalkAndText
format	Preprint
id	arxiv_https___arxiv_org_abs_2404_12192
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Aligning Actions and Walking to LLM-Generated Textual Descriptions Chivereanu, Radu Cosma, Adrian Catruna, Andy Rughinis, Razvan Radoi, Emilian Computer Vision and Pattern Recognition Large Language Models (LLMs) have demonstrated remarkable capabilities in various domains, including data augmentation and synthetic data generation. This work explores the use of LLMs to generate rich textual descriptions for motion sequences, encompassing both actions and walking patterns. We leverage the expressive power of LLMs to align motion representations with high-level linguistic cues, addressing two distinct tasks: action recognition and retrieval of walking sequences based on appearance attributes. For action recognition, we employ LLMs to generate textual descriptions of actions in the BABEL-60 dataset, facilitating the alignment of motion sequences with linguistic representations. In the domain of gait analysis, we investigate the impact of appearance attributes on walking patterns by generating textual descriptions of motion sequences from the DenseGait dataset using LLMs. These descriptions capture subtle variations in walking styles influenced by factors such as clothing choices and footwear. Our approach demonstrates the potential of LLMs in augmenting structured motion attributes and aligning multi-modal representations. The findings contribute to the advancement of comprehensive motion understanding and open up new avenues for leveraging LLMs in multi-modal alignment and data augmentation for motion analysis. We make the code publicly available at https://github.com/Radu1999/WalkAndText
title	Aligning Actions and Walking to LLM-Generated Textual Descriptions
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2404.12192

Similar Items