:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Li, Yifan, Dao, Anh, Bao, Wentao, Tan, Zhen, Chen, Tianlong, Liu, Huan, Kong, Yu
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2404.05052
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Window Token Concatenation for Efficient Visual Large Language Models
by: Li, Yifan, et al.
Published: (2025)

IndustryEQA: Pushing the Frontiers of Embodied Question Answering in Industrial Scenarios
by: Li, Yifan, et al.
Published: (2025)

Visual Large Language Models for Generalized and Specialized Applications
by: Li, Yifan, et al.
Published: (2025)

Weakly Supervised Learning for Facial Affective Behavior Analysis : A Review
by: Praveen, R. Gnana, et al.
Published: (2021)

Robust Light-Weight Facial Affective Behavior Recognition with CLIP
by: Lin, Li, et al.
Published: (2024)

Scalable Audio-Visual Masked Autoencoders for Efficient Affective Video Facial Analysis
by: Wu, Xuecheng, et al.
Published: (2025)

IndustryNav: Exploring Spatial Reasoning of Embodied Agents in Dynamic Industrial Navigation
by: Li, Yifan, et al.
Published: (2025)

EMO-LLaMA: Enhancing Facial Emotion Understanding with Instruction Tuning
by: Xing, Bohao, et al.
Published: (2024)

Open Set Face Forgery Detection via Dual-Level Evidence Collection
by: Cai, Zhongyi, et al.
Published: (2025)

Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment
by: Chen, Yuxiao, et al.
Published: (2024)

Prompting Language-Informed Distribution for Compositional Zero-Shot Learning
by: Bao, Wentao, et al.
Published: (2023)

Affective Behaviour Analysis via Progressive Learning
by: Liu, Chen, et al.
Published: (2024)

Facial Dynamics in Video: Instruction Tuning for Improved Facial Expression Perception and Contextual Awareness
by: Zhao, Jiaxing, et al.
Published: (2025)

Solution for 8th Competition on Affective & Behavior Analysis in-the-wild
by: Yu, Jun, et al.
Published: (2025)

Task-Aware Resolution Optimization for Visual Large Language Models
by: Luo, Weiqing, et al.
Published: (2025)

FairSkin: Fair Diffusion for Skin Disease Image Generation
by: Zhang, Ruichen, et al.
Published: (2024)

MissBench: Benchmarking Multimodal Affective Analysis under Imbalanced Missing Modalities
by: Pham, Tien Anh, et al.
Published: (2026)

Affective Behavior Analysis using Task-adaptive and AU-assisted Graph Network
by: Li, Xiaodong, et al.
Published: (2024)

Affective Behaviour Analysis via Integrating Multi-Modal Knowledge
by: Zhang, Wei, et al.
Published: (2024)

The 6th Affective Behavior Analysis in-the-wild (ABAW) Competition
by: Kollias, Dimitrios, et al.
Published: (2024)

Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection
by: Bao, Wentao, et al.
Published: (2024)

CausalAffect: Causal Discovery for Facial Affective Understanding
by: Hu, Guanyu, et al.
Published: (2025)

CoIDO: Efficient Data Selection for Visual Instruction Tuning via Coupled Importance-Diversity Optimization
by: Yan, Yichen, et al.
Published: (2025)

A Generative Framework for Self-Supervised Facial Representation Learning
by: He, Ruian, et al.
Published: (2023)

To See is Not to Learn: Protecting Multimodal Data from Unauthorized Fine-Tuning of Large Vision-Language Model
by: Zhao, Chengshuai, et al.
Published: (2026)

Self-Corrected Flow Distillation for Consistent One-Step and Few-Step Text-to-Image Generation
by: Dao, Quan, et al.
Published: (2024)

3DMIT: 3D Multi-modal Instruction Tuning for Scene Understanding
by: Li, Zeju, et al.
Published: (2024)

VAST: Vivify Your Talking Avatar via Zero-Shot Expressive Facial Style Transfer
by: Chen, Liyang, et al.
Published: (2023)

A$^{3}$lign-DFER: Pioneering Comprehensive Dynamic Affective Alignment for Dynamic Facial Expression Recognition with CLIP
by: Tao, Zeng, et al.
Published: (2024)

Unified Multi-modal Diagnostic Framework with Reconstruction Pre-training and Heterogeneity-combat Tuning
by: Zhang, Yupei, et al.
Published: (2024)

Technical Approach for the EMI Challenge in the 8th Affective Behavior Analysis in-the-Wild Competition
by: Yu, Jun, et al.
Published: (2025)

CFCPalsy: Facial Image Synthesis with Cross-Fusion Cycle Diffusion Model for Facial Paralysis Individuals
by: Gao, Weixiang, et al.
Published: (2024)

BT-Adapter: Video Conversation is Feasible Without Video Instruction Tuning
by: Liu, Ruyang, et al.
Published: (2023)

Streaming Video Instruction Tuning
by: Xia, Jiaer, et al.
Published: (2025)

PPBoost: Progressive Prompt Boosting for Text-Driven Medical Image Segmentation
by: Li, Xuchen, et al.
Published: (2025)

What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Instruction Tuning
by: Du, Yifan, et al.
Published: (2023)

Face-LLaVA: Facial Expression and Attribute Understanding through Instruction Tuning
by: Chaubey, Ashutosh, et al.
Published: (2025)

AICA-Bench: Holistically Examining the Capabilities of VLMs in Affective Image Content Analysis
by: She, Dong, et al.
Published: (2026)

StreamingTalker: Audio-driven 3D Facial Animation with Autoregressive Diffusion Model
by: Yang, Yifan, et al.
Published: (2025)

Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning
by: Gou, Yunhao, et al.
Published: (2023)