:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Jung, Chaeyoung, Jang, Youngjoon, Lee, Seungwoo, Chung, Joon Son
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2601.13143
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

AVCD: Mitigating Hallucinations in Audio-Visual Large Language Models through Contrastive Decoding
by: Jung, Chaeyoung, et al.
Published: (2025)

Fork-Merge Decoding: Enhancing Multimodal Understanding in Audio-Visual Large Language Models
by: Jung, Chaeyoung, et al.
Published: (2025)

Keep What Audio Cannot Say: Context-Preserving Token Pruning for Omni-LLMs
by: Jung, Chaeyoung, et al.
Published: (2026)

EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning
by: Kim, Jongsuk, et al.
Published: (2024)

FlowAVSE: Efficient Audio-Visual Speech Enhancement with Conditional Flow Matching
by: Jung, Chaeyoung, et al.
Published: (2024)

Probing Cross-modal Information Hubs in Audio-Visual LLMs
by: Jung, Jihoo, et al.
Published: (2026)

VoiceDiT: Dual-Condition Diffusion Transformer for Environment-Aware Speech Synthesis
by: Jung, Jaemin, et al.
Published: (2024)

Fast-Slow Efficient Training for Multimodal Large Language Models via Visual Token Pruning
by: Zhang, Dingkun, et al.
Published: (2026)

Test-Time Augmentation for Pose-invariant Face Recognition
by: Jung, Jaemin, et al.
Published: (2025)

Two Heads Are Better Than One: Audio-Visual Speech Error Correction with Dual Hypotheses
by: Kim, Sungnyun, et al.
Published: (2025)

From Coarse to Fine: Efficient Training for Audio Spectrogram Transformers
by: Feng, Jiu, et al.
Published: (2024)

AsymVLM: Asymmetric Token Pruning for Efficient Vision-Language Model Inference
by: Feng, Yilin, et al.
Published: (2026)

Prefixing Attention Sinks can Mitigate Activation Outliers for Large Language Model Quantization
by: Son, Seungwoo, et al.
Published: (2024)

InfiniteAudio: Infinite-Length Audio Generation with Consistency
by: Jung, Chaeyoung, et al.
Published: (2025)

ResPrune: Text-Conditioned Subspace Reconstruction for Visual Token Pruning in Large Vision-Language Models
by: Li, Xu, et al.
Published: (2026)

LP-CFM: Perceptual Invariance-Aware Conditional Flow Matching for Speech Modeling
by: Kwak, Doyeop, et al.
Published: (2025)

The Role of Masking for Efficient Supervised Knowledge Distillation of Vision Transformers
by: Son, Seungwoo, et al.
Published: (2023)

Mixture of Scales: Memory-Efficient Token-Adaptive Binarization for Large Language Models
by: Jo, Dongwon, et al.
Published: (2024)

Hierarchical Attention-based Graph Neural Network with Relevance-driven Pruning
by: Kum, Seungwoo
Published: (2026)

Diffusion-Link: Diffusion Probabilistic Model for Bridging the Audio-Text Modality Gap
by: Nam, KiHyun, et al.
Published: (2025)

COPAL: Continual Pruning in Large Language Generative Models
by: Malla, Srikanth, et al.
Published: (2024)

On the Nature of Attention Sink that Shapes Decoding Strategy in Omni-LLMs
by: Yoo, Suho, et al.
Published: (2026)

Token Sparse Attention: Efficient Long-Context Inference with Interleaved Token Selection
by: Jo, Dongwon, et al.
Published: (2026)

MoLT: Mixture of Layer-Wise Tokens for Efficient Audio-Visual Learning
by: Rho, Kyeongha, et al.
Published: (2025)

DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models
by: Alvar, Saeed Ranjbar, et al.
Published: (2025)

Deep Understanding of Sign Language for Sign to Subtitle Alignment
by: Jang, Youngjoon, et al.
Published: (2025)

Window-Diffusion: Accelerating Diffusion Language Model Inference with Windowed Token Pruning and Caching
by: Zuo, Fengrui, et al.
Published: (2026)

Segmentwise Pruning in Audio-Language Models
by: Gibier, Marcel, et al.
Published: (2025)

FASP: Fast and Accurate Structured Pruning of Large Language Models
by: Hu, Hanyu, et al.
Published: (2025)

On the Importance of a Multi-Scale Calibration for Quantization
by: Son, Seungwoo, et al.
Published: (2026)

Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System
by: Jang, Hongsun, et al.
Published: (2024)

Fast and Effective Weight Update for Pruned Large Language Models
by: Boža, Vladimír
Published: (2024)

PagedEviction: Structured Block-wise KV Cache Pruning for Efficient Large Language Model Inference
by: Chitty-Venkata, Krishna Teja, et al.
Published: (2025)

CoreMatching: A Co-adaptive Sparse Inference Framework with Token and Neuron Pruning for Comprehensive Acceleration of Vision-Language Models
by: Wang, Qinsi, et al.
Published: (2025)

DRIFT: Drift-Resilient Invariant-Feature Transformer for DGA Detection
by: Lee, Chaeyoung, et al.
Published: (2026)

Localizing and Editing Knowledge in Large Audio-Language Models
by: Chung, Sung Kyun, et al.
Published: (2026)

AgilePruner: An Empirical Study of Attention and Diversity for Adaptive Visual Token Pruning in Large Vision-Language Models
by: Baek, Changwoo, et al.
Published: (2026)

Towards Efficient Automatic Self-Pruning of Large Language Models
by: Huang, Weizhong, et al.
Published: (2025)

LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference
by: Fu, Qichen, et al.
Published: (2024)

Fast Inference for Augmented Large Language Models
by: Shahout, Rana, et al.
Published: (2024)