:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xiong, Jing, Liu, Gongye, Huang, Lun, Wu, Chengyue, Wu, Taiqiang, Mu, Yao, Yao, Yuan, Shen, Hui, Wan, Zhongwei, Huang, Jinfa, Tao, Chaofan, Yan, Shen, Yao, Huaxiu, Kong, Lingpeng, Yang, Hongxia, Zhang, Mi, Sapiro, Guillermo, Luo, Jiebo, Luo, Ping, Wong, Ngai
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Computation and Language
Online Access:	https://arxiv.org/abs/2411.05902
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Evolver: Chain-of-Evolution Prompting to Boost Large Multimodal Models for Hateful Meme Detection
by: Huang, Jinfa, et al.
Published: (2024)

SpecEyes: Accelerating Agentic Multimodal LLMs via Speculative Perception and Planning
by: Huang, Haoyu, et al.
Published: (2026)

MMFormalizer: Multimodal Autoformalization in the Wild
by: Xiong, Jing, et al.
Published: (2026)

ParallelComp: Parallel Long-Context Compressor for Length Extrapolation
by: Xiong, Jing, et al.
Published: (2025)

Rethinking Kullback-Leibler Divergence in Knowledge Distillation for Large Language Models
by: Wu, Taiqiang, et al.
Published: (2024)

MEIT: Multimodal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation
by: Wan, Zhongwei, et al.
Published: (2024)

UNComp: Can Matrix Entropy Uncover Sparsity? -- A Compressor Design from an Uncertainty-Aware Perspective
by: Xiong, Jing, et al.
Published: (2024)

SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving
by: Xu, Wendong, et al.
Published: (2025)

The Art of Efficient Reasoning: Data, Reward, and Optimization
by: Wu, Taiqiang, et al.
Published: (2026)

DoPE: Denoising Rotary Position Embedding
by: Xiong, Jing, et al.
Published: (2025)

Revisiting Model Interpolation for Efficient Reasoning
by: Wu, Taiqiang, et al.
Published: (2025)

Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation
by: Zhang, Zhuoyang, et al.
Published: (2025)

VarAD: Lightweight High-Resolution Image Anomaly Detection via Visual Autoregressive Modeling
by: Cao, Yunkang, et al.
Published: (2024)

ATTS: Asynchronous Test-Time Scaling via Conformal Prediction
by: Xiong, Jing, et al.
Published: (2025)

Re-Activating Frozen Primitives for 3D Gaussian Splatting
by: Cheng, Yuxin, et al.
Published: (2025)

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
by: Tao, Chaofan, et al.
Published: (2024)

Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach
by: Zhang, Shaofeng, et al.
Published: (2024)

Timber: Training-free Instruct Model Refining with Base via Effective Rank
by: Wu, Taiqiang, et al.
Published: (2025)

Enhancing Robustness of Implicit Neural Representations Against Weight Perturbations
by: Zhou, Wenyong, et al.
Published: (2025)

Distribution-Aware Hadamard Quantization for Hardware-Efficient Implicit Neural Representations
by: Zhou, Wenyong, et al.
Published: (2025)

MINR: Efficient Implicit Neural Representations for Multi-Image Encoding
by: Zhou, Wenyong, et al.
Published: (2025)

Instances Need More Care: Rewriting Prompts for Instances with LLMs in the Loop Yields Better Zero-Shot Performance
by: Srivastava, Saurabh, et al.
Published: (2023)

Perspective-aware 3D Gaussian Inpainting with Multi-view Consistency
by: Cheng, Yuxin, et al.
Published: (2025)

LINA: Linear Autoregressive Image Generative Models with Continuous Tokens
by: Wang, Jiahao, et al.
Published: (2026)

LongEmotion: Measuring Emotional Intelligence of Large Language Models in Long-Context Interaction
by: Liu, Weichu, et al.
Published: (2025)

UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation
by: Li, Zixuan, et al.
Published: (2024)

Shadow-FT: Tuning Instruct Model via Training on Paired Base Model
by: Wu, Taiqiang, et al.
Published: (2025)

Identity-Preserving Text-to-Video Generation by Frequency Decomposition
by: Yuan, Shenghai, et al.
Published: (2024)

Seeing the Poem: Image-Semantic Detection of AI-Generated Modern Chinese Poetry with MLLMs
by: Wang, Shanshan, et al.
Published: (2026)

Why Does the Effective Context Length of LLMs Fall Short?
by: An, Chenxin, et al.
Published: (2024)

Weight-Inherited Distillation for Task-Agnostic BERT Compression
by: Wu, Taiqiang, et al.
Published: (2023)

QuadINR: Hardware-Efficient Implicit Neural Representations Through Quadratic Activation
by: Zhou, Wenyong, et al.
Published: (2025)

OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation
by: Yuan, Shenghai, et al.
Published: (2025)

LLM-NEO: Parameter Efficient Knowledge Distillation for Large Language Models
by: Yang, Runming, et al.
Published: (2024)

QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehension
by: Luo, Yongdong, et al.
Published: (2025)

LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference
by: Wan, Zhongwei, et al.
Published: (2024)

ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
by: Yuan, Shenghai, et al.
Published: (2024)

LITE: Modeling Environmental Ecosystems with Multimodal Large Language Models
by: Li, Haoran, et al.
Published: (2024)

MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
by: Yuan, Shenghai, et al.
Published: (2024)

Can We Trust LLMs on Memristors? Diving into Reasoning Ability under Non-Ideality
by: Wu, Taiqiang, et al.
Published: (2026)