Saved in:
| Main Authors: | Zhou, Pengfei, Xia, Jie, Peng, Xiaopeng, Zhao, Wangbo, Ye, Zilong, Li, Zekai, Yang, Suorong, Pan, Jiadong, Chen, Yuanxiang, Wang, Ziqiao, Wang, Kai, Zheng, Qian, Jin, Hao, Chang, Xiaojun, Pan, Gang, Dong, Shurong, Zhang, Kaipeng, You, Yang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.05397 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
REPA Works Until It Doesn't: Early-Stopped, Holistic Alignment Supercharges Diffusion Training
by: Wang, Ziqiao, et al.
Published: (2025)
by: Wang, Ziqiao, et al.
Published: (2025)
Wearable Music2Emotion : Assessing Emotions Induced by AI-Generated Music through Portable EEG-fNIRS Fusion
by: Zhao, Sha, et al.
Published: (2025)
by: Zhao, Sha, et al.
Published: (2025)
On-the-Fly Data Augmentation via Gradient-Guided and Sample-Aware Influence Estimation
by: Yang, Suorong, et al.
Published: (2025)
by: Yang, Suorong, et al.
Published: (2025)
Exposing Hallucinations To Suppress Them: VLMs Representation Editing With Generative Anchors
by: Shi, Youxu, et al.
Published: (2025)
by: Shi, Youxu, et al.
Published: (2025)
Prioritize Alignment in Dataset Distillation
by: Li, Zekai, et al.
Published: (2024)
by: Li, Zekai, et al.
Published: (2024)
MDK12-Bench: A Comprehensive Evaluation of Multimodal Large Language Models on Multidisciplinary Exams
by: Zhou, Pengfei, et al.
Published: (2025)
by: Zhou, Pengfei, et al.
Published: (2025)
Human-Inspired Computing for Robust and Efficient Audio-Visual Speech Recognition
by: Liu, Qianhui, et al.
Published: (2024)
by: Liu, Qianhui, et al.
Published: (2024)
EntAugment: Entropy-Driven Adaptive Data Augmentation Framework for Image Classification
by: Yang, Suorong, et al.
Published: (2024)
by: Yang, Suorong, et al.
Published: (2024)
MPBench: A Comprehensive Multimodal Reasoning Benchmark for Process Errors Identification
by: Xu, Zhaopan, et al.
Published: (2025)
by: Xu, Zhaopan, et al.
Published: (2025)
MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models
by: Zhou, Pengfei, et al.
Published: (2025)
by: Zhou, Pengfei, et al.
Published: (2025)
Ensemble Debiasing Across Class and Sample Levels for Fairer Prompting Accuracy
by: Lin, Ruixi, et al.
Published: (2025)
by: Lin, Ruixi, et al.
Published: (2025)
Recurrent Diffusion for Large-Scale Parameter Generation
by: Wang, Kai, et al.
Published: (2025)
by: Wang, Kai, et al.
Published: (2025)
RASA: Replace Anyone, Say Anything -- A Training-Free Framework for Audio-Driven and Universal Portrait Video Editing
by: Pan, Tianrui, et al.
Published: (2025)
by: Pan, Tianrui, et al.
Published: (2025)
Towards Scalable and Consistent 3D Editing
by: Xia, Ruihao, et al.
Published: (2025)
by: Xia, Ruihao, et al.
Published: (2025)
PEBench: A Fictitious Dataset to Benchmark Machine Unlearning for Multimodal Large Language Models
by: Xu, Zhaopan, et al.
Published: (2025)
by: Xu, Zhaopan, et al.
Published: (2025)
FOCUS: Unified Vision-Language Modeling for Interactive Editing Driven by Referential Segmentation
by: Yang, Fan, et al.
Published: (2025)
by: Yang, Fan, et al.
Published: (2025)
When Dynamic Data Selection Meets Data Augmentation
by: Yang, Suorong, et al.
Published: (2025)
by: Yang, Suorong, et al.
Published: (2025)
RAPID^3: Tri-Level Reinforced Acceleration Policies for Diffusion Transformer
by: Zhao, Wangbo, et al.
Published: (2025)
by: Zhao, Wangbo, et al.
Published: (2025)
Structure-Level Disentangled Diffusion for Few-Shot Chinese Font Generation
by: Li, Jie, et al.
Published: (2026)
by: Li, Jie, et al.
Published: (2026)
Role-SynthCLIP: A Role Play Driven Diverse Synthetic Data Approach
by: Huangfu, Yuanxiang, et al.
Published: (2025)
by: Huangfu, Yuanxiang, et al.
Published: (2025)
Faster Vision Mamba is Rebuilt in Minutes via Merged Token Re-training
by: Shi, Mingjia, et al.
Published: (2024)
by: Shi, Mingjia, et al.
Published: (2024)
Conditional LoRA Parameter Generation
by: Jin, Xiaolong, et al.
Published: (2024)
by: Jin, Xiaolong, et al.
Published: (2024)
One-shot Optimized Steering Vector for Hallucination Mitigation for VLMs
by: Shi, Youxu, et al.
Published: (2026)
by: Shi, Youxu, et al.
Published: (2026)
$Δ$-AttnMask: Attention-Guided Masked Hidden States for Efficient Data Selection and Augmentation
by: Hu, Jucheng, et al.
Published: (2025)
by: Hu, Jucheng, et al.
Published: (2025)
Towards Lossless Dataset Distillation via Difficulty-Aligned Trajectory Matching
by: Guo, Ziyao, et al.
Published: (2023)
by: Guo, Ziyao, et al.
Published: (2023)
Frustrated Lewis pairs constructed on Cs‐Beta for aldol condensation of methyl acetate with formaldehyde
by: Kaipeng Cao, et al.
Published: (2025)
by: Kaipeng Cao, et al.
Published: (2025)
Dynamic Diffusion Transformer
by: Zhao, Wangbo, et al.
Published: (2024)
by: Zhao, Wangbo, et al.
Published: (2024)
Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation
by: Zhao, Wangbo, et al.
Published: (2024)
by: Zhao, Wangbo, et al.
Published: (2024)
A Stitch in Time Saves Nine: Small VLM is a Precise Guidance for Accelerating Large VLMs
by: Zhao, Wangbo, et al.
Published: (2024)
by: Zhao, Wangbo, et al.
Published: (2024)
Unsupervised Learning for Class Distribution Mismatch
by: Du, Pan, et al.
Published: (2025)
by: Du, Pan, et al.
Published: (2025)
Bidirectional Photoregulated Chromism in Pyridinium Derivatives via Secondary Excitation‐Driven Electron Transfer
by: Yun‐Rui Chen, et al.
Published: (2025)
by: Yun‐Rui Chen, et al.
Published: (2025)
A CLIP-Powered Framework for Robust and Generalizable Data Selection
by: Yang, Suorong, et al.
Published: (2024)
by: Yang, Suorong, et al.
Published: (2024)
Dynamic Vision Mamba
by: Wu, Mengxuan, et al.
Published: (2025)
by: Wu, Mengxuan, et al.
Published: (2025)
Enhance-A-Video: Better Generated Video for Free
by: Luo, Yang, et al.
Published: (2025)
by: Luo, Yang, et al.
Published: (2025)
World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models
by: Ma, Ziqiao, et al.
Published: (2023)
by: Ma, Ziqiao, et al.
Published: (2023)
GRE Suite: Geo-localization Inference via Fine-Tuned Vision-Language Models and Enhanced Reasoning Chains
by: Wang, Chun, et al.
Published: (2025)
by: Wang, Chun, et al.
Published: (2025)
Knowdit: Agentic Smart Contract Vulnerability Detection with Auditing Knowledge Summarization
by: Kong, Ziqiao, et al.
Published: (2026)
by: Kong, Ziqiao, et al.
Published: (2026)
Anion Exchange Polyelectrolytes with High Branching Ratio and Adjustable Free Volume for Anion Exchange Membrane Water Electrolysis
by: Xiaoyu Zhao, et al.
Published: (2025)
by: Xiaoyu Zhao, et al.
Published: (2025)
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights
by: Liang, Zhiyuan, et al.
Published: (2025)
by: Liang, Zhiyuan, et al.
Published: (2025)
Dragging with Geometry: From Pixels to Geometry-Guided Image Editing
by: Pu, Xinyu, et al.
Published: (2025)
by: Pu, Xinyu, et al.
Published: (2025)
Similar Items
-
REPA Works Until It Doesn't: Early-Stopped, Holistic Alignment Supercharges Diffusion Training
by: Wang, Ziqiao, et al.
Published: (2025) -
Wearable Music2Emotion : Assessing Emotions Induced by AI-Generated Music through Portable EEG-fNIRS Fusion
by: Zhao, Sha, et al.
Published: (2025) -
On-the-Fly Data Augmentation via Gradient-Guided and Sample-Aware Influence Estimation
by: Yang, Suorong, et al.
Published: (2025) -
Exposing Hallucinations To Suppress Them: VLMs Representation Editing With Generative Anchors
by: Shi, Youxu, et al.
Published: (2025) -
Prioritize Alignment in Dataset Distillation
by: Li, Zekai, et al.
Published: (2024)