Saved in:
| Main Authors: | Li, Yixing, Xie, Ruobing, Yang, Zhen, Sun, Xingwu, Li, Shuaipeng, Han, Weidong, Kang, Zhanhui, Cheng, Yu, Xu, Chengzhong, Wang, Di, Jiang, Jie |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.24067 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
TransMamba: Fast Universal Architecture Adaption from Transformers to Mamba
by: Chen, Xiuwei, et al.
Published: (2025)
by: Chen, Xiuwei, et al.
Published: (2025)
Continuous Speech Tokenizer in Text To Speech
by: Li, Yixing, et al.
Published: (2024)
by: Li, Yixing, et al.
Published: (2024)
Scaling Laws for Floating Point Quantization Training
by: Sun, Xingwu, et al.
Published: (2025)
by: Sun, Xingwu, et al.
Published: (2025)
More Expressive Attention with Negative Weights
by: Lv, Ang, et al.
Published: (2024)
by: Lv, Ang, et al.
Published: (2024)
Mitigating Hallucination in Multimodal Large Language Model via Hallucination-targeted Direct Preference Optimization
by: Fu, Yuhan, et al.
Published: (2024)
by: Fu, Yuhan, et al.
Published: (2024)
Exploring Forgetting in Large Language Model Pre-Training
by: Liao, Chonghua, et al.
Published: (2024)
by: Liao, Chonghua, et al.
Published: (2024)
HMoE: Heterogeneous Mixture of Experts for Language Modeling
by: Wang, An, et al.
Published: (2024)
by: Wang, An, et al.
Published: (2024)
Language Models "Grok" to Copy
by: Lv, Ang, et al.
Published: (2024)
by: Lv, Ang, et al.
Published: (2024)
The Elephant in the Room: Rethinking the Usage of Pre-trained Language Model in Sequential Recommendation
by: Qu, Zekai, et al.
Published: (2024)
by: Qu, Zekai, et al.
Published: (2024)
Self-Distillation for Multi-Token Prediction
by: Zhao, Guoliang, et al.
Published: (2026)
by: Zhao, Guoliang, et al.
Published: (2026)
The Climb Carves Wisdom Deeper Than the Summit: On the Noisy Rewards in Learning to Reason
by: Lv, Ang, et al.
Published: (2025)
by: Lv, Ang, et al.
Published: (2025)
Hybrid-Tower: Fine-grained Pseudo-query Interaction and Generation for Text-to-Video Retrieval
by: Lan, Bangxiang, et al.
Published: (2025)
by: Lan, Bangxiang, et al.
Published: (2025)
The Security Threat of Compressed Projectors in Large Vision-Language Models
by: Zhang, Yudong, et al.
Published: (2025)
by: Zhang, Yudong, et al.
Published: (2025)
QAVA: Query-Agnostic Visual Attack to Large Vision-Language Models
by: Zhang, Yudong, et al.
Published: (2025)
by: Zhang, Yudong, et al.
Published: (2025)
Lossless KV Cache Compression to 2%
by: Yang, Zhen, et al.
Published: (2024)
by: Yang, Zhen, et al.
Published: (2024)
Towards a Comprehensive Scaling Law of Mixture-of-Experts
by: Zhao, Guoliang, et al.
Published: (2025)
by: Zhao, Guoliang, et al.
Published: (2025)
Enhancing Contrastive Learning Inspired by the Philosophy of "The Blind Men and the Elephant"
by: Zhang, Yudong, et al.
Published: (2024)
by: Zhang, Yudong, et al.
Published: (2024)
DHCP: Detecting Hallucinations by Cross-modal Attention Pattern in Large Vision-Language Models
by: Zhang, Yudong, et al.
Published: (2024)
by: Zhang, Yudong, et al.
Published: (2024)
Magnifier Prompt: Tackling Multimodal Hallucination via Extremely Simple Instructions
by: Fu, Yuhan, et al.
Published: (2024)
by: Fu, Yuhan, et al.
Published: (2024)
Multi-Grained Patch Training for Efficient LLM-based Recommendation
by: Liao, Jiayi, et al.
Published: (2025)
by: Liao, Jiayi, et al.
Published: (2025)
PhD: A ChatGPT-Prompted Visual hallucination Evaluation Dataset
by: Liu, Jiazhen, et al.
Published: (2024)
by: Liu, Jiazhen, et al.
Published: (2024)
Autonomy-of-Experts Models
by: Lv, Ang, et al.
Published: (2025)
by: Lv, Ang, et al.
Published: (2025)
Fighting Fire with Fire (F3): A Training-free and Efficient Visual Adversarial Example Purification Method in LVLMs
by: Zhang, Yudong, et al.
Published: (2025)
by: Zhang, Yudong, et al.
Published: (2025)
Large Language Model Empowered Recommendation Meets All-domain Continual Pre-Training
by: Ma, Haokai, et al.
Published: (2025)
by: Ma, Haokai, et al.
Published: (2025)
Negative Sampling in Recommendation: A Survey and Future Directions
by: Ma, Haokai, et al.
Published: (2024)
by: Ma, Haokai, et al.
Published: (2024)
RosePO: Aligning LLM-based Recommenders with Human Values
by: Liao, Jiayi, et al.
Published: (2024)
by: Liao, Jiayi, et al.
Published: (2024)
MaskMamba: A Hybrid Mamba-Transformer Model for Masked Image Generation
by: Chen, Wenchao, et al.
Published: (2024)
by: Chen, Wenchao, et al.
Published: (2024)
Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuning
by: Chen, Zhongzhi, et al.
Published: (2023)
by: Chen, Zhongzhi, et al.
Published: (2023)
MambaVision: A Hybrid Mamba-Transformer Vision Backbone
by: Hatamizadeh, Ali, et al.
Published: (2024)
by: Hatamizadeh, Ali, et al.
Published: (2024)
InfoMamba: An Attention-Free Hybrid Mamba-Transformer Model
by: Wang, Youjin, et al.
Published: (2026)
by: Wang, Youjin, et al.
Published: (2026)
ReMamba: Equip Mamba with Effective Long-Sequence Modeling
by: Yuan, Danlong, et al.
Published: (2024)
by: Yuan, Danlong, et al.
Published: (2024)
Bi-Mamba+: Bidirectional Mamba for Time Series Forecasting
by: Liang, Aobo, et al.
Published: (2024)
by: Liang, Aobo, et al.
Published: (2024)
MaTVLM: Hybrid Mamba-Transformer for Efficient Vision-Language Modeling
by: Li, Yingyue, et al.
Published: (2025)
by: Li, Yingyue, et al.
Published: (2025)
PPC-MT: Parallel Point Cloud Completion with Mamba-Transformer Hybrid Architecture
by: Li, Jie, et al.
Published: (2026)
by: Li, Jie, et al.
Published: (2026)
Jamba: A Hybrid Transformer-Mamba Language Model
by: Lieber, Opher, et al.
Published: (2024)
by: Lieber, Opher, et al.
Published: (2024)
PIP: Detecting Adversarial Examples in Large Vision-Language Models via Attention Patterns of Irrelevant Probe Questions
by: Zhang, Yudong, et al.
Published: (2024)
by: Zhang, Yudong, et al.
Published: (2024)
FinMamba: Market-Aware Graph Enhanced Multi-Level Mamba for Stock Movement Prediction
by: Hu, Yifan, et al.
Published: (2025)
by: Hu, Yifan, et al.
Published: (2025)
Hybrid Mamba for Few-Shot Segmentation
by: Xu, Qianxiong, et al.
Published: (2024)
by: Xu, Qianxiong, et al.
Published: (2024)
Decision Mamba: Reinforcement Learning via Hybrid Selective Sequence Modeling
by: Huang, Sili, et al.
Published: (2024)
by: Huang, Sili, et al.
Published: (2024)
Hi-Mamba: Hierarchical Mamba for Efficient Image Super-Resolution
by: Qiao, Junbo, et al.
Published: (2024)
by: Qiao, Junbo, et al.
Published: (2024)
Similar Items
-
TransMamba: Fast Universal Architecture Adaption from Transformers to Mamba
by: Chen, Xiuwei, et al.
Published: (2025) -
Continuous Speech Tokenizer in Text To Speech
by: Li, Yixing, et al.
Published: (2024) -
Scaling Laws for Floating Point Quantization Training
by: Sun, Xingwu, et al.
Published: (2025) -
More Expressive Attention with Negative Weights
by: Lv, Ang, et al.
Published: (2024) -
Mitigating Hallucination in Multimodal Large Language Model via Hallucination-targeted Direct Preference Optimization
by: Fu, Yuhan, et al.
Published: (2024)