Guardado en:
| Autores principales: | Hong, Wanyang, Zhang, Zhaoning, Chen, Yi, Zhang, Libo, Liu, Baihui, Qiao, Linbo, Tian, Zhiliang, Li, Dongsheng |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2512.06869 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Alloc-MoE: Budget-Aware Expert Activation Allocation for Efficient Mixture-of-Experts Inference
por: Liu, Baihui, et al.
Publicado: (2026)
por: Liu, Baihui, et al.
Publicado: (2026)
Dovetail: A CPU/GPU Heterogeneous Speculative Decoding for LLM inference
por: Zhang, Libo, et al.
Publicado: (2024)
por: Zhang, Libo, et al.
Publicado: (2024)
Sparrow: Text-Anchored Window Attention with Visual-Semantic Glimpsing for Speculative Decoding in Video LLMs
por: Zhang, Libo, et al.
Publicado: (2026)
por: Zhang, Libo, et al.
Publicado: (2026)
GRASS: Gradient-based Adaptive Layer-wise Importance Sampling for Memory-efficient Large Language Model Fine-tuning
por: Tian, Kaiyuan, et al.
Publicado: (2026)
por: Tian, Kaiyuan, et al.
Publicado: (2026)
TFDMNet: A Novel Network Structure Combines the Time Domain and Frequency Domain Features
por: Pan, Hengyue, et al.
Publicado: (2024)
por: Pan, Hengyue, et al.
Publicado: (2024)
CTTA-T: Continual Test-Time Adaptation for Text Understanding via Teacher-Student with a Domain-aware and Generalized Teacher
por: Liu, Tianlun, et al.
Publicado: (2025)
por: Liu, Tianlun, et al.
Publicado: (2025)
Let the Agent Search: Autonomous Exploration Beats Rigid Workflows in Temporal Question Answering
por: Lv, Xufei, et al.
Publicado: (2026)
por: Lv, Xufei, et al.
Publicado: (2026)
POMP: Probability-driven Meta-graph Prompter for LLMs in Low-resource Unsupervised Neural Machine Translation
por: Pan, Shilong, et al.
Publicado: (2024)
por: Pan, Shilong, et al.
Publicado: (2024)
Two-stage Generative Question Answering on Temporal Knowledge Graph Using Large Language Models
por: Gao, Yifu, et al.
Publicado: (2024)
por: Gao, Yifu, et al.
Publicado: (2024)
A Survey on Memory-Efficient Transformer-Based Model Training in AI for Science
por: Tian, Kaiyuan, et al.
Publicado: (2025)
por: Tian, Kaiyuan, et al.
Publicado: (2025)
LLM-based Privacy Data Augmentation Guided by Knowledge Distillation with a Distribution Tutor for Medical Text Classification
por: Song, Yiping, et al.
Publicado: (2024)
por: Song, Yiping, et al.
Publicado: (2024)
Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference
por: Qiu, Quantong, et al.
Publicado: (2026)
por: Qiu, Quantong, et al.
Publicado: (2026)
Broaden your SCOPE! Efficient Multi-turn Conversation Planning for LLMs with Semantic Space
por: Chen, Zhiliang, et al.
Publicado: (2025)
por: Chen, Zhiliang, et al.
Publicado: (2025)
Learn to Disguise: Avoid Refusal Responses in LLM's Defense via a Multi-agent Attacker-Disguiser Game
por: Xu, Qianqiao, et al.
Publicado: (2024)
por: Xu, Qianqiao, et al.
Publicado: (2024)
Punctuation-aware Hybrid Trainable Sparse Attention for Large Language Models
por: Qiu, Junxiang, et al.
Publicado: (2026)
por: Qiu, Junxiang, et al.
Publicado: (2026)
Synergizing Unsupervised Episode Detection with LLMs for Large-Scale News Events
por: Kargupta, Priyanka, et al.
Publicado: (2024)
por: Kargupta, Priyanka, et al.
Publicado: (2024)
ARQUSUMM: Argument-aware Quantitative Summarization of Online Conversations
por: Tang, An Quang, et al.
Publicado: (2025)
por: Tang, An Quang, et al.
Publicado: (2025)
From Biased Chatbots to Biased Agents: Examining Role Assignment Effects on LLM Agent Robustness
por: Cao, Linbo, et al.
Publicado: (2026)
por: Cao, Linbo, et al.
Publicado: (2026)
Attention Consistency for LLMs Explanation
por: Lan, Tian, et al.
Publicado: (2025)
por: Lan, Tian, et al.
Publicado: (2025)
DAC: A Dynamic Attention-aware Approach for Task-Agnostic Prompt Compression
por: Zhao, Yi, et al.
Publicado: (2025)
por: Zhao, Yi, et al.
Publicado: (2025)
Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering
por: Wen, Zhihua, et al.
Publicado: (2024)
por: Wen, Zhihua, et al.
Publicado: (2024)
Translationese-index: Using Likelihood Ratios for Graded and Generalizable Measurement of Translationese
por: Liu, Yikang, et al.
Publicado: (2025)
por: Liu, Yikang, et al.
Publicado: (2025)
SYNAPSE: Empowering LLM Agents with Episodic-Semantic Memory via Spreading Activation
por: Jiang, Hanqi, et al.
Publicado: (2026)
por: Jiang, Hanqi, et al.
Publicado: (2026)
Where Matters More Than What: Decoding-aligned KV Cache Compression via Position-aware Pseudo Queries
por: Tian, Zhenxu, et al.
Publicado: (2026)
por: Tian, Zhenxu, et al.
Publicado: (2026)
Understanding the RoPE Extensions of Long-Context LLMs: An Attention Perspective
por: Zhong, Meizhi, et al.
Publicado: (2024)
por: Zhong, Meizhi, et al.
Publicado: (2024)
Enhancing LLMs for Impression Generation in Radiology Reports through a Multi-Agent System
por: Zeng, Fang, et al.
Publicado: (2024)
por: Zeng, Fang, et al.
Publicado: (2024)
Scaling In-Context Online Learning Capability of LLMs via Cross-Episode Meta-RL
por: Lin, Xiaofeng, et al.
Publicado: (2026)
por: Lin, Xiaofeng, et al.
Publicado: (2026)
Attention Editing: A Versatile Framework for Cross-Architecture Attention Conversion
por: Cheng, Zhen, et al.
Publicado: (2026)
por: Cheng, Zhen, et al.
Publicado: (2026)
RAIDEN-R1: Improving Role-awareness of LLMs via GRPO with Verifiable Reward
por: Wang, Zongsheng, et al.
Publicado: (2025)
por: Wang, Zongsheng, et al.
Publicado: (2025)
Zero-resource Hallucination Detection for Text Generation via Graph-based Contextual Knowledge Triples Modeling
por: Fang, Xinyue, et al.
Publicado: (2024)
por: Fang, Xinyue, et al.
Publicado: (2024)
Latent-space adversarial training with post-aware calibration for defending large language models against jailbreak attacks
por: Yi, Xin, et al.
Publicado: (2025)
por: Yi, Xin, et al.
Publicado: (2025)
FlowKV: Enhancing Multi-Turn Conversational Coherence in LLMs via Isolated Key-Value Cache Management
por: Liu, Xiang, et al.
Publicado: (2025)
por: Liu, Xiang, et al.
Publicado: (2025)
RoleRMBench & RoleRM: Towards Reward Modeling for Profile-Based Role Play in Dialogue Systems
por: Ding, Hang, et al.
Publicado: (2025)
por: Ding, Hang, et al.
Publicado: (2025)
Personality-aware Student Simulation for Conversational Intelligent Tutoring Systems
por: Liu, Zhengyuan, et al.
Publicado: (2024)
por: Liu, Zhengyuan, et al.
Publicado: (2024)
SGSH: Stimulate Large Language Models with Skeleton Heuristics for Knowledge Base Question Generation
por: Guo, Shasha, et al.
Publicado: (2024)
por: Guo, Shasha, et al.
Publicado: (2024)
Role-Play Paradox in Large Language Models: Reasoning Performance Gains and Ethical Dilemmas
por: Zhao, Jinman, et al.
Publicado: (2024)
por: Zhao, Jinman, et al.
Publicado: (2024)
CRPO: Character-centric Group Relative Policy Optimization for Role-aware Reasoning in Role-playing Agents
por: Tang, Yihong, et al.
Publicado: (2026)
por: Tang, Yihong, et al.
Publicado: (2026)
Instruction-Aligned Visual Attention for Mitigating Hallucinations in Large Vision-Language Models
por: Li, Bin, et al.
Publicado: (2025)
por: Li, Bin, et al.
Publicado: (2025)
The Imperative of Conversation Analysis in the Era of LLMs: A Survey of Tasks, Techniques, and Trends
por: Zhang, Xinghua, et al.
Publicado: (2024)
por: Zhang, Xinghua, et al.
Publicado: (2024)
LLMs Can Also Do Well! Breaking Barriers in Semantic Role Labeling via Large Language Models
por: Li, Xinxin, et al.
Publicado: (2025)
por: Li, Xinxin, et al.
Publicado: (2025)
Ejemplares similares
-
Alloc-MoE: Budget-Aware Expert Activation Allocation for Efficient Mixture-of-Experts Inference
por: Liu, Baihui, et al.
Publicado: (2026) -
Dovetail: A CPU/GPU Heterogeneous Speculative Decoding for LLM inference
por: Zhang, Libo, et al.
Publicado: (2024) -
Sparrow: Text-Anchored Window Attention with Visual-Semantic Glimpsing for Speculative Decoding in Video LLMs
por: Zhang, Libo, et al.
Publicado: (2026) -
GRASS: Gradient-based Adaptive Layer-wise Importance Sampling for Memory-efficient Large Language Model Fine-tuning
por: Tian, Kaiyuan, et al.
Publicado: (2026) -
TFDMNet: A Novel Network Structure Combines the Time Domain and Frequency Domain Features
por: Pan, Hengyue, et al.
Publicado: (2024)