:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhao, Mingkuan, Hu, Wentao, Wang, Jiayin, Lai, Xin, Huang, Tianchen, Min, Yuheng, Yan, Rui, Zhu, Xiaoyan
Format:	Preprint
Published:	2025
Subjects:	Machine Learning 68T50 (Primary) I.2.7
Online Access:	https://arxiv.org/abs/2511.09596
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Fast Quiet-STaR: Thinking Without Thought Tokens
by: Huang, Wei, et al.
Published: (2025)

D-SMART: Enhancing LLM Dialogue Consistency via Dynamic Structured Memory And Reasoning Tree
by: Lei, Xiang, et al.
Published: (2025)

Train-Attention: Meta-Learning Where to Focus in Continual Knowledge Learning
by: Seo, Yeongbin, et al.
Published: (2024)

WeDLM: Reconciling Diffusion Language Models with Standard Causal Attention for Fast Inference
by: Liu, Aiwei, et al.
Published: (2025)

When Does Content-Based Routing Work? Representation Requirements for Selective Attention in Hybrid Sequence Models
by: Basu, Abhinaba
Published: (2026)

From Brazilian Portuguese to European Portuguese
by: Sanches, João, et al.
Published: (2024)

Towards Effective and Efficient Continual Pre-training of Large Language Models
by: Chen, Jie, et al.
Published: (2024)

Prompt Engineering and the Effectiveness of Large Language Models in Enhancing Human Productivity
by: Anam, Rizal Khoirul
Published: (2025)

Transactional Attention: Semantic Sponsorship for KV-Cache Retention
by: Basu, Abhinaba
Published: (2026)

How Pruning Reshapes Features: Sparse Autoencoder Analysis of Weight-Pruned Language Models
by: Borobia, Hector, et al.
Published: (2026)

CLMN: Concept based Language Models via Neural Symbolic Reasoning
by: Yang, Yibo
Published: (2025)

Fact Grounded Attention: Eliminating Hallucination in Large Language Models Through Attention Level Knowledge Integration
by: Gupta, Aayush
Published: (2025)

Softmax Linear Attention: Reclaiming Global Competition
by: Xu, Mingwei, et al.
Published: (2026)

Co-NAML-LSTUR: A Combined Model with Attentive Multi-View Learning and Long- and Short-term User Representations for News Recommendation
by: Nguyen, Minh Hoang, et al.
Published: (2025)

Mitigating LLM Hallucinations through Domain-Grounded Tiered Retrieval
by: Haque, Md. Asraful, et al.
Published: (2026)

Multipole Semantic Attention: A Fast Approximation of Softmax Attention for Pretraining
by: Mitchell, Rupert, et al.
Published: (2025)

Align-to-Distill: Trainable Attention Alignment for Knowledge Distillation in Neural Machine Translation
by: Jin, Heegon, et al.
Published: (2024)

Investigating Neuron Ablation in Attention Heads: The Case for Peak Activation Centering
by: Pochinkov, Nicholas, et al.
Published: (2024)

Enhancing OCR for Sino-Vietnamese Language Processing via Fine-tuned PaddleOCRv5
by: Nguyen, Minh Hoang, et al.
Published: (2025)

Towards Probabilistic Question Answering Over Tabular Data
by: Shen, Chen, et al.
Published: (2025)

RHealthTwin: Towards Responsible and Multimodal Digital Twins for Personalized Well-being
by: Ferdousi, Rahatara, et al.
Published: (2025)

Advancing Explainability in Neural Machine Translation: Analytical Metrics for Attention and Alignment Consistency
by: Mishra, Anurag
Published: (2024)

$δ$-STEAL: LLM Stealing Attack with Local Differential Privacy
by: Dang, Kieu, et al.
Published: (2025)

Multi-Model Synthetic Training for Mission-Critical Small Language Models
by: Platt, Nolan, et al.
Published: (2025)

PaperAudit-Bench: Benchmarking Error Detection in Research Papers for Critical Automated Peer Review
by: Tu, Songjun, et al.
Published: (2026)

PairCFR: Enhancing Model Training on Paired Counterfactually Augmented Data through Contrastive Learning
by: Qiu, Xiaoqi, et al.
Published: (2024)

Bi-Attention HateXplain : Taking into account the sequential aspect of data during explainability in a multi-task context
by: Mondjo, Ghislain Dorian Tchuente
Published: (2026)

How much do LLMs learn from negative examples?
by: Hamdan, Shadi, et al.
Published: (2025)

Communicative Agents for Slideshow Storytelling Video Generation based on LLMs
by: Fan, Jingxing, et al.
Published: (2025)

An Epidemiological Knowledge Graph extracted from the World Health Organization's Disease Outbreak News
by: Consoli, Sergio, et al.
Published: (2025)

MedMemoryBench: Benchmarking Agent Memory in Personalized Healthcare
by: Wang, Yihao, et al.
Published: (2026)

From Noise to Diversity: Random Embedding Injection in LLM Reasoning
by: Kim, Heejun, et al.
Published: (2026)

NOTAI.AI: Explainable Detection of Machine-Generated Text via Curvature and Feature Attribution
by: Breneur, Oleksandr Marchenko, et al.
Published: (2026)

Rethinking the Multilingual Reasoning Gap with Layer Swap
by: Lasbordes, Maxence, et al.
Published: (2026)

Cache-to-Cache: Direct Semantic Communication Between Large Language Models
by: Fu, Tianyu, et al.
Published: (2025)

An Iterative Optimizing Framework for Radiology Report Summarization with ChatGPT
by: Ma, Chong, et al.
Published: (2023)

The Knesset Corpus: An Annotated Corpus of Hebrew Parliamentary Proceedings
by: Goldin, Gili, et al.
Published: (2024)

Math Natural Language Inference: this should be easy!
by: de Paiva, Valeria, et al.
Published: (2025)

New Skills or Sharper Primitives? A Probabilistic Perspective on the Emergence of Reasoning in RLVR
by: Wang, Zhilin, et al.
Published: (2026)

Pitfalls in Evaluating Interpretability Agents
by: Haklay, Tal, et al.
Published: (2026)