Saved in:
| Main Authors: | Gu, Zihan, Chen, Ruoyu, Zhang, Han, Zhang, Hua, Hu, Yue |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.13027 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Beyond Progress Measures: Theoretical Insights into the Mechanism of Grokking
by: Gu, Zihan, et al.
Published: (2025)
by: Gu, Zihan, et al.
Published: (2025)
ELODI: Ensemble Logit Difference Inhibition for Positive-Congruent Training
by: Zhao, Yue, et al.
Published: (2022)
by: Zhao, Yue, et al.
Published: (2022)
Learning in Position-Aware Multinomial Logit Bandits: From Multiplicative to General Position Effects
by: Chen, Xi, et al.
Published: (2026)
by: Chen, Xi, et al.
Published: (2026)
Rank-Aware Spectral Bounds on Attention Logits for Stable Low-Precision Training
by: Emadi, Seyed Morteza
Published: (2026)
by: Emadi, Seyed Morteza
Published: (2026)
Provably Efficient Reinforcement Learning with Multinomial Logit Function Approximation
by: Li, Long-Fei, et al.
Published: (2024)
by: Li, Long-Fei, et al.
Published: (2024)
Consistency Regularization for Domain Generalization with Logit Attribution Matching
by: Gao, Han, et al.
Published: (2023)
by: Gao, Han, et al.
Published: (2023)
Logits-Based Finetuning
by: Li, Jingyao, et al.
Published: (2025)
by: Li, Jingyao, et al.
Published: (2025)
Positional LSH: Binary Block Matrix Approximation for Attention with Linear Biases
by: Wolfson, Daniel, et al.
Published: (2026)
by: Wolfson, Daniel, et al.
Published: (2026)
Energy-Gated Attention and Wavelet Positional Encoding: Complementary Inductive Biases for Transformer Attention
by: Zeris, Athanasios
Published: (2026)
by: Zeris, Athanasios
Published: (2026)
Deconstructing Attention: Investigating Design Principles for Effective Language Modeling
by: Xue, Huiyin, et al.
Published: (2025)
by: Xue, Huiyin, et al.
Published: (2025)
Enhancing Certified Robustness via Block Reflector Orthogonal Layers and Logit Annealing Loss
by: Lai, Bo-Han, et al.
Published: (2025)
by: Lai, Bo-Han, et al.
Published: (2025)
Feature Learning Dynamics in Infinite-Depth Neural Networks
by: Yao, Zihan, et al.
Published: (2025)
by: Yao, Zihan, et al.
Published: (2025)
Contextual Multinomial Logit Bandits with General Value Functions
by: Zhang, Mengxiao, et al.
Published: (2024)
by: Zhang, Mengxiao, et al.
Published: (2024)
From Static Analysis to Audience Dissemination: A Training-Free Multimodal Controversy Detection Multi-Agent Framework
by: Ding, Zihan, et al.
Published: (2026)
by: Ding, Zihan, et al.
Published: (2026)
Scalable Spatiotemporal Inference with Biased Scan Attention Transformer Neural Processes
by: Jenson, Daniel, et al.
Published: (2025)
by: Jenson, Daniel, et al.
Published: (2025)
Training Unbiased Diffusion Models From Biased Dataset
by: Kim, Yeongmin, et al.
Published: (2024)
by: Kim, Yeongmin, et al.
Published: (2024)
From Logits to Latents: Contrastive Representation Shaping for LLM Unlearning
by: Tang, Haoran, et al.
Published: (2026)
by: Tang, Haoran, et al.
Published: (2026)
Spectral Logit Sculpting: Adaptive Low-Rank Logit Transformation for Controlled Text Generation
by: Li, Jin, et al.
Published: (2025)
by: Li, Jin, et al.
Published: (2025)
DALD: Improving Logits-based Detector without Logits from Black-box LLMs
by: Zeng, Cong, et al.
Published: (2024)
by: Zeng, Cong, et al.
Published: (2024)
Where Not to Learn: Prior-Aligned Training with Subset-based Attribution Constraints for Reliable Decision-Making
by: Chen, Ruoyu, et al.
Published: (2026)
by: Chen, Ruoyu, et al.
Published: (2026)
Logits Replay + MoClip: Stabilized, Low-Cost Post-Training with Minimal Forgetting
by: Qiu, Suming, et al.
Published: (2025)
by: Qiu, Suming, et al.
Published: (2025)
Unbiased Scene Graph Generation from Biased Training
by: Tang, Kaihua, et al.
Published: (2020)
by: Tang, Kaihua, et al.
Published: (2020)
Less is More: Fewer Interpretable Region via Submodular Subset Selection
by: Chen, Ruoyu, et al.
Published: (2024)
by: Chen, Ruoyu, et al.
Published: (2024)
From Projection to Prediction: Beyond Logits for Scalable Language Models
by: Dong, Jianbing, et al.
Published: (2025)
by: Dong, Jianbing, et al.
Published: (2025)
Learning to Scale Logits for Temperature-Conditional GFlowNets
by: Kim, Minsu, et al.
Published: (2023)
by: Kim, Minsu, et al.
Published: (2023)
The Implicit Bias of Logit Regularization
by: Beck, Alon, et al.
Published: (2026)
by: Beck, Alon, et al.
Published: (2026)
Meta-Learning Neural Procedural Biases
by: Raymond, Christian, et al.
Published: (2024)
by: Raymond, Christian, et al.
Published: (2024)
Dissecting Generalized Category Discovery: Multiplex Consensus under Self-Deconstruction
by: Tang, Luyao, et al.
Published: (2025)
by: Tang, Luyao, et al.
Published: (2025)
Exploiting LLMs for Automatic Hypothesis Assessment via a Logit-Based Calibrated Prior
by: Gong, Yue, et al.
Published: (2025)
by: Gong, Yue, et al.
Published: (2025)
Customizing the Inductive Biases of Softmax Attention using Structured Matrices
by: Kuang, Yilun, et al.
Published: (2025)
by: Kuang, Yilun, et al.
Published: (2025)
Replacing Paths with Connection-Biased Attention for Knowledge Graph Completion
by: Dutta, Sharmishtha, et al.
Published: (2024)
by: Dutta, Sharmishtha, et al.
Published: (2024)
Biased Stochastic First-Order Methods for Conditional Stochastic Optimization and Applications in Meta Learning
by: Hu, Yifan, et al.
Published: (2020)
by: Hu, Yifan, et al.
Published: (2020)
Tighter Regret Bounds for Contextual Action-Set Reinforcement Learning
by: Chen, Zijun, et al.
Published: (2026)
by: Chen, Zijun, et al.
Published: (2026)
Transformers Are Born Biased: Structural Inductive Biases at Random Initialization and Their Practical Consequences
by: Li, Siquan, et al.
Published: (2026)
by: Li, Siquan, et al.
Published: (2026)
Logits Poisoning Attack in Federated Distillation
by: Tang, Yuhan, et al.
Published: (2024)
by: Tang, Yuhan, et al.
Published: (2024)
Biased Dueling Bandits with Stochastic Delayed Feedback
by: Yi, Bongsoo, et al.
Published: (2024)
by: Yi, Bongsoo, et al.
Published: (2024)
Peak-Controlled Logits Poisoning Attack in Federated Distillation
by: Tang, Yuhan, et al.
Published: (2024)
by: Tang, Yuhan, et al.
Published: (2024)
Recency Biased Causal Attention for Time-series Forecasting
by: Hegazy, Kareem, et al.
Published: (2025)
by: Hegazy, Kareem, et al.
Published: (2025)
Deconstructing Generative Diversity: An Information Bottleneck Analysis of Discrete Latent Generative Models
by: Wu, Yudi, et al.
Published: (2025)
by: Wu, Yudi, et al.
Published: (2025)
Deconstructing Pre-training: Knowledge Attribution Analysis in MoE and Dense Models
by: Wang, Bo, et al.
Published: (2026)
by: Wang, Bo, et al.
Published: (2026)
Similar Items
-
Beyond Progress Measures: Theoretical Insights into the Mechanism of Grokking
by: Gu, Zihan, et al.
Published: (2025) -
ELODI: Ensemble Logit Difference Inhibition for Positive-Congruent Training
by: Zhao, Yue, et al.
Published: (2022) -
Learning in Position-Aware Multinomial Logit Bandits: From Multiplicative to General Position Effects
by: Chen, Xi, et al.
Published: (2026) -
Rank-Aware Spectral Bounds on Attention Logits for Stable Low-Precision Training
by: Emadi, Seyed Morteza
Published: (2026) -
Provably Efficient Reinforcement Learning with Multinomial Logit Function Approximation
by: Li, Long-Fei, et al.
Published: (2024)