:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Haris, Themistoklis, Zhang, Zihan, Yoshida, Yuichi
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2602.08287
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Compression Barriers for Autoregressive Transformers
by: Haris, Themistoklis, et al.
Published: (2025)

Symmetry Reveals Layerwise Dynamics: How Transformers Perform In-Context Classification
by: Lutz, Patrick, et al.
Published: (2026)

$k$NN Attention Demystified: A Theoretical Exploration for Scalable Transformers
by: Haris, Themistoklis
Published: (2024)

Is Monotonic Sampling Necessary in Diffusion Models?
by: Khan, Muhammad Haris
Published: (2026)

NoiseFormer -- Noise Diffused Symmetric Attention Transformer
by: Kumar, Phani, et al.
Published: (2026)

NoiseAR: AutoRegressing Initial Noise Prior for Diffusion Models
by: Li, Zeming, et al.
Published: (2025)

From Noise to Narrative: Tracing the Origins of Hallucinations in Transformers
by: Suresh, Praneet, et al.
Published: (2025)

Stability and Generalization in Looped Transformers
by: Labovich, Asher
Published: (2026)

SafeBench-Seq: A Homology-Clustered, CPU-Only Baseline for Protein Hazard Screening with Physicochemical/Composition Features and Cluster-Aware Confidence Intervals
by: Khan, Muhammad Haris
Published: (2025)

The Kinetics of Reasoning: How Chain-of-Thought Shapes Learning in Transformers?
by: Pengmei, Zihan, et al.
Published: (2025)

Cut Less, Fold More: Model Compression through the Lens of Projection Geometry
by: Saukh, Olga, et al.
Published: (2026)

Diffusion World Model: Future Modeling Beyond Step-by-Step Rollout for Offline Reinforcement Learning
by: Ding, Zihan, et al.
Published: (2024)

Robust Federated Learning Over the Air: Combating Heavy-Tailed Noise with Median Anchored Clipping
by: Li, Jiaxing, et al.
Published: (2024)

Policy Filtration for RLHF to Mitigate Noise in Reward Models
by: Zhang, Chuheng, et al.
Published: (2024)

Precision Tracked Transformer via Kalman Filtering, Kriging and Process Noise
by: Long, Bo, et al.
Published: (2026)

Modular Delta Merging with Orthogonal Constraints: A Scalable Framework for Continual and Reversible Model Composition
by: Khan, Haris, et al.
Published: (2025)

Exact Attention Sensitivity and the Geometry of Transformer Stability
by: Emadi, Seyed Morteza
Published: (2026)

When Is Rank-1 Steering Cheap? Geometry, Granularity, and Budgeted Search
by: Robertson, John T., et al.
Published: (2026)

Forget the Data and Fine-Tuning! Just Fold the Network to Compress
by: Wang, Dong, et al.
Published: (2025)

Self-Discovered Intention-aware Transformer for Multi-modal Vehicle Trajectory Prediction
by: Liu, Diyi, et al.
Published: (2026)

Stability of Transformers under Layer Normalization
by: Kan, Kelvin, et al.
Published: (2025)

Unlocking Emergent Modularity in Large Language Models
by: Qiu, Zihan, et al.
Published: (2023)

A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement Learning
by: Yoshihara, Hiroshi, et al.
Published: (2025)

Handling Label Noise via Instance-Level Difficulty Modeling and Dynamic Optimization
by: Zhang, Kuan, et al.
Published: (2025)

Traj-Transformer: Diffusion Models with Transformer for GPS Trajectory Generation
by: Zhang, Zhiyang, et al.
Published: (2025)

Hide and Seek in Noise Labels: Noise-Robust Collaborative Active Learning with LLM-Powered Assistance
by: Yuan, Bo, et al.
Published: (2025)

From Static Analysis to Audience Dissemination: A Training-Free Multimodal Controversy Detection Multi-Agent Framework
by: Ding, Zihan, et al.
Published: (2026)

PrismAgent: Illuminating Harm in Memes via a Zero-Shot Interpretable Multi-Agent Framework
by: Ding, Zihan, et al.
Published: (2026)

On Some Tunable Multi-fidelity Bayesian Optimization Frameworks
by: Manoj, Arjun, et al.
Published: (2025)

Say Less, Mean More: Leveraging Pragmatics in Retrieval-Augmented Generation
by: Riaz, Haris, et al.
Published: (2025)

Federated Self-Supervised Learning for Automatic Modulation Classification under Non-IID and Class-Imbalanced Data
by: Akram, Usman, et al.
Published: (2025)

MambaCSP: Hybrid-Attention State Space Models for Hardware-Efficient Channel State Prediction
by: Djuhera, Aladin, et al.
Published: (2026)

An Empirical Study of Pre-trained Model Selection for Out-of-Distribution Generalization and Calibration
by: Naganuma, Hiroki, et al.
Published: (2023)

A Comprehensive Review on Noise Control of Diffusion Model
by: Guo, Zhehao, et al.
Published: (2025)

CCS: Controllable and Constrained Sampling with Diffusion Models via Initial Noise Perturbation
by: Song, Bowen, et al.
Published: (2025)

Adaptive Policy Selection and Fine-Tuning under Interaction Budgets for Offline-to-Online Reinforcement Learning
by: Bozkurt, Alper Kamil, et al.
Published: (2026)

Online Learning for Multi-Layer Hierarchical Inference under Partial and Policy-Dependent Feedback
by: Zhang, Haoran, et al.
Published: (2026)

Foundation-Preserving Adaptation via Generalized Rayleigh-Quotient Optimization
by: Kim, Dongjun, et al.
Published: (2026)

Single-stream Policy Optimization
by: Xu, Zhongwen, et al.
Published: (2025)

Dynamic Influence Tracker: Measuring Time-Varying Sample Influence During Training
by: Xu, Jie, et al.
Published: (2025)