:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ou, Zhixin, Liang, Peng, Han, Jianchen, Liu, Baihui, Qiao, Linbo
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2511.13198
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

A Survey on Memory-Efficient Transformer-Based Model Training in AI for Science
by: Tian, Kaiyuan, et al.
Published: (2025)

Alloc-MoE: Budget-Aware Expert Activation Allocation for Efficient Mixture-of-Experts Inference
by: Liu, Baihui, et al.
Published: (2026)

Budget-aware Auto Optimizer Configurator
by: Liu, Kang, et al.
Published: (2026)

Exact Dual Geometry of SOC-ICNN Value Functions
by: Liu, Kang, et al.
Published: (2026)

Dy-mer: An Explainable DNA Sequence Representation Scheme using Dictionary Learning
by: Peng, Zhiyuan, et al.
Published: (2024)

EEG-DCNet: A Fast and Accurate MI-EEG Dilated CNN Classification Method
by: Peng, Wei, et al.
Published: (2024)

DyMRL: Dynamic Multispace Representation Learning for Multimodal Event Forecasting in Knowledge Graph
by: Zhao, Feng, et al.
Published: (2026)

DyTTP: Trajectory Prediction with Normalization-Free Transformers
by: Zhu, JianLin, et al.
Published: (2025)

CoDy: Counterfactual Explainers for Dynamic Graphs
by: Qu, Zhan, et al.
Published: (2024)

Approximated Likelihood Ratio: A Forward-Only and Parallel Framework for Boosting Neural Network Training
by: Zhang, Zeliang, et al.
Published: (2024)

Clip Your Sequences Fairly: Enforcing Length Fairness for Sequence-Level RL
by: Mao, Hanyi, et al.
Published: (2025)

Highly Parallelized Reinforcement Learning Training with Relaxed Assignment Dependencies
by: He, Zhouyu, et al.
Published: (2025)

Adaptive Ensembles of Fine-Tuned Transformers for LLM-Generated Text Detection
by: Lai, Zhixin, et al.
Published: (2024)

Memory Sequence Length of Data Sampling Impacts the Adaptation of Meta-Reinforcement Learning Agents
by: Zhang, Menglong, et al.
Published: (2024)

Scheduling Parallel Optical Circuit Switches for AI Training
by: Liang, Kevin, et al.
Published: (2026)

Arithmetic Transformers Can Length-Generalize in Both Operand Length and Count
by: Cho, Hanseul, et al.
Published: (2024)

Rethinking the Comparison Unit in Sequence-Level Reinforcement Learning: An Equal-Length Paired Training Framework from Loss Correction to Sample Construction
by: Ding, Fei, et al.
Published: (2026)

Action-Adaptive Continual Learning: Enabling Policy Generalization under Dynamic Action Spaces
by: Pan, Chaofan, et al.
Published: (2025)

GSPN-2: Efficient Parallel Sequence Modeling
by: Wang, Hongjun, et al.
Published: (2025)

On Vanishing Variance in Transformer Length Generalization
by: Li, Ruining, et al.
Published: (2025)

Acceleration for Deep Reinforcement Learning using Parallel and Distributed Computing: A Survey
by: Liu, Zhihong, et al.
Published: (2024)

In Search of Lost DNA Sequence Pretraining
by: Tang, Zhijiang, et al.
Published: (2026)

EasyQuant: An Efficient Data-free Quantization Algorithm for LLMs
by: Tang, Hanlin, et al.
Published: (2024)

Evaluating the Sensitivity of BiLSTM Forecasting Models to Sequence Length and Input Noise
by: Albelali, Salma, et al.
Published: (2025)

AGDC: Autoregressive Generation of Variable-Length Sequences with Joint Discrete and Continuous Spaces
by: Shin, Yeonsang, et al.
Published: (2026)

Transolver is a Linear Transformer: Revisiting Physics-Attention through the Lens of Linear Attention
by: Hu, Wenjie, et al.
Published: (2025)

How Particle-System Random Batch Methods Enhance Graph Transformer: Memory Efficiency and Parallel Computing Strategy
by: Liu, Hanwen, et al.
Published: (2025)

Adaptive Overclocking: Dynamic Control of Thinking Path Length via Real-Time Reasoning Signals
by: Jiang, Shuhao, et al.
Published: (2025)

DyCE: Dynamically Configurable Exiting for Deep Learning Compression and Real-time Scaling
by: Wang, Qingyuan, et al.
Published: (2024)

DySCo: Dynamic Semantic Compression for Effective Long-term Time Series Forecasting
by: Ao, Xiang, et al.
Published: (2026)

Provable Length Generalization in Sequence Prediction via Spectral Filtering
by: Marsden, Annie, et al.
Published: (2024)

GFT: From Imitation to Reward Fine-Tuning with Unbiased Group Advantages and Dynamic Coefficient Rectification
by: Gan, Wangjie, et al.
Published: (2026)

DyVal: Dynamic Evaluation of Large Language Models for Reasoning Tasks
by: Zhu, Kaijie, et al.
Published: (2023)

The Role of Sparsity for Length Generalization in Transformers
by: Golowich, Noah, et al.
Published: (2025)

Tequila: Trapping-free Ternary Quantization for Large Language Models
by: Huang, Hong, et al.
Published: (2025)

USP: A Unified Sequence Parallelism Approach for Long Context Generative AI
by: Fang, Jiarui, et al.
Published: (2024)

On the Limitations and Capabilities of Position Embeddings for Length Generalization
by: Chen, Yang, et al.
Published: (2025)

DAQ: Delta-Aware Quantization for Post-Training LLM Weight Compression
by: Yu, Xiaoming, et al.
Published: (2026)

Rethinking Random Transformers as Adaptive Sequence Smoothers for Sleep Staging
by: Liu, Guisong, et al.
Published: (2026)

DyEdgeGAT: Dynamic Edge via Graph Attention for Early Fault Detection in IIoT Systems
by: Zhao, Mengjie, et al.
Published: (2023)