:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Kahraman, Efe, Tosato, Giulio
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Computation and Language
Online Access:	https://arxiv.org/abs/2602.11040
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

There Are No Silly Questions: Evaluation of Offline LLM Capabilities from a Turkish Perspective
by: Yilmaz, Edibe, et al.
Published: (2026)

Modeling Overlapped Speech with Shuffles
by: Wiesner, Matthew, et al.
Published: (2026)

QMOS: Enhancing LLMs for Telecommunication with Question Masked loss and Option Shuffling
by: Guda, Blessed, et al.
Published: (2024)

Group and Shuffle: Efficient Structured Orthogonal Parametrization
by: Gorbunov, Mikhail, et al.
Published: (2024)

Zeroth-Order Sharpness-Aware Learning with Exponential Tilting
by: Gong, Xuchen, et al.
Published: (2025)

A Word is Worth 4-bit: Efficient Log Parsing with Binary Coded Decimal Recognition
by: Srivastava, Prerak, et al.
Published: (2025)

Multimodal Assessment of Classroom Discourse Quality: A Text-Centered Attention-Based Multi-Task Learning Approach
by: Hou, Ruikun, et al.
Published: (2025)

Learning Semantic Structure through First-Order-Logic Translation
by: Chaturvedi, Akshay, et al.
Published: (2024)

Batch-ICL: Effective, Efficient, and Order-Agnostic In-Context Learning
by: Zhang, Kaiyi, et al.
Published: (2024)

MMLU-Pro+: Evaluating Higher-Order Reasoning and Shortcut Learning in LLMs
by: Taghanaki, Saeid Asgari, et al.
Published: (2024)

On the Hidden Objective Biases of Group-based Reinforcement Learning
by: Fontana, Aleksandar, et al.
Published: (2026)

ProxySPEX: Inference-Efficient Interpretability via Sparse Feature Interactions in LLMs
by: Butler, Landon, et al.
Published: (2025)

ZOQO: Zero-Order Quantized Optimization
by: Bar, Noga, et al.
Published: (2025)

BRIDO: Bringing Democratic Order to Abstractive Summarization
by: Lee, Junhyun, et al.
Published: (2025)

Paging Dr. GPT: Extracting Information from Clinical Notes to Enhance Patient Predictions
by: Anderson, David, et al.
Published: (2025)

Neighborhood-Order Learning Graph Attention Network for Fake News Detection
by: Lakzaei, Batool, et al.
Published: (2025)

SOUL: Unlocking the Power of Second-Order Optimization for LLM Unlearning
by: Jia, Jinghan, et al.
Published: (2024)

On the Convergence of Zeroth-Order Federated Tuning for Large Language Models
by: Ling, Zhenqing, et al.
Published: (2024)

Attention Sinks: A 'Catch, Tag, Release' Mechanism for Embeddings
by: Zhang, Stephen, et al.
Published: (2025)

Reveal and Release: Iterative LLM Unlearning with Self-generated Data
by: Xie, Linxi, et al.
Published: (2025)

Transformers Learn to Achieve Second-Order Convergence Rates for In-Context Linear Regression
by: Fu, Deqing, et al.
Published: (2023)

ORCE: Order-Aware Alignment of Verbalized Confidence in Large Language Models
by: Li, Chen, et al.
Published: (2026)

On-Device Fine-Tuning via Backprop-Free Zeroth-Order Optimization
by: Katti, Prabodh, et al.
Published: (2025)

Looking Beyond The Top-1: Transformers Determine Top Tokens In Order
by: Lioubashevski, Daria, et al.
Published: (2024)

Order Independence With Finetuning
by: Brown, Katrina, et al.
Published: (2025)

Efficient Second-Order Neural Network Optimization via Adaptive Trust Region Methods
by: Vo, James
Published: (2024)

Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark
by: Zhang, Yihua, et al.
Published: (2024)

When Domains Interact: Asymmetric and Order-Sensitive Cross-Domain Effects in Reinforcement Learning for Reasoning
by: Yang, Wang, et al.
Published: (2026)

Improving Diffusion Language Model Decoding through Joint Search in Generation Order and Token Space
by: Shen, Yangyi, et al.
Published: (2026)

Scalable Fine-tuning from Multiple Data Sources: A First-Order Approximation Approach
by: Li, Dongyue, et al.
Published: (2024)

Simultaneous Computation and Memory Efficient Zeroth-Order Optimizer for Fine-Tuning Large Language Models
by: Wang, Fei, et al.
Published: (2024)

GTPO: Stabilizing Group Relative Policy Optimization via Gradient and Entropy Control
by: Simoni, Marco, et al.
Published: (2025)

Forma mentis networks predict creativity ratings of short texts via interpretable artificial intelligence in human and GPT-simulated raters
by: Haim, Edith, et al.
Published: (2024)

Why Any-Order Autoregressive Models Need Two-Stream Attention: A Structural-Semantic Tradeoff
by: Pynadath, Patrick, et al.
Published: (2026)

Prior-Informed Zeroth-Order Optimization with Adaptive Direction Alignment for Memory-Efficient LLM Fine-Tuning
by: Jin, Feihu, et al.
Published: (2026)

Order-Independence Without Fine Tuning
by: McIlroy-Young, Reid, et al.
Published: (2024)

Hi-ZFO: Hierarchical Zeroth- and First-Order LLM Fine-Tuning via Importance-Guided Tensor Selection
by: Jin, Feihu, et al.
Published: (2026)

Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models
by: Li, Zeman, et al.
Published: (2024)

FOCUS: First Order Concentrated Updating Scheme
by: Liu, Yizhou, et al.
Published: (2025)

SPEX: Scaling Feature Interaction Explanations for LLMs
by: Kang, Justin Singh, et al.
Published: (2025)