Gardado en:
| Main Authors: | Mezentsev, Gleb, Oseledets, Ivan |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Subjects: | |
| Acceso en liña: | https://arxiv.org/abs/2505.21189 |
| Tags: |
Engadir etiqueta
Sen Etiquetas, Sexa o primeiro en etiquetar este rexistro!
|
Títulos similares
SparseGrad: A Selective Method for Efficient Fine-tuning of MLP Layers
por: Chekalina, Viktoriia, et al.
Publicado: (2024)
por: Chekalina, Viktoriia, et al.
Publicado: (2024)
On the Spatial Structure of Mixture-of-Experts in Transformers
por: Bershatsky, Daniel, et al.
Publicado: (2025)
por: Bershatsky, Daniel, et al.
Publicado: (2025)
LoTR: Low Tensor Rank Weight Adaptation
por: Bershatsky, Daniel, et al.
Publicado: (2024)
por: Bershatsky, Daniel, et al.
Publicado: (2024)
Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation
por: Baumann, Joachim, et al.
Publicado: (2025)
por: Baumann, Joachim, et al.
Publicado: (2025)
Guided Score identity Distillation for Data-Free One-Step Text-to-Image Generation
por: Zhou, Mingyuan, et al.
Publicado: (2024)
por: Zhou, Mingyuan, et al.
Publicado: (2024)
Direct-Inverse Prompting: Analyzing LLMs' Discriminative Capacity in Self-Improving Generation
por: Ahn, Jihyun Janice, et al.
Publicado: (2024)
por: Ahn, Jihyun Janice, et al.
Publicado: (2024)
Your Transformer is Secretly Linear
por: Razzhigaev, Anton, et al.
Publicado: (2024)
por: Razzhigaev, Anton, et al.
Publicado: (2024)
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs
por: Lai, Xin, et al.
Publicado: (2024)
por: Lai, Xin, et al.
Publicado: (2024)
Scalable Cross-Entropy Loss for Sequential Recommendations with Large Item Catalogs
por: Mezentsev, Gleb, et al.
Publicado: (2024)
por: Mezentsev, Gleb, et al.
Publicado: (2024)
RECE: Reduced Cross-Entropy Loss for Large-Catalogue Sequential Recommenders
por: Gusak, Danil, et al.
Publicado: (2024)
por: Gusak, Danil, et al.
Publicado: (2024)
Hidden in the Haystack: Smaller Needles are More Difficult for LLMs to Find
por: Bianchi, Owen, et al.
Publicado: (2025)
por: Bianchi, Owen, et al.
Publicado: (2025)
FS-DFM: Fast and Accurate Long Text Generation with Few-Step Diffusion Language Models
por: Monsefi, Amin Karimi, et al.
Publicado: (2025)
por: Monsefi, Amin Karimi, et al.
Publicado: (2025)
Counterfactual Evaluation Reveals Hidden Capability Profiles in Clinical LLMs and Agents
por: Turk, Matt
Publicado: (2026)
por: Turk, Matt
Publicado: (2026)
The Hidden Signal of Verifier Strictness: Controlling and Improving Step-Wise Verification via Selective Latent Steering
por: Zhou, Yefan, et al.
Publicado: (2026)
por: Zhou, Yefan, et al.
Publicado: (2026)
Unraveling Text Generation in LLMs: A Stochastic Differential Equation Approach
por: Zhang, Yukun
Publicado: (2024)
por: Zhang, Yukun
Publicado: (2024)
Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text
por: Hans, Abhimanyu, et al.
Publicado: (2024)
por: Hans, Abhimanyu, et al.
Publicado: (2024)
From Associations to Activations: Comparing Behavioral and Hidden-State Semantic Geometry in LLMs
por: Schiekiera, Louis, et al.
Publicado: (2026)
por: Schiekiera, Louis, et al.
Publicado: (2026)
Text-ADBench: Text Anomaly Detection Benchmark based on LLMs Embedding
por: Xiao, Feng, et al.
Publicado: (2025)
por: Xiao, Feng, et al.
Publicado: (2025)
Teaching LLMs for Step-Level Automatic Math Correction via Reinforcement Learning
por: Li, Junsong, et al.
Publicado: (2025)
por: Li, Junsong, et al.
Publicado: (2025)
UProp: Investigating the Uncertainty Propagation of LLMs in Multi-Step Agentic Decision-Making
por: Duan, Jinhao, et al.
Publicado: (2025)
por: Duan, Jinhao, et al.
Publicado: (2025)
Reinforcement Learning for Diffusion LLMs with Entropy-Guided Step Selection and Stepwise Advantages
por: Kunde, Vishnu Teja, et al.
Publicado: (2026)
por: Kunde, Vishnu Teja, et al.
Publicado: (2026)
Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations
por: Wang, Peiyi, et al.
Publicado: (2023)
por: Wang, Peiyi, et al.
Publicado: (2023)
When Should LLMs Be Less Specific? Selective Abstraction for Reliable Long-Form Text Generation
por: Goren, Shani, et al.
Publicado: (2026)
por: Goren, Shani, et al.
Publicado: (2026)
A Case Study Exploring the Current Landscape of Synthetic Medical Record Generation with Commercial LLMs
por: Lin, Yihan, et al.
Publicado: (2025)
por: Lin, Yihan, et al.
Publicado: (2025)
OneGen: Efficient One-Pass Unified Generation and Retrieval for LLMs
por: Zhang, Jintian, et al.
Publicado: (2024)
por: Zhang, Jintian, et al.
Publicado: (2024)
Comparing Specialised Small and General Large Language Models on Text Classification: 100 Labelled Samples to Achieve Break-Even Performance
por: Pecher, Branislav, et al.
Publicado: (2024)
por: Pecher, Branislav, et al.
Publicado: (2024)
Self-Evaluating LLMs for Multi-Step Tasks: Stepwise Confidence Estimation for Failure Detection
por: Mavi, Vaibhav, et al.
Publicado: (2025)
por: Mavi, Vaibhav, et al.
Publicado: (2025)
StepFun-Formalizer: Unlocking the Autoformalization Potential of LLMs through Knowledge-Reasoning Fusion
por: Wu, Yutong, et al.
Publicado: (2025)
por: Wu, Yutong, et al.
Publicado: (2025)
SEUF: Is Unlearning One Expert Enough for Mixture-of-Experts LLMs?
por: Zhuang, Haomin, et al.
Publicado: (2024)
por: Zhuang, Haomin, et al.
Publicado: (2024)
Can LLMs Convert Graphs to Text-Attributed Graphs?
por: Wang, Zehong, et al.
Publicado: (2024)
por: Wang, Zehong, et al.
Publicado: (2024)
PREF: Reference-Free Evaluation of Personalised Text Generation in LLMs
por: Fu, Xiao, et al.
Publicado: (2025)
por: Fu, Xiao, et al.
Publicado: (2025)
Better as Generators Than Classifiers: Leveraging LLMs and Synthetic Data for Low-Resource Multilingual Classification
por: Pecher, Branislav, et al.
Publicado: (2026)
por: Pecher, Branislav, et al.
Publicado: (2026)
Exploring Design Choices for Building Language-Specific LLMs
por: Tejaswi, Atula, et al.
Publicado: (2024)
por: Tejaswi, Atula, et al.
Publicado: (2024)
HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs
por: Chen, Junying, et al.
Publicado: (2023)
por: Chen, Junying, et al.
Publicado: (2023)
Wings: Learning Multimodal LLMs without Text-only Forgetting
por: Zhang, Yi-Kai, et al.
Publicado: (2024)
por: Zhang, Yi-Kai, et al.
Publicado: (2024)
The Shape of Learning: Anisotropy and Intrinsic Dimensions in Transformer-Based Models
por: Razzhigaev, Anton, et al.
Publicado: (2023)
por: Razzhigaev, Anton, et al.
Publicado: (2023)
Diagonal Batching Unlocks Parallelism in Recurrent Memory Transformers for Long Contexts
por: Sivtsov, Danil, et al.
Publicado: (2025)
por: Sivtsov, Danil, et al.
Publicado: (2025)
Too Big to Think: Capacity, Memorization, and Generalization in Pre-Trained Transformers
por: Barron, Joshua, et al.
Publicado: (2025)
por: Barron, Joshua, et al.
Publicado: (2025)
The Impact of Inference Acceleration on Bias of LLMs
por: Kirsten, Elisabeth, et al.
Publicado: (2024)
por: Kirsten, Elisabeth, et al.
Publicado: (2024)
Multi-Turn Code Generation Through Single-Step Rewards
por: Jain, Arnav Kumar, et al.
Publicado: (2025)
por: Jain, Arnav Kumar, et al.
Publicado: (2025)
Títulos similares
-
SparseGrad: A Selective Method for Efficient Fine-tuning of MLP Layers
por: Chekalina, Viktoriia, et al.
Publicado: (2024) -
On the Spatial Structure of Mixture-of-Experts in Transformers
por: Bershatsky, Daniel, et al.
Publicado: (2025) -
LoTR: Low Tensor Rank Weight Adaptation
por: Bershatsky, Daniel, et al.
Publicado: (2024) -
Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation
por: Baumann, Joachim, et al.
Publicado: (2025) -
Guided Score identity Distillation for Data-Free One-Step Text-to-Image Generation
por: Zhou, Mingyuan, et al.
Publicado: (2024)