:: Library Catalog

Imaxe de Portada

Gardado en:

Detalles Bibliográficos
Main Authors:	Mezentsev, Gleb, Oseledets, Ivan
Formato:	Preprint
Publicado:	2025
Subjects:	Computation and Language Artificial Intelligence Machine Learning
Acceso en liña:	https://arxiv.org/abs/2505.21189
Tags:	Engadir etiqueta Sen Etiquetas, Sexa o primeiro en etiquetar este rexistro!

Títulos similares

SparseGrad: A Selective Method for Efficient Fine-tuning of MLP Layers
por: Chekalina, Viktoriia, et al.
Publicado: (2024)

On the Spatial Structure of Mixture-of-Experts in Transformers
por: Bershatsky, Daniel, et al.
Publicado: (2025)

LoTR: Low Tensor Rank Weight Adaptation
por: Bershatsky, Daniel, et al.
Publicado: (2024)

Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation
por: Baumann, Joachim, et al.
Publicado: (2025)

Guided Score identity Distillation for Data-Free One-Step Text-to-Image Generation
por: Zhou, Mingyuan, et al.
Publicado: (2024)

Direct-Inverse Prompting: Analyzing LLMs' Discriminative Capacity in Self-Improving Generation
por: Ahn, Jihyun Janice, et al.
Publicado: (2024)

Your Transformer is Secretly Linear
por: Razzhigaev, Anton, et al.
Publicado: (2024)

Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs
por: Lai, Xin, et al.
Publicado: (2024)

Scalable Cross-Entropy Loss for Sequential Recommendations with Large Item Catalogs
por: Mezentsev, Gleb, et al.
Publicado: (2024)

RECE: Reduced Cross-Entropy Loss for Large-Catalogue Sequential Recommenders
por: Gusak, Danil, et al.
Publicado: (2024)

Hidden in the Haystack: Smaller Needles are More Difficult for LLMs to Find
por: Bianchi, Owen, et al.
Publicado: (2025)

FS-DFM: Fast and Accurate Long Text Generation with Few-Step Diffusion Language Models
por: Monsefi, Amin Karimi, et al.
Publicado: (2025)

Counterfactual Evaluation Reveals Hidden Capability Profiles in Clinical LLMs and Agents
por: Turk, Matt
Publicado: (2026)

The Hidden Signal of Verifier Strictness: Controlling and Improving Step-Wise Verification via Selective Latent Steering
por: Zhou, Yefan, et al.
Publicado: (2026)

Unraveling Text Generation in LLMs: A Stochastic Differential Equation Approach
por: Zhang, Yukun
Publicado: (2024)

Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text
por: Hans, Abhimanyu, et al.
Publicado: (2024)

From Associations to Activations: Comparing Behavioral and Hidden-State Semantic Geometry in LLMs
por: Schiekiera, Louis, et al.
Publicado: (2026)

Text-ADBench: Text Anomaly Detection Benchmark based on LLMs Embedding
por: Xiao, Feng, et al.
Publicado: (2025)

Teaching LLMs for Step-Level Automatic Math Correction via Reinforcement Learning
por: Li, Junsong, et al.
Publicado: (2025)

UProp: Investigating the Uncertainty Propagation of LLMs in Multi-Step Agentic Decision-Making
por: Duan, Jinhao, et al.
Publicado: (2025)

Reinforcement Learning for Diffusion LLMs with Entropy-Guided Step Selection and Stepwise Advantages
por: Kunde, Vishnu Teja, et al.
Publicado: (2026)

Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations
por: Wang, Peiyi, et al.
Publicado: (2023)

When Should LLMs Be Less Specific? Selective Abstraction for Reliable Long-Form Text Generation
por: Goren, Shani, et al.
Publicado: (2026)

A Case Study Exploring the Current Landscape of Synthetic Medical Record Generation with Commercial LLMs
por: Lin, Yihan, et al.
Publicado: (2025)

OneGen: Efficient One-Pass Unified Generation and Retrieval for LLMs
por: Zhang, Jintian, et al.
Publicado: (2024)

Comparing Specialised Small and General Large Language Models on Text Classification: 100 Labelled Samples to Achieve Break-Even Performance
por: Pecher, Branislav, et al.
Publicado: (2024)

Self-Evaluating LLMs for Multi-Step Tasks: Stepwise Confidence Estimation for Failure Detection
por: Mavi, Vaibhav, et al.
Publicado: (2025)

StepFun-Formalizer: Unlocking the Autoformalization Potential of LLMs through Knowledge-Reasoning Fusion
por: Wu, Yutong, et al.
Publicado: (2025)

SEUF: Is Unlearning One Expert Enough for Mixture-of-Experts LLMs?
por: Zhuang, Haomin, et al.
Publicado: (2024)

Can LLMs Convert Graphs to Text-Attributed Graphs?
por: Wang, Zehong, et al.
Publicado: (2024)

PREF: Reference-Free Evaluation of Personalised Text Generation in LLMs
por: Fu, Xiao, et al.
Publicado: (2025)

Better as Generators Than Classifiers: Leveraging LLMs and Synthetic Data for Low-Resource Multilingual Classification
por: Pecher, Branislav, et al.
Publicado: (2026)

Exploring Design Choices for Building Language-Specific LLMs
por: Tejaswi, Atula, et al.
Publicado: (2024)

HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs
por: Chen, Junying, et al.
Publicado: (2023)

Wings: Learning Multimodal LLMs without Text-only Forgetting
por: Zhang, Yi-Kai, et al.
Publicado: (2024)

The Shape of Learning: Anisotropy and Intrinsic Dimensions in Transformer-Based Models
por: Razzhigaev, Anton, et al.
Publicado: (2023)

Diagonal Batching Unlocks Parallelism in Recurrent Memory Transformers for Long Contexts
por: Sivtsov, Danil, et al.
Publicado: (2025)

Too Big to Think: Capacity, Memorization, and Generalization in Pre-Trained Transformers
por: Barron, Joshua, et al.
Publicado: (2025)

The Impact of Inference Acceleration on Bias of LLMs
por: Kirsten, Elisabeth, et al.
Publicado: (2024)

Multi-Turn Code Generation Through Single-Step Rewards
por: Jain, Arnav Kumar, et al.
Publicado: (2025)