:: Library Catalog

Imagen de Portada

Guardado en:

Detalles Bibliográficos
Autores principales:	Ruscio, Valeria, Khedouri, Eli-Shaoul, Thompson, Keiran
Formato:	Preprint
Publicado:	2026
Materias:	Machine Learning Artificial Intelligence Computation and Language
Acceso en línea:	https://arxiv.org/abs/2605.16600
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Ejemplares similares

The Phenomenology of Hallucinations
por: Ruscio, Valeria, et al.
Publicado: (2026)

What are you sinking? A geometric approach on attention sink
por: Ruscio, Valeria, et al.
Publicado: (2025)

TPTT: Transforming Pretrained Transformers into Titans
por: Furfaro, Fabien
Publicado: (2025)

Beyond Position: the emergence of wavelet-like properties in Transformers
por: Ruscio, Valeria, et al.
Publicado: (2024)

Alignment Pretraining: AI Discourse Causes Self-Fulfilling (Mis)alignment
por: Tice, Cameron, et al.
Publicado: (2026)

Report Cards: Qualitative Evaluation of Language Models Using Natural Language Summaries
por: Yang, Blair, et al.
Publicado: (2024)

Pooling Attention: Evaluating Pretrained Transformer Embeddings for Deception Classification
por: Mamtani, Sumit, et al.
Publicado: (2025)

LEAP: Layer-wise Exit-Aware Pretraining for Efficient Transformer Inference
por: Kapadia, Shashank, et al.
Publicado: (2026)

Reward Models Inherit Value Biases from Pretraining
por: Christian, Brian, et al.
Publicado: (2026)

Subjective Depth and Timescale Transformers: Learning Where and When to Compute
por: Wieser, Frederico, et al.
Publicado: (2025)

Where Do Reasoning Models Refuse?
por: Yamaguchi, Kureha, et al.
Publicado: (2025)

Pretrained Hybrids with MAD Skills
por: Roberts, Nicholas, et al.
Publicado: (2024)

Knowledge Circuits in Pretrained Transformers
por: Yao, Yunzhi, et al.
Publicado: (2024)

Memorization Dynamics of Fill-in-the-Middle Pretraining
por: von Arx, Tobias, et al.
Publicado: (2026)

RLP: Reinforcement as a Pretraining Objective
por: Hatamizadeh, Ali, et al.
Publicado: (2025)

mini-vec2vec: Scaling Universal Geometry Alignment with Linear Transformations
por: Dar, Guy
Publicado: (2025)

Where does output diversity collapse in post-training?
por: Karouzos, Constantinos, et al.
Publicado: (2026)

Fantastic Bugs and Where to Find Them in AI Benchmarks
por: Truong, Sang, et al.
Publicado: (2025)

Decoupling the "What" and "Where" With Polar Coordinate Positional Embeddings
por: Gopalakrishnan, Anand, et al.
Publicado: (2025)

Output Embedding Centering for Stable LLM Pretraining
por: Stollenwerk, Felix, et al.
Publicado: (2026)

Pretraining Large Language Models with NVFP4
por: NVIDIA, et al.
Publicado: (2025)

Patent Language Model Pretraining with ModernBERT
por: Yousefiramandi, Amirhossein, et al.
Publicado: (2025)

Transformers as Decision Makers: Provable In-Context Reinforcement Learning via Supervised Pretraining
por: Lin, Licong, et al.
Publicado: (2023)

Where Norms and References Collide: Evaluating LLMs on Normative Reasoning
por: Abrams, Mitchell, et al.
Publicado: (2026)

To Memorize or to Retrieve: Scaling Laws for RAG-Considerate Pretraining
por: Singh, Karan, et al.
Publicado: (2026)

Can GRPO Help LLMs Transcend Their Pretraining Origin?
por: Ni, Kangqi, et al.
Publicado: (2025)

In-context Pretraining: Language Modeling Beyond Document Boundaries
por: Shi, Weijia, et al.
Publicado: (2023)

Revisiting Multilingual Data Mixtures in Language Model Pretraining
por: Foroutan, Negar, et al.
Publicado: (2025)

Emergent Communication Pretraining for Few-Shot Machine Translation
por: Li, Yaoyiran, et al.
Publicado: (2020)

Discovering Knowledge-Critical Subnetworks in Pretrained Language Models
por: Bayazit, Deniz, et al.
Publicado: (2023)

Intent-based Prompt Calibration: Enhancing prompt optimization with synthetic boundary cases
por: Levi, Elad, et al.
Publicado: (2024)

Beyond URLs: Metadata Diversity and Position for Efficient LLM Pretraining
por: Fan, Dongyang, et al.
Publicado: (2025)

Does Differential Privacy Impact Bias in Pretrained NLP Models?
por: Islam, Md. Khairul, et al.
Publicado: (2024)

Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
por: McLeish, Sean, et al.
Publicado: (2025)

Pretraining with hierarchical memories: separating long-tail and common knowledge
por: Pouransari, Hadi, et al.
Publicado: (2025)

Many-to-English Machine Translation Tools, Data, and Pretrained Models
por: Gowda, Thamme, et al.
Publicado: (2021)

Where Does Toxicity Live? Mechanistic Localization and Targeted Suppression in Language Models
por: Beniwal, Himanshu, et al.
Publicado: (2026)

Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process
por: Zhang, Zhenyu, et al.
Publicado: (2025)

Generating Pretraining Tokens from Organic Data for Data-Bound Scaling
por: Yu, Zichun, et al.
Publicado: (2026)

Stabilizing Reasoning in Medical LLMs with Continued Pretraining and Reasoning Preference Optimization
por: Kawakami, Wataru, et al.
Publicado: (2025)