:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Yao, Yihang, Cen, Zhepeng, Li, Miao, Han, William, Zhang, Yuyou, Liu, Emerson, Liu, Zuxin, Gan, Chuang, Zhao, Ding
Natura:	Preprint
Pubblicazione:	2025
Soggetti:	Computation and Language
Accesso online:	https://arxiv.org/abs/2502.17800
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

Safety is Not Only About Refusal: Reasoning-Enhanced Fine-tuning for Interpretable LLM Safety
di: Zhang, Yuyou, et al.
Pubblicazione: (2025)

Feasibility Consistent Representation Learning for Safe Reinforcement Learning
di: Cen, Zhepeng, et al.
Pubblicazione: (2024)

Behavior Injection: Preparing Language Models for Reinforcement Learning
di: Cen, Zhepeng, et al.
Pubblicazione: (2025)

Learning from Sparse Offline Datasets via Conservative Density Estimation
di: Cen, Zhepeng, et al.
Pubblicazione: (2024)

Constraint-Conditioned Policy Optimization for Versatile Safe Reinforcement Learning
di: Yao, Yihang, et al.
Pubblicazione: (2023)

Pushing Forward Pareto Frontiers of Proactive Agents with Behavioral Agentic Optimization
di: Yao, Yihang, et al.
Pubblicazione: (2026)

CrashAgent: Crash Scenario Generation via Multi-modal Reasoning
di: Li, Miao, et al.
Pubblicazione: (2025)

Signal, Image, or Symbolic: Exploring the Best Input Representation for Electrocardiogram-Language Models Through a Unified Framework
di: Han, William, et al.
Pubblicazione: (2025)

OASIS: Conditional Distribution Shaping for Offline Safe Reinforcement Learning
di: Yao, Yihang, et al.
Pubblicazione: (2024)

Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels
di: Cen, Zhepeng, et al.
Pubblicazione: (2025)

Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens
di: Cen, Zhepeng, et al.
Pubblicazione: (2024)

Thinking-Based Non-Thinking: Solving the Reward Hacking Problem in Training Hybrid Reasoning Models via Reinforcement Learning
di: Gan, Siyuan, et al.
Pubblicazione: (2026)

SpinBench: Perspective and Rotation as a Lens on Spatial Reasoning in VLMs
di: Zhang, Yuyou, et al.
Pubblicazione: (2025)

Bipedalism for Quadrupedal Robots: Versatile Loco-Manipulation through Risk-Adaptive Reinforcement Learning
di: Zhang, Yuyou, et al.
Pubblicazione: (2025)

QuietPaw: Learning Quadrupedal Locomotion with Versatile Noise Preference Alignment
di: Zhang, Yuyou, et al.
Pubblicazione: (2025)

Exceptional Enhancement of Optical Anisotropy Achieved via the Strategy of Combining Rigid Groups with High Symmetry and π‐Conjugated Organic Groups in Hybrid Fluorides
di: Ru‐Ling Tang, et al.
Pubblicazione: (2025)

ELF: A Family of Encoder-Free ECG-Language Models
di: Han, William, et al.
Pubblicazione: (2026)

Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter
di: Hu, Qinghao, et al.
Pubblicazione: (2025)

We May Have Come Too Far, Too Fast

Tailored Primitive Initialization is the Secret Key to Reinforcement Learning
di: Yao, Yihang, et al.
Pubblicazione: (2025)

Too Big to Think: Capacity, Memorization, and Generalization in Pre-Trained Transformers
di: Barron, Joshua, et al.
Pubblicazione: (2025)

Wishful Thinking is Risky Thinking
di: Burgh, Jarrod, et al.
Pubblicazione: (2023)

A Remeshing Method via Adaptive Multiple Original-Facet-Clipping and Centroidal Voronoi Tessellation
di: Fei, Yue, et al.
Pubblicazione: (2025)

Steering LLM Thinking with Budget Guidance
di: Li, Junyan, et al.
Pubblicazione: (2025)

Rethinking External Slow-Thinking: From Snowball Errors to Probability of Correct Reasoning
di: Gan, Zeyu, et al.
Pubblicazione: (2025)

Reverse Thinking Enhances Missing Information Detection in Large Language Models
di: Liu, Yuxin, et al.
Pubblicazione: (2025)

Achieving binary weight and activation for LLMs using Post-Training Quantization
di: Song, Siqing, et al.
Pubblicazione: (2025)

Too Consistent to Detect: A Study of Self-Consistent Errors in LLMs
di: Tan, Hexiang, et al.
Pubblicazione: (2025)

Think Natively: Unlocking Multilingual Reasoning with Consistency-Enhanced Reinforcement Learning
di: Zhang, Xue, et al.
Pubblicazione: (2025)

Learn to Think: Improving Multimodal Reasoning through Vision-Aware Self-Improvement Training
di: Zhong, Qihuang, et al.
Pubblicazione: (2026)

Think How Your Teammates Think: Active Inference Can Benefit Decentralized Execution
di: Wu, Hao, et al.
Pubblicazione: (2025)

Thinking as Compression: Your Reasoning Model is Secretly a Context Compressor
di: Ma, Guoxin, et al.
Pubblicazione: (2026)

As We May Think, Information Systems Do Not.
di: Paisley, William J.
Pubblicazione: (1968)

LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering
di: Qiu, Jielin, et al.
Pubblicazione: (2025)

The Peril of Thinking for Others: The Russian Intelligentsia, Pro and Contra
di: Caryl Emerson
Pubblicazione: (2025)

Breaking Symmetries Leads to Diverse Quadrupedal Gaits
di: Ding, Jiayu, et al.
Pubblicazione: (2023)

SQLFixAgent: Towards Semantic-Accurate Text-to-SQL Parsing via Consistency-Enhanced Multi-Agent Collaboration
di: Cen, Jipeng, et al.
Pubblicazione: (2024)

TAIL: Task-specific Adapters for Imitation Learning with Large Pretrained Models
di: Liu, Zuxin, et al.
Pubblicazione: (2023)

Censored Beliefs and Wishful Thinking
di: Burgh, Jarrod, et al.
Pubblicazione: (2024)

RapVerse: Coherent Vocals and Whole-Body Motions Generations from Text
di: Chen, Jiaben, et al.
Pubblicazione: (2024)