:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Jin, Can, Li, Jiakang, Wu, Rui, Zhang, Eddy, Metaxas, Dimitris N.
Format:	Preprint
Veröffentlicht:	2026
Schlagworte:	Artificial Intelligence
Online-Zugang:	https://arxiv.org/abs/2606.00424
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

Scalable Oversight for Superhuman AI via Recursive Self-Critiquing
von: Wen, Xueru, et al.
Veröffentlicht: (2025)

Two Heads are Better Than One: Test-time Scaling of Multi-agent Collaborative Reasoning
von: Jin, Can, et al.
Veröffentlicht: (2025)

Evidence Over Plans: Online Trajectory Verification for Skill Distillation
von: Zhou, Yang, et al.
Veröffentlicht: (2026)

Latent Reward Steering: An Adaptive Inference-Time Framework that Implicitly Promotes Cognitive Behaviors in Reasoning LLMs
von: Li, Jiakang, et al.
Veröffentlicht: (2026)

Learning from Teaching Regularization: Generalizable Correlations Should be Easy to Imitate
von: Jin, Can, et al.
Veröffentlicht: (2024)

LED: LLM Enhanced Open-Vocabulary Object Detection without Human Curated Data Generation
von: Zhou, Yang, et al.
Veröffentlicht: (2025)

Steering LLMs via Scalable Interactive Oversight
von: Zhou, Enyu, et al.
Veröffentlicht: (2026)

Calibrating Conservatism for Scalable Oversight
von: Overman, William, et al.
Veröffentlicht: (2026)

Test-Time Spectrum-Aware Latent Steering for Zero-Shot Generalization in Vision-Language Models
von: Dafnis, Konstantinos M., et al.
Veröffentlicht: (2025)

A Benchmark for Scalable Oversight Protocols
von: Sudhir, Abhimanyu Pallavi, et al.
Veröffentlicht: (2025)

Scaling Laws For Scalable Oversight
von: Engels, Joshua, et al.
Veröffentlicht: (2025)

RAC: Rectified Flow Auto Coder
von: Fang, Sen, et al.
Veröffentlicht: (2026)

M^3-Bench: Multi-Modal, Multi-Hop, Multi-Threaded Tool-Using MLLM Agent Benchmark
von: Zhou, Yang, et al.
Veröffentlicht: (2025)

DTop-p MoE: Sparsity-Controlled Dynamic Top-p MoE for Foundation Model Pre-training
von: Jin, Can, et al.
Veröffentlicht: (2025)

DiMSUM: Diffusion Mamba -- A Scalable and Unified Spatial-Frequency Method for Image Generation
von: Phung, Hao, et al.
Veröffentlicht: (2024)

SignVerse-2M: A Two-Million-Clip Pose-Native Universe of 55+ Sign Languages
von: Fang, Sen, et al.
Veröffentlicht: (2026)

Reasoning over Precedents Alongside Statutes: Case-Augmented Deliberative Alignment for LLM Safety
von: Jin, Can, et al.
Veröffentlicht: (2026)

APEER: Automatic Prompt Engineering Enhances Large Language Model Reranking
von: Jin, Can, et al.
Veröffentlicht: (2024)

DARE: Difficulty-Adaptive Reinforcement Learning with Co-Evolved Difficulty Estimation
von: Zhou, Yang, et al.
Veröffentlicht: (2026)

Aligning Large Language Models with Healthcare Stakeholders: A Pathway to Trustworthy AI Integration
von: Ding, Kexin, et al.
Veröffentlicht: (2025)

Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation
von: Zhao, Shuai, et al.
Veröffentlicht: (2024)

Scalable Stewardship of an LLM-Assisted Clinical Benchmark with Physician Oversight
von: Ye, Junze, et al.
Veröffentlicht: (2025)

Your Reward Function for RL is Your Best PRM for Search: Unifying RL and Search-Based TTS
von: Jin, Can, et al.
Veröffentlicht: (2025)

Modeling Human Beliefs about AI Behavior for Scalable Oversight
von: Lang, Leon, et al.
Veröffentlicht: (2025)

Towards Scalable Oversight via Partitioned Human Supervision
von: Yin, Ren, et al.
Veröffentlicht: (2025)

The Critique of Critique
von: Sun, Shichao, et al.
Veröffentlicht: (2024)

SINE: SINgle Image Editing with Text-to-Image Diffusion Models
von: Zhang, Zhixing, et al.
Veröffentlicht: (2022)

Anatomy-VLM: A Fine-grained Vision-Language Model for Medical Interpretation
von: Gu, Difei, et al.
Veröffentlicht: (2025)

PrefGen: Multimodal Preference Learning for Preference-Conditioned Image Generation
von: Mo, Wenyi, et al.
Veröffentlicht: (2025)

Learn where to Click from Yourself: On-Policy Self-Distillation for GUI Grounding
von: Zhang, Yan, et al.
Veröffentlicht: (2026)

Beyond Interpretability: When, Why, and How Sparse Autoencoders Enable Label-Free Visual Steering
von: Chatzoudis, Gerasimos, et al.
Veröffentlicht: (2025)

FindTheFlaws: Annotated Errors for Detecting Flawed Reasoning and Scalable Oversight Research
von: Recchia, Gabriel, et al.
Veröffentlicht: (2025)

Weak-for-Strong: Training Weak Meta-Agent to Harness Strong Executors
von: Nie, Fan, et al.
Veröffentlicht: (2025)

Concept Distillation from Strong to Weak Models via Hypotheses-to-Theories Prompting
von: Boateng, Emmanuel Aboah, et al.
Veröffentlicht: (2024)

Extreme Region Policy Distillation
von: Chen, Changyu, et al.
Veröffentlicht: (2026)

BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models
von: Wang, Yibin, et al.
Veröffentlicht: (2024)

Hybrid Policy Distillation for LLMs
von: Zhu, Wenhong, et al.
Veröffentlicht: (2026)

Beyond Output Critique: Self-Correction via Task Distillation
von: Rahmani, Hossein A., et al.
Veröffentlicht: (2026)

Selective Weak-to-Strong Generalization
von: Lang, Hao, et al.
Veröffentlicht: (2025)

SafeSteer: Localized On-Policy Distillation for Efficient Safety Alignment
von: Li, Hao, et al.
Veröffentlicht: (2026)