Saved in:
| Main Authors: | Wu, Yue, Fan, Yewen, Liang, Paul Pu, Azaria, Amos, Li, Yuanzhi, Mitchell, Tom M. |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2302.04449 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
The Self-Execution Benchmark: Measuring LLMs' Attempts to Overcome Their Lack of Self-Execution
by: Ezra, Elon, et al.
Published: (2025)
by: Ezra, Elon, et al.
Published: (2025)
TALL -- A Trainable Architecture for Enhancing LLM Performance in Low-Resource Languages
by: Ofer, Moshe, et al.
Published: (2025)
by: Ofer, Moshe, et al.
Published: (2025)
TextAtari: 100K Frames Game Playing with Language Agents
by: Li, Wenhao, et al.
Published: (2025)
by: Li, Wenhao, et al.
Published: (2025)
SmartPlay: A Benchmark for LLMs as Intelligent Agents
by: Wu, Yue, et al.
Published: (2023)
by: Wu, Yue, et al.
Published: (2023)
Generating Pragmatic Examples to Train Neural Program Synthesizers
by: Vaduguru, Saujas, et al.
Published: (2023)
by: Vaduguru, Saujas, et al.
Published: (2023)
Ruffle&Riley: Insights from Designing and Evaluating a Large Language Model-Based Conversational Tutoring System
by: Schmucker, Robin, et al.
Published: (2024)
by: Schmucker, Robin, et al.
Published: (2024)
Amortizing Pragmatic Program Synthesis with Rankings
by: Pu, Yewen, et al.
Published: (2024)
by: Pu, Yewen, et al.
Published: (2024)
AutoManual: Constructing Instruction Manuals by LLM Agents via Interactive Environmental Learning
by: Chen, Minghao, et al.
Published: (2024)
by: Chen, Minghao, et al.
Published: (2024)
Self-Evolved Reward Learning for LLMs
by: Huang, Chenghua, et al.
Published: (2024)
by: Huang, Chenghua, et al.
Published: (2024)
AgentKit: Structured LLM Reasoning with Dynamic Graphs
by: Wu, Yue, et al.
Published: (2024)
by: Wu, Yue, et al.
Published: (2024)
Learning To Play Atari Games Using Dueling Q-Learning and Hebbian Plasticity
by: Salehin, Md Ashfaq
Published: (2024)
by: Salehin, Md Ashfaq
Published: (2024)
HackAtari: Atari Learning Environments for Robust and Continual Reinforcement Learning
by: Delfosse, Quentin, et al.
Published: (2024)
by: Delfosse, Quentin, et al.
Published: (2024)
Hypothesis Search: Inductive Reasoning with Language Models
by: Wang, Ruocheng, et al.
Published: (2023)
by: Wang, Ruocheng, et al.
Published: (2023)
InverseCoder: Self-improving Instruction-Tuned Code LLMs with Inverse-Instruct
by: Wu, Yutong, et al.
Published: (2024)
by: Wu, Yutong, et al.
Published: (2024)
Debate Helps Weak Judges Reward Stronger Models
by: Elasky, Ethan, et al.
Published: (2026)
by: Elasky, Ethan, et al.
Published: (2026)
Amortizing Pragmatic Program Synthesis with Rankings
by: Pu, Yewen, et al.
Published: (2023)
by: Pu, Yewen, et al.
Published: (2023)
Physics of Language Models: Part 1, Learning Hierarchical Language Structures
by: Allen-Zhu, Zeyuan, et al.
Published: (2023)
by: Allen-Zhu, Zeyuan, et al.
Published: (2023)
LLM-KT: Aligning Large Language Models with Knowledge Tracing using a Plug-and-Play Instruction
by: Wang, Ziwei, et al.
Published: (2025)
by: Wang, Ziwei, et al.
Published: (2025)
Does Calibration Affect Human Actions?
by: Nizri, Meir, et al.
Published: (2025)
by: Nizri, Meir, et al.
Published: (2025)
One Token Away from Collapse: The Fragility of Instruction-Tuned Helpfulness
by: Potraghloo, Erfan Baghaei, et al.
Published: (2026)
by: Potraghloo, Erfan Baghaei, et al.
Published: (2026)
mrCAD: Multimodal Refinement of Computer-aided Designs
by: McCarthy, William P., et al.
Published: (2025)
by: McCarthy, William P., et al.
Published: (2025)
Playing Atari Space Invaders with Sparse Cosine Optimized Policy Evolution
by: O'Connor, Jim, et al.
Published: (2025)
by: O'Connor, Jim, et al.
Published: (2025)
EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing
by: Wu, Keming, et al.
Published: (2025)
by: Wu, Keming, et al.
Published: (2025)
Diffusion for World Modeling: Visual Details Matter in Atari
by: Alonso, Eloi, et al.
Published: (2024)
by: Alonso, Eloi, et al.
Published: (2024)
AdaRubric: Task-Adaptive Rubrics for Reliable LLM Agent Evaluation and Reward Learning
by: Ding, Liang
Published: (2026)
by: Ding, Liang
Published: (2026)
Read to Play (R2-Play): Decision Transformer with Multimodal Game Instruction
by: Jin, Yonggang, et al.
Published: (2024)
by: Jin, Yonggang, et al.
Published: (2024)
Secure Code Generation via Online Reinforcement Learning with Vulnerability Reward Model
by: Wu, Tianyi, et al.
Published: (2026)
by: Wu, Tianyi, et al.
Published: (2026)
Exploring Reasoning Reward Model for Agents
by: Fan, Kaixuan, et al.
Published: (2026)
by: Fan, Kaixuan, et al.
Published: (2026)
Self-Rewarding Rubric-Based Reinforcement Learning for Open-Ended Reasoning
by: Ye, Zhiling, et al.
Published: (2025)
by: Ye, Zhiling, et al.
Published: (2025)
Rubrics to Tokens: Bridging Response-level Rubrics and Token-level Rewards in Instruction Following Tasks
by: Xu, Tianze, et al.
Published: (2026)
by: Xu, Tianze, et al.
Published: (2026)
Online Intrinsic Rewards for Decision Making Agents from Large Language Model Feedback
by: Zheng, Qinqing, et al.
Published: (2024)
by: Zheng, Qinqing, et al.
Published: (2024)
A Vision for Multisensory Intelligence: Sensing, Science, and Synergy
by: Liang, Paul Pu
Published: (2026)
by: Liang, Paul Pu
Published: (2026)
HonestLLM: Toward an Honest and Helpful Large Language Model
by: Gao, Chujie, et al.
Published: (2024)
by: Gao, Chujie, et al.
Published: (2024)
Instructions are all you need: Self-supervised Reinforcement Learning for Instruction Following
by: Ren, Qingyu, et al.
Published: (2025)
by: Ren, Qingyu, et al.
Published: (2025)
Beyond IID: Optimizing Instruction Learning from the Perspective of Instruction Interaction and Dependency
by: Zhao, Hanyu, et al.
Published: (2024)
by: Zhao, Hanyu, et al.
Published: (2024)
Physics of Language Models: Part 3.1, Knowledge Storage and Extraction
by: Allen-Zhu, Zeyuan, et al.
Published: (2023)
by: Allen-Zhu, Zeyuan, et al.
Published: (2023)
Physics of Language Models: Part 3.2, Knowledge Manipulation
by: Allen-Zhu, Zeyuan, et al.
Published: (2023)
by: Allen-Zhu, Zeyuan, et al.
Published: (2023)
Foundations of Multisensory Artificial Intelligence
by: Liang, Paul Pu
Published: (2024)
by: Liang, Paul Pu
Published: (2024)
Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws
by: Allen-Zhu, Zeyuan, et al.
Published: (2024)
by: Allen-Zhu, Zeyuan, et al.
Published: (2024)
Self-Play Preference Optimization for Language Model Alignment
by: Wu, Yue, et al.
Published: (2024)
by: Wu, Yue, et al.
Published: (2024)
Similar Items
-
The Self-Execution Benchmark: Measuring LLMs' Attempts to Overcome Their Lack of Self-Execution
by: Ezra, Elon, et al.
Published: (2025) -
TALL -- A Trainable Architecture for Enhancing LLM Performance in Low-Resource Languages
by: Ofer, Moshe, et al.
Published: (2025) -
TextAtari: 100K Frames Game Playing with Language Agents
by: Li, Wenhao, et al.
Published: (2025) -
SmartPlay: A Benchmark for LLMs as Intelligent Agents
by: Wu, Yue, et al.
Published: (2023) -
Generating Pragmatic Examples to Train Neural Program Synthesizers
by: Vaduguru, Saujas, et al.
Published: (2023)