Guardado en:
| Autores principales: | Lee, Yujeong, Shin, Sangwoo, Park, Wei-Jin, Woo, Honguk |
|---|---|
| Formato: | Preprint |
| Publicado: |
2024
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2411.17135 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Semantic Skill Grounding for Embodied Instruction-Following in Cross-Domain Environments
por: Shin, Sangwoo, et al.
Publicado: (2024)
por: Shin, Sangwoo, et al.
Publicado: (2024)
Efficient Policy Adaptation with Contrastive Prompt Ensemble for Embodied Agents
por: Choi, Wonje, et al.
Publicado: (2024)
por: Choi, Wonje, et al.
Publicado: (2024)
SemTra: A Semantic Skill Translator for Cross-Domain Zero-Shot Policy Adaptation
por: Shin, Sangwoo, et al.
Publicado: (2024)
por: Shin, Sangwoo, et al.
Publicado: (2024)
One-shot Imitation in a Non-Stationary Environment via Multi-Modal Skill
por: Shin, Sangwoo, et al.
Publicado: (2024)
por: Shin, Sangwoo, et al.
Publicado: (2024)
Efficient Process Reward Modeling via Contrastive Mutual Information
por: Lee, Nakyung, et al.
Publicado: (2026)
por: Lee, Nakyung, et al.
Publicado: (2026)
Embodied CoT Distillation From LLM To Off-the-shelf Agents
por: Choi, Wonje, et al.
Publicado: (2024)
por: Choi, Wonje, et al.
Publicado: (2024)
Expanding Search Space with Diverse Prompting Agents: An Efficient Sampling Approach for LLM Mathematical Reasoning
por: Lee, Gisang, et al.
Publicado: (2024)
por: Lee, Gisang, et al.
Publicado: (2024)
T-MAP: Red-Teaming LLM Agents with Trajectory-aware Evolutionary Search
por: Lee, Hyomin, et al.
Publicado: (2026)
por: Lee, Hyomin, et al.
Publicado: (2026)
Robust Policy Learning via Offline Skill Diffusion
por: Kim, Woo Kyung, et al.
Publicado: (2024)
por: Kim, Woo Kyung, et al.
Publicado: (2024)
Nudging Beyond the Comfort Zone: Efficient Strategy-Guided Exploration for RLVR
por: Lee, Chanuk, et al.
Publicado: (2026)
por: Lee, Chanuk, et al.
Publicado: (2026)
SuRe: Summarizing Retrievals using Answer Candidates for Open-domain QA of LLMs
por: Kim, Jaehyung, et al.
Publicado: (2024)
por: Kim, Jaehyung, et al.
Publicado: (2024)
Exploratory Retrieval-Augmented Planning For Continual Embodied Instruction Following
por: Yoo, Minjong, et al.
Publicado: (2025)
por: Yoo, Minjong, et al.
Publicado: (2025)
Harnessing Consistency for Robust Test-Time LLM Ensemble
por: Zeng, Zhichen, et al.
Publicado: (2025)
por: Zeng, Zhichen, et al.
Publicado: (2025)
Guide-LLM: An Embodied LLM Agent and Text-Based Topological Map for Robotic Guidance of People with Visual Impairments
por: Song, Sangmim, et al.
Publicado: (2024)
por: Song, Sangmim, et al.
Publicado: (2024)
Offline Policy Learning via Skill-step Abstraction for Long-horizon Goal-Conditioned Tasks
por: Kim, Donghoon, et al.
Publicado: (2024)
por: Kim, Donghoon, et al.
Publicado: (2024)
World Model Implanting for Test-time Adaptation of Embodied Agents
por: Yoo, Minjong, et al.
Publicado: (2025)
por: Yoo, Minjong, et al.
Publicado: (2025)
Test-Time Mixture of World Models for Embodied Agents in Dynamic Environments
por: Jang, Jinwoo, et al.
Publicado: (2026)
por: Jang, Jinwoo, et al.
Publicado: (2026)
Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning
por: Zhang, Kongcheng, et al.
Publicado: (2025)
por: Zhang, Kongcheng, et al.
Publicado: (2025)
Adaptive Selection of LoRA Components in Privacy-Preserving Federated Learning
por: Kim, Myoungjun, et al.
Publicado: (2026)
por: Kim, Myoungjun, et al.
Publicado: (2026)
PREPING: Building Agent Memory without Tasks
por: Choi, Yumin, et al.
Publicado: (2026)
por: Choi, Yumin, et al.
Publicado: (2026)
NeSyC: A Neuro-symbolic Continual Learner For Complex Embodied Tasks In Open Domains
por: Choi, Wonje, et al.
Publicado: (2025)
por: Choi, Wonje, et al.
Publicado: (2025)
A Theoretical Analysis of Why Masked Diffusion Models Mitigate the Reversal Curse
por: Jeon, Moongyu, et al.
Publicado: (2026)
por: Jeon, Moongyu, et al.
Publicado: (2026)
Towards Reliable Code-as-Policies: A Neuro-Symbolic Framework for Embodied Task Planning
por: Ahn, Sanghyun, et al.
Publicado: (2025)
por: Ahn, Sanghyun, et al.
Publicado: (2025)
Embodied LLM Agents Learn to Cooperate in Organized Teams
por: Guo, Xudong, et al.
Publicado: (2024)
por: Guo, Xudong, et al.
Publicado: (2024)
NeSyPr: Neurosymbolic Proceduralization For Efficient Embodied Reasoning
por: Choi, Wonje, et al.
Publicado: (2025)
por: Choi, Wonje, et al.
Publicado: (2025)
Pay What LLM Wants: Can LLM Simulate Economics Experiment with 522 Real-human Persona?
por: Choi, Junhyuk, et al.
Publicado: (2025)
por: Choi, Junhyuk, et al.
Publicado: (2025)
SPIO: Ensemble and Selective Strategies via LLM-Based Multi-Agent Planning in Automated Data Science
por: Seo, Wonduk, et al.
Publicado: (2025)
por: Seo, Wonduk, et al.
Publicado: (2025)
Reward-Guided Speculative Decoding for Efficient LLM Reasoning
por: Liao, Baohao, et al.
Publicado: (2025)
por: Liao, Baohao, et al.
Publicado: (2025)
Reward Difference Optimization For Sample Reweighting In Offline RLHF
por: Wang, Shiqi, et al.
Publicado: (2024)
por: Wang, Shiqi, et al.
Publicado: (2024)
Beyond Self-Consistency: Ensemble Reasoning Boosts Consistency and Accuracy of LLMs in Cancer Staging
por: Chang, Chia-Hsuan, et al.
Publicado: (2024)
por: Chang, Chia-Hsuan, et al.
Publicado: (2024)
Grounded in Reality: Learning and Deploying Proactive LLM from Offline Logs
por: Wei, Fei, et al.
Publicado: (2025)
por: Wei, Fei, et al.
Publicado: (2025)
AdaRubric: Task-Adaptive Rubrics for Reliable LLM Agent Evaluation and Reward Learning
por: Ding, Liang
Publicado: (2026)
por: Ding, Liang
Publicado: (2026)
Can an Embodied Agent Find Your "Cat-shaped Mug"? LLM-Guided Exploration for Zero-Shot Object Navigation
por: Dorbala, Vishnu Sashank, et al.
Publicado: (2023)
por: Dorbala, Vishnu Sashank, et al.
Publicado: (2023)
When to Ensemble: Identifying Token-Level Points for Stable and Fast LLM Ensembling
por: Yun, Heecheol, et al.
Publicado: (2025)
por: Yun, Heecheol, et al.
Publicado: (2025)
MENTOR: A Reinforcement Learning Framework for Enabling Tool Use in Small Models via Teacher-Optimized Rewards
por: Choi, ChangSu, et al.
Publicado: (2025)
por: Choi, ChangSu, et al.
Publicado: (2025)
MT-RewardTree: A Comprehensive Framework for Advancing LLM-Based Machine Translation via Reward Modeling
por: Feng, Zhaopeng, et al.
Publicado: (2025)
por: Feng, Zhaopeng, et al.
Publicado: (2025)
Distribution-Aware Reward: Reinforcement Learning over Predictive Distributions for LLM Regression
por: Park, Jungsoo, et al.
Publicado: (2026)
por: Park, Jungsoo, et al.
Publicado: (2026)
OptiHive: Ensemble Selection for LLM-Based Optimization via Statistical Modeling
por: Bouscary, Maxime, et al.
Publicado: (2025)
por: Bouscary, Maxime, et al.
Publicado: (2025)
No Verifiable Reward for Prosody: Toward Preference-Guided Prosody Learning in TTS
por: Shin, Seungyoun, et al.
Publicado: (2025)
por: Shin, Seungyoun, et al.
Publicado: (2025)
Retrospex: Language Agent Meets Offline Reinforcement Learning Critic
por: Xiang, Yufei, et al.
Publicado: (2025)
por: Xiang, Yufei, et al.
Publicado: (2025)
Ejemplares similares
-
Semantic Skill Grounding for Embodied Instruction-Following in Cross-Domain Environments
por: Shin, Sangwoo, et al.
Publicado: (2024) -
Efficient Policy Adaptation with Contrastive Prompt Ensemble for Embodied Agents
por: Choi, Wonje, et al.
Publicado: (2024) -
SemTra: A Semantic Skill Translator for Cross-Domain Zero-Shot Policy Adaptation
por: Shin, Sangwoo, et al.
Publicado: (2024) -
One-shot Imitation in a Non-Stationary Environment via Multi-Modal Skill
por: Shin, Sangwoo, et al.
Publicado: (2024) -
Efficient Process Reward Modeling via Contrastive Mutual Information
por: Lee, Nakyung, et al.
Publicado: (2026)