:: Library Catalog

Imagen de Portada

Guardado en:

Detalles Bibliográficos
Autores principales:	Lee, Yujeong, Shin, Sangwoo, Park, Wei-Jin, Woo, Honguk
Formato:	Preprint
Publicado:	2024
Materias:	Artificial Intelligence Computation and Language
Acceso en línea:	https://arxiv.org/abs/2411.17135
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Ejemplares similares

Semantic Skill Grounding for Embodied Instruction-Following in Cross-Domain Environments
por: Shin, Sangwoo, et al.
Publicado: (2024)

Efficient Policy Adaptation with Contrastive Prompt Ensemble for Embodied Agents
por: Choi, Wonje, et al.
Publicado: (2024)

SemTra: A Semantic Skill Translator for Cross-Domain Zero-Shot Policy Adaptation
por: Shin, Sangwoo, et al.
Publicado: (2024)

One-shot Imitation in a Non-Stationary Environment via Multi-Modal Skill
por: Shin, Sangwoo, et al.
Publicado: (2024)

Efficient Process Reward Modeling via Contrastive Mutual Information
por: Lee, Nakyung, et al.
Publicado: (2026)

Embodied CoT Distillation From LLM To Off-the-shelf Agents
por: Choi, Wonje, et al.
Publicado: (2024)

Expanding Search Space with Diverse Prompting Agents: An Efficient Sampling Approach for LLM Mathematical Reasoning
por: Lee, Gisang, et al.
Publicado: (2024)

T-MAP: Red-Teaming LLM Agents with Trajectory-aware Evolutionary Search
por: Lee, Hyomin, et al.
Publicado: (2026)

Robust Policy Learning via Offline Skill Diffusion
por: Kim, Woo Kyung, et al.
Publicado: (2024)

Nudging Beyond the Comfort Zone: Efficient Strategy-Guided Exploration for RLVR
por: Lee, Chanuk, et al.
Publicado: (2026)

SuRe: Summarizing Retrievals using Answer Candidates for Open-domain QA of LLMs
por: Kim, Jaehyung, et al.
Publicado: (2024)

Exploratory Retrieval-Augmented Planning For Continual Embodied Instruction Following
por: Yoo, Minjong, et al.
Publicado: (2025)

Harnessing Consistency for Robust Test-Time LLM Ensemble
por: Zeng, Zhichen, et al.
Publicado: (2025)

Guide-LLM: An Embodied LLM Agent and Text-Based Topological Map for Robotic Guidance of People with Visual Impairments
por: Song, Sangmim, et al.
Publicado: (2024)

Offline Policy Learning via Skill-step Abstraction for Long-horizon Goal-Conditioned Tasks
por: Kim, Donghoon, et al.
Publicado: (2024)

World Model Implanting for Test-time Adaptation of Embodied Agents
por: Yoo, Minjong, et al.
Publicado: (2025)

Test-Time Mixture of World Models for Embodied Agents in Dynamic Environments
por: Jang, Jinwoo, et al.
Publicado: (2026)

Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning
por: Zhang, Kongcheng, et al.
Publicado: (2025)

Adaptive Selection of LoRA Components in Privacy-Preserving Federated Learning
por: Kim, Myoungjun, et al.
Publicado: (2026)

PREPING: Building Agent Memory without Tasks
por: Choi, Yumin, et al.
Publicado: (2026)

NeSyC: A Neuro-symbolic Continual Learner For Complex Embodied Tasks In Open Domains
por: Choi, Wonje, et al.
Publicado: (2025)

A Theoretical Analysis of Why Masked Diffusion Models Mitigate the Reversal Curse
por: Jeon, Moongyu, et al.
Publicado: (2026)

Towards Reliable Code-as-Policies: A Neuro-Symbolic Framework for Embodied Task Planning
por: Ahn, Sanghyun, et al.
Publicado: (2025)

Embodied LLM Agents Learn to Cooperate in Organized Teams
por: Guo, Xudong, et al.
Publicado: (2024)

NeSyPr: Neurosymbolic Proceduralization For Efficient Embodied Reasoning
por: Choi, Wonje, et al.
Publicado: (2025)

Pay What LLM Wants: Can LLM Simulate Economics Experiment with 522 Real-human Persona?
por: Choi, Junhyuk, et al.
Publicado: (2025)

SPIO: Ensemble and Selective Strategies via LLM-Based Multi-Agent Planning in Automated Data Science
por: Seo, Wonduk, et al.
Publicado: (2025)

Reward-Guided Speculative Decoding for Efficient LLM Reasoning
por: Liao, Baohao, et al.
Publicado: (2025)

Reward Difference Optimization For Sample Reweighting In Offline RLHF
por: Wang, Shiqi, et al.
Publicado: (2024)

Beyond Self-Consistency: Ensemble Reasoning Boosts Consistency and Accuracy of LLMs in Cancer Staging
por: Chang, Chia-Hsuan, et al.
Publicado: (2024)

Grounded in Reality: Learning and Deploying Proactive LLM from Offline Logs
por: Wei, Fei, et al.
Publicado: (2025)

AdaRubric: Task-Adaptive Rubrics for Reliable LLM Agent Evaluation and Reward Learning
por: Ding, Liang
Publicado: (2026)

Can an Embodied Agent Find Your "Cat-shaped Mug"? LLM-Guided Exploration for Zero-Shot Object Navigation
por: Dorbala, Vishnu Sashank, et al.
Publicado: (2023)

When to Ensemble: Identifying Token-Level Points for Stable and Fast LLM Ensembling
por: Yun, Heecheol, et al.
Publicado: (2025)

MENTOR: A Reinforcement Learning Framework for Enabling Tool Use in Small Models via Teacher-Optimized Rewards
por: Choi, ChangSu, et al.
Publicado: (2025)

MT-RewardTree: A Comprehensive Framework for Advancing LLM-Based Machine Translation via Reward Modeling
por: Feng, Zhaopeng, et al.
Publicado: (2025)

Distribution-Aware Reward: Reinforcement Learning over Predictive Distributions for LLM Regression
por: Park, Jungsoo, et al.
Publicado: (2026)

OptiHive: Ensemble Selection for LLM-Based Optimization via Statistical Modeling
por: Bouscary, Maxime, et al.
Publicado: (2025)

No Verifiable Reward for Prosody: Toward Preference-Guided Prosody Learning in TTS
por: Shin, Seungyoun, et al.
Publicado: (2025)

Retrospex: Language Agent Meets Offline Reinforcement Learning Critic
por: Xiang, Yufei, et al.
Publicado: (2025)