Saved in:
| Main Authors: | Konyushkova, Ksenia, Kaplanis, Christos, Cabi, Serkan, Denil, Misha |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.02740 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
$\pi2\text{vec}$: Policy Representations with Successor Features
by: Scarpellini, Gianluca, et al.
Published: (2023)
by: Scarpellini, Gianluca, et al.
Published: (2023)
Rapid Object Annotation
by: Denil, Misha
Published: (2024)
by: Denil, Misha
Published: (2024)
Self-Improvement in Language Models: The Sharpening Mechanism
by: Huang, Audrey, et al.
Published: (2024)
by: Huang, Audrey, et al.
Published: (2024)
Measuring and Controlling Instruction (In)Stability in Language Model Dialogs
by: Li, Kenneth, et al.
Published: (2024)
by: Li, Kenneth, et al.
Published: (2024)
Latent Principle Discovery for Language Model Self-Improvement
by: Ramji, Keshav, et al.
Published: (2025)
by: Ramji, Keshav, et al.
Published: (2025)
GATS: Gather-Attend-Scatter
by: Zolna, Konrad, et al.
Published: (2024)
by: Zolna, Konrad, et al.
Published: (2024)
Generalized Policy Improvement Algorithms with Theoretically Supported Sample Reuse
by: Queeney, James, et al.
Published: (2022)
by: Queeney, James, et al.
Published: (2022)
Theoretical Modeling of Large Language Model Self-Improvement Training Dynamics Through Solver-Verifier Gap
by: Sun, Yifan, et al.
Published: (2025)
by: Sun, Yifan, et al.
Published: (2025)
Enhancing Visual-Language Modality Alignment in Large Vision Language Models via Self-Improvement
by: Wang, Xiyao, et al.
Published: (2024)
by: Wang, Xiyao, et al.
Published: (2024)
Guarding the Meaning: Self-Supervised Training for Semantic Robustness in Guard Models
by: Pinneri, Cristina, et al.
Published: (2025)
by: Pinneri, Cristina, et al.
Published: (2025)
FastVLM: Self-Speculative Decoding for Fast Vision-Language Model Inference
by: Bajpai, Divya Jyoti, et al.
Published: (2025)
by: Bajpai, Divya Jyoti, et al.
Published: (2025)
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
by: Wang, Haozhe, et al.
Published: (2025)
by: Wang, Haozhe, et al.
Published: (2025)
Provable and Practical In-Context Policy Optimization for Self-Improvement
by: Yu, Tianrun, et al.
Published: (2026)
by: Yu, Tianrun, et al.
Published: (2026)
Goal Inference from Open-Ended Dialog
by: Ma, Rachel, et al.
Published: (2024)
by: Ma, Rachel, et al.
Published: (2024)
Learning from Relevant Subgoals in Successful Dialogs using Iterative Training for Task-oriented Dialog Systems
by: Kaiser, Magdalena, et al.
Published: (2024)
by: Kaiser, Magdalena, et al.
Published: (2024)
Policy Improvement using Language Feedback Models
by: Zhong, Victor, et al.
Published: (2024)
by: Zhong, Victor, et al.
Published: (2024)
Improving fine-grained understanding in image-text pre-training
by: Bica, Ioana, et al.
Published: (2024)
by: Bica, Ioana, et al.
Published: (2024)
Dialog2Flow: Pre-training Soft-Contrastive Action-Driven Sentence Embeddings for Automatic Dialog Flow Extraction
by: Burdisso, Sergio, et al.
Published: (2024)
by: Burdisso, Sergio, et al.
Published: (2024)
Teaching by Failure: Counter-Example-Driven Curricula for Transformer Self-Improvement
by: Vejendla, Harshil
Published: (2025)
by: Vejendla, Harshil
Published: (2025)
SCALE: Self-uncertainty Conditioned Adaptive Looking and Execution for Vision-Language-Action Models
by: Choi, Hyeonbeom, et al.
Published: (2026)
by: Choi, Hyeonbeom, et al.
Published: (2026)
Self-Trained Verification for Training- and Test-Time Self-Improvement
by: Wu, Chen Henry, et al.
Published: (2026)
by: Wu, Chen Henry, et al.
Published: (2026)
Scheduled Curiosity-Deep Dyna-Q: Efficient Exploration for Dialog Policy Learning
by: Niu, Xuecheng, et al.
Published: (2024)
by: Niu, Xuecheng, et al.
Published: (2024)
Self-Questioning Language Models
by: Chen, Lili, et al.
Published: (2025)
by: Chen, Lili, et al.
Published: (2025)
SIME: Enhancing Policy Self-Improvement with Modal-level Exploration
by: Jin, Yang, et al.
Published: (2025)
by: Jin, Yang, et al.
Published: (2025)
InterVLS: Interactive Model Understanding and Improvement with Vision-Language Surrogates
by: Huang, Jinbin, et al.
Published: (2023)
by: Huang, Jinbin, et al.
Published: (2023)
Mastering the Game of Go with Self-play Experience Replay
by: Liu, Jingbin, et al.
Published: (2026)
by: Liu, Jingbin, et al.
Published: (2026)
WIST: Web-Grounded Iterative Self-Play Tree for Domain-Targeted Reasoning Improvement
by: Li, Fangyuan, et al.
Published: (2026)
by: Li, Fangyuan, et al.
Published: (2026)
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains
by: Subramaniam, Vighnesh, et al.
Published: (2025)
by: Subramaniam, Vighnesh, et al.
Published: (2025)
Self-Improvement as Coherence Optimization: A Theoretical Account
by: Qiu, Tianyi, et al.
Published: (2026)
by: Qiu, Tianyi, et al.
Published: (2026)
SERPENT-VLM : Self-Refining Radiology Report Generation Using Vision Language Models
by: Kapadnis, Manav Nitin, et al.
Published: (2024)
by: Kapadnis, Manav Nitin, et al.
Published: (2024)
Contextual Experience Replay for Self-Improvement of Language Agents
by: Liu, Yitao, et al.
Published: (2025)
by: Liu, Yitao, et al.
Published: (2025)
Towards General Continuous Memory for Vision-Language Models
by: Wu, Wenyi, et al.
Published: (2025)
by: Wu, Wenyi, et al.
Published: (2025)
Latent Domain Prompt Learning for Vision-Language Models
by: Li, Zhixing, et al.
Published: (2025)
by: Li, Zhixing, et al.
Published: (2025)
Revisiting the Learning Objectives of Vision-Language Reward Models
by: Roy, Simon, et al.
Published: (2025)
by: Roy, Simon, et al.
Published: (2025)
Vision-Language Model Selection and Reuse for Downstream Adaptation
by: Tan, Hao-Zhe, et al.
Published: (2025)
by: Tan, Hao-Zhe, et al.
Published: (2025)
Towards a Zero-Data, Controllable, Adaptive Dialog System
by: Väth, Dirk, et al.
Published: (2024)
by: Väth, Dirk, et al.
Published: (2024)
An Empirical Study on Context Length for Open-Domain Dialog Generation
by: Shen, Xinyi, et al.
Published: (2024)
by: Shen, Xinyi, et al.
Published: (2024)
Conversational Tree Search: A New Hybrid Dialog Task
by: Väth, Dirk, et al.
Published: (2023)
by: Väth, Dirk, et al.
Published: (2023)
A survey on Concept-based Approaches For Model Improvement
by: Gupta, Avani, et al.
Published: (2024)
by: Gupta, Avani, et al.
Published: (2024)
Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning
by: Rocamonde, Juan, et al.
Published: (2023)
by: Rocamonde, Juan, et al.
Published: (2023)
Similar Items
-
$\pi2\text{vec}$: Policy Representations with Successor Features
by: Scarpellini, Gianluca, et al.
Published: (2023) -
Rapid Object Annotation
by: Denil, Misha
Published: (2024) -
Self-Improvement in Language Models: The Sharpening Mechanism
by: Huang, Audrey, et al.
Published: (2024) -
Measuring and Controlling Instruction (In)Stability in Language Model Dialogs
by: Li, Kenneth, et al.
Published: (2024) -
Latent Principle Discovery for Language Model Self-Improvement
by: Ramji, Keshav, et al.
Published: (2025)