Saved in:
| Main Authors: | Pecháč, Matej, Chovanec, Michal, Farkaš, Igor |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2302.11563 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Robot at the Mirror: Learning to Imitate via Associating Self-supervised Models
by: Lucny, Andrej, et al.
Published: (2023)
by: Lucny, Andrej, et al.
Published: (2023)
The impact of intrinsic rewards on exploration in Reinforcement Learning
by: Kayal, Aya, et al.
Published: (2025)
by: Kayal, Aya, et al.
Published: (2025)
Appearance-based gaze estimation enhanced with synthetic images using deep neural networks
by: Herashchenko, Dmytro, et al.
Published: (2023)
by: Herashchenko, Dmytro, et al.
Published: (2023)
Safe Reinforcement Learning in a Simulated Robotic Arm
by: Kovač, Luka, et al.
Published: (2023)
by: Kovač, Luka, et al.
Published: (2023)
Autonomous state-space segmentation for Deep-RL sparse reward scenarios
by: Maselli, Gianluca, et al.
Published: (2025)
by: Maselli, Gianluca, et al.
Published: (2025)
Self-rewarding correction for mathematical reasoning
by: Xiong, Wei, et al.
Published: (2025)
by: Xiong, Wei, et al.
Published: (2025)
Diverse Feature Learning by Self-distillation and Reset
by: Park, Sejik
Published: (2024)
by: Park, Sejik
Published: (2024)
Self-supervised Pre-training of Text Recognizers
by: Kišš, Martin, et al.
Published: (2024)
by: Kišš, Martin, et al.
Published: (2024)
Retro: Reusing teacher projection head for efficient embedding distillation on Lightweight Models via Self-supervised Learning
by: Nguyen, Khanh-Binh, et al.
Published: (2024)
by: Nguyen, Khanh-Binh, et al.
Published: (2024)
Learning Low-Level Causal Relations using a Simulated Robotic Arm
by: Cibula, Miroslav, et al.
Published: (2024)
by: Cibula, Miroslav, et al.
Published: (2024)
Streaming Looking Ahead with Token-level Self-reward
by: Zhang, Hongming, et al.
Published: (2025)
by: Zhang, Hongming, et al.
Published: (2025)
LeanTree: Accelerating White-Box Proof Search with Factorized States in Lean 4
by: Kripner, Matěj, et al.
Published: (2025)
by: Kripner, Matěj, et al.
Published: (2025)
Pessimistic Off-Policy Optimization for Learning to Rank
by: Cief, Matej, et al.
Published: (2022)
by: Cief, Matej, et al.
Published: (2022)
BLEUBERI: BLEU is a surprisingly effective reward for instruction following
by: Chang, Yapei, et al.
Published: (2025)
by: Chang, Yapei, et al.
Published: (2025)
Just Say What You Want: Only-prompting Self-rewarding Online Preference Optimization
by: Xu, Ruijie, et al.
Published: (2024)
by: Xu, Ruijie, et al.
Published: (2024)
SPARE: Self-distillation for PARameter-Efficient Removal
by: Mola, Natnael, et al.
Published: (2026)
by: Mola, Natnael, et al.
Published: (2026)
FedMSE: Semi-supervised federated learning approach for IoT network intrusion detection
by: Nguyen, Van Tuan, et al.
Published: (2024)
by: Nguyen, Van Tuan, et al.
Published: (2024)
What should be observed for optimal reward in POMDPs?
by: Konsta, Alyzia-Maria, et al.
Published: (2024)
by: Konsta, Alyzia-Maria, et al.
Published: (2024)
Active teacher selection for reward learning
by: Freedman, Rachel, et al.
Published: (2023)
by: Freedman, Rachel, et al.
Published: (2023)
Noise-based reward-modulated learning
by: Fernández, Jesús García, et al.
Published: (2025)
by: Fernández, Jesús García, et al.
Published: (2025)
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization
by: Liu, Shih-Yang, et al.
Published: (2026)
by: Liu, Shih-Yang, et al.
Published: (2026)
Towards better dense rewards in Reinforcement Learning Applications
by: Zhang, Shuyuan
Published: (2025)
by: Zhang, Shuyuan
Published: (2025)
sDREAMER: Self-distilled Mixture-of-Modality-Experts Transformer for Automatic Sleep Staging
by: Chen, Jingyuan, et al.
Published: (2025)
by: Chen, Jingyuan, et al.
Published: (2025)
Information-theoretic analysis of world models in optimal reward maximizers
by: Harwood, Alfred, et al.
Published: (2026)
by: Harwood, Alfred, et al.
Published: (2026)
Resistance of Trapezoidal Sheeting in Fire
by: Aleš Chovanec, et al.
Published: (2025)
by: Aleš Chovanec, et al.
Published: (2025)
Towards Lightweight and Stable Zero-shot TTS with Self-distilled Representation Disentanglement
by: Chen, Qianniu, et al.
Published: (2025)
by: Chen, Qianniu, et al.
Published: (2025)
Self-supervised Hierarchical Visual Reasoning with World Model
by: Xu, Yuanfei, et al.
Published: (2026)
by: Xu, Yuanfei, et al.
Published: (2026)
Toward effective protection against diffusion based mimicry through score distillation
by: Xue, Haotian, et al.
Published: (2023)
by: Xue, Haotian, et al.
Published: (2023)
Education distillation:getting student models to learn in shcools
by: Feng, Ling, et al.
Published: (2023)
by: Feng, Ling, et al.
Published: (2023)
On the expressivity of sparse maxout networks
by: Grillo, Moritz, et al.
Published: (2025)
by: Grillo, Moritz, et al.
Published: (2025)
EVAL: EigenVector-based Average-reward Learning
by: Adamczyk, Jacob, et al.
Published: (2025)
by: Adamczyk, Jacob, et al.
Published: (2025)
Episodic Reinforcement Learning with Expanded State-reward Space
by: Liang, Dayang, et al.
Published: (2024)
by: Liang, Dayang, et al.
Published: (2024)
Numerical exploration of the range of shape functionals using neural networks
by: Martinet, Eloi, et al.
Published: (2026)
by: Martinet, Eloi, et al.
Published: (2026)
Smooth-Distill: A Self-distillation Framework for Multitask Learning with Wearable Sensor Data
by: Vu, Hoang-Dieu, et al.
Published: (2025)
by: Vu, Hoang-Dieu, et al.
Published: (2025)
SMS: Self-supervised Model Seeding for Verification of Machine Unlearning
by: Wang, Weiqi, et al.
Published: (2025)
by: Wang, Weiqi, et al.
Published: (2025)
Self-supervised learning on gene expression data
by: Dradjat, Kevin, et al.
Published: (2025)
by: Dradjat, Kevin, et al.
Published: (2025)
FIT-SLAM -- Fisher Information and Traversability estimation-based Active SLAM for exploration in 3D environments
by: Saravanan, Suchetan, et al.
Published: (2024)
by: Saravanan, Suchetan, et al.
Published: (2024)
IDLM: Inverse-distilled Diffusion Language Models
by: Li, David, et al.
Published: (2026)
by: Li, David, et al.
Published: (2026)
R-ParVI: Particle-based variational inference through lens of rewards
by: Huang, Yongchao
Published: (2025)
by: Huang, Yongchao
Published: (2025)
Deep Reinforcement Learning with anticipatory reward in LSTM for Collision Avoidance of Mobile Robots
by: Poulet, Olivier, et al.
Published: (2025)
by: Poulet, Olivier, et al.
Published: (2025)
Similar Items
-
Robot at the Mirror: Learning to Imitate via Associating Self-supervised Models
by: Lucny, Andrej, et al.
Published: (2023) -
The impact of intrinsic rewards on exploration in Reinforcement Learning
by: Kayal, Aya, et al.
Published: (2025) -
Appearance-based gaze estimation enhanced with synthetic images using deep neural networks
by: Herashchenko, Dmytro, et al.
Published: (2023) -
Safe Reinforcement Learning in a Simulated Robotic Arm
by: Kovač, Luka, et al.
Published: (2023) -
Autonomous state-space segmentation for Deep-RL sparse reward scenarios
by: Maselli, Gianluca, et al.
Published: (2025)