Saved in:
| Main Authors: | Wang, Renhao, Geng, Haoran, Li, Tingle, Wang, Feishi, Anumanchipalli, Gopala, Darrell, Trevor, Li, Boyi, Abbeel, Pieter, Malik, Jitendra, Efros, Alexei A. |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.02864 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Self-Supervised Audio-Visual Soundscape Stylization
by: Li, Tingle, et al.
Published: (2024)
by: Li, Tingle, et al.
Published: (2024)
Audio Texture Manipulation by Exemplar-Based Analogy
by: Cheng, Kan Jen, et al.
Published: (2025)
by: Cheng, Kan Jen, et al.
Published: (2025)
Prioritized Generative Replay
by: Wang, Renhao, et al.
Published: (2024)
by: Wang, Renhao, et al.
Published: (2024)
Interactive Task Planning with Language Models
by: Li, Boyi, et al.
Published: (2023)
by: Li, Boyi, et al.
Published: (2023)
Sounding that Object: Interactive Object-Aware Image to Audio Generation
by: Li, Tingle, et al.
Published: (2025)
by: Li, Tingle, et al.
Published: (2025)
ViTacFormer: Learning Cross-Modal Representation for Visuo-Tactile Dexterous Manipulation
by: Heng, Liang, et al.
Published: (2025)
by: Heng, Liang, et al.
Published: (2025)
Rodrigues Network for Learning Robot Actions
by: Zhang, Jialiang, et al.
Published: (2025)
by: Zhang, Jialiang, et al.
Published: (2025)
Synthesizing Moving People with 3D Control
by: Li, Boyi, et al.
Published: (2024)
by: Li, Boyi, et al.
Published: (2024)
D-REX: Differentiable Real-to-Sim-to-Real Engine for Learning Dexterous Grasping
by: Lou, Haozhe, et al.
Published: (2026)
by: Lou, Haozhe, et al.
Published: (2026)
DexGarmentLab: Dexterous Garment Manipulation Environment with Generalizable Policy
by: Wang, Yuran, et al.
Published: (2025)
by: Wang, Yuran, et al.
Published: (2025)
Rethinking Patch Dependence for Masked Autoencoders
by: Fu, Letian, et al.
Published: (2024)
by: Fu, Letian, et al.
Published: (2024)
SkillBlender: Towards Versatile Humanoid Whole-Body Loco-Manipulation via Skill Blending
by: Kuang, Yuxuan, et al.
Published: (2025)
by: Kuang, Yuxuan, et al.
Published: (2025)
Deep Sensorimotor Control by Imitating Predictive Models of Human Motion
by: Singh, Himanshu Gaurav, et al.
Published: (2025)
by: Singh, Himanshu Gaurav, et al.
Published: (2025)
DIPOLE: Fusing Vision and Geometry for Robust Visuomotor Generalization
by: Tang, Yikai, et al.
Published: (2025)
by: Tang, Yikai, et al.
Published: (2025)
Multi-Objective Learning for Diffusion Models: A Statistical Theory under Semi-Supervised Learning
by: Cheng, Ziheng, et al.
Published: (2026)
by: Cheng, Ziheng, et al.
Published: (2026)
StyleStream: Real-Time Zero-Shot Voice Style Conversion
by: Liu, Yisi, et al.
Published: (2026)
by: Liu, Yisi, et al.
Published: (2026)
End-to-end RL Improves Dexterous Grasping Policies
by: Singh, Ritvik, et al.
Published: (2025)
by: Singh, Ritvik, et al.
Published: (2025)
Full-Duplex-Bench: A Benchmark to Evaluate Full-duplex Spoken Dialogue Models on Turn-taking Capabilities
by: Lin, Guan-Ting, et al.
Published: (2025)
by: Lin, Guan-Ting, et al.
Published: (2025)
Learning Humanoid Locomotion over Challenging Terrain
by: Radosavovic, Ilija, et al.
Published: (2024)
by: Radosavovic, Ilija, et al.
Published: (2024)
Re-evaluating the Need for Multimodal Signals in Unsupervised Grammar Induction
by: Li, Boyi, et al.
Published: (2022)
by: Li, Boyi, et al.
Published: (2022)
Towards Hierarchical Spoken Language Dysfluency Modeling
by: Lian, Jiachen, et al.
Published: (2024)
by: Lian, Jiachen, et al.
Published: (2024)
Large Video Planner Enables Generalizable Robot Control
by: Chen, Boyuan, et al.
Published: (2025)
by: Chen, Boyuan, et al.
Published: (2025)
It's Never Too Late: Noise Optimization for Collapse Recovery in Trained Diffusion Models
by: Harrington, Anne, et al.
Published: (2025)
by: Harrington, Anne, et al.
Published: (2025)
Closing the Visual Sim-to-Real Gap with Object-Composable NeRFs
by: Mishra, Nikhil, et al.
Published: (2024)
by: Mishra, Nikhil, et al.
Published: (2024)
How to Peel with a Knife: Aligning Fine-Grained Manipulation with Human Preference
by: Lin, Toru, et al.
Published: (2026)
by: Lin, Toru, et al.
Published: (2026)
Twisting Lids Off with Two Hands
by: Lin, Toru, et al.
Published: (2024)
by: Lin, Toru, et al.
Published: (2024)
Visual Imitation Enables Contextual Humanoid Control
by: Allshire, Arthur, et al.
Published: (2025)
by: Allshire, Arthur, et al.
Published: (2025)
RT-VC: Real-Time Zero-Shot Voice Conversion with Speech Articulatory Coding
by: Liu, Yisi, et al.
Published: (2025)
by: Liu, Yisi, et al.
Published: (2025)
xT: Nested Tokenization for Larger Context in Large Images
by: Gupta, Ritwik, et al.
Published: (2024)
by: Gupta, Ritwik, et al.
Published: (2024)
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models
by: Lian, Long, et al.
Published: (2023)
by: Lian, Long, et al.
Published: (2023)
Learning Sim-to-Real Humanoid Locomotion in 15 Minutes
by: Seo, Younggyo, et al.
Published: (2025)
by: Seo, Younggyo, et al.
Published: (2025)
Schrodinger Audio-Visual Editor: Object-Level Audiovisual Removal
by: Xu, Weihan, et al.
Published: (2025)
by: Xu, Weihan, et al.
Published: (2025)
From Generated Human Videos to Physically Plausible Robot Trajectories
by: Ni, James, et al.
Published: (2025)
by: Ni, James, et al.
Published: (2025)
A Unified Framework for Model Editing
by: Gupta, Akshat, et al.
Published: (2024)
by: Gupta, Akshat, et al.
Published: (2024)
Is Bigger Edit Batch Size Always Better? -- An Empirical Study on Model Editing with Llama-3
by: Yoon, Junsang, et al.
Published: (2024)
by: Yoon, Junsang, et al.
Published: (2024)
Rebuilding ROME : Resolving Model Collapse during Sequential Model Editing
by: Gupta, Akshat, et al.
Published: (2024)
by: Gupta, Akshat, et al.
Published: (2024)
Geometric Interpretation of Layer Normalization and a Comparative Analysis with RMSNorm
by: Gupta, Akshat, et al.
Published: (2024)
by: Gupta, Akshat, et al.
Published: (2024)
Model Editing at Scale leads to Gradual and Catastrophic Forgetting
by: Gupta, Akshat, et al.
Published: (2024)
by: Gupta, Akshat, et al.
Published: (2024)
Self-Assessment Tests are Unreliable Measures of LLM Personality
by: Gupta, Akshat, et al.
Published: (2023)
by: Gupta, Akshat, et al.
Published: (2023)
Multimodal Segmentation for Vocal Tract Modeling
by: Jain, Rishi, et al.
Published: (2024)
by: Jain, Rishi, et al.
Published: (2024)
Similar Items
-
Self-Supervised Audio-Visual Soundscape Stylization
by: Li, Tingle, et al.
Published: (2024) -
Audio Texture Manipulation by Exemplar-Based Analogy
by: Cheng, Kan Jen, et al.
Published: (2025) -
Prioritized Generative Replay
by: Wang, Renhao, et al.
Published: (2024) -
Interactive Task Planning with Language Models
by: Li, Boyi, et al.
Published: (2023) -
Sounding that Object: Interactive Object-Aware Image to Audio Generation
by: Li, Tingle, et al.
Published: (2025)