Saved in:
| Main Authors: | Raghuraman, Nikhil, Harley, Adam W., Guibas, Leonidas |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2309.03468 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Bongard-RWR+: Real-World Representations of Fine-Grained Concepts in Bongard Problems
by: Pawlonka, Szymon, et al.
Published: (2025)
by: Pawlonka, Szymon, et al.
Published: (2025)
Refining Pre-Trained Motion Models
by: Sun, Xinglong, et al.
Published: (2024)
by: Sun, Xinglong, et al.
Published: (2024)
Reasoning Limitations of Multimodal Large Language Models. A Case Study of Bongard Problems
by: Małkiński, Mikołaj, et al.
Published: (2024)
by: Małkiński, Mikołaj, et al.
Published: (2024)
PASTA: Controllable Part-Aware Shape Generation with Autoregressive Transformers
by: Li, Songlin, et al.
Published: (2024)
by: Li, Songlin, et al.
Published: (2024)
View-Consistent Hierarchical 3D Segmentation Using Ultrametric Feature Fields
by: He, Haodi, et al.
Published: (2024)
by: He, Haodi, et al.
Published: (2024)
Synthesizing 3D Abstractions by Inverting Procedural Buildings with Transformers
by: Dax, Maximilian, et al.
Published: (2025)
by: Dax, Maximilian, et al.
Published: (2025)
pi-Flow: Policy-Based Few-Step Generation via Imitation Distillation
by: Chen, Hansheng, et al.
Published: (2025)
by: Chen, Hansheng, et al.
Published: (2025)
Diffusion Self-Distillation for Zero-Shot Customized Image Generation
by: Cai, Shengqu, et al.
Published: (2024)
by: Cai, Shengqu, et al.
Published: (2024)
Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs
by: Hong, Yining, et al.
Published: (2026)
by: Hong, Yining, et al.
Published: (2026)
ESI-Bench: Towards Embodied Spatial Intelligence that Closes the Perception-Action Loop
by: Hong, Yining, et al.
Published: (2026)
by: Hong, Yining, et al.
Published: (2026)
Bongards at the Boundary of Perception and Reasoning: Programs or Language?
by: Langenfeld, Cassidy, et al.
Published: (2026)
by: Langenfeld, Cassidy, et al.
Published: (2026)
Diversify, Don't Fine-Tune: Scaling Up Visual Recognition Training with Synthetic Images
by: Yu, Zhuoran, et al.
Published: (2023)
by: Yu, Zhuoran, et al.
Published: (2023)
Beyond Perfect Scores: Proof-by-Contradiction for Trustworthy Machine Learning
by: Wadduwage, Dushan N., et al.
Published: (2026)
by: Wadduwage, Dushan N., et al.
Published: (2026)
ODIN: A Single Model for 2D and 3D Segmentation
by: Jain, Ayush, et al.
Published: (2024)
by: Jain, Ayush, et al.
Published: (2024)
FOCUS: Forcing In-Context Object Localization through Visual Support Constraints and Policy Optimization
by: Karim, Mohammed Asad, et al.
Published: (2026)
by: Karim, Mohammed Asad, et al.
Published: (2026)
Diffusion for World Modeling: Visual Details Matter in Atari
by: Alonso, Eloi, et al.
Published: (2024)
by: Alonso, Eloi, et al.
Published: (2024)
RAVEN: Resilient Aerial Navigation via Open-Set Semantic Memory and Behavior Adaptation
by: Kim, Seungchan, et al.
Published: (2025)
by: Kim, Seungchan, et al.
Published: (2025)
RayFronts: Open-Set Semantic Ray Frontiers for Online Scene Understanding and Exploration
by: Alama, Omar, et al.
Published: (2025)
by: Alama, Omar, et al.
Published: (2025)
Local Lesion Generation is Effective for Capsule Endoscopy Image Data Augmentation in a Limited Data Setting
by: Chłopowiec, Adrian B., et al.
Published: (2024)
by: Chłopowiec, Adrian B., et al.
Published: (2024)
Seeing Beyond Frames: Zero-Shot Pedestrian Intention Prediction with Raw Temporal Video and Multimodal Cues
by: Zambare, Pallavi, et al.
Published: (2025)
by: Zambare, Pallavi, et al.
Published: (2025)
SpinQuant: LLM quantization with learned rotations
by: Liu, Zechun, et al.
Published: (2024)
by: Liu, Zechun, et al.
Published: (2024)
Robust sensor fusion against on-vehicle sensor staleness
by: Fan, Meng, et al.
Published: (2025)
by: Fan, Meng, et al.
Published: (2025)
SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling
by: Fedele, Elisabetta, et al.
Published: (2025)
by: Fedele, Elisabetta, et al.
Published: (2025)
HouseLayout3D: A Benchmark and Training-Free Baseline for 3D Layout Estimation in the Wild
by: Bieri, Valentin, et al.
Published: (2025)
by: Bieri, Valentin, et al.
Published: (2025)
Bayesian Modeling of Zero-Shot Classifications for Urban Flood Detection
by: Franchi, Matt, et al.
Published: (2025)
by: Franchi, Matt, et al.
Published: (2025)
SetFlow: Generating Structured Sets of Representations for Multiple Instance Learning
by: Jovišić, Nikola, et al.
Published: (2026)
by: Jovišić, Nikola, et al.
Published: (2026)
Convolutional Set Transformer
by: Chinello, Federico, et al.
Published: (2025)
by: Chinello, Federico, et al.
Published: (2025)
OpenSUN3D: 1st Workshop Challenge on Open-Vocabulary 3D Scene Understanding
by: Engelmann, Francis, et al.
Published: (2024)
by: Engelmann, Francis, et al.
Published: (2024)
Animal Pose Labeling Using General-Purpose Point Trackers
by: Pan, Zhuoyang, et al.
Published: (2025)
by: Pan, Zhuoyang, et al.
Published: (2025)
LookOut: Real-World Humanoid Egocentric Navigation
by: Pan, Boxiao, et al.
Published: (2025)
by: Pan, Boxiao, et al.
Published: (2025)
Self-supervised video pretraining yields robust and more human-aligned visual representations
by: Parthasarathy, Nikhil, et al.
Published: (2022)
by: Parthasarathy, Nikhil, et al.
Published: (2022)
Semi-Supervised Masked Autoencoders: Unlocking Vision Transformer Potential with Limited Data
by: Faysal, Atik, et al.
Published: (2026)
by: Faysal, Atik, et al.
Published: (2026)
Zero-Shot Image Feature Consensus with Deep Functional Maps
by: Cheng, Xinle, et al.
Published: (2024)
by: Cheng, Xinle, et al.
Published: (2024)
Context Normalization Layer with Applications
by: Faye, Bilal, et al.
Published: (2023)
by: Faye, Bilal, et al.
Published: (2023)
What Matters in Practical Learned Image Compression
by: Tatwawadi, Kedar, et al.
Published: (2026)
by: Tatwawadi, Kedar, et al.
Published: (2026)
Holistic Uncertainty Estimation For Open-Set Recognition
by: Erlygin, Leonid, et al.
Published: (2024)
by: Erlygin, Leonid, et al.
Published: (2024)
GHOST: Gaussian Hypothesis Open-Set Technique
by: Rabinowitz, Ryan, et al.
Published: (2025)
by: Rabinowitz, Ryan, et al.
Published: (2025)
ARC Is a Vision Problem!
by: Hu, Keya, et al.
Published: (2025)
by: Hu, Keya, et al.
Published: (2025)
What Matters in Range View 3D Object Detection
by: Wilson, Benjamin, et al.
Published: (2024)
by: Wilson, Benjamin, et al.
Published: (2024)
Efficient World Models with Context-Aware Tokenization
by: Micheli, Vincent, et al.
Published: (2024)
by: Micheli, Vincent, et al.
Published: (2024)
Similar Items
-
Bongard-RWR+: Real-World Representations of Fine-Grained Concepts in Bongard Problems
by: Pawlonka, Szymon, et al.
Published: (2025) -
Refining Pre-Trained Motion Models
by: Sun, Xinglong, et al.
Published: (2024) -
Reasoning Limitations of Multimodal Large Language Models. A Case Study of Bongard Problems
by: Małkiński, Mikołaj, et al.
Published: (2024) -
PASTA: Controllable Part-Aware Shape Generation with Autoregressive Transformers
by: Li, Songlin, et al.
Published: (2024) -
View-Consistent Hierarchical 3D Segmentation Using Ultrametric Feature Fields
by: He, Haodi, et al.
Published: (2024)