:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Raghuraman, Nikhil, Harley, Adam W., Guibas, Leonidas
Format:	Preprint
Published:	2023
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2309.03468
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Bongard-RWR+: Real-World Representations of Fine-Grained Concepts in Bongard Problems
by: Pawlonka, Szymon, et al.
Published: (2025)

Refining Pre-Trained Motion Models
by: Sun, Xinglong, et al.
Published: (2024)

Reasoning Limitations of Multimodal Large Language Models. A Case Study of Bongard Problems
by: Małkiński, Mikołaj, et al.
Published: (2024)

PASTA: Controllable Part-Aware Shape Generation with Autoregressive Transformers
by: Li, Songlin, et al.
Published: (2024)

View-Consistent Hierarchical 3D Segmentation Using Ultrametric Feature Fields
by: He, Haodi, et al.
Published: (2024)

Synthesizing 3D Abstractions by Inverting Procedural Buildings with Transformers
by: Dax, Maximilian, et al.
Published: (2025)

pi-Flow: Policy-Based Few-Step Generation via Imitation Distillation
by: Chen, Hansheng, et al.
Published: (2025)

Diffusion Self-Distillation for Zero-Shot Customized Image Generation
by: Cai, Shengqu, et al.
Published: (2024)

Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs
by: Hong, Yining, et al.
Published: (2026)

ESI-Bench: Towards Embodied Spatial Intelligence that Closes the Perception-Action Loop
by: Hong, Yining, et al.
Published: (2026)

Bongards at the Boundary of Perception and Reasoning: Programs or Language?
by: Langenfeld, Cassidy, et al.
Published: (2026)

Diversify, Don't Fine-Tune: Scaling Up Visual Recognition Training with Synthetic Images
by: Yu, Zhuoran, et al.
Published: (2023)

Beyond Perfect Scores: Proof-by-Contradiction for Trustworthy Machine Learning
by: Wadduwage, Dushan N., et al.
Published: (2026)

ODIN: A Single Model for 2D and 3D Segmentation
by: Jain, Ayush, et al.
Published: (2024)

FOCUS: Forcing In-Context Object Localization through Visual Support Constraints and Policy Optimization
by: Karim, Mohammed Asad, et al.
Published: (2026)

Diffusion for World Modeling: Visual Details Matter in Atari
by: Alonso, Eloi, et al.
Published: (2024)

RAVEN: Resilient Aerial Navigation via Open-Set Semantic Memory and Behavior Adaptation
by: Kim, Seungchan, et al.
Published: (2025)

RayFronts: Open-Set Semantic Ray Frontiers for Online Scene Understanding and Exploration
by: Alama, Omar, et al.
Published: (2025)

Local Lesion Generation is Effective for Capsule Endoscopy Image Data Augmentation in a Limited Data Setting
by: Chłopowiec, Adrian B., et al.
Published: (2024)

Seeing Beyond Frames: Zero-Shot Pedestrian Intention Prediction with Raw Temporal Video and Multimodal Cues
by: Zambare, Pallavi, et al.
Published: (2025)

SpinQuant: LLM quantization with learned rotations
by: Liu, Zechun, et al.
Published: (2024)

Robust sensor fusion against on-vehicle sensor staleness
by: Fan, Meng, et al.
Published: (2025)

SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling
by: Fedele, Elisabetta, et al.
Published: (2025)

HouseLayout3D: A Benchmark and Training-Free Baseline for 3D Layout Estimation in the Wild
by: Bieri, Valentin, et al.
Published: (2025)

Bayesian Modeling of Zero-Shot Classifications for Urban Flood Detection
by: Franchi, Matt, et al.
Published: (2025)

SetFlow: Generating Structured Sets of Representations for Multiple Instance Learning
by: Jovišić, Nikola, et al.
Published: (2026)

Convolutional Set Transformer
by: Chinello, Federico, et al.
Published: (2025)

OpenSUN3D: 1st Workshop Challenge on Open-Vocabulary 3D Scene Understanding
by: Engelmann, Francis, et al.
Published: (2024)

Animal Pose Labeling Using General-Purpose Point Trackers
by: Pan, Zhuoyang, et al.
Published: (2025)

LookOut: Real-World Humanoid Egocentric Navigation
by: Pan, Boxiao, et al.
Published: (2025)

Self-supervised video pretraining yields robust and more human-aligned visual representations
by: Parthasarathy, Nikhil, et al.
Published: (2022)

Semi-Supervised Masked Autoencoders: Unlocking Vision Transformer Potential with Limited Data
by: Faysal, Atik, et al.
Published: (2026)

Zero-Shot Image Feature Consensus with Deep Functional Maps
by: Cheng, Xinle, et al.
Published: (2024)

Context Normalization Layer with Applications
by: Faye, Bilal, et al.
Published: (2023)

What Matters in Practical Learned Image Compression
by: Tatwawadi, Kedar, et al.
Published: (2026)

Holistic Uncertainty Estimation For Open-Set Recognition
by: Erlygin, Leonid, et al.
Published: (2024)

GHOST: Gaussian Hypothesis Open-Set Technique
by: Rabinowitz, Ryan, et al.
Published: (2025)

ARC Is a Vision Problem!
by: Hu, Keya, et al.
Published: (2025)

What Matters in Range View 3D Object Detection
by: Wilson, Benjamin, et al.
Published: (2024)

Efficient World Models with Context-Aware Tokenization
by: Micheli, Vincent, et al.
Published: (2024)