:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Brown, Jostan, Grimm, Cindy, Davidson, Joseph R.
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Robotics
Online Access:	https://arxiv.org/abs/2504.10764
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Machine Vision-Based Assessment of Fall Color Changes and its Relationship with Leaf Nitrogen Concentration
by: Paudel, Achyut, et al.
Published: (2024)

VERNIER: an open-source software pushing marker pose estimation down to the micrometer and nanometer scales
by: Sandoz, Patrick, et al.
Published: (2025)

SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding
by: Li, Rong, et al.
Published: (2024)

Depth Jitter: Seeing through the Depth
by: Rahman, Md Sazidur, et al.
Published: (2025)

AscDAMs: Advanced SLAM-based channel detection and mapping system
by: Wang, Tengfei, et al.
Published: (2024)

Seeing to Act, Prompting to Specify: A Bayesian Factorization of Vision Language Action Policy
by: Xu, Kechun, et al.
Published: (2025)

YOLO26-RipeLoc Lite: A lightweight architecture for tomato ripeness detection and picking point localization in greenhouse robotic harvesting
by: Singh, Rajmeet, et al.
Published: (2026)

From Seeing to Experiencing: Scaling Navigation Foundation Models with Reinforcement Learning
by: He, Honglin, et al.
Published: (2025)

See and Switch: Vision-Based Branching for Interactive Robot-Skill Programming
by: Vanc, Petr, et al.
Published: (2026)

RefAV: Towards Planning-Centric Scenario Mining
by: Davidson, Cainan, et al.
Published: (2025)

Learning to See and Act: Task-Aware Virtual View Exploration for Robotic Manipulation
by: Bai, Yongjie, et al.
Published: (2025)

VANP: Learning Where to See for Navigation with Self-Supervised Vision-Action Pre-Training
by: Nazeri, Mohammad, et al.
Published: (2024)

Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces
by: Luo, Gen, et al.
Published: (2025)

How Robot Dogs See the Unseeable: Improving Visual Interpretability via Peering for Exploratory Robots
by: Bimber, Oliver, et al.
Published: (2025)

See, Plan, Rewind: Progress-Aware Vision-Language-Action Models for Robust Robotic Manipulation
by: Dai, Tingjun, et al.
Published: (2026)

See What Matters: Differentiable Grid Sample Pruning for Generalizable Vision-Language-Action Model
by: Feng, Yixu, et al.
Published: (2026)

Robot See Robot Do: Imitating Articulated Object Manipulation with Monocular 4D Reconstruction
by: Kerr, Justin, et al.
Published: (2024)

SeePerSea: Multi-modal Perception Dataset of In-water Objects for Autonomous Surface Vehicles
by: Jeong, Mingi, et al.
Published: (2024)

Seeing Realism from Simulation: Efficient Video Transfer for Vision-Language-Action Data Augmentation
by: Hui, Chenyu, et al.
Published: (2026)

OVerSeeC: Open-Vocabulary Costmap Generation from Satellite Images and Natural Language
by: Rana, Rwik, et al.
Published: (2026)

Seeing Where to Deploy: Metric RGB-Based Traversability Analysis for Aerial-to-Ground Hidden Space Inspection
by: Lee, Seoyoung, et al.
Published: (2026)

Autonomous Catheterization with Open-source Simulator and Expert Trajectory
by: Jianu, Tudor, et al.
Published: (2024)

Towards agile multi-robot systems in the real world: Fast onboard tracking of active blinking markers for relative localization
by: Lakemann, Tim Felix, et al.
Published: (2025)

Addressing the challenges of loop detection in agricultural environments
by: Soncini, Nicolás, et al.
Published: (2024)

Floor extraction and door detection for visually impaired guidance
by: Berenguel-Baeta, Bruno, et al.
Published: (2024)

Believing is Seeing: Unobserved Object Detection using Generative Models
by: Bhattacharjee, Subhransu S., et al.
Published: (2024)

See Silhouettes in Motion with Neuromorphic Vision
by: Zhang, Pei, et al.
Published: (2026)

Self-localization on a 3D map by fusing global and local features from a monocular camera
by: Kikuchi, Satoshi, et al.
Published: (2025)

OmniNOCS: A unified NOCS dataset and model for 3D lifting of 2D objects
by: Krishnan, Akshay, et al.
Published: (2024)

REGRACE: A Robust and Efficient Graph-based Re-localization Algorithm using Consistency Evaluation
by: Oliveira, Débora N. P., et al.
Published: (2025)

Embodied Tree of Thoughts: Deliberate Manipulation Planning with Embodied World Model
by: Xu, Wenjiang, et al.
Published: (2025)

Multi-Modal Camera-Based Detection of Vulnerable Road Users
by: Brown, Penelope, et al.
Published: (2025)

Evaluation of facial landmark localization performance in a surgical setting
by: Frajtag, Ines, et al.
Published: (2025)

Seeing Farther and Smarter: Value-Guided Multi-Path Reflection for VLM Policy Optimization
by: Yang, Yanting, et al.
Published: (2026)

Seeing Through Uncertainty: A Free-Energy Approach for Real-Time Perceptual Adaptation in Robust Visual Navigation
by: Piriyajitakonkij, Maytus, et al.
Published: (2024)

BiasBench: A reproducible benchmark for tuning the biases of event cameras
by: Ziegler, Andreas, et al.
Published: (2025)

Ensemble-Based Event Camera Place Recognition Under Varying Illumination
by: Joseph, Therese, et al.
Published: (2025)

Gradient-Driven 3D Segmentation and Affordance Transfer in Gaussian Splatting Using 2D Masks
by: Joseph, Joji, et al.
Published: (2024)

STHN: Deep Homography Estimation for UAV Thermal Geo-localization with Satellite Imagery
by: Xiao, Jiuhong, et al.
Published: (2024)

Image-based Geo-localization for Robotics: Are Black-box Vision-Language Models there yet?
by: Waheed, Sania, et al.
Published: (2025)