:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Shankar, Nathan, Ladosz, Pawel, Yin, Hujun
Format:	Preprint
Published:	2025
Subjects:	Robotics Computer Vision and Pattern Recognition Machine Learning
Online Access:	https://arxiv.org/abs/2510.04883
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

FLARE-BO: Fused Luminance and Adaptive Retinex Enhancement via Bayesian Optimisation for Low-Light Robotic Vision
by: Shankar, Nathan, et al.
Published: (2026)

Do Open-Vocabulary Detectors Transfer to Aerial Imagery? A Comparative Evaluation
by: Tsourveloudis, Christos
Published: (2026)

Biomolecular Analysis of Soil Samples and Rock Imagery for Tracing Evidence of Life Using a Mobile Robot
by: Siddique, Shah Md Ahasan, et al.
Published: (2024)

Human-in-the-Loop Segmentation of Multi-species Coral Imagery
by: Raine, Scarlett, et al.
Published: (2024)

Application Research of a Deep Learning Model Integrating CycleGAN and YOLO in PCB Infrared Defect Detection
by: Yang, Chao, et al.
Published: (2026)

ActLoc: Learning to Localize on the Move via Active Viewpoint Selection
by: Li, Jiajie, et al.
Published: (2025)

Observer-Actor: Active Vision Imitation Learning with Sparse-View Gaussian Splatting
by: Wang, Yilong, et al.
Published: (2025)

AVA-VLA: Improving Vision-Language-Action models with Active Visual Attention
by: Xiao, Lei, et al.
Published: (2025)

Flux4D: Flow-based Unsupervised 4D Reconstruction
by: Wang, Jingkang, et al.
Published: (2025)

Comparative Analysis of Military Detection Using Drone Imagery Across Multiple Visual Spectrums
by: Shuvo, Sourov Roy, et al.
Published: (2026)

Reconstruction or Semantics? What Makes a Latent Space Useful for Robotic World Models
by: Nilaksh, et al.
Published: (2026)

Control-oriented Clustering of Visual Latent Representation
by: Qi, Han, et al.
Published: (2024)

Formulating Event-based Image Reconstruction as a Linear Inverse Problem with Deep Regularization using Optical Flow
by: Zhang, Zelin, et al.
Published: (2021)

Inference-Time Enhancement of Generative Robot Policies via Predictive World Modeling
by: Qi, Han, et al.
Published: (2025)

ReGentS: Real-World Safety-Critical Driving Scenario Generation Made Stable
by: Yin, Yuan, et al.
Published: (2024)

Enhanced Spatiotemporal Consistency for Image-to-LiDAR Data Pretraining
by: Xu, Xiang, et al.
Published: (2025)

H-RDT: Human Manipulation Enhanced Bimanual Robotic Manipulation
by: Bi, Hongzhe, et al.
Published: (2025)

LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning
by: Niu, Dantong, et al.
Published: (2024)

AirIO: Learning Inertial Odometry with Enhanced IMU Feature Observability
by: Qiu, Yuheng, et al.
Published: (2025)

Language-Enhanced Latent Representations for Out-of-Distribution Detection in Autonomous Driving
by: Mao, Zhenjiang, et al.
Published: (2024)

VisualMimic: Visual Humanoid Loco-Manipulation via Motion Tracking and Generation
by: Yin, Shaofeng, et al.
Published: (2025)

GenDP: 3D Semantic Fields for Category-Level Generalizable Diffusion Policy
by: Wang, Yixuan, et al.
Published: (2024)

FEEL (Force-Enhanced Egocentric Learning): A Dataset for Physical Action Understanding
by: Dessalene, Eadom, et al.
Published: (2026)

Reconstruction by Generation: 3D Multi-Object Scene Reconstruction from Sparse Observations
by: Zadaianchuk, Andrii, et al.
Published: (2026)

R3GS: Gaussian Splatting for Robust Reconstruction and Relocalization in Unconstrained Image Collections
by: yan, Xu, et al.
Published: (2025)

Show, Don't Tell: Detecting Novel Objects by Watching Human Videos
by: Akl, James, et al.
Published: (2026)

iVideoGPT: Interactive VideoGPTs are Scalable World Models
by: Wu, Jialong, et al.
Published: (2024)

Enhancing Large Vision Model in Street Scene Semantic Understanding through Leveraging Posterior Optimization Trajectory
by: Kou, Wei-Bin, et al.
Published: (2025)

LLM-attacker: Enhancing Closed-loop Adversarial Scenario Generation for Autonomous Driving with Large Language Models
by: Mei, Yuewen, et al.
Published: (2025)

ReasonDrive: Efficient Visual Question Answering for Autonomous Vehicles with Reasoning-Enhanced Small Vision-Language Models
by: Chahe, Amirhosein, et al.
Published: (2025)

Robot Synesthesia: In-Hand Manipulation with Visuotactile Sensing
by: Yuan, Ying, et al.
Published: (2023)

CodeDiffuser: Attention-Enhanced Diffusion Policy via VLM-Generated Code for Instruction Ambiguity
by: Yin, Guang, et al.
Published: (2025)

TAVIS: A Benchmark for Egocentric Active Vision and Anticipatory Gaze in Imitation Learning
by: Spigler, Giacomo
Published: (2026)

SegXAL: Explainable Active Learning for Semantic Segmentation in Driving Scene Scenarios
by: Mandalika, Sriram, et al.
Published: (2024)

Enhancing Active Learning for Sentinel 2 Imagery through Contrastive Learning and Uncertainty Estimation
by: Pogorzelski, David, et al.
Published: (2024)

MapAnything: Universal Feed-Forward Metric 3D Reconstruction
by: Keetha, Nikhil, et al.
Published: (2025)

Make the Pertinent Salient: Task-Relevant Reconstruction for Visual Control with Distractions
by: Kim, Kyungmin, et al.
Published: (2024)

Differentiable Inverse Graphics for Zero-shot Scene Reconstruction and Robot Grasping
by: Arriaga, Octavio, et al.
Published: (2026)

Any4D: Unified Feed-Forward Metric 4D Reconstruction
by: Karhade, Jay, et al.
Published: (2025)

SplArt: Articulation Estimation and Part-Level Reconstruction with 3D Gaussian Splatting
by: Lin, Shengjie, et al.
Published: (2025)