:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Passi, Ananya, Robinson, Brian S., Bonner, Michael F.
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2605.19155
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

An extremely coarse feedback signal is sufficient for learning human-aligned visual representations
by: Mehta, Yash, et al.
Published: (2026)

Universal dimensions of visual representation
by: Chen, Zirui, et al.
Published: (2024)

Rapidly deploying on-device eye tracking by distilling visual foundation models
by: Jiang, Cheng, et al.
Published: (2026)

SAGE: Spatial-visual Adaptive Graph Exploration for Efficient Visual Place Recognition
by: Chen, Shunpeng, et al.
Published: (2025)

Evaluating the Suitability of Different Intraoral Scan Resolutions for Deep Learning-Based Tooth Segmentation
by: Weekley, Daron, et al.
Published: (2025)

DIMM: Decoupled Multi-hierarchy Kalman Filter for 3D Object Tracking
by: Zha, Jirong, et al.
Published: (2025)

HiERO: understanding the hierarchy of human behavior enhances reasoning on egocentric videos
by: Peirone, Simone Alberto, et al.
Published: (2025)

Towards long-term player tracking with graph hierarchies and domain-specific features
by: Koshkina, Maria, et al.
Published: (2025)

Mining Contextualized Visual Associations from Images for Creativity Understanding
by: Sahu, Ananya, et al.
Published: (2025)

Contrastive Learning-based Multi Modal Architecture for Emoticon Prediction by Employing Image-Text Pairs
by: Pandey, Ananya, et al.
Published: (2024)

Target-Dependent Multimodal Sentiment Analysis Via Employing Visual-to Emotional-Caption Translation Network using Visual-Caption Pairs
by: Pandey, Ananya, et al.
Published: (2024)

FedPartWhole: Federated domain generalization via consistent part-whole hierarchies
by: Radwan, Ahmed, et al.
Published: (2024)

Characterizing Universal Object Representations Across Vision Models
by: Mahner, Florian P., et al.
Published: (2026)

On the rankability of visual embeddings
by: Sonthalia, Ankit, et al.
Published: (2025)

ARMARecon: An ARMA Convolutional Filter based Graph Neural Network for Neurodegenerative Dementias Classification
by: Abburi, VSS Tejaswi, et al.
Published: (2026)

Generative Action Tell-Tales: Assessing Human Motion in Synthesized Videos
by: Thomas, Xavier, et al.
Published: (2025)

Characterizing the visual representation of objects from the child's view
by: Yang, Jane, et al.
Published: (2026)

HOLA: Enhancing Audio-visual Deepfake Detection via Hierarchical Contextual Aggregations and Efficient Pre-training
by: Wu, Xuecheng, et al.
Published: (2025)

Neuromorphic visual attention for Sign-language recognition on SpiNNaker
by: Liskova, Sarka, et al.
Published: (2026)

Analyzing Noise Models and Advanced Filtering Algorithms for Image Enhancement
by: Akbar, Sahil Ali, et al.
Published: (2024)

UltrON: Ultrasound Occupancy Networks
by: Wysocki, Magdalena, et al.
Published: (2025)

Modelling Visual Semantics via Image Captioning to extract Enhanced Multi-Level Cross-Modal Semantic Incongruity Representation with Attention for Multimodal Sarcasm Detection
by: Aggarwal, Sajal, et al.
Published: (2024)

LIT: Large Language Model Driven Intention Tracking for Proactive Human-Robot Collaboration -- A Robot Sous-Chef Application
by: Huang, Zhe, et al.
Published: (2024)

TetraSphere: A Neural Descriptor for O(3)-Invariant Point Cloud Analysis
by: Melnyk, Pavlo, et al.
Published: (2022)

Wandering around: A bioinspired approach to visual attention through object motion sensitivity
by: D'Angelo, Giulia, et al.
Published: (2025)

DualResolution Residual Architecture with Artifact Suppression for Melanocytic Lesion Segmentation
by: Singh, Vikram, et al.
Published: (2025)

Spectral Progressive Diffusion for Efficient Image and Video Generation
by: Xiao, Howard, et al.
Published: (2026)

Large-scale visual SLAM for in-the-wild videos
by: Sun, Shuo, et al.
Published: (2025)

Explaning with trees: interpreting CNNs using hierarchies
by: Rodrigues, Caroline Mazini, et al.
Published: (2024)

Foveated Diffusion: Efficient Spatially Adaptive Image and Video Generation
by: Chao, Brian, et al.
Published: (2026)

RemEdit: Efficient Diffusion Editing with Riemannian Geometry
by: Adhikarla, Eashan, et al.
Published: (2026)

A transition towards virtual representations of visual scenes
by: Pereira, Américo, et al.
Published: (2024)

AI-driven visual monitoring of industrial assembly tasks
by: Nardon, Mattia, et al.
Published: (2025)

Perception Encoder: The best visual embeddings are not at the output of the network
by: Bolya, Daniel, et al.
Published: (2025)

Can visual language models resolve textual ambiguity with visual cues? Let visual puns tell you!
by: Chung, Jiwan, et al.
Published: (2024)

Automated mapping of virtual environments with visual predictive coding
by: Gornet, James, et al.
Published: (2023)

Block-Sparse Global Attention for Efficient Multi-View Geometry Transformers
by: Wang, Chung-Shien Brian, et al.
Published: (2025)

Incremental dimension reduction for efficient and accurate visual anomaly detection
by: Lee, Teng-Yok
Published: (2026)

SCHIGAND: A Synthetic Facial Generation Mode Pipeline
by: Kadali, Ananya, et al.
Published: (2026)

Affine transformation estimation improves visual self-supervised learning
by: Torpey, David, et al.
Published: (2024)