:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Kondapaneni, Neehar, Marks, Markus, Knott, Manuel, Guimaraes, Rogerio, Perona, Pietro
Format:	Preprint
Published:	2023
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2310.00031
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Less is More: Discovering Concise Network Explanations
by: Kondapaneni, Neehar, et al.
Published: (2024)

A Number Sense as an Emergent Property of the Manipulating Brain
by: Kondapaneni, Neehar, et al.
Published: (2020)

A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification
by: Marks, Markus, et al.
Published: (2024)

Representational Similarity via Interpretable Visual Concepts
by: Kondapaneni, Neehar, et al.
Published: (2025)

Diffusion-Based Action Recognition Generalizes to Untrained Domains
by: Guimaraes, Rogerio, et al.
Published: (2025)

Representational Difference Explanations
by: Kondapaneni, Neehar, et al.
Published: (2025)

Social Perception of Faces in a Vision-Language Model
by: Hausladen, Carina I., et al.
Published: (2024)

A Rapid Test for Accuracy and Bias of Face Recognition Technology
by: Knott, Manuel, et al.
Published: (2025)

SAVeD: Learning to Denoise Low-SNR Video for Improved Downstream Performance
by: Stathatos, Suzanne, et al.
Published: (2025)

Learning Keypoints for Multi-Agent Behavior Analysis using Self-Supervision
by: Khalil, Daniel, et al.
Published: (2024)

Single View Seafloor Recovery from Imaging Sonar via Differentiable Rendering
by: Brodjian, Sevan, et al.
Published: (2026)

Linear Mechanisms for Spatiotemporal Reasoning in Vision Language Models
by: Kang, Raphi, et al.
Published: (2026)

Confidence Intervals for Error Rates in 1:1 Matching Tasks: Critical Statistical Analysis and Recommendations
by: Fogliato, Riccardo, et al.
Published: (2023)

A Framework for Efficient Model Evaluation through Stratification, Sampling, and Estimation
by: Fogliato, Riccardo, et al.
Published: (2024)

On the Effect of Image Resolution on Semantic Segmentation
by: Singh, Ritambhara, et al.
Published: (2024)

Is CLIP ideal? No. Can we fix it? Yes!
by: Kang, Raphi, et al.
Published: (2025)

Kuramoto Orientation Diffusion Models
by: Song, Yue, et al.
Published: (2025)

Unsupervised Representation Learning from Sparse Transformation Analysis
by: Song, Yue, et al.
Published: (2024)

Information Theoretic Text-to-Image Alignment
by: Wang, Chao, et al.
Published: (2024)

Weakly Supervised Panoptic Segmentation for Defect-Based Grading of Fresh Produce
by: Knott, Manuel, et al.
Published: (2024)

Counting Fish with Temporal Representations of Sonar Video
by: Van Brunt, Kai, et al.
Published: (2025)

RefAV: Towards Planning-Centric Scenario Mining
by: Davidson, Cainan, et al.
Published: (2025)

Revisiting Few-Shot Object Detection with Vision-Language Models
by: Madan, Anish, et al.
Published: (2023)

Probing the Mid-level Vision Capabilities of Self-Supervised Learning
by: Chen, Xuweiyi, et al.
Published: (2024)

MTADiffusion: Mask Text Alignment Diffusion Model for Object Inpainting
by: Huang, Jun, et al.
Published: (2025)

RFMI: Estimating Mutual Information on Rectified Flow for Text-to-Image Alignment
by: Wang, Chao, et al.
Published: (2025)

MonoFusion: Sparse-View 4D Reconstruction via Monocular Fusion
by: Wang, Zihan, et al.
Published: (2025)

I Can't Believe It's Not Scene Flow!
by: Khatri, Ishan, et al.
Published: (2024)

RF-DETR: Neural Architecture Search for Real-Time Detection Transformers
by: Robinson, Isaac, et al.
Published: (2025)

Inference-Time Text-to-Video Alignment with Diffusion Latent Beam Search
by: Oshima, Yuta, et al.
Published: (2025)

Cross-Modal Attention Alignment Network with Auxiliary Text Description for zero-shot sketch-based image retrieval
by: Su, Hanwen, et al.
Published: (2024)

Continuous Concepts Removal in Text-to-image Diffusion Models
by: Han, Tingxu, et al.
Published: (2024)

TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models
by: Ni, Haomiao, et al.
Published: (2024)

Hierarchical Vision-Language Alignment for Text-to-Image Generation via Diffusion Models
by: Johnson, Emily, et al.
Published: (2025)

Bridging the Skeleton-Text Modality Gap: Diffusion-Powered Modality Alignment for Zero-shot Skeleton-based Action Recognition
by: Do, Jeonghyeok, et al.
Published: (2024)

Improving Long-Text Alignment for Text-to-Image Diffusion Models
by: Liu, Luping, et al.
Published: (2024)

Personalized Safety Alignment for Text-to-Image Diffusion Models
by: Lei, Yu, et al.
Published: (2025)

Instant Preference Alignment for Text-to-Image Diffusion Models
by: Li, Yang, et al.
Published: (2025)

RadarGNN: Transformation Invariant Graph Neural Network for Radar-based Perception
by: Fent, Felix, et al.
Published: (2023)

Object-Conditioned Energy-Based Attention Map Alignment in Text-to-Image Diffusion Models
by: Zhang, Yasi, et al.
Published: (2024)