Saved in:
| Main Authors: | Van Brunt, Kai, Kay, Justin, Haucke, Timm, Perona, Pietro, Van Horn, Grant, Beery, Sara |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.05129 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Align and Distill: Unifying and Improving Domain Adaptive Object Detection
by: Kay, Justin, et al.
Published: (2024)
by: Kay, Justin, et al.
Published: (2024)
Pairwise Matching of Intermediate Representations for Fine-grained Explainability
by: Shrack, Lauren, et al.
Published: (2025)
by: Shrack, Lauren, et al.
Published: (2025)
Consensus-Driven Active Model Selection
by: Kay, Justin, et al.
Published: (2025)
by: Kay, Justin, et al.
Published: (2025)
Single View Seafloor Recovery from Imaging Sonar via Differentiable Rendering
by: Brodjian, Sevan, et al.
Published: (2026)
by: Brodjian, Sevan, et al.
Published: (2026)
Deep in the Jungle: Towards Automating Chimpanzee Population Estimation
by: Raynes, Tom, et al.
Published: (2026)
by: Raynes, Tom, et al.
Published: (2026)
SAVeD: Learning to Denoise Low-SNR Video for Improved Downstream Performance
by: Stathatos, Suzanne, et al.
Published: (2025)
by: Stathatos, Suzanne, et al.
Published: (2025)
Merlin L48 Spectrogram Dataset
by: Sun, Aaron, et al.
Published: (2025)
by: Sun, Aaron, et al.
Published: (2025)
Improved Zero-Shot Classification by Adapting VLMs with Text Descriptions
by: Saha, Oindrila, et al.
Published: (2024)
by: Saha, Oindrila, et al.
Published: (2024)
INQUIRE: A Natural World Text-to-Image Retrieval Benchmark
by: Vendrow, Edward, et al.
Published: (2024)
by: Vendrow, Edward, et al.
Published: (2024)
Representational Similarity via Interpretable Visual Concepts
by: Kondapaneni, Neehar, et al.
Published: (2025)
by: Kondapaneni, Neehar, et al.
Published: (2025)
Representational Difference Explanations
by: Kondapaneni, Neehar, et al.
Published: (2025)
by: Kondapaneni, Neehar, et al.
Published: (2025)
Generate, Transduct, Adapt: Iterative Transduction with VLMs
by: Saha, Oindrila, et al.
Published: (2025)
by: Saha, Oindrila, et al.
Published: (2025)
Human-in-the-Loop Visual Re-ID for Population Size Estimation
by: Perez, Gustavo, et al.
Published: (2023)
by: Perez, Gustavo, et al.
Published: (2023)
Moment Sampling in Video LLMs for Long-Form Video QA
by: Chasmai, Mustafa, et al.
Published: (2025)
by: Chasmai, Mustafa, et al.
Published: (2025)
Personalized Representation from Personalized Generation
by: Sundaram, Shobhita, et al.
Published: (2024)
by: Sundaram, Shobhita, et al.
Published: (2024)
A Number Sense as an Emergent Property of the Manipulating Brain
by: Kondapaneni, Neehar, et al.
Published: (2020)
by: Kondapaneni, Neehar, et al.
Published: (2020)
Unsupervised Representation Learning from Sparse Transformation Analysis
by: Song, Yue, et al.
Published: (2024)
by: Song, Yue, et al.
Published: (2024)
Diffusion-Based Action Recognition Generalizes to Untrained Domains
by: Guimaraes, Rogerio, et al.
Published: (2025)
by: Guimaraes, Rogerio, et al.
Published: (2025)
Linear Mechanisms for Spatiotemporal Reasoning in Vision Language Models
by: Kang, Raphi, et al.
Published: (2026)
by: Kang, Raphi, et al.
Published: (2026)
Confidence Intervals for Error Rates in 1:1 Matching Tasks: Critical Statistical Analysis and Recommendations
by: Fogliato, Riccardo, et al.
Published: (2023)
by: Fogliato, Riccardo, et al.
Published: (2023)
WildSAT: Learning Satellite Image Representations from Wildlife Observations
by: Daroya, Rangel, et al.
Published: (2024)
by: Daroya, Rangel, et al.
Published: (2024)
Less is More: Discovering Concise Network Explanations
by: Kondapaneni, Neehar, et al.
Published: (2024)
by: Kondapaneni, Neehar, et al.
Published: (2024)
A Framework for Efficient Model Evaluation through Stratification, Sampling, and Estimation
by: Fogliato, Riccardo, et al.
Published: (2024)
by: Fogliato, Riccardo, et al.
Published: (2024)
Masked Autoencoders with Limited Data: Does It Work? A Fine-Grained Bioacoustics Case Study
by: Liu, Wuao, et al.
Published: (2026)
by: Liu, Wuao, et al.
Published: (2026)
Seeing Through the PRISM: Compound & Controllable Restoration of Scientific Images
by: Kurinchi-Vendhan, Rupa, et al.
Published: (2026)
by: Kurinchi-Vendhan, Rupa, et al.
Published: (2026)
Text-image Alignment for Diffusion-based Perception
by: Kondapaneni, Neehar, et al.
Published: (2023)
by: Kondapaneni, Neehar, et al.
Published: (2023)
On the Effect of Image Resolution on Semantic Segmentation
by: Singh, Ritambhara, et al.
Published: (2024)
by: Singh, Ritambhara, et al.
Published: (2024)
CleverBirds: A Multiple-Choice Benchmark for Fine-grained Human Knowledge Tracing
by: Bossemeyer, Leonie, et al.
Published: (2025)
by: Bossemeyer, Leonie, et al.
Published: (2025)
Is CLIP ideal? No. Can we fix it? Yes!
by: Kang, Raphi, et al.
Published: (2025)
by: Kang, Raphi, et al.
Published: (2025)
A Rapid Test for Accuracy and Bias of Face Recognition Technology
by: Knott, Manuel, et al.
Published: (2025)
by: Knott, Manuel, et al.
Published: (2025)
Learning Keypoints for Multi-Agent Behavior Analysis using Self-Supervision
by: Khalil, Daniel, et al.
Published: (2024)
by: Khalil, Daniel, et al.
Published: (2024)
Anchored Video Generation: Decoupling Scene Construction and Temporal Synthesis in Text-to-Video Diffusion Models
by: Hassan, Mariam, et al.
Published: (2025)
by: Hassan, Mariam, et al.
Published: (2025)
You May Speak Freely: Improving the Fine-Grained Visual Recognition Capabilities of Multimodal Large Language Models with Answer Extraction
by: Lawrence, Logan, et al.
Published: (2025)
by: Lawrence, Logan, et al.
Published: (2025)
SonarSplat: Novel View Synthesis of Imaging Sonar via Gaussian Splatting
by: Sethuraman, Advaith V., et al.
Published: (2025)
by: Sethuraman, Advaith V., et al.
Published: (2025)
Visually Consistent Hierarchical Image Classification
by: Park, Seulki, et al.
Published: (2024)
by: Park, Seulki, et al.
Published: (2024)
Are They the Same Picture? Adapting Concept Bottleneck Models for Human-AI Collaboration in Image Retrieval
by: Balloli, Vaibhav, et al.
Published: (2024)
by: Balloli, Vaibhav, et al.
Published: (2024)
Spatial-Temporal Graph Mamba for Music-Guided Dance Video Synthesis
by: Tang, Hao, et al.
Published: (2025)
by: Tang, Hao, et al.
Published: (2025)
TrackMAE: Video Representation Learning via Track Mask and Predict
by: Vandeghen, Renaud, et al.
Published: (2026)
by: Vandeghen, Renaud, et al.
Published: (2026)
SONIC: Sonar Image Correspondence using Pose Supervised Learning for Imaging Sonars
by: Gode, Samiran, et al.
Published: (2023)
by: Gode, Samiran, et al.
Published: (2023)
Counting to Four is still a Chore for VLMs
by: Anh, Duy Le Dinh, et al.
Published: (2026)
by: Anh, Duy Le Dinh, et al.
Published: (2026)
Similar Items
-
Align and Distill: Unifying and Improving Domain Adaptive Object Detection
by: Kay, Justin, et al.
Published: (2024) -
Pairwise Matching of Intermediate Representations for Fine-grained Explainability
by: Shrack, Lauren, et al.
Published: (2025) -
Consensus-Driven Active Model Selection
by: Kay, Justin, et al.
Published: (2025) -
Single View Seafloor Recovery from Imaging Sonar via Differentiable Rendering
by: Brodjian, Sevan, et al.
Published: (2026) -
Deep in the Jungle: Towards Automating Chimpanzee Population Estimation
by: Raynes, Tom, et al.
Published: (2026)