Saved in:
| Main Authors: | Hamblin, Chris, Saha, Srijani, Konkle, Talia, Alvarez, George |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.05598 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Feature Accentuation: Revealing 'What' Features Respond to in Natural Images
by: Hamblin, Chris, et al.
Published: (2024)
by: Hamblin, Chris, et al.
Published: (2024)
FOVI: A biologically-inspired foveated interface for deep vision models
by: Blauch, Nicholas M., et al.
Published: (2026)
by: Blauch, Nicholas M., et al.
Published: (2026)
Visual Anagrams Reveal Hidden Differences in Holistic Shape Processing Across Vision Models
by: Doshi, Fenil R., et al.
Published: (2025)
by: Doshi, Fenil R., et al.
Published: (2025)
Bi-Orthogonal Factor Decomposition for Vision Transformers
by: Doshi, Fenil R., et al.
Published: (2026)
by: Doshi, Fenil R., et al.
Published: (2026)
Physics Based Differentiable Rendering for Inverse Problems and Beyond
by: Kakkar, Preetish, et al.
Published: (2024)
by: Kakkar, Preetish, et al.
Published: (2024)
Archetypal SAE: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models
by: Fel, Thomas, et al.
Published: (2025)
by: Fel, Thomas, et al.
Published: (2025)
Into the Rabbit Hull: From Task-Relevant Concepts in DINO to Minkowski Geometry
by: Fel, Thomas, et al.
Published: (2025)
by: Fel, Thomas, et al.
Published: (2025)
VALUED -- Vision and Logical Understanding Evaluation Dataset
by: Saha, Soumadeep, et al.
Published: (2023)
by: Saha, Soumadeep, et al.
Published: (2023)
Npix2Cpix: A GAN-Based Image-to-Image Translation Network With Retrieval- Classification Integration for Watermark Retrieval From Historical Document Images
by: Saha, Utsab, et al.
Published: (2024)
by: Saha, Utsab, et al.
Published: (2024)
HAUR: Human Annotation Understanding and Recognition Through Text-Heavy Images
by: Yang, Yuchen, et al.
Published: (2024)
by: Yang, Yuchen, et al.
Published: (2024)
Representation Understanding via Activation Maximization
by: Zhu, Hongbo, et al.
Published: (2025)
by: Zhu, Hongbo, et al.
Published: (2025)
Understanding Sensor Vulnerabilities in Industrial XR Tracking
by: Saha, Sourya, et al.
Published: (2026)
by: Saha, Sourya, et al.
Published: (2026)
Using Multimodal Deep Neural Networks to Disentangle Language from Visual Aesthetics
by: Conwell, Colin, et al.
Published: (2024)
by: Conwell, Colin, et al.
Published: (2024)
Oscillating Dispersion for Maximal Light-throughput Spectral Imaging
by: Zhang, Jiuyun, et al.
Published: (2026)
by: Zhang, Jiuyun, et al.
Published: (2026)
Prompt2LVideos: Exploring Prompts for Understanding Long-Form Multimodal Videos
by: Jahagirdar, Soumya Shamarao, et al.
Published: (2025)
by: Jahagirdar, Soumya Shamarao, et al.
Published: (2025)
PushPull-Net: Inhibition-driven ResNet robust to image corruptions
by: Bennabhaktula, Guru Swaroop, et al.
Published: (2024)
by: Bennabhaktula, Guru Swaroop, et al.
Published: (2024)
Enhancing Understanding Through Wildlife Re-Identification
by: Buitenhuis, J.
Published: (2024)
by: Buitenhuis, J.
Published: (2024)
What If : Understanding Motion Through Sparse Interactions
by: Baumann, Stefan Andreas, et al.
Published: (2025)
by: Baumann, Stefan Andreas, et al.
Published: (2025)
Video Understanding: Through A Temporal Lens
by: Nguyen, Thong Thanh
Published: (2026)
by: Nguyen, Thong Thanh
Published: (2026)
Image Clustering using an Augmented Generative Adversarial Network and Information Maximization
by: Ntelemis, Foivos, et al.
Published: (2020)
by: Ntelemis, Foivos, et al.
Published: (2020)
Animal Re-Identification on Microcontrollers
by: Chen, Yubo, et al.
Published: (2025)
by: Chen, Yubo, et al.
Published: (2025)
Information-Maximized Soft Variable Discretization for Self-Supervised Image Representation Learning
by: Niu, Chuang, et al.
Published: (2025)
by: Niu, Chuang, et al.
Published: (2025)
Can Deep Learning Trigger Alerts from Mobile-Captured Images?
by: Sarkar, Pritisha, et al.
Published: (2025)
by: Sarkar, Pritisha, et al.
Published: (2025)
Colorful Diffuse Intrinsic Image Decomposition in the Wild
by: Careaga, Chris, et al.
Published: (2024)
by: Careaga, Chris, et al.
Published: (2024)
Graph Cut-guided Maximal Coding Rate Reduction for Learning Image Embedding and Clustering
by: He, W., et al.
Published: (2024)
by: He, W., et al.
Published: (2024)
Maximizing T2-Only Prostate Cancer Localization from Expected Diffusion Weighted Imaging
by: Yi, Weixi, et al.
Published: (2026)
by: Yi, Weixi, et al.
Published: (2026)
Understanding Cross-Model Perceptual Invariances Through Ensemble Metamers
by: Boehm, Lukas, et al.
Published: (2025)
by: Boehm, Lukas, et al.
Published: (2025)
VET-DINO: Learning Anatomical Understanding Through Multi-View Distillation in Veterinary Imaging
by: Dourson, Andre, et al.
Published: (2025)
by: Dourson, Andre, et al.
Published: (2025)
AEM: Attention Entropy Maximization for Multiple Instance Learning based Whole Slide Image Classification
by: Zhang, Yunlong, et al.
Published: (2024)
by: Zhang, Yunlong, et al.
Published: (2024)
Semi-supervised Image Dehazing via Expectation-Maximization and Bidirectional Brownian Bridge Diffusion Models
by: Liu, Bing, et al.
Published: (2025)
by: Liu, Bing, et al.
Published: (2025)
Unified 3D Scene Understanding Through Physical World Modeling
by: Lee, Wanhee, et al.
Published: (2026)
by: Lee, Wanhee, et al.
Published: (2026)
Scenario Understanding of Traffic Scenes Through Large Visual Language Models
by: Rivera, Esteban, et al.
Published: (2025)
by: Rivera, Esteban, et al.
Published: (2025)
Image Understanding Makes for A Good Tokenizer for Image Generation
by: Wang, Luting, et al.
Published: (2024)
by: Wang, Luting, et al.
Published: (2024)
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation
by: Yang, Yue, et al.
Published: (2025)
by: Yang, Yue, et al.
Published: (2025)
Acoustic Field Video for Multimodal Scene Understanding
by: Kim, Daehwa, et al.
Published: (2026)
by: Kim, Daehwa, et al.
Published: (2026)
3D Scene Understanding Through Local Random Access Sequence Modeling
by: Lee, Wanhee, et al.
Published: (2025)
by: Lee, Wanhee, et al.
Published: (2025)
Benchmarking and Enhancing VLM for Compressed Image Understanding
by: Zhang, Zifu, et al.
Published: (2025)
by: Zhang, Zifu, et al.
Published: (2025)
How does the primate brain combine generative and discriminative computations in vision?
by: Peters, Benjamin, et al.
Published: (2024)
by: Peters, Benjamin, et al.
Published: (2024)
Risk Controlled Image Retrieval
by: Cai, Kaiwen, et al.
Published: (2023)
by: Cai, Kaiwen, et al.
Published: (2023)
Scale and Rotation Estimation of Similarity-Transformed Images via Cross-Correlation Maximization Based on Auxiliary Function Method
by: Yamashita, Shinji, et al.
Published: (2025)
by: Yamashita, Shinji, et al.
Published: (2025)
Similar Items
-
Feature Accentuation: Revealing 'What' Features Respond to in Natural Images
by: Hamblin, Chris, et al.
Published: (2024) -
FOVI: A biologically-inspired foveated interface for deep vision models
by: Blauch, Nicholas M., et al.
Published: (2026) -
Visual Anagrams Reveal Hidden Differences in Holistic Shape Processing Across Vision Models
by: Doshi, Fenil R., et al.
Published: (2025) -
Bi-Orthogonal Factor Decomposition for Vision Transformers
by: Doshi, Fenil R., et al.
Published: (2026) -
Physics Based Differentiable Rendering for Inverse Problems and Beyond
by: Kakkar, Preetish, et al.
Published: (2024)