Saved in:
| Main Authors: | Sepehri, Mohammad Shahab, Fabian, Zalan, Soltanolkotabi, Maryam, Soltanolkotabi, Mahdi |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.15477 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Serpent: Scalable and Efficient Image Restoration via Multi-scale Structured State Space Models
by: Sepehri, Mohammad Shahab, et al.
Published: (2024)
by: Sepehri, Mohammad Shahab, et al.
Published: (2024)
Hyperphantasia: A Benchmark for Evaluating the Mental Visualization Capabilities of Multimodal LLMs
by: Sepehri, Mohammad Shahab, et al.
Published: (2025)
by: Sepehri, Mohammad Shahab, et al.
Published: (2025)
ConceptMix++: Leveling the Playing Field in Text-to-Image Benchmarking via Iterative Prompt Optimization
by: Gan, Haosheng, et al.
Published: (2025)
by: Gan, Haosheng, et al.
Published: (2025)
MosaicMRI: A Diverse Dataset and Benchmark for Raw Musculoskeletal MRI
by: Arguello, Paula, et al.
Published: (2026)
by: Arguello, Paula, et al.
Published: (2026)
Emergence and Evolution of Interpretable Concepts in Diffusion Models
by: Tinaz, Berk, et al.
Published: (2025)
by: Tinaz, Berk, et al.
Published: (2025)
DiracDiffusion: Denoising and Incremental Reconstruction with Assured Data-Consistency
by: Fabian, Zalan, et al.
Published: (2023)
by: Fabian, Zalan, et al.
Published: (2023)
ATHENA: Adaptive Test-Time Steering for Improving Count Fidelity in Diffusion Models
by: Sepehri, Mohammad Shahab, et al.
Published: (2026)
by: Sepehri, Mohammad Shahab, et al.
Published: (2026)
HARMONY: Hidden Activation Representations and Model Output-Aware Uncertainty Estimation for Vision-Language Models
by: Mushtaq, Erum, et al.
Published: (2025)
by: Mushtaq, Erum, et al.
Published: (2025)
Adapt and Diffuse: Sample-adaptive Reconstruction via Latent Diffusion Models
by: Fabian, Zalan, et al.
Published: (2023)
by: Fabian, Zalan, et al.
Published: (2023)
CryptoMamba: Leveraging State Space Models for Accurate Bitcoin Price Prediction
by: Sepehri, Mohammad Shahab, et al.
Published: (2025)
by: Sepehri, Mohammad Shahab, et al.
Published: (2025)
Gradient Descent Provably Solves Nonlinear Tomographic Reconstruction
by: Fridovich-Keil, Sara, et al.
Published: (2023)
by: Fridovich-Keil, Sara, et al.
Published: (2023)
Theoretical Insights into Overparameterized Models in Multi-Task and Replay-Based Continual Learning
by: Banayeeanzade, Amin, et al.
Published: (2024)
by: Banayeeanzade, Amin, et al.
Published: (2024)
LANTERN: A Machine Learning Framework for Lipid Nanoparticle Transfection Efficiency Prediction
by: Mehradfar, Asal, et al.
Published: (2025)
by: Mehradfar, Asal, et al.
Published: (2025)
Don't trust your eyes: on the (un)reliability of feature visualizations
by: Geirhos, Robert, et al.
Published: (2023)
by: Geirhos, Robert, et al.
Published: (2023)
Training Dynamics of Softmax Self-Attention: Fast Global Convergence via Preconditioning
by: Goel, Gautam, et al.
Published: (2026)
by: Goel, Gautam, et al.
Published: (2026)
Asymmetric Prompt Weighting for Reinforcement Learning with Verifiable Rewards
by: Heckel, Reinhard, et al.
Published: (2026)
by: Heckel, Reinhard, et al.
Published: (2026)
Bias-constrained multimodal intelligence for equitable and reliable clinical AI
by: Li, Cheng, et al.
Published: (2026)
by: Li, Cheng, et al.
Published: (2026)
Do not trust what you trust: Miscalibration in Semi-supervised Learning
by: Mishra, Shambhavi, et al.
Published: (2024)
by: Mishra, Shambhavi, et al.
Published: (2024)
If you're waiting for a sign... that might not be it! Mitigating Trust Boundary Confusion from Visual Injections on Vision-Language Agentic Systems
by: Chang, Jiamin, et al.
Published: (2026)
by: Chang, Jiamin, et al.
Published: (2026)
AI-assisted prostate cancer detection and localisation on biparametric MR by classifying radiologist-positives
by: Wu, Xiangcen, et al.
Published: (2024)
by: Wu, Xiangcen, et al.
Published: (2024)
A multimodal vision foundation model for generalizable knee pathology
by: Yu, Kang, et al.
Published: (2026)
by: Yu, Kang, et al.
Published: (2026)
Are foundation models efficient for medical image segmentation?
by: Ferreira, Danielle, et al.
Published: (2023)
by: Ferreira, Danielle, et al.
Published: (2023)
Fine-tuning can cripple your foundation model; preserving features may be the solution
by: Mukhoti, Jishnu, et al.
Published: (2023)
by: Mukhoti, Jishnu, et al.
Published: (2023)
Leveraging AI multimodal geospatial foundation models for improved near-real-time flood mapping at a global scale
by: Tulbure, Mirela G., et al.
Published: (2025)
by: Tulbure, Mirela G., et al.
Published: (2025)
MIMO: A medical vision language model with visual referring multimodal input and pixel grounding multimodal output
by: Chen, Yanyuan, et al.
Published: (2025)
by: Chen, Yanyuan, et al.
Published: (2025)
ENSAM: an efficient foundation model for interactive segmentation of 3D medical images
by: Stenhede, Elias, et al.
Published: (2025)
by: Stenhede, Elias, et al.
Published: (2025)
FoMo4Wheat: Toward reliable crop vision foundation models with globally curated data
by: Han, Bing, et al.
Published: (2025)
by: Han, Bing, et al.
Published: (2025)
Stability properties of gradient flow dynamics for the symmetric low-rank matrix factorization problem
by: Mohammadi, Hesameddin, et al.
Published: (2024)
by: Mohammadi, Hesameddin, et al.
Published: (2024)
MiSuRe is all you need to explain your image segmentation
by: Hasany, Syed Nouman, et al.
Published: (2024)
by: Hasany, Syed Nouman, et al.
Published: (2024)
GlitchBench: Can large multimodal models detect video game glitches?
by: Taesiri, Mohammad Reza, et al.
Published: (2023)
by: Taesiri, Mohammad Reza, et al.
Published: (2023)
MediAug: Exploring Visual Augmentation in Medical Imaging
by: Qi, Xuyin, et al.
Published: (2025)
by: Qi, Xuyin, et al.
Published: (2025)
The Rich and the Simple: On the Implicit Bias of Adam and SGD
by: Vasudeva, Bhavya, et al.
Published: (2025)
by: Vasudeva, Bhavya, et al.
Published: (2025)
Learning to Recall with Transformers Beyond Orthogonal Embeddings
by: Vural, Nuri Mert, et al.
Published: (2026)
by: Vural, Nuri Mert, et al.
Published: (2026)
PAST: A multimodal single-cell foundation model for histopathology and spatial transcriptomics in cancer
by: Yang, Changchun, et al.
Published: (2025)
by: Yang, Changchun, et al.
Published: (2025)
MedDINOv3: How to adapt vision foundation models for medical image segmentation?
by: Li, Yuheng, et al.
Published: (2025)
by: Li, Yuheng, et al.
Published: (2025)
Can you SPLICE it together? A Human Curated Benchmark for Probing Visual Reasoning in VLMs
by: Ballout, Mohamad, et al.
Published: (2025)
by: Ballout, Mohamad, et al.
Published: (2025)
PB-IAD: Utilizing multimodal foundation models for semantic industrial anomaly detection in dynamic manufacturing environments
by: Hofmann, Bernd, et al.
Published: (2025)
by: Hofmann, Bernd, et al.
Published: (2025)
Closing the gap in multimodal medical representation alignment
by: Grassucci, Eleonora, et al.
Published: (2026)
by: Grassucci, Eleonora, et al.
Published: (2026)
Visual concept ranking uncovers medical shortcuts used by large multimodal models
by: Janizek, Joseph D., et al.
Published: (2026)
by: Janizek, Joseph D., et al.
Published: (2026)
When Eyes and Ears Disagree: Can MLLMs Discern Audio-Visual Confusion?
by: Ye, Qilang, et al.
Published: (2025)
by: Ye, Qilang, et al.
Published: (2025)
Similar Items
-
Serpent: Scalable and Efficient Image Restoration via Multi-scale Structured State Space Models
by: Sepehri, Mohammad Shahab, et al.
Published: (2024) -
Hyperphantasia: A Benchmark for Evaluating the Mental Visualization Capabilities of Multimodal LLMs
by: Sepehri, Mohammad Shahab, et al.
Published: (2025) -
ConceptMix++: Leveling the Playing Field in Text-to-Image Benchmarking via Iterative Prompt Optimization
by: Gan, Haosheng, et al.
Published: (2025) -
MosaicMRI: A Diverse Dataset and Benchmark for Raw Musculoskeletal MRI
by: Arguello, Paula, et al.
Published: (2026) -
Emergence and Evolution of Interpretable Concepts in Diffusion Models
by: Tinaz, Berk, et al.
Published: (2025)