Saved in:
| Main Authors: | Demidov, Dmitry, Majzoub, Roba Al, Kumar, Amandeep, Khan, Fahad |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2401.01164 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
How Good is my Histopathology Vision-Language Foundation Model? A Holistic Benchmark
by: Majzoub, Roba Al, et al.
Published: (2025)
by: Majzoub, Roba Al, et al.
Published: (2025)
Extract More from Less: Efficient Fine-Grained Visual Recognition in Low-Data Regimes
by: Demidov, Dmitry, et al.
Published: (2024)
by: Demidov, Dmitry, et al.
Published: (2024)
Vocabulary-free Fine-grained Visual Recognition via Enriched Contextually Grounded Vision-Language Model
by: Demidov, Dmitry, et al.
Published: (2025)
by: Demidov, Dmitry, et al.
Published: (2025)
Salient Mask-Guided Vision Transformer for Fine-Grained Classification
by: Demidov, Dmitry, et al.
Published: (2023)
by: Demidov, Dmitry, et al.
Published: (2023)
Beyond Simple Edits: Composed Video Retrieval with Dense Modifications
by: Thawakar, Omkar, et al.
Published: (2025)
by: Thawakar, Omkar, et al.
Published: (2025)
CoVR-R:Reason-Aware Composed Video Retrieval
by: Thawakar, Omkar, et al.
Published: (2026)
by: Thawakar, Omkar, et al.
Published: (2026)
Cross-Modal Self-Training: Aligning Images and Pointclouds to Learn Classification without Labels
by: Dharmasiri, Amaya, et al.
Published: (2024)
by: Dharmasiri, Amaya, et al.
Published: (2024)
Unifying Local and Global Multimodal Features for Place Recognition in Aliased and Low-Texture Environments
by: García-Hernández, Alberto, et al.
Published: (2024)
by: García-Hernández, Alberto, et al.
Published: (2024)
Interpretable Zero-Shot Learning with Locally-Aligned Vision-Language Model
by: Chen, Shiming, et al.
Published: (2025)
by: Chen, Shiming, et al.
Published: (2025)
Grey Level Texture Features for Segmentation of Chromogenic Dye RNAscope From Breast Cancer Tissue
by: Davidson, Andrew, et al.
Published: (2024)
by: Davidson, Andrew, et al.
Published: (2024)
TransResNet: Integrating the Strengths of ViTs and CNNs for High Resolution Medical Image Segmentation via Feature Grafting
by: Sharif, Muhammad Hamza, et al.
Published: (2024)
by: Sharif, Muhammad Hamza, et al.
Published: (2024)
Thinking Beyond Labels: Vocabulary-Free Fine-Grained Recognition using Reasoning-Augmented LMMs
by: Demidov, Dmitry, et al.
Published: (2025)
by: Demidov, Dmitry, et al.
Published: (2025)
ELGC-Net: Efficient Local-Global Context Aggregation for Remote Sensing Change Detection
by: Noman, Mubashir, et al.
Published: (2024)
by: Noman, Mubashir, et al.
Published: (2024)
MorphDistill: Distilling Unified Morphological Knowledge from Pathology Foundation Models for Colorectal Cancer Survival Prediction
by: Khan, Hikmat, et al.
Published: (2026)
by: Khan, Hikmat, et al.
Published: (2026)
CasTex: Cascaded Text-to-Texture Synthesis via Explicit Texture Maps and Physically-Based Shading
by: Aliev, Mishan, et al.
Published: (2025)
by: Aliev, Mishan, et al.
Published: (2025)
Few-Shot Classification and Anatomical Localization of Tissues in SPECT Imaging
by: Khan, Mohammed Abdul Hafeez, et al.
Published: (2025)
by: Khan, Mohammed Abdul Hafeez, et al.
Published: (2025)
ALMRR: Anomaly Localization Mamba on Industrial Textured Surface with Feature Reconstruction and Refinement
by: Qu, Shichen, et al.
Published: (2024)
by: Qu, Shichen, et al.
Published: (2024)
Learning on the Manifold: Unlocking Standard Diffusion Transformers with Representation Encoders
by: Kumar, Amandeep, et al.
Published: (2026)
by: Kumar, Amandeep, et al.
Published: (2026)
Open-Vocabulary Temporal Action Localization using Multimodal Guidance
by: Gupta, Akshita, et al.
Published: (2024)
by: Gupta, Akshita, et al.
Published: (2024)
Multi-modal Generation via Cross-Modal In-Context Learning
by: Kumar, Amandeep, et al.
Published: (2024)
by: Kumar, Amandeep, et al.
Published: (2024)
GenQ: Quantization in Low Data Regimes with Generative Synthetic Data
by: Li, Yuhang, et al.
Published: (2023)
by: Li, Yuhang, et al.
Published: (2023)
Structural and Statistical Texture Knowledge Distillation for Semantic Segmentation
by: Ji, Deyi, et al.
Published: (2023)
by: Ji, Deyi, et al.
Published: (2023)
Structural and Statistical Texture Knowledge Distillation and Learning for Segmentation
by: Ji, Deyi, et al.
Published: (2025)
by: Ji, Deyi, et al.
Published: (2025)
Vision Backbone Efficient Selection for Image Classification in Low-Data Regimes
by: Guerin, Joris, et al.
Published: (2024)
by: Guerin, Joris, et al.
Published: (2024)
Robust-LLaVA: On the Effectiveness of Large-Scale Robust Image Encoders for Multi-modal Large Language Models
by: Malik, Hashmat Shadab, et al.
Published: (2025)
by: Malik, Hashmat Shadab, et al.
Published: (2025)
Texture-guided Coding for Deep Features
by: Xiong, Lei, et al.
Published: (2024)
by: Xiong, Lei, et al.
Published: (2024)
Contour Refinement using Discrete Diffusion in Low Data Regime
by: Guan, Fei Yu, et al.
Published: (2026)
by: Guan, Fei Yu, et al.
Published: (2026)
Enhancing Classification of Streaming Data with Image Distillation
by: Khatib, Rwad, et al.
Published: (2025)
by: Khatib, Rwad, et al.
Published: (2025)
GTFMN: Guided Texture and Feature Modulation Network for Low-Light Image Enhancement and Super-Resolution
by: Huang, Yongsong, et al.
Published: (2026)
by: Huang, Yongsong, et al.
Published: (2026)
Towards Evaluating the Robustness of Visual State Space Models
by: Malik, Hashmat Shadab, et al.
Published: (2024)
by: Malik, Hashmat Shadab, et al.
Published: (2024)
VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding
by: Maaz, Muhammad, et al.
Published: (2024)
by: Maaz, Muhammad, et al.
Published: (2024)
Blind Localization and Clustering of Anomalies in Textures
by: Ardelean, Andrei-Timotei, et al.
Published: (2024)
by: Ardelean, Andrei-Timotei, et al.
Published: (2024)
Defending Deepfake via Texture Feature Perturbation
by: Zhang, Xiao, et al.
Published: (2025)
by: Zhang, Xiao, et al.
Published: (2025)
CEPO: RLVR Self-Distillation using Contrastive Evidence Policy Optimization
by: Heakl, Ahmed, et al.
Published: (2026)
by: Heakl, Ahmed, et al.
Published: (2026)
Chaotic Contrastive Learning for Robust Texture Classification
by: Florindo, Joao B
Published: (2026)
by: Florindo, Joao B
Published: (2026)
Vision Mamba Distillation for Low-resolution Fine-grained Image Classification
by: Chen, Yao, et al.
Published: (2024)
by: Chen, Yao, et al.
Published: (2024)
Example-Based Feature Painting on Textures
by: Ardelean, Andrei-Timotei, et al.
Published: (2025)
by: Ardelean, Andrei-Timotei, et al.
Published: (2025)
Language Guided Domain Generalized Medical Image Segmentation
by: Kunhimon, Shahina, et al.
Published: (2024)
by: Kunhimon, Shahina, et al.
Published: (2024)
Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models
by: Maaz, Muhammad, et al.
Published: (2023)
by: Maaz, Muhammad, et al.
Published: (2023)
Video-R2: Reinforcing Consistent and Grounded Reasoning in Multimodal Language Models
by: Maaz, Muhammad, et al.
Published: (2025)
by: Maaz, Muhammad, et al.
Published: (2025)
Similar Items
-
How Good is my Histopathology Vision-Language Foundation Model? A Holistic Benchmark
by: Majzoub, Roba Al, et al.
Published: (2025) -
Extract More from Less: Efficient Fine-Grained Visual Recognition in Low-Data Regimes
by: Demidov, Dmitry, et al.
Published: (2024) -
Vocabulary-free Fine-grained Visual Recognition via Enriched Contextually Grounded Vision-Language Model
by: Demidov, Dmitry, et al.
Published: (2025) -
Salient Mask-Guided Vision Transformer for Fine-Grained Classification
by: Demidov, Dmitry, et al.
Published: (2023) -
Beyond Simple Edits: Composed Video Retrieval with Dense Modifications
by: Thawakar, Omkar, et al.
Published: (2025)