Saved in:
| Main Authors: | Gamal, Mai, Rashad, Mohamed, Ehab, Eman, Eldawlatly, Seif, Siam, Mennatullah |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.12519 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
PixFoundation: Are We Heading in the Right Direction with Pixel-level Vision Foundation Models?
by: Siam, Mennatullah
Published: (2025)
by: Siam, Mennatullah
Published: (2025)
PixFoundation 2.0: Do Video Multi-Modal LLMs Use Motion in Visual Grounding?
by: Siam, Mennatullah
Published: (2025)
by: Siam, Mennatullah
Published: (2025)
Multiscale Video Transformers for Class Agnostic Segmentation in Autonomous Driving
by: Cheshmi, Leila, et al.
Published: (2025)
by: Cheshmi, Leila, et al.
Published: (2025)
Towards a Better Understanding of the Computer Vision Research Community in Africa
by: Omotayo, Abdul-Hakeem, et al.
Published: (2023)
by: Omotayo, Abdul-Hakeem, et al.
Published: (2023)
TAM-VT: Transformation-Aware Multi-scale Video Transformer for Segmentation and Tracking
by: Goyal, Raghav, et al.
Published: (2023)
by: Goyal, Raghav, et al.
Published: (2023)
MED-VT++: Unifying Multimodal Learning with a Multiscale Encoder-Decoder Video Transformer
by: Karim, Rezaul, et al.
Published: (2023)
by: Karim, Rezaul, et al.
Published: (2023)
The State of Computer Vision Research in Africa
by: Omotayo, Abdul-Hakeem, et al.
Published: (2024)
by: Omotayo, Abdul-Hakeem, et al.
Published: (2024)
Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach
by: Hossain, Mir Rayat Imtiaz, et al.
Published: (2024)
by: Hossain, Mir Rayat Imtiaz, et al.
Published: (2024)
The Power of One: A Single Example is All it Takes for Segmentation in VLMs
by: Hossain, Mir Rayat Imtiaz, et al.
Published: (2025)
by: Hossain, Mir Rayat Imtiaz, et al.
Published: (2025)
Generalized Few-Shot Semantic Segmentation in Remote Sensing: Challenge and Benchmark
by: Broni-Bediako, Clifford, et al.
Published: (2024)
by: Broni-Bediako, Clifford, et al.
Published: (2024)
Quantifying and Learning Static vs. Dynamic Information in Deep Spatiotemporal Networks
by: Kowal, Matthew, et al.
Published: (2022)
by: Kowal, Matthew, et al.
Published: (2022)
CephRes-MHNet: A Multi-Head Residual Network for Accurate and Robust Cephalometric Landmark Detection
by: Jaheen, Ahmed, et al.
Published: (2025)
by: Jaheen, Ahmed, et al.
Published: (2025)
Arabic-Nougat: Fine-Tuning Vision Transformers for Arabic OCR and Markdown Extraction
by: Rashad, Mohamed
Published: (2024)
by: Rashad, Mohamed
Published: (2024)
Real-Time Neural Video Compression with Unified Intra and Inter Coding
by: Xiang, Hui, et al.
Published: (2025)
by: Xiang, Hui, et al.
Published: (2025)
A Vision Centric Remote Sensing Benchmark
by: Adejumo, Abduljaleel, et al.
Published: (2025)
by: Adejumo, Abduljaleel, et al.
Published: (2025)
IIP-Transformer: Intra-Inter-Part Transformer for Skeleton-Based Action Recognition
by: Wang, Qingtian, et al.
Published: (2021)
by: Wang, Qingtian, et al.
Published: (2021)
Inter- and Intra-image Refinement for Few Shot Segmentation
by: Fu, Ourui, et al.
Published: (2025)
by: Fu, Ourui, et al.
Published: (2025)
DHR: Dual Features-Driven Hierarchical Rebalancing in Inter- and Intra-Class Regions for Weakly-Supervised Semantic Segmentation
by: Jo, Sanghyun, et al.
Published: (2024)
by: Jo, Sanghyun, et al.
Published: (2024)
$I^{2}$-World: Intra-Inter Tokenization for Efficient Dynamic 4D Scene Forecasting
by: Liao, Zhimin, et al.
Published: (2025)
by: Liao, Zhimin, et al.
Published: (2025)
Uncertainty-Guided Attention and Entropy-Weighted Loss for Precise Plant Seedling Segmentation
by: Ehab, Mohamed, et al.
Published: (2026)
by: Ehab, Mohamed, et al.
Published: (2026)
Intra and Inter Parser-Prompted Transformers for Effective Image Restoration
by: Wang, Cong, et al.
Published: (2025)
by: Wang, Cong, et al.
Published: (2025)
Video Super-Resolution Transformer with Masked Inter&Intra-Frame Attention
by: Zhou, Xingyu, et al.
Published: (2024)
by: Zhou, Xingyu, et al.
Published: (2024)
Advancing Automated Deception Detection: A Multimodal Approach to Feature Extraction and Analysis
by: Bahaa, Mohamed, et al.
Published: (2024)
by: Bahaa, Mohamed, et al.
Published: (2024)
I2CKD : Intra- and Inter-Class Knowledge Distillation for Semantic Segmentation
by: Karine, Ayoub, et al.
Published: (2024)
by: Karine, Ayoub, et al.
Published: (2024)
Detail-Enhanced Intra- and Inter-modal Interaction for Audio-Visual Emotion Recognition
by: Shi, Tong, et al.
Published: (2024)
by: Shi, Tong, et al.
Published: (2024)
RayFormer: Modeling Inter- and Intra-Ray Similarity for NeRF-Based Video Snapshot Compressive Imaging
by: Dong, Yubo, et al.
Published: (2026)
by: Dong, Yubo, et al.
Published: (2026)
DVN-SLAM: Dynamic Visual Neural SLAM Based on Local-Global Encoding
by: Wu, Wenhua, et al.
Published: (2024)
by: Wu, Wenhua, et al.
Published: (2024)
CRA-PCN: Point Cloud Completion with Intra- and Inter-level Cross-Resolution Transformers
by: Rong, Yi, et al.
Published: (2024)
by: Rong, Yi, et al.
Published: (2024)
Dose Prediction Driven Radiotherapy Paramters Regression via Intra- and Inter-Relation Modeling
by: Cui, Jiaqi, et al.
Published: (2024)
by: Cui, Jiaqi, et al.
Published: (2024)
Swin Transformer for Robust Differentiation of Real and Synthetic Images: Intra- and Inter-Dataset Analysis
by: Mehta, Preetu, et al.
Published: (2024)
by: Mehta, Preetu, et al.
Published: (2024)
IIDM: Inter and Intra-domain Mixing for Semi-supervised Domain Adaptation in Semantic Segmentation
by: Fu, Weifu, et al.
Published: (2023)
by: Fu, Weifu, et al.
Published: (2023)
Universal Incremental Learning: Mitigating Confusion from Inter- and Intra-task Distribution Randomness
by: Luo, Sheng, et al.
Published: (2025)
by: Luo, Sheng, et al.
Published: (2025)
MoIIE: Mixture of Intra- and Inter-Modality Experts for Large Vision Language Models
by: Wang, Dianyi, et al.
Published: (2025)
by: Wang, Dianyi, et al.
Published: (2025)
Evaluating and Enhancing Segmentation Model Robustness with Metamorphic Testing
by: Mzoughi, Seif, et al.
Published: (2025)
by: Mzoughi, Seif, et al.
Published: (2025)
M2I2HA: Multi-modal Object Detection Based on Intra- and Inter-Modal Hypergraph Attention
by: Yang, Xiaofan, et al.
Published: (2026)
by: Yang, Xiaofan, et al.
Published: (2026)
iPac: Incorporating Intra-image Patch Context into Graph Neural Networks for Medical Image Classification
by: Zidan, Usama, et al.
Published: (2025)
by: Zidan, Usama, et al.
Published: (2025)
Enhanced Partially Relevant Video Retrieval through Inter- and Intra-Sample Analysis with Coherence Prediction
by: Ren, Junlong, et al.
Published: (2025)
by: Ren, Junlong, et al.
Published: (2025)
MEJO: MLLM-Engaged Surgical Triplet Recognition via Inter- and Intra-Task Joint Optimization
by: Zhang, Yiyi, et al.
Published: (2025)
by: Zhang, Yiyi, et al.
Published: (2025)
DC-ControlNet: Decoupling Inter- and Intra-Element Conditions in Image Generation with Diffusion Models
by: Yang, Hongji, et al.
Published: (2025)
by: Yang, Hongji, et al.
Published: (2025)
Beyond Sequential Distance: Inter-Modal Distance Invariant Position Encoding
by: Chen, Lin, et al.
Published: (2026)
by: Chen, Lin, et al.
Published: (2026)
Similar Items
-
PixFoundation: Are We Heading in the Right Direction with Pixel-level Vision Foundation Models?
by: Siam, Mennatullah
Published: (2025) -
PixFoundation 2.0: Do Video Multi-Modal LLMs Use Motion in Visual Grounding?
by: Siam, Mennatullah
Published: (2025) -
Multiscale Video Transformers for Class Agnostic Segmentation in Autonomous Driving
by: Cheshmi, Leila, et al.
Published: (2025) -
Towards a Better Understanding of the Computer Vision Research Community in Africa
by: Omotayo, Abdul-Hakeem, et al.
Published: (2023) -
TAM-VT: Transformation-Aware Multi-scale Video Transformer for Segmentation and Tracking
by: Goyal, Raghav, et al.
Published: (2023)