Saved in:
| Main Authors: | Xie, Wen, Zhu, Yanjun, Overgoor, Gijs, Bart, Yakov, Garcia, Agata Lapedriza, Ostadabbas, Sarah |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.26569 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Bridging Knowledge Gap Between Image Inpainting and Large-Area Visible Watermark Removal
by: Leng, Yicheng, et al.
Published: (2025)
by: Leng, Yicheng, et al.
Published: (2025)
A Roadmap for Multilingual, Multimodal Domain Independent Deception Detection
by: Boumber, Dainis, et al.
Published: (2024)
by: Boumber, Dainis, et al.
Published: (2024)
U-Net-Like Spiking Neural Networks for Single Image Dehazing
by: Li, Huibin, et al.
Published: (2025)
by: Li, Huibin, et al.
Published: (2025)
AVControl: Efficient Framework for Training Audio-Visual Controls
by: Ben-Yosef, Matan, et al.
Published: (2026)
by: Ben-Yosef, Matan, et al.
Published: (2026)
Visual Style Prompt Learning Using Diffusion Models for Blind Face Restoration
by: Lu, Wanglong, et al.
Published: (2024)
by: Lu, Wanglong, et al.
Published: (2024)
FACEMUG: A Multimodal Generative and Fusion Framework for Local Facial Editing
by: Lu, Wanglong, et al.
Published: (2024)
by: Lu, Wanglong, et al.
Published: (2024)
Leum-VL Technical Report
by: He, Yuxuan, et al.
Published: (2026)
by: He, Yuxuan, et al.
Published: (2026)
CLIP-Joint-Detect: End-to-End Joint Training of Object Detectors with Contrastive Vision-Language Supervision
by: Raoufi, Behnam, et al.
Published: (2025)
by: Raoufi, Behnam, et al.
Published: (2025)
FLD+: Data-efficient Evaluation Metric for Generative Models
by: Jeevan, Pranav, et al.
Published: (2024)
by: Jeevan, Pranav, et al.
Published: (2024)
WaveMixSR-V2: Enhancing Super-resolution with Higher Efficiency
by: Jeevan, Pranav, et al.
Published: (2024)
by: Jeevan, Pranav, et al.
Published: (2024)
Normalizing Flow-Based Metric for Image Generation
by: Jeevan, Pranav, et al.
Published: (2024)
by: Jeevan, Pranav, et al.
Published: (2024)
Geo2Sound: A Scalable Geo-Aligned Framework for Soundscape Generation from Satellite Imagery
by: Wu, Kunlin, et al.
Published: (2026)
by: Wu, Kunlin, et al.
Published: (2026)
A Hybrid Deterministic Framework for Named Entity Extraction in Broadcast News Video
by: Lucas, Andrea Filiberto, et al.
Published: (2026)
by: Lucas, Andrea Filiberto, et al.
Published: (2026)
Learning Joint Denoising, Demosaicing, and Compression from the Raw Natural Image Noise Dataset
by: Brummer, Benoit, et al.
Published: (2025)
by: Brummer, Benoit, et al.
Published: (2025)
Light Future: Multimodal Action Frame Prediction via InstructPix2Pix
by: Zhong, Zesen, et al.
Published: (2025)
by: Zhong, Zesen, et al.
Published: (2025)
AIM 2024 Challenge on Video Saliency Prediction: Methods and Results
by: Moskalenko, Andrey, et al.
Published: (2024)
by: Moskalenko, Andrey, et al.
Published: (2024)
NTIRE 2026 Challenge on Video Saliency Prediction: Methods and Results
by: Moskalenko, Andrey, et al.
Published: (2026)
by: Moskalenko, Andrey, et al.
Published: (2026)
Phase-Aware Wavelet-Based-Scattering Encoder-Decoder for Dense Predictions
by: Marrakchi, Ghassen, et al.
Published: (2026)
by: Marrakchi, Ghassen, et al.
Published: (2026)
Scene Detection Policies and Keyframe Extraction Strategies for Large-Scale Video Analysis
by: Korolkov, Vasilii
Published: (2025)
by: Korolkov, Vasilii
Published: (2025)
EDSNet: Efficient-DSNet for Video Summarization
by: Prasad, Ashish, et al.
Published: (2024)
by: Prasad, Ashish, et al.
Published: (2024)
A Real-Time Diminished Reality Approach to Privacy in MR Collaboration
by: Fane, Christian
Published: (2025)
by: Fane, Christian
Published: (2025)
PCRI: Measuring Context Robustness in Multimodal Models for Enterprise Applications
by: Patel, Hitesh Laxmichand, et al.
Published: (2025)
by: Patel, Hitesh Laxmichand, et al.
Published: (2025)
Graph-PiT: Enhancing Structural Coherence in Part-Based Image Synthesis via Graph Priors
by: Zhang, Junbin, et al.
Published: (2026)
by: Zhang, Junbin, et al.
Published: (2026)
MetaErr: Towards Predicting Error Patterns in Deep Neural Networks
by: Totakura, Varun, et al.
Published: (2026)
by: Totakura, Varun, et al.
Published: (2026)
CCVA-FL: Cross-Client Variations Adaptive Federated Learning for Medical Imaging
by: Gupta, Sunny, et al.
Published: (2024)
by: Gupta, Sunny, et al.
Published: (2024)
Taming the Tail: Leveraging Asymmetric Loss and Pade Approximation to Overcome Medical Image Long-Tailed Class Imbalance
by: Kashyap, Pankhi, et al.
Published: (2024)
by: Kashyap, Pankhi, et al.
Published: (2024)
HY-Himmel Technical Report: Hierarchical Interleaved Multi-stream Motion Encoding for Long Video Understanding
by: Jin, Haopeng, et al.
Published: (2026)
by: Jin, Haopeng, et al.
Published: (2026)
Digital analysis of early color photographs taken using regular color screen processes
by: Hubička, Jan, et al.
Published: (2023)
by: Hubička, Jan, et al.
Published: (2023)
Semantic2Graph: Graph-based Multi-modal Feature Fusion for Action Segmentation in Videos
by: Zhang, Junbin, et al.
Published: (2022)
by: Zhang, Junbin, et al.
Published: (2022)
RCI: A Score for Evaluating Global and Local Reasoning in Multimodal Benchmarks
by: Agarwal, Amit, et al.
Published: (2025)
by: Agarwal, Amit, et al.
Published: (2025)
Efficient and Privacy-Protecting Background Removal for 2D Video Streaming using iPhone 15 Pro Max LiDAR
by: Kinnevan, Jessica, et al.
Published: (2025)
by: Kinnevan, Jessica, et al.
Published: (2025)
ForensicFormer: Hierarchical Multi-Scale Reasoning for Cross-Domain Image Forgery Detection
by: Samson, Hema Hariharan
Published: (2026)
by: Samson, Hema Hariharan
Published: (2026)
Evaluation Metric for Quality Control and Generative Models in Histopathology Images
by: Jeevan, Pranav, et al.
Published: (2024)
by: Jeevan, Pranav, et al.
Published: (2024)
Development of ultra-high efficiency soft X-ray angle-resolved photoemission spectroscopy equipped with deep prior-based denoising method
by: Yamagami, Kohei, et al.
Published: (2025)
by: Yamagami, Kohei, et al.
Published: (2025)
Automatic Detection of Intro and Credits in Video using CLIP and Multihead Attention
by: Korolkov, Vasilii, et al.
Published: (2025)
by: Korolkov, Vasilii, et al.
Published: (2025)
DSCSNet: A Dynamic Sparse Compression Sensing Network for Closely-Spaced Infrared Small Target Unmixing
by: Tang, Zhiyang, et al.
Published: (2026)
by: Tang, Zhiyang, et al.
Published: (2026)
WaveMix: A Resource-efficient Neural Network for Image Analysis
by: Jeevan, Pranav, et al.
Published: (2022)
by: Jeevan, Pranav, et al.
Published: (2022)
Which Backbone to Use: A Resource-efficient Domain Specific Comparison for Computer Vision
by: Jeevan, Pranav, et al.
Published: (2024)
by: Jeevan, Pranav, et al.
Published: (2024)
Image and Video Compression using Generative Sparse Representation with Fidelity Controls
by: Jiang, Wei, et al.
Published: (2024)
by: Jiang, Wei, et al.
Published: (2024)
Image-Based Leopard Seal Recognition: Approaches and Challenges in Current Automated Systems
by: Salazar, Jorge Yero, et al.
Published: (2024)
by: Salazar, Jorge Yero, et al.
Published: (2024)
Similar Items
-
Bridging Knowledge Gap Between Image Inpainting and Large-Area Visible Watermark Removal
by: Leng, Yicheng, et al.
Published: (2025) -
A Roadmap for Multilingual, Multimodal Domain Independent Deception Detection
by: Boumber, Dainis, et al.
Published: (2024) -
U-Net-Like Spiking Neural Networks for Single Image Dehazing
by: Li, Huibin, et al.
Published: (2025) -
AVControl: Efficient Framework for Training Audio-Visual Controls
by: Ben-Yosef, Matan, et al.
Published: (2026) -
Visual Style Prompt Learning Using Diffusion Models for Blind Face Restoration
by: Lu, Wanglong, et al.
Published: (2024)