:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xie, Wen, Zhu, Yanjun, Overgoor, Gijs, Bart, Yakov, Garcia, Agata Lapedriza, Ostadabbas, Sarah
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Information Retrieval Multimedia 68T05 I.4.0; H.3.1; I.2.10; K.4.4
Online Access:	https://arxiv.org/abs/2510.26569
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Bridging Knowledge Gap Between Image Inpainting and Large-Area Visible Watermark Removal
by: Leng, Yicheng, et al.
Published: (2025)

A Roadmap for Multilingual, Multimodal Domain Independent Deception Detection
by: Boumber, Dainis, et al.
Published: (2024)

U-Net-Like Spiking Neural Networks for Single Image Dehazing
by: Li, Huibin, et al.
Published: (2025)

AVControl: Efficient Framework for Training Audio-Visual Controls
by: Ben-Yosef, Matan, et al.
Published: (2026)

Visual Style Prompt Learning Using Diffusion Models for Blind Face Restoration
by: Lu, Wanglong, et al.
Published: (2024)

FACEMUG: A Multimodal Generative and Fusion Framework for Local Facial Editing
by: Lu, Wanglong, et al.
Published: (2024)

Leum-VL Technical Report
by: He, Yuxuan, et al.
Published: (2026)

CLIP-Joint-Detect: End-to-End Joint Training of Object Detectors with Contrastive Vision-Language Supervision
by: Raoufi, Behnam, et al.
Published: (2025)

FLD+: Data-efficient Evaluation Metric for Generative Models
by: Jeevan, Pranav, et al.
Published: (2024)

WaveMixSR-V2: Enhancing Super-resolution with Higher Efficiency
by: Jeevan, Pranav, et al.
Published: (2024)

Normalizing Flow-Based Metric for Image Generation
by: Jeevan, Pranav, et al.
Published: (2024)

Geo2Sound: A Scalable Geo-Aligned Framework for Soundscape Generation from Satellite Imagery
by: Wu, Kunlin, et al.
Published: (2026)

A Hybrid Deterministic Framework for Named Entity Extraction in Broadcast News Video
by: Lucas, Andrea Filiberto, et al.
Published: (2026)

Learning Joint Denoising, Demosaicing, and Compression from the Raw Natural Image Noise Dataset
by: Brummer, Benoit, et al.
Published: (2025)

Light Future: Multimodal Action Frame Prediction via InstructPix2Pix
by: Zhong, Zesen, et al.
Published: (2025)

AIM 2024 Challenge on Video Saliency Prediction: Methods and Results
by: Moskalenko, Andrey, et al.
Published: (2024)

NTIRE 2026 Challenge on Video Saliency Prediction: Methods and Results
by: Moskalenko, Andrey, et al.
Published: (2026)

Phase-Aware Wavelet-Based-Scattering Encoder-Decoder for Dense Predictions
by: Marrakchi, Ghassen, et al.
Published: (2026)

Scene Detection Policies and Keyframe Extraction Strategies for Large-Scale Video Analysis
by: Korolkov, Vasilii
Published: (2025)

EDSNet: Efficient-DSNet for Video Summarization
by: Prasad, Ashish, et al.
Published: (2024)

A Real-Time Diminished Reality Approach to Privacy in MR Collaboration
by: Fane, Christian
Published: (2025)

PCRI: Measuring Context Robustness in Multimodal Models for Enterprise Applications
by: Patel, Hitesh Laxmichand, et al.
Published: (2025)

Graph-PiT: Enhancing Structural Coherence in Part-Based Image Synthesis via Graph Priors
by: Zhang, Junbin, et al.
Published: (2026)

MetaErr: Towards Predicting Error Patterns in Deep Neural Networks
by: Totakura, Varun, et al.
Published: (2026)

CCVA-FL: Cross-Client Variations Adaptive Federated Learning for Medical Imaging
by: Gupta, Sunny, et al.
Published: (2024)

Taming the Tail: Leveraging Asymmetric Loss and Pade Approximation to Overcome Medical Image Long-Tailed Class Imbalance
by: Kashyap, Pankhi, et al.
Published: (2024)

HY-Himmel Technical Report: Hierarchical Interleaved Multi-stream Motion Encoding for Long Video Understanding
by: Jin, Haopeng, et al.
Published: (2026)

Digital analysis of early color photographs taken using regular color screen processes
by: Hubička, Jan, et al.
Published: (2023)

Semantic2Graph: Graph-based Multi-modal Feature Fusion for Action Segmentation in Videos
by: Zhang, Junbin, et al.
Published: (2022)

RCI: A Score for Evaluating Global and Local Reasoning in Multimodal Benchmarks
by: Agarwal, Amit, et al.
Published: (2025)

Efficient and Privacy-Protecting Background Removal for 2D Video Streaming using iPhone 15 Pro Max LiDAR
by: Kinnevan, Jessica, et al.
Published: (2025)

ForensicFormer: Hierarchical Multi-Scale Reasoning for Cross-Domain Image Forgery Detection
by: Samson, Hema Hariharan
Published: (2026)

Evaluation Metric for Quality Control and Generative Models in Histopathology Images
by: Jeevan, Pranav, et al.
Published: (2024)

Development of ultra-high efficiency soft X-ray angle-resolved photoemission spectroscopy equipped with deep prior-based denoising method
by: Yamagami, Kohei, et al.
Published: (2025)

Automatic Detection of Intro and Credits in Video using CLIP and Multihead Attention
by: Korolkov, Vasilii, et al.
Published: (2025)

DSCSNet: A Dynamic Sparse Compression Sensing Network for Closely-Spaced Infrared Small Target Unmixing
by: Tang, Zhiyang, et al.
Published: (2026)

WaveMix: A Resource-efficient Neural Network for Image Analysis
by: Jeevan, Pranav, et al.
Published: (2022)

Which Backbone to Use: A Resource-efficient Domain Specific Comparison for Computer Vision
by: Jeevan, Pranav, et al.
Published: (2024)

Image and Video Compression using Generative Sparse Representation with Fidelity Controls
by: Jiang, Wei, et al.
Published: (2024)

Image-Based Leopard Seal Recognition: Approaches and Challenges in Current Automated Systems
by: Salazar, Jorge Yero, et al.
Published: (2024)