:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Triaridis, Kostas, Kaliosis, Panagiotis, Nguyen, E-Ro, Xu, Jingyi, Le, Hieu, Samaras, Dimitris
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2505.22850
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Phrase-Instance Alignment for Generalized Referring Segmentation
by: Nguyen, E-Ro, et al.
Published: (2024)

Assessing Sample Quality via the Latent Space of Generative Models
by: Xu, Jingyi, et al.
Published: (2024)

Importance-Based Token Merging for Efficient Image and Video Generation
by: Wu, Haoyu, et al.
Published: (2024)

Mitigating Diffusion Model Hallucinations with Dynamic Guidance
by: Triaridis, Kostas, et al.
Published: (2025)

Talking Head Generation via AU-Guided Landmark Prediction
by: Chang, Shao-Yu, et al.
Published: (2025)

One Attention, One Scale: Phase-Aligned Rotary Positional Embeddings for Mixed-Resolution Diffusion Transformer
by: Wu, Haoyu, et al.
Published: (2025)

Embedding Physical Reasoning into Diffusion-Based Shadow Generation
by: Hu, Shilin, et al.
Published: (2025)

Cast and Attached Shadow Detection via Iterative Light and Geometry Reasoning
by: Hu, Shilin, et al.
Published: (2025)

CORA: Consistency-Guided Semi-Supervised Framework for Reasoning Segmentation
by: Howlader, Prantik, et al.
Published: (2025)

Weighting Pseudo-Labels via High-Activation Feature Index Similarity and Object Detection for Semi-Supervised Segmentation
by: Howlader, Prantik, et al.
Published: (2024)

ZoomLDM: Latent Diffusion Model for multi-scale image generation
by: Yellapragada, Srikar, et al.
Published: (2024)

Learning to Align: Addressing Character Frequency Distribution Shifts in Handwritten Text Recognition
by: Kaliosis, Panagiotis, et al.
Published: (2025)

Beyond Pixels: Semi-Supervised Semantic Segmentation with a Multi-scale Patch-based Multi-Label Classifier
by: Howlader, Prantik, et al.
Published: (2024)

Few-shot Personalized Scanpath Prediction
by: Xue, Ruoyu, et al.
Published: (2025)

MMFusion: Combining Image Forensic Filters for Visual Manipulation Detection and Localization
by: Triaridis, Kostas, et al.
Published: (2023)

Personalized Image Descriptions from Attention Sequences
by: Xue, Ruoyu, et al.
Published: (2025)

Shadow Removal Refinement via Material-Consistent Shadow Edges
by: Hu, Shilin, et al.
Published: (2024)

Pathology Image Compression with Pre-trained Autoencoders
by: Yellapragada, Srikar, et al.
Published: (2025)

JEAN: Joint Expression and Audio-guided NeRF-based Talking Face Generation
by: Chakkera, Sai Tanmay Reddy, et al.
Published: (2024)

Counting Stacked Objects
by: Dumery, Corentin, et al.
Published: (2024)

Self-supervised co-salient object detection via feature correspondence at multiple scales
by: Chakraborty, Souradeep, et al.
Published: (2024)

Automated Counting of Stacked Objects in Industrial Inspection
by: Dumery, Corentin, et al.
Published: (2026)

Decoupling What to Count and Where to See for Referring Expression Counting
by: Zou, Yuda, et al.
Published: (2025)

MI-NeRF: Learning a Single Face NeRF from Multiple Identities
by: Chatziagapi, Aggelina, et al.
Published: (2024)

What about gravity in video generation? Post-Training Newton's Laws with Verifiable Rewards
by: Le, Minh-Quan, et al.
Published: (2025)

Multi-view Gaze Target Estimation
by: Miao, Qiaomu, et al.
Published: (2025)

Learning 3D Reconstruction with Priors in Test Time
by: Zhou, Lei, et al.
Published: (2026)

TopoDiffusionNet: A Topology-aware Diffusion Model
by: Gupta, Saumya, et al.
Published: (2024)

Learning to Weight Parameters for Training Data Attribution
by: Li, Shuangqi, et al.
Published: (2025)

MIGS: Multi-Identity Gaussian Splatting via Tensor Decomposition
by: Chatziagapi, Aggelina, et al.
Published: (2024)

Exploring Contextual Attribute Density in Referring Expression Counting
by: Wang, Zhicheng, et al.
Published: (2025)

Fast constrained sampling in pre-trained diffusion models
by: Graikos, Alexandros, et al.
Published: (2024)

MonoLoss: A Training Objective for Interpretable Monosemantic Representations
by: Nasiri-Sarvi, Ali, et al.
Published: (2026)

GECKO: Gigapixel Vision-Concept Contrastive Pretraining in Histopathology
by: Kapse, Saarthak, et al.
Published: (2025)

CAKE: Real-time Action Detection via Motion Distillation and Background-aware Contrastive Learning
by: Hoang, Hieu, et al.
Published: (2026)

Learned representation-guided diffusion models for large-image generation
by: Graikos, Alexandros, et al.
Published: (2023)

All Seeds Are Not Equal: Enhancing Compositional Text-to-Image Generation with Reliable Random Seeds
by: Li, Shuangqi, et al.
Published: (2024)

Pairwise-Constrained Implicit Functions for 3D Human Heart Modelling
by: Le, Hieu, et al.
Published: (2023)

PathSegDiff: Pathology Segmentation using Diffusion model representations
by: Danisetty, Sachin Kumar, et al.
Published: (2025)

Rig3DGS: Creating Controllable Portraits from Casual Monocular Videos
by: Rivero, Alfredo, et al.
Published: (2024)