:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Vo, Thanh-Nhan, Nguyen, Trong-Thuan, Nguyen, Tam V., Tran, Minh-Triet
Format:	Preprint
Veröffentlicht:	2026
Schlagworte:	Computer Vision and Pattern Recognition Artificial Intelligence
Online-Zugang:	https://arxiv.org/abs/2601.21498
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

VENUS: Visual Editing with Noise Inversion Using Scene Graphs
von: Vo, Thanh-Nhan, et al.
Veröffentlicht: (2026)

SATURN: Autoregressive Image Generation Guided by Scene Graphs
von: Vo, Thanh-Nhan, et al.
Veröffentlicht: (2025)

THYME: Temporal Hierarchical-Cyclic Interactivity Modeling for Video Scene Graphs in Aerial Footage
von: Nguyen, Trong-Thuan, et al.
Veröffentlicht: (2025)

CPAM: Context-Preserving Adaptive Manipulation for Zero-Shot Real Image Editing
von: Vo, Dinh-Khoi, et al.
Veröffentlicht: (2025)

HIG: Hierarchical Interlacement Graph Approach to Scene Graph Generation in Video Understanding
von: Nguyen, Trong-Thuan, et al.
Veröffentlicht: (2023)

Automated Image Recognition Framework
von: Nguyen, Quang-Binh, et al.
Veröffentlicht: (2025)

HyperGLM: HyperGraph for Video Scene Graph Generation and Anticipation
von: Nguyen, Trong-Thuan, et al.
Veröffentlicht: (2024)

ShowFlow: From Robust Single Concept to Condition-Free Multi-Concept Generation
von: Hoang, Trong-Vu, et al.
Veröffentlicht: (2025)

Toward Content-based Indexing and Retrieval of Head and Neck CT with Abscess Segmentation
von: Dao, Thao Thi Phuong, et al.
Veröffentlicht: (2025)

Bridging the Training-Deployment Gap: Gated Encoding and Multi-Scale Refinement for Efficient Quantization-Aware Image Enhancement
von: To-Thanh, Dat, et al.
Veröffentlicht: (2026)

GenKOL: Modular Generative AI Framework For Scalable Virtual KOL Generation
von: To, Tan-Hiep, et al.
Veröffentlicht: (2025)

ACM Multimedia Grand Challenge on ENT Endoscopy Analysis
von: Nguyen, Trong-Thuan, et al.
Veröffentlicht: (2025)

Shape2Animal: Creative Animal Generation from Natural Silhouettes
von: Tran, Quoc-Duy, et al.
Veröffentlicht: (2025)

PANDORA: Pixel-wise Attention Dissolution and Latent Guidance for Zero-Shot Object Removal
von: Vo, Dinh-Khoi, et al.
Veröffentlicht: (2026)

CamoFA: A Learnable Fourier-based Augmentation for Camouflage Segmentation
von: Le, Minh-Quan, et al.
Veröffentlicht: (2023)

KiseKloset for Fashion Retrieval and Recommendation
von: Phan-Nguyen, Thanh-Tung, et al.
Veröffentlicht: (2025)

MasHeNe: A Benchmark for Head and Neck CT Mass Segmentation using Window-Enhanced Mamba with Frequency-Domain Integration
von: Dao, Thao Thi Phuong, et al.
Veröffentlicht: (2025)

TaleForge: Interactive Multimodal System for Personalized Story Creation
von: Nguyen, Minh-Loi, et al.
Veröffentlicht: (2025)

The Art of Camouflage: Few-Shot Learning for Animal Detection and Segmentation
von: Nguyen, Thanh-Danh, et al.
Veröffentlicht: (2023)

EVENT-Retriever: Event-Aware Multimodal Image Retrieval for Realistic Captions
von: Vo, Dinh-Khoi, et al.
Veröffentlicht: (2025)

OpenEvents V1: Large-Scale Benchmark Dataset for Multimodal Event Grounding
von: Nguyen, Hieu, et al.
Veröffentlicht: (2025)

MaskDiff: Modeling Mask Distribution with Diffusion Probabilistic Model for Few-Shot Instance Segmentation
von: Le, Minh-Quan, et al.
Veröffentlicht: (2023)

STER-VLM: Spatio-Temporal With Enhanced Reference Vision-Language Models
von: Nguyen-Nhu, Tinh-Anh, et al.
Veröffentlicht: (2025)

Interactive Interface For Semantic Segmentation Dataset Synthesis
von: Tran, Ngoc-Do, et al.
Veröffentlicht: (2025)

TF-SASM: Training-free Spatial-aware Sparse Memory for Multi-object Tracking
von: Nguyen-Quang, Thuc, et al.
Veröffentlicht: (2024)

Event-Enriched Image Analysis Grand Challenge at ACM Multimedia 2025
von: Tran, Thien-Phuc, et al.
Veröffentlicht: (2025)

GUNNEL: Guided Mixup Augmentation and Multi-Model Fusion for Aquatic Animal Segmentation
von: Le, Minh-Quan, et al.
Veröffentlicht: (2021)

ARtVista: Gateway To Empower Anyone Into Artist
von: Hoang, Trong-Vu, et al.
Veröffentlicht: (2024)

Revolutionizing Precise Low Back Pain Diagnosis via Contrastive Learning
von: Le, Thanh Binh, et al.
Veröffentlicht: (2025)

Efficient 3D Brain Tumor Segmentation with Axial-Coronal-Sagittal Embedding
von: Huynh, Tuan-Luc, et al.
Veröffentlicht: (2025)

VisionGuard: Synergistic Framework for Helmet Violation Detection
von: Nguyen, Lam-Huy, et al.
Veröffentlicht: (2025)

SHREC 2025: Retrieval of Optimal Objects for Multi-modal Enhanced Language and Spatial Assistance (ROOMELSA)
von: Nguyen, Trong-Thuan, et al.
Veröffentlicht: (2025)

Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context Understanding
von: Nguyen-Truong, Hai, et al.
Veröffentlicht: (2024)

EDGER: EDge-Guided with HEatmap Refinement for Generalizable Image Forgery Localization
von: Le-Phan, Minh-Khoa, et al.
Veröffentlicht: (2026)

iCONTRA: Toward Thematic Collection Design Via Interactive Concept Transfer
von: Vo, Dinh-Khoi, et al.
Veröffentlicht: (2024)

SwiftEdit: Lightning Fast Text-Guided Image Editing via One-Step Diffusion
von: Nguyen, Trong-Tung, et al.
Veröffentlicht: (2024)

SAMURAI: Shape-Aware Multimodal Retrieval for 3D Object Identification
von: Vo, Dinh-Khoi, et al.
Veröffentlicht: (2025)

FlexEdit: Flexible and Controllable Diffusion-based Object-centric Image Editing
von: Nguyen, Trong-Tung, et al.
Veröffentlicht: (2024)

ReCap: Event-Aware Image Captioning with Article Retrieval and Semantic Gaussian Normalization
von: Nguyen, Thinh-Phuc, et al.
Veröffentlicht: (2025)

CYCLO: Cyclic Graph Transformer Approach to Multi-Object Relationship Modeling in Aerial Videos
von: Nguyen, Trong-Thuan, et al.
Veröffentlicht: (2024)