Saved in:
| Main Authors: | Nguyen, Phuong-Anh, Pham, Tien Anh, Le, Duc-Trong, Nguyen, Cam-Van Thi |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.19718 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MissBench: Benchmarking Multimodal Affective Analysis under Imbalanced Missing Modalities
by: Pham, Tien Anh, et al.
Published: (2026)
by: Pham, Tien Anh, et al.
Published: (2026)
FlexEdit: Flexible and Controllable Diffusion-based Object-centric Image Editing
by: Nguyen, Trong-Tung, et al.
Published: (2024)
by: Nguyen, Trong-Tung, et al.
Published: (2024)
CLEAR: Causal Learning Framework For Robust Histopathology Tumor Detection Under Out-Of-Distribution Shifts
by: Thi, Kieu-Anh Truong, et al.
Published: (2025)
by: Thi, Kieu-Anh Truong, et al.
Published: (2025)
Leveraging Self-Paced Curriculum Learning for Enhanced Modality Balance in Multimodal Conversational Emotion Recognition
by: Nguyen, Phuong-Anh, et al.
Published: (2026)
by: Nguyen, Phuong-Anh, et al.
Published: (2026)
Efficient INT8 Single-Image Super-Resolution via Deployment-Aware Quantization and Teacher-Guided Training
by: Nguyen, Pham Phuong Nam, et al.
Published: (2026)
by: Nguyen, Pham Phuong Nam, et al.
Published: (2026)
A Dual-Module Denoising Approach with Curriculum Learning for Enhancing Multimodal Aspect-Based Sentiment Analysis
by: Van Doan, Nguyen, et al.
Published: (2024)
by: Van Doan, Nguyen, et al.
Published: (2024)
MedSteer: Counterfactual Endoscopic Synthesis via Training-Free Activation Steering
by: Pham, Trong-Thang, et al.
Published: (2026)
by: Pham, Trong-Thang, et al.
Published: (2026)
Virtual Fusion with Contrastive Learning for Single Sensor-based Activity Recognition
by: Nguyen, Duc-Anh, et al.
Published: (2023)
by: Nguyen, Duc-Anh, et al.
Published: (2023)
SwiftEdit: Lightning Fast Text-Guided Image Editing via One-Step Diffusion
by: Nguyen, Trong-Tung, et al.
Published: (2024)
by: Nguyen, Trong-Tung, et al.
Published: (2024)
Semi-supervised 3D Semantic Scene Completion with 2D Vision Foundation Model Guidance
by: Pham, Duc-Hai, et al.
Published: (2024)
by: Pham, Duc-Hai, et al.
Published: (2024)
InverFill: One-Step Inversion for Enhanced Few-Step Diffusion Inpainting
by: Vu, Duc, et al.
Published: (2026)
by: Vu, Duc, et al.
Published: (2026)
Ada2I: Enhancing Modality Balance for Multimodal Conversational Emotion Recognition
by: Nguyen, Cam-Van Thi, et al.
Published: (2024)
by: Nguyen, Cam-Van Thi, et al.
Published: (2024)
Any3DIS: Class-Agnostic 3D Instance Segmentation by 2D Mask Tracking
by: Nguyen, Phuc, et al.
Published: (2024)
by: Nguyen, Phuc, et al.
Published: (2024)
Interpreting Radiologist's Intention from Eye Movements in Chest X-ray Diagnosis
by: Pham, Trong-Thang, et al.
Published: (2025)
by: Pham, Trong-Thang, et al.
Published: (2025)
Towards a text-based quantitative and explainable histopathology image analysis
by: Nguyen, Anh Tien, et al.
Published: (2024)
by: Nguyen, Anh Tien, et al.
Published: (2024)
Linguistically Informed Multimodal Fusion for Vietnamese Scene-Text Image Captioning: Dataset, Graph Framework, and Phonological Attention
by: Nguyen, Nhi Ngoc-Yen, et al.
Published: (2026)
by: Nguyen, Nhi Ngoc-Yen, et al.
Published: (2026)
UniSemAlign: Text-Prototype Alignment with a Foundation Encoder for Semi-Supervised Histopathology Segmentation
by: Thai, Le-Van, et al.
Published: (2026)
by: Thai, Le-Van, et al.
Published: (2026)
Towards Efficient and Robust Moment Retrieval System: A Unified Framework for Multi-Granularity Models and Temporal Reranking
by: Tran, Huu-Loc, et al.
Published: (2025)
by: Tran, Huu-Loc, et al.
Published: (2025)
Divide and Refine: Enhancing Multimodal Representation and Explainability for Emotion Recognition in Conversation
by: Mai, Anh-Tuan, et al.
Published: (2026)
by: Mai, Anh-Tuan, et al.
Published: (2026)
CT-ScanGaze: A Dataset and Baselines for 3D Volumetric Scanpath Modeling
by: Pham, Trong-Thang, et al.
Published: (2025)
by: Pham, Trong-Thang, et al.
Published: (2025)
SwiftPie: Lightning-fast Subject-driven Image Personalization via One step Diffusion
by: Duong, Huy, et al.
Published: (2026)
by: Duong, Huy, et al.
Published: (2026)
WAVER: Writing-style Agnostic Text-Video Retrieval via Distilling Vision-Language Models Through Open-Vocabulary Knowledge
by: Le, Huy, et al.
Published: (2023)
by: Le, Huy, et al.
Published: (2023)
Supercharged One-step Text-to-Image Diffusion Models with Negative Prompts
by: Nguyen, Viet, et al.
Published: (2024)
by: Nguyen, Viet, et al.
Published: (2024)
Hierarchical Neural Collapse Detection Transformer for Class Incremental Object Detection
by: Pham, Duc Thanh, et al.
Published: (2025)
by: Pham, Duc Thanh, et al.
Published: (2025)
SuMa: A Subspace Mapping Approach for Robust and Effective Concept Erasure in Text-to-Image Diffusion Models
by: Nguyen, Kien, et al.
Published: (2025)
by: Nguyen, Kien, et al.
Published: (2025)
Improved Training Technique for Shortcut Models
by: Nguyen, Anh, et al.
Published: (2025)
by: Nguyen, Anh, et al.
Published: (2025)
PRE: Vision-Language Prompt Learning with Reparameterization Encoder
by: Pham, Thi Minh Anh, et al.
Published: (2023)
by: Pham, Thi Minh Anh, et al.
Published: (2023)
Anti-I2V: Safeguarding your photos from malicious image-to-video generation
by: Vu, Duc, et al.
Published: (2026)
by: Vu, Duc, et al.
Published: (2026)
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher
by: Dao, Trung, et al.
Published: (2024)
by: Dao, Trung, et al.
Published: (2024)
CutPaste&Find: Efficient Multimodal Hallucination Detector with Visual-aid Knowledge Base
by: Nguyen, Cong-Duy, et al.
Published: (2025)
by: Nguyen, Cong-Duy, et al.
Published: (2025)
A Hybrid Vision Transformer Approach for Mathematical Expression Recognition
by: Le, Anh Duy, et al.
Published: (2026)
by: Le, Anh Duy, et al.
Published: (2026)
ThyroidEffi 1.0: A Cost-Effective System for High-Performance Multi-Class Thyroid Carcinoma Classification
by: Pham-Ngoc, Hai, et al.
Published: (2025)
by: Pham-Ngoc, Hai, et al.
Published: (2025)
Toward a Vision-Language Foundation Model for Medical Data: Multimodal Dataset and Benchmarks for Vietnamese PET/CT Report Generation
by: Nguyen, Huu Tien, et al.
Published: (2025)
by: Nguyen, Huu Tien, et al.
Published: (2025)
PDIWS: Thermal Imaging Dataset for Person Detection in Intrusion Warning Systems
by: Thuan, Nguyen Duc, et al.
Published: (2023)
by: Thuan, Nguyen Duc, et al.
Published: (2023)
KTVIC: A Vietnamese Image Captioning Dataset on the Life Domain
by: Pham, Anh-Cuong, et al.
Published: (2024)
by: Pham, Anh-Cuong, et al.
Published: (2024)
LLandMark: A Multi-Agent Framework for Landmark-Aware Multimodal Interactive Video Retrieval
by: Phung, Minh-Chi, et al.
Published: (2026)
by: Phung, Minh-Chi, et al.
Published: (2026)
A Novel Combined Optical Flow Approach for Comprehensive Micro-Expression Recognition
by: Khuong, Vu Tram Anh, et al.
Published: (2025)
by: Khuong, Vu Tram Anh, et al.
Published: (2025)
FMANet: A Novel Dual-Phase Optical Flow Approach with Fusion Motion Attention Network for Robust Micro-expression Recognition
by: Nguyen, Luu Tu, et al.
Published: (2025)
by: Nguyen, Luu Tu, et al.
Published: (2025)
Adaptive Fusion Network with Temporal-Ranked and Motion-Intensity Dynamic Images for Micro-expression Recognition
by: Man, Thi Bich Phuong, et al.
Published: (2025)
by: Man, Thi Bich Phuong, et al.
Published: (2025)
Dual-View Optical Flow for 4D Micro-Expression Recognition - A Multi-Stream Fusion Attention Approach
by: Nguyen, Luu Tu, et al.
Published: (2026)
by: Nguyen, Luu Tu, et al.
Published: (2026)
Similar Items
-
MissBench: Benchmarking Multimodal Affective Analysis under Imbalanced Missing Modalities
by: Pham, Tien Anh, et al.
Published: (2026) -
FlexEdit: Flexible and Controllable Diffusion-based Object-centric Image Editing
by: Nguyen, Trong-Tung, et al.
Published: (2024) -
CLEAR: Causal Learning Framework For Robust Histopathology Tumor Detection Under Out-Of-Distribution Shifts
by: Thi, Kieu-Anh Truong, et al.
Published: (2025) -
Leveraging Self-Paced Curriculum Learning for Enhanced Modality Balance in Multimodal Conversational Emotion Recognition
by: Nguyen, Phuong-Anh, et al.
Published: (2026) -
Efficient INT8 Single-Image Super-Resolution via Deployment-Aware Quantization and Teacher-Guided Training
by: Nguyen, Pham Phuong Nam, et al.
Published: (2026)