Saved in:
| Main Authors: | Song, Tianyu, Duong, Van-Doan, Le, Thi-Phuong, Ta, Ton Viet |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.10938 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Ensemble Learning for Vietnamese Scene Text Spotting in Urban Environments
by: Nguyen, Hieu, et al.
Published: (2024)
by: Nguyen, Hieu, et al.
Published: (2024)
ViCocktail: Automated Multi-Modal Data Collection for Vietnamese Audio-Visual Speech Recognition
by: Nguyen, Thai-Binh, et al.
Published: (2025)
by: Nguyen, Thai-Binh, et al.
Published: (2025)
A Dual-Module Denoising Approach with Curriculum Learning for Enhancing Multimodal Aspect-Based Sentiment Analysis
by: Van Doan, Nguyen, et al.
Published: (2024)
by: Van Doan, Nguyen, et al.
Published: (2024)
Multi-Level CLS Token Fusion for Contrastive Learning in Endoscopy Image Classification
by: Nguyen, Y Hop, et al.
Published: (2025)
by: Nguyen, Y Hop, et al.
Published: (2025)
BALM: A Model-Agnostic Framework for Balanced Multimodal Learning under Imbalanced Missing Rates
by: Nguyen, Phuong-Anh, et al.
Published: (2026)
by: Nguyen, Phuong-Anh, et al.
Published: (2026)
Physics-informed Ground Reaction Dynamics from Human Motion Capture
by: Le, Cuong, et al.
Published: (2025)
by: Le, Cuong, et al.
Published: (2025)
Automated Knot Detection and Pairing for Wood Analysis in the Timber Industry
by: Lin, Guohao, et al.
Published: (2025)
by: Lin, Guohao, et al.
Published: (2025)
Enhancing Vietnamese VQA through Curriculum Learning on Raw and Augmented Text Representations
by: Nguyen, Khoi Anh, et al.
Published: (2025)
by: Nguyen, Khoi Anh, et al.
Published: (2025)
KTVIC: A Vietnamese Image Captioning Dataset on the Life Domain
by: Pham, Anh-Cuong, et al.
Published: (2024)
by: Pham, Anh-Cuong, et al.
Published: (2024)
MissBench: Benchmarking Multimodal Affective Analysis under Imbalanced Missing Modalities
by: Pham, Tien Anh, et al.
Published: (2026)
by: Pham, Tien Anh, et al.
Published: (2026)
ViInfographicVQA: A Benchmark for Single and Multi-image Visual Question Answering on Vietnamese Infographics
by: Van-Dinh, Tue-Thu, et al.
Published: (2025)
by: Van-Dinh, Tue-Thu, et al.
Published: (2025)
Advancing Vietnamese Visual Question Answering with Transformer and Convolutional Integration
by: Nguyen, Ngoc Son, et al.
Published: (2024)
by: Nguyen, Ngoc Son, et al.
Published: (2024)
kabr-tools: Automated Framework for Multi-Species Behavioral Monitoring
by: Kline, Jenna, et al.
Published: (2025)
by: Kline, Jenna, et al.
Published: (2025)
Comparative Study of UNet-based Architectures for Liver Tumor Segmentation in Multi-Phase Contrast-Enhanced Computed Tomography
by: Ly, Doan-Van-Anh, et al.
Published: (2025)
by: Ly, Doan-Van-Anh, et al.
Published: (2025)
Efficient Endangered Deer Species Monitoring with UAV Aerial Imagery and Deep Learning
by: Roca, Agustín, et al.
Published: (2025)
by: Roca, Agustín, et al.
Published: (2025)
Towards High-Fidelity and Controllable Bioacoustic Generation via Enhanced Diffusion Learning
by: Song, Tianyu, et al.
Published: (2025)
by: Song, Tianyu, et al.
Published: (2025)
Examining Monitoring System: Detecting Abnormal Behavior In Online Examinations
by: Ngo, Dinh An, et al.
Published: (2024)
by: Ngo, Dinh An, et al.
Published: (2024)
Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios
by: Phan, Van-Hoang-Anh, et al.
Published: (2025)
by: Phan, Van-Hoang-Anh, et al.
Published: (2025)
AC-MAMBASEG: An adaptive convolution and Mamba-based architecture for enhanced skin lesion segmentation
by: Nguyen, Viet-Thanh, et al.
Published: (2024)
by: Nguyen, Viet-Thanh, et al.
Published: (2024)
V-Math: An Agentic Approach to the Vietnamese National High School Graduation Mathematics Exams
by: Nguyen, Duong Q., et al.
Published: (2025)
by: Nguyen, Duong Q., et al.
Published: (2025)
A Low-Cost Machine Learning Approach for Timber Diameter Estimation
by: Fard, Fatemeh Hasanzadeh, et al.
Published: (2025)
by: Fard, Fatemeh Hasanzadeh, et al.
Published: (2025)
ACM Multimedia Grand Challenge on ENT Endoscopy Analysis
by: Nguyen, Trong-Thuan, et al.
Published: (2025)
by: Nguyen, Trong-Thuan, et al.
Published: (2025)
A Novel Combined Optical Flow Approach for Comprehensive Micro-Expression Recognition
by: Khuong, Vu Tram Anh, et al.
Published: (2025)
by: Khuong, Vu Tram Anh, et al.
Published: (2025)
FMANet: A Novel Dual-Phase Optical Flow Approach with Fusion Motion Attention Network for Robust Micro-expression Recognition
by: Nguyen, Luu Tu, et al.
Published: (2025)
by: Nguyen, Luu Tu, et al.
Published: (2025)
Adaptive Fusion Network with Temporal-Ranked and Motion-Intensity Dynamic Images for Micro-expression Recognition
by: Man, Thi Bich Phuong, et al.
Published: (2025)
by: Man, Thi Bich Phuong, et al.
Published: (2025)
Dual-View Optical Flow for 4D Micro-Expression Recognition - A Multi-Stream Fusion Attention Approach
by: Nguyen, Luu Tu, et al.
Published: (2026)
by: Nguyen, Luu Tu, et al.
Published: (2026)
DIANet: A Phase-Aware Dual-Stream Network for Micro-Expression Recognition via Dynamic Images
by: Khuong, Vu Tram Anh, et al.
Published: (2025)
by: Khuong, Vu Tram Anh, et al.
Published: (2025)
Object Detection in Thermal Images Using Deep Learning for Unmanned Aerial Vehicles
by: Tu, Minh Dang, et al.
Published: (2024)
by: Tu, Minh Dang, et al.
Published: (2024)
Automated Detection of Salvin's Albatrosses: Improving Deep Learning Tools for Aerial Wildlife Surveys
by: Rogers, Mitchell, et al.
Published: (2025)
by: Rogers, Mitchell, et al.
Published: (2025)
GorillaWatch: An Automated System for In-the-Wild Gorilla Re-Identification and Population Monitoring
by: Schall, Maximilian, et al.
Published: (2025)
by: Schall, Maximilian, et al.
Published: (2025)
A Survey on Vietnamese Document Analysis and Recognition: Challenges and Future Directions
by: Le, Anh, et al.
Published: (2025)
by: Le, Anh, et al.
Published: (2025)
A Vision-Language Foundation Model for Leaf Disease Identification
by: Quoc, Khang Nguyen, et al.
Published: (2025)
by: Quoc, Khang Nguyen, et al.
Published: (2025)
VietMEAgent: Culturally-Aware Few-Shot Multimodal Explanation for Vietnamese Visual Question Answering
by: Nguyen, Hai-Dang, et al.
Published: (2025)
by: Nguyen, Hai-Dang, et al.
Published: (2025)
Automated Identification and Segmentation of Hi Sources in CRAFTS Using Deep Learning Method
by: Song, Zihao, et al.
Published: (2024)
by: Song, Zihao, et al.
Published: (2024)
Efficient INT8 Single-Image Super-Resolution via Deployment-Aware Quantization and Teacher-Guided Training
by: Nguyen, Pham Phuong Nam, et al.
Published: (2026)
by: Nguyen, Pham Phuong Nam, et al.
Published: (2026)
PGDS: Pose-Guidance Deep Supervision for Mitigating Clothes-Changing in Person Re-Identification
by: Trinh, Quoc-Huy, et al.
Published: (2023)
by: Trinh, Quoc-Huy, et al.
Published: (2023)
ConstStyle: Robust Domain Generalization with Unified Style Transformation
by: Tran, Nam Duong, et al.
Published: (2025)
by: Tran, Nam Duong, et al.
Published: (2025)
An Automated Real-Time Approach for Image Processing and Segmentation of Fluoroscopic Images and Videos Using a Single Deep Learning Network
by: Nguyen, Viet Dung, et al.
Published: (2024)
by: Nguyen, Viet Dung, et al.
Published: (2024)
Bridging Classification and Segmentation in Osteosarcoma Assessment via Foundation and Discrete Diffusion Models
by: Nguyen, Manh Duong, et al.
Published: (2025)
by: Nguyen, Manh Duong, et al.
Published: (2025)
RealBirdID: Benchmarking Bird Species Identification in the Era of MLLMs
by: Lawrence, Logan, et al.
Published: (2026)
by: Lawrence, Logan, et al.
Published: (2026)
Similar Items
-
Ensemble Learning for Vietnamese Scene Text Spotting in Urban Environments
by: Nguyen, Hieu, et al.
Published: (2024) -
ViCocktail: Automated Multi-Modal Data Collection for Vietnamese Audio-Visual Speech Recognition
by: Nguyen, Thai-Binh, et al.
Published: (2025) -
A Dual-Module Denoising Approach with Curriculum Learning for Enhancing Multimodal Aspect-Based Sentiment Analysis
by: Van Doan, Nguyen, et al.
Published: (2024) -
Multi-Level CLS Token Fusion for Contrastive Learning in Endoscopy Image Classification
by: Nguyen, Y Hop, et al.
Published: (2025) -
BALM: A Model-Agnostic Framework for Balanced Multimodal Learning under Imbalanced Missing Rates
by: Nguyen, Phuong-Anh, et al.
Published: (2026)