:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Pham, Tan-Hanh, Le, Hoang-Nam, Nguyen, Phu-Vinh, Ngo, Chris, Hy, Truong-Son
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2412.16771
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

SilVar-Med: A Speech-Driven Visual Language Model for Explainable Abnormality Detection in Medical Imaging
by: Pham, Tan-Hanh, et al.
Published: (2025)

A Novel Framework for Automated Explain Vision Model Using Vision-Language Models
by: Nguyen, Phu-Vinh, et al.
Published: (2025)

Repeton: Structured Bug Repair with ReAct-Guided Patch-and-Test Cycles
by: Vinh, Nguyen Phu, et al.
Published: (2025)

IQBench: How "Smart'' Are Vision-Language Models? A Study with Human IQ Tests
by: Pham, Tan-Hanh, et al.
Published: (2025)

Multimodal Chain of Continuous Thought for Latent-Space Reasoning in Vision-Language Models
by: Pham, Tan-Hanh, et al.
Published: (2025)

MultiMed: Multilingual Medical Speech Recognition via Attention Encoder Decoder
by: Le-Duc, Khai, et al.
Published: (2024)

wav2graph: A Framework for Supervised Learning Knowledge Graph from Speech
by: Le-Duc, Khai, et al.
Published: (2024)

RARL: Improving Medical VLM Reasoning and Generalization with Reinforcement Learning and LoRA under Data and Hardware Constraints
by: Pham, Tan-Hanh, et al.
Published: (2025)

LiteGPT: Large Vision-Language Model for Joint Chest X-ray Localization and Classification Task
by: Le-Duc, Khai, et al.
Published: (2024)

ViInfographicVQA: A Benchmark for Single and Multi-image Visual Question Answering on Vietnamese Infographics
by: Van-Dinh, Tue-Thu, et al.
Published: (2025)

RedBench: A Universal Dataset for Comprehensive Red Teaming of Large Language Models
by: Dang, Quy-Anh, et al.
Published: (2026)

RainbowPlus: Enhancing Adversarial Prompt Generation via Evolutionary Quality-Diversity Search
by: Dang, Quy-Anh, et al.
Published: (2025)

LINKER: Learning Interactions Between Functional Groups and Residues With Chemical Knowledge-Enhanced Reasoning and Explainability
by: Pham, Phuc, et al.
Published: (2025)

Multimodal graph representation learning for website generation based on visual sketch
by: Vu, Tung D., et al.
Published: (2025)

Range-aware Positional Encoding via High-order Pretraining: Theory and Practice
by: Nguyen, Viet Anh, et al.
Published: (2024)

ESGNN: Towards Equivariant Scene Graph Neural Network for 3D Scene Understanding
by: Pham, Quang P. M., et al.
Published: (2024)

Q-BIOLAT: Binary Latent Protein Fitness Landscapes for QUBO-Based Optimization
by: Hy, Truong-Son
Published: (2026)

Binary Latent Protein Fitness Landscapes for Quantum Annealing Optimization
by: Hy, Truong-Son
Published: (2026)

A nonlinear analogue of additive commutators
by: Dung, Truong Huu, et al.
Published: (2025)

Efficient Deep Learning for Medical Imaging: Bridging the Gap Between High-Performance AI and Clinical Deployment
by: Nguyen, Cuong Manh, et al.
Published: (2026)

TESGNN: Temporal Equivariant Scene Graph Neural Networks for Efficient and Robust Multi-View 3D Scene Understanding
by: Pham, Quang P. M., et al.
Published: (2024)

Exploring the Application of Visual Question Answering (VQA) for Classroom Activity Monitoring
by: Vu, Sinh Trong, et al.
Published: (2025)

Real-time Speech Summarization for Medical Conversations
by: Le-Duc, Khai, et al.
Published: (2024)

An Approach of Structure‐Enhanced Code‐Centric Graph Learning for Just‐in‐Time Software Vulnerability Detection
by: Phu Pham, et al.
Published: (2026)

AccurateRAG: A Framework for Building Accurate Retrieval-Augmented Question-Answering Applications
by: Nguyen, Linh The, et al.
Published: (2025)

Rethinking Top Probability from Multi-view for Distracted Driver Behaviour Localization
by: Nguyen, Quang Vinh, et al.
Published: (2024)

Advancing Vietnamese Visual Question Answering with Transformer and Convolutional Integration
by: Nguyen, Ngoc Son, et al.
Published: (2024)

GROOT: Effective Design of Biological Sequences with Limited Experimental Data
by: Tran, Thanh V. T., et al.
Published: (2024)

Towards Robust Fact-Checking: A Multi-Agent System with Advanced Evidence Retrieval
by: Trinh, Tam, et al.
Published: (2025)

Advances in Protein Representation Learning: Methods, Applications, and Future Directions
by: Nguyen, Viet Thanh Duy, et al.
Published: (2025)

DiFlow-TTS: Compact and Low-Latency Zero-Shot Text-to-Speech with Factorized Discrete Flow Matching
by: Nguyen, Ngoc-Son, et al.
Published: (2025)

Vietnamese Legal Information Retrieval in Question-Answering System
by: Ba, Thiem Nguyen, et al.
Published: (2024)

OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching
by: Huynh-Nguyen, Hieu-Nghia, et al.
Published: (2025)

EquiHGNN: Scalable Rotationally Equivariant Hypergraph Neural Networks
by: Dang, Tien, et al.
Published: (2025)

OWLViz: An Open-World Benchmark for Visual Question Answering
by: Nguyen, Thuy, et al.
Published: (2025)

Multimodal Contrastive Representation Learning in Augmented Biomedical Knowledge Graphs
by: Dang, Tien, et al.
Published: (2025)

Diet of Critically Endangered Black‐Eyed Bent‐Toed Gecko, Cyrtodactylus nigriocularis , Nguyen, Orlov & Darevsky, 2006 From Vietnam
by: Hanh Thi Ngo, et al.
Published: (2025)

TinyGiantVLM: A Lightweight Vision-Language Architecture for Spatial Reasoning under Resource Constraints
by: Ly, Vinh-Thuan, et al.
Published: (2025)

Adaptive Compensation for Robotic Joint Failures Using Partially Observable Reinforcement Learning
by: Pham, Tan-Hanh, et al.
Published: (2024)

Advancing Vietnamese Information Retrieval with Learning Objective and Benchmark
by: Nguyen, Phu-Vinh, et al.
Published: (2025)