Saved in:
| Main Authors: | Nguyen, Hy, Thudumu, Srikanth, Du, Hung, Vasa, Rajesh, Mouzakis, Kon |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2501.16753 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
CSAOT: Cooperative Multi-Agent System for Active Object Tracking
by: Nguyen, Hy, et al.
Published: (2025)
by: Nguyen, Hy, et al.
Published: (2025)
Contextual Knowledge Sharing in Multi-Agent Reinforcement Learning with Decentralized Communication and Coordination
by: Du, Hung, et al.
Published: (2025)
by: Du, Hung, et al.
Published: (2025)
Goal-Oriented Multi-Agent Reinforcement Learning for Decentralized Agent Teams
by: Du, Hung, et al.
Published: (2025)
by: Du, Hung, et al.
Published: (2025)
A Survey on Context-Aware Multi-Agent Systems: Techniques, Challenges and Future Directions
by: Du, Hung, et al.
Published: (2024)
by: Du, Hung, et al.
Published: (2024)
Local Control Networks (LCNs): Optimizing Flexibility in Neural Network Data Pattern Capture
by: Nguyen, Hy, et al.
Published: (2025)
by: Nguyen, Hy, et al.
Published: (2025)
Dual-Branch HNSW Approach with Skip Bridges and LID-Driven Optimization
by: Nguyen, Hy, et al.
Published: (2025)
by: Nguyen, Hy, et al.
Published: (2025)
The M-factor: A Novel Metric for Evaluating Neural Architecture Search in Resource-Constrained Environments
by: Thudumu, Srikanth, et al.
Published: (2025)
by: Thudumu, Srikanth, et al.
Published: (2025)
Pretraining Objective Matters in Extreme Low-Data FGVC: A Backbone-Controlled Study
by: Hackett, Alexander, et al.
Published: (2026)
by: Hackett, Alexander, et al.
Published: (2026)
Playing with Transformer at 30+ FPS via Next-Frame Diffusion
by: Cheng, Xinle, et al.
Published: (2025)
by: Cheng, Xinle, et al.
Published: (2025)
VideoAR: Autoregressive Video Generation via Next-Frame & Scale Prediction
by: Ji, Longbin, et al.
Published: (2026)
by: Ji, Longbin, et al.
Published: (2026)
Explanation-Driven Counterfactual Testing for Faithfulness in Vision-Language Model Explanations
by: Ding, Sihao, et al.
Published: (2025)
by: Ding, Sihao, et al.
Published: (2025)
XAI-Enhanced Semantic Segmentation Models for Visual Quality Inspection
by: Clement, Tobias, et al.
Published: (2024)
by: Clement, Tobias, et al.
Published: (2024)
Novel 3D Binary Indexed Tree for Volume Computation of 3D Reconstructed Models from Volumetric Data
by: Nguyen-Le, Quoc-Bao, et al.
Published: (2024)
by: Nguyen-Le, Quoc-Bao, et al.
Published: (2024)
RotCAtt-TransUNet++: Novel Deep Neural Network for Sophisticated Cardiac Segmentation
by: Nguyen-Le, Quoc-Bao, et al.
Published: (2024)
by: Nguyen-Le, Quoc-Bao, et al.
Published: (2024)
Fostering Video Reasoning via Next-Event Prediction
by: Wang, Haonan, et al.
Published: (2025)
by: Wang, Haonan, et al.
Published: (2025)
Overcoming the Curvature Bottleneck in MeanFlow
by: Zhang, Xinxi, et al.
Published: (2025)
by: Zhang, Xinxi, et al.
Published: (2025)
Enhancing the Fairness and Performance of Edge Cameras with Explainable AI
by: Nguyen, Truong Thanh Hung, et al.
Published: (2024)
by: Nguyen, Truong Thanh Hung, et al.
Published: (2024)
SwiftTry: Fast and Consistent Video Virtual Try-On with Diffusion Models
by: Nguyen, Hung, et al.
Published: (2024)
by: Nguyen, Hung, et al.
Published: (2024)
Predicting the Next Action by Modeling the Abstract Goal
by: Roy, Debaditya, et al.
Published: (2022)
by: Roy, Debaditya, et al.
Published: (2022)
GPT-4V Takes the Wheel: Promises and Challenges for Pedestrian Behavior Prediction
by: Huang, Jia, et al.
Published: (2023)
by: Huang, Jia, et al.
Published: (2023)
DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and Synchronization
by: Nguyen, Ngoc-Son, et al.
Published: (2026)
by: Nguyen, Ngoc-Son, et al.
Published: (2026)
Next Best View Selections for Semantic and Dynamic 3D Gaussian Splatting
by: Li, Yiqian, et al.
Published: (2025)
by: Li, Yiqian, et al.
Published: (2025)
A Novel Framework for Automated Explain Vision Model Using Vision-Language Models
by: Nguyen, Phu-Vinh, et al.
Published: (2025)
by: Nguyen, Phu-Vinh, et al.
Published: (2025)
Efficient and Concise Explanations for Object Detection with Gaussian-Class Activation Mapping Explainer
by: Nguyen, Quoc Khanh, et al.
Published: (2024)
by: Nguyen, Quoc Khanh, et al.
Published: (2024)
Semantic Causality-Aware Vision-Based 3D Occupancy Prediction
by: Chen, Dubing, et al.
Published: (2025)
by: Chen, Dubing, et al.
Published: (2025)
DisBeaNet: A Deep Neural Network to augment Unmanned Surface Vessels for maritime situational awareness
by: Vemula, Srikanth, et al.
Published: (2024)
by: Vemula, Srikanth, et al.
Published: (2024)
SDGOCC: Semantic and Depth-Guided Bird's-Eye View Transformation for 3D Multimodal Occupancy Prediction
by: Duan, Zaipeng, et al.
Published: (2025)
by: Duan, Zaipeng, et al.
Published: (2025)
Improving Zero-Shot Object-Level Change Detection by Incorporating Visual Correspondence
by: Nguyen, Hung Huy, et al.
Published: (2025)
by: Nguyen, Hung Huy, et al.
Published: (2025)
Power of Boundary and Reflection: Semantic Transparent Object Segmentation using Pyramid Vision Transformer with Transparent Cues
by: Vu, Tuan-Anh, et al.
Published: (2025)
by: Vu, Tuan-Anh, et al.
Published: (2025)
Next Block Prediction: Video Generation via Semi-Autoregressive Modeling
by: Ren, Shuhuai, et al.
Published: (2025)
by: Ren, Shuhuai, et al.
Published: (2025)
GAIS: Frame-Level Gated Audio-Visual Integration with Semantic Variance-Scaled Perturbation for Text-Video Retrieval
by: Yang, Bowen, et al.
Published: (2025)
by: Yang, Bowen, et al.
Published: (2025)
Representation Separation for Semantic Segmentation with Vision Transformers
by: Hong, Yuanduo, et al.
Published: (2022)
by: Hong, Yuanduo, et al.
Published: (2022)
XEdgeAI: A Human-centered Industrial Inspection Framework with Data-centric Explainable Edge AI Approach
by: Nguyen, Truong Thanh Hung, et al.
Published: (2024)
by: Nguyen, Truong Thanh Hung, et al.
Published: (2024)
GaussianFormer: Scene as Gaussians for Vision-Based 3D Semantic Occupancy Prediction
by: Huang, Yuanhui, et al.
Published: (2024)
by: Huang, Yuanhui, et al.
Published: (2024)
FrameMind: Frame-Interleaved Video Reasoning via Reinforcement Learning
by: Ge, Haonan, et al.
Published: (2025)
by: Ge, Haonan, et al.
Published: (2025)
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
by: Zhou, Chunting, et al.
Published: (2024)
by: Zhou, Chunting, et al.
Published: (2024)
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction
by: Tian, Keyu, et al.
Published: (2024)
by: Tian, Keyu, et al.
Published: (2024)
LangXAI: Integrating Large Vision Models for Generating Textual Explanations to Enhance Explainability in Visual Perception Tasks
by: Nguyen, Truong Thanh Hung, et al.
Published: (2024)
by: Nguyen, Truong Thanh Hung, et al.
Published: (2024)
M-LLM Based Video Frame Selection for Efficient Video Understanding
by: Hu, Kai, et al.
Published: (2025)
by: Hu, Kai, et al.
Published: (2025)
ECMNet:Lightweight Semantic Segmentation with Efficient CNN-Mamba Network
by: Du, Feixiang, et al.
Published: (2025)
by: Du, Feixiang, et al.
Published: (2025)
Similar Items
-
CSAOT: Cooperative Multi-Agent System for Active Object Tracking
by: Nguyen, Hy, et al.
Published: (2025) -
Contextual Knowledge Sharing in Multi-Agent Reinforcement Learning with Decentralized Communication and Coordination
by: Du, Hung, et al.
Published: (2025) -
Goal-Oriented Multi-Agent Reinforcement Learning for Decentralized Agent Teams
by: Du, Hung, et al.
Published: (2025) -
A Survey on Context-Aware Multi-Agent Systems: Techniques, Challenges and Future Directions
by: Du, Hung, et al.
Published: (2024) -
Local Control Networks (LCNs): Optimizing Flexibility in Neural Network Data Pattern Capture
by: Nguyen, Hy, et al.
Published: (2025)