Saved in:
| Main Authors: | Van Duong, Vinh, Huu, Thuc Nguyen, Yim, Jonghoon, Jeon, Byeungwoo |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2310.08006 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Proposing Smart System for Detecting and Monitoring Vehicle Using Multiobject Multicamera Tracking
by: Phat Nguyen Huu, et al.
Published: (2024)
by: Phat Nguyen Huu, et al.
Published: (2024)
Multimodal LLM-based Query Paraphrasing for Video Search
by: Wu, Jiaxin, et al.
Published: (2024)
by: Wu, Jiaxin, et al.
Published: (2024)
Cross-Platform Neural Video Coding: A Case Study
by: Conceição, Ruhan, et al.
Published: (2024)
by: Conceição, Ruhan, et al.
Published: (2024)
KAN-Based Fusion of Dual-Domain for Audio-Driven Facial Landmarks Generation
by: Vo-Thanh, Hoang-Son, et al.
Published: (2024)
by: Vo-Thanh, Hoang-Son, et al.
Published: (2024)
CLIPRerank: An Extremely Simple Method for Improving Ad-hoc Video Search
by: Chen, Aozhu, et al.
Published: (2024)
by: Chen, Aozhu, et al.
Published: (2024)
Modeling the Impacts of Swipe Delay on User Quality of Experience in Short Video Streaming
by: Nguyen, Duc V., et al.
Published: (2026)
by: Nguyen, Duc V., et al.
Published: (2026)
Loss-resilient Coding of Texture and Depth for Free-viewpoint Video Conferencing
by: Macchiavello, Bruno, et al.
Published: (2013)
by: Macchiavello, Bruno, et al.
Published: (2013)
Fact-Checking at Scale: Multimodal AI for Authenticity and Context Verification in Online Media
by: Phan, Van-Hoang, et al.
Published: (2025)
by: Phan, Van-Hoang, et al.
Published: (2025)
MTAVG-Bench 2.0: Diagnosing Failure Modes of Cinematic Expressiveness in Multi-Talker Audio-Video Generation
by: Li, Haitian, et al.
Published: (2026)
by: Li, Haitian, et al.
Published: (2026)
MFQE 2.0: A New Approach for Multi-frame Quality Enhancement on Compressed Video
by: Xing, Qunliang, et al.
Published: (2019)
by: Xing, Qunliang, et al.
Published: (2019)
Integrated Semantic and Temporal Alignment for Interactive Video Retrieval
by: Luu, Thanh-Danh, et al.
Published: (2025)
by: Luu, Thanh-Danh, et al.
Published: (2025)
FedMAC: Tackling Partial-Modality Missing in Federated Learning with Cross-Modal Aggregation and Contrastive Regularization
by: Nguyen, Manh Duong, et al.
Published: (2024)
by: Nguyen, Manh Duong, et al.
Published: (2024)
Ada2I: Enhancing Modality Balance for Multimodal Conversational Emotion Recognition
by: Nguyen, Cam-Van Thi, et al.
Published: (2024)
by: Nguyen, Cam-Van Thi, et al.
Published: (2024)
VideoMem: Constructing, Analyzing, Predicting Short-term and Long-term Video Memorability
by: Cohendet, Romain, et al.
Published: (2018)
by: Cohendet, Romain, et al.
Published: (2018)
A Dual-Module Denoising Approach with Curriculum Learning for Enhancing Multimodal Aspect-Based Sentiment Analysis
by: Van Doan, Nguyen, et al.
Published: (2024)
by: Van Doan, Nguyen, et al.
Published: (2024)
diveXplore 6.0: ITEC's Interactive Video Exploration System at VBS 2022
by: Leibetseder, Andreas, et al.
Published: (2025)
by: Leibetseder, Andreas, et al.
Published: (2025)
Robust Relevance Feedback for Interactive Known-Item Video Search
by: Ma, Zhixin, et al.
Published: (2025)
by: Ma, Zhixin, et al.
Published: (2025)
Transform and Entropy Coding in AV2
by: Nalci, Alican, et al.
Published: (2026)
by: Nalci, Alican, et al.
Published: (2026)
BC-GAN: A Generative Adversarial Network for Synthesizing a Batch of Collocated Clothing
by: Zhou, Dongliang, et al.
Published: (2025)
by: Zhou, Dongliang, et al.
Published: (2025)
Symmetric Entropy-Constrained Video Coding for Machines
by: Sun, Yuxiao, et al.
Published: (2025)
by: Sun, Yuxiao, et al.
Published: (2025)
Short-Form Video Viewing Behavior Analysis and Multi-Step Viewing Time Prediction
by: Yen, Vu Thi Hai, et al.
Published: (2026)
by: Yen, Vu Thi Hai, et al.
Published: (2026)
Bi-modal Prediction and Transformation Coding for Compressing Complex Human Dynamics
by: Hoang, Huong, et al.
Published: (2025)
by: Hoang, Huong, et al.
Published: (2025)
When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding
by: Zhang, Pingping, et al.
Published: (2024)
by: Zhang, Pingping, et al.
Published: (2024)
Contrast then Memorize: Semantic Neighbor Retrieval-Enhanced Inductive Multimodal Knowledge Graph Completion
by: Zhao, Yu, et al.
Published: (2024)
by: Zhao, Yu, et al.
Published: (2024)
Interpretable Embedding for Ad-hoc Video Search
by: Wu, Jiaxin, et al.
Published: (2024)
by: Wu, Jiaxin, et al.
Published: (2024)
How2Compress: Scalable and Efficient Edge Video Analytics via Adaptive Granular Video Compression
by: Wu, Yuheng, et al.
Published: (2025)
by: Wu, Yuheng, et al.
Published: (2025)
FCBoost-Net: A Generative Network for Synthesizing Multiple Collocated Outfits via Fashion Compatibility Boosting
by: Zhou, Dongliang, et al.
Published: (2025)
by: Zhou, Dongliang, et al.
Published: (2025)
Adaptive Resolution and Chroma Subsampling for Energy-Efficient Video Coding
by: Premkumar, Amritha, et al.
Published: (2026)
by: Premkumar, Amritha, et al.
Published: (2026)
Convex-hull Estimation using XPSNR for Versatile Video Coding
by: Menon, Vignesh V, et al.
Published: (2024)
by: Menon, Vignesh V, et al.
Published: (2024)
Reversible Video Steganography Using Quick Response Codes and Modified ElGamal Cryptosystem
by: Mstafa, Ramadhan J.
Published: (2025)
by: Mstafa, Ramadhan J.
Published: (2025)
Neural B-Frame Coding: Tackling Domain Shift Issues with Lightweight Online Motion Resolution Adaptation
by: NguyenQuang, Sang, et al.
Published: (2025)
by: NguyenQuang, Sang, et al.
Published: (2025)
Cap2Sum: Learning to Summarize Videos by Generating Captions
by: Zhao, Cairong, et al.
Published: (2024)
by: Zhao, Cairong, et al.
Published: (2024)
Audio-Visual Cross-Modal Compression for Generative Face Video Coding
by: Xu, Youmin, et al.
Published: (2025)
by: Xu, Youmin, et al.
Published: (2025)
MIntRec2.0: A Large-scale Benchmark Dataset for Multimodal Intent Recognition and Out-of-scope Detection in Conversations
by: Zhang, Hanlei, et al.
Published: (2024)
by: Zhang, Hanlei, et al.
Published: (2024)
Synchronized Video Storytelling: Generating Video Narrations with Structured Storyline
by: Yang, Dingyi, et al.
Published: (2024)
by: Yang, Dingyi, et al.
Published: (2024)
lifeXplore at the Lifelog Search Challenge 2020
by: Leibetseder, Andreas, et al.
Published: (2025)
by: Leibetseder, Andreas, et al.
Published: (2025)
DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and Synchronization
by: Nguyen, Ngoc-Son, et al.
Published: (2026)
by: Nguyen, Ngoc-Son, et al.
Published: (2026)
Sec2Sec Co-attention for Video-Based Apparent Affective Prediction
by: Sun, Mingwei, et al.
Published: (2024)
by: Sun, Mingwei, et al.
Published: (2024)
Towards Signboard-Oriented Visual Question Answering: ViSignVQA Dataset, Method and Benchmark
by: Nguyen, Hieu Minh, et al.
Published: (2025)
by: Nguyen, Hieu Minh, et al.
Published: (2025)
Subjective Quality Assessment of Dynamic 3D Meshes in Virtual Reality Environment
by: Nguyen, Duc V., et al.
Published: (2026)
by: Nguyen, Duc V., et al.
Published: (2026)
Similar Items
-
Proposing Smart System for Detecting and Monitoring Vehicle Using Multiobject Multicamera Tracking
by: Phat Nguyen Huu, et al.
Published: (2024) -
Multimodal LLM-based Query Paraphrasing for Video Search
by: Wu, Jiaxin, et al.
Published: (2024) -
Cross-Platform Neural Video Coding: A Case Study
by: Conceição, Ruhan, et al.
Published: (2024) -
KAN-Based Fusion of Dual-Domain for Audio-Driven Facial Landmarks Generation
by: Vo-Thanh, Hoang-Son, et al.
Published: (2024) -
CLIPRerank: An Extremely Simple Method for Improving Ad-hoc Video Search
by: Chen, Aozhu, et al.
Published: (2024)