:: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Van Duong, Vinh, Huu, Thuc Nguyen, Yim, Jonghoon, Jeon, Byeungwoo
Format:	Preprint
Published:	2023
Subjects:	Multimedia
Online Access:	https://arxiv.org/abs/2310.08006
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Proposing Smart System for Detecting and Monitoring Vehicle Using Multiobject Multicamera Tracking
by: Phat Nguyen Huu, et al.
Published: (2024)

Multimodal LLM-based Query Paraphrasing for Video Search
by: Wu, Jiaxin, et al.
Published: (2024)

Cross-Platform Neural Video Coding: A Case Study
by: Conceição, Ruhan, et al.
Published: (2024)

KAN-Based Fusion of Dual-Domain for Audio-Driven Facial Landmarks Generation
by: Vo-Thanh, Hoang-Son, et al.
Published: (2024)

CLIPRerank: An Extremely Simple Method for Improving Ad-hoc Video Search
by: Chen, Aozhu, et al.
Published: (2024)

Modeling the Impacts of Swipe Delay on User Quality of Experience in Short Video Streaming
by: Nguyen, Duc V., et al.
Published: (2026)

Loss-resilient Coding of Texture and Depth for Free-viewpoint Video Conferencing
by: Macchiavello, Bruno, et al.
Published: (2013)

Fact-Checking at Scale: Multimodal AI for Authenticity and Context Verification in Online Media
by: Phan, Van-Hoang, et al.
Published: (2025)

MTAVG-Bench 2.0: Diagnosing Failure Modes of Cinematic Expressiveness in Multi-Talker Audio-Video Generation
by: Li, Haitian, et al.
Published: (2026)

MFQE 2.0: A New Approach for Multi-frame Quality Enhancement on Compressed Video
by: Xing, Qunliang, et al.
Published: (2019)

Integrated Semantic and Temporal Alignment for Interactive Video Retrieval
by: Luu, Thanh-Danh, et al.
Published: (2025)

FedMAC: Tackling Partial-Modality Missing in Federated Learning with Cross-Modal Aggregation and Contrastive Regularization
by: Nguyen, Manh Duong, et al.
Published: (2024)

Ada2I: Enhancing Modality Balance for Multimodal Conversational Emotion Recognition
by: Nguyen, Cam-Van Thi, et al.
Published: (2024)

VideoMem: Constructing, Analyzing, Predicting Short-term and Long-term Video Memorability
by: Cohendet, Romain, et al.
Published: (2018)

A Dual-Module Denoising Approach with Curriculum Learning for Enhancing Multimodal Aspect-Based Sentiment Analysis
by: Van Doan, Nguyen, et al.
Published: (2024)

diveXplore 6.0: ITEC's Interactive Video Exploration System at VBS 2022
by: Leibetseder, Andreas, et al.
Published: (2025)

Robust Relevance Feedback for Interactive Known-Item Video Search
by: Ma, Zhixin, et al.
Published: (2025)

Transform and Entropy Coding in AV2
by: Nalci, Alican, et al.
Published: (2026)

BC-GAN: A Generative Adversarial Network for Synthesizing a Batch of Collocated Clothing
by: Zhou, Dongliang, et al.
Published: (2025)

Symmetric Entropy-Constrained Video Coding for Machines
by: Sun, Yuxiao, et al.
Published: (2025)

Short-Form Video Viewing Behavior Analysis and Multi-Step Viewing Time Prediction
by: Yen, Vu Thi Hai, et al.
Published: (2026)

Bi-modal Prediction and Transformation Coding for Compressing Complex Human Dynamics
by: Hoang, Huong, et al.
Published: (2025)

When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding
by: Zhang, Pingping, et al.
Published: (2024)

Contrast then Memorize: Semantic Neighbor Retrieval-Enhanced Inductive Multimodal Knowledge Graph Completion
by: Zhao, Yu, et al.
Published: (2024)

Interpretable Embedding for Ad-hoc Video Search
by: Wu, Jiaxin, et al.
Published: (2024)

How2Compress: Scalable and Efficient Edge Video Analytics via Adaptive Granular Video Compression
by: Wu, Yuheng, et al.
Published: (2025)

FCBoost-Net: A Generative Network for Synthesizing Multiple Collocated Outfits via Fashion Compatibility Boosting
by: Zhou, Dongliang, et al.
Published: (2025)

Adaptive Resolution and Chroma Subsampling for Energy-Efficient Video Coding
by: Premkumar, Amritha, et al.
Published: (2026)

Convex-hull Estimation using XPSNR for Versatile Video Coding
by: Menon, Vignesh V, et al.
Published: (2024)

Reversible Video Steganography Using Quick Response Codes and Modified ElGamal Cryptosystem
by: Mstafa, Ramadhan J.
Published: (2025)

Neural B-Frame Coding: Tackling Domain Shift Issues with Lightweight Online Motion Resolution Adaptation
by: NguyenQuang, Sang, et al.
Published: (2025)

Cap2Sum: Learning to Summarize Videos by Generating Captions
by: Zhao, Cairong, et al.
Published: (2024)

Audio-Visual Cross-Modal Compression for Generative Face Video Coding
by: Xu, Youmin, et al.
Published: (2025)

MIntRec2.0: A Large-scale Benchmark Dataset for Multimodal Intent Recognition and Out-of-scope Detection in Conversations
by: Zhang, Hanlei, et al.
Published: (2024)

Synchronized Video Storytelling: Generating Video Narrations with Structured Storyline
by: Yang, Dingyi, et al.
Published: (2024)

lifeXplore at the Lifelog Search Challenge 2020
by: Leibetseder, Andreas, et al.
Published: (2025)

DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and Synchronization
by: Nguyen, Ngoc-Son, et al.
Published: (2026)

Sec2Sec Co-attention for Video-Based Apparent Affective Prediction
by: Sun, Mingwei, et al.
Published: (2024)

Towards Signboard-Oriented Visual Question Answering: ViSignVQA Dataset, Method and Benchmark
by: Nguyen, Hieu Minh, et al.
Published: (2025)

Subjective Quality Assessment of Dynamic 3D Meshes in Virtual Reality Environment
by: Nguyen, Duc V., et al.
Published: (2026)