:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Guo, Hongzhi
Format:	Preprint
Published:	2024
Subjects:	Multimedia
Online Access:	https://arxiv.org/abs/2411.12825
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Detecting Notational Errors in Digital Music Scores
by: Léo, Géré, et al.
Published: (2025)

An Efficient NVoD Scheme Using Implicit Error Correction and Subchannels for Wireless Networks
by: Asorey-Cacheda, Rafael, et al.
Published: (2025)

Robust Steganography with Boundary-Preserving Overflow Alleviation and Adaptive Error Correction
by: Cheng, Yu, et al.
Published: (2024)

Agentic Mixed-Source Multi-Modal Misinformation Detection with Adaptive Test-Time Scaling
by: Jiang, Wei, et al.
Published: (2026)

Speaker Embedding Informed Audiovisual Active Speaker Detection for Egocentric Recordings
by: Clarke, Jason, et al.
Published: (2025)

MF-AED-AEC: Speech Emotion Recognition by Leveraging Multimodal Fusion, Asr Error Detection, and Asr Error Correction
by: He, Jiajun, et al.
Published: (2024)

ASAP: Advancing Semantic Alignment Promotes Multi-Modal Manipulation Detecting and Grounding
by: Zhang, Zhenxing, et al.
Published: (2024)

Contextual Wireless Video Semantic Communication in MIMO-OFDM Systems
by: Xie, Bingyan, et al.
Published: (2026)

Mining the Social Fabric: Unveiling Communities for Fake News Detection in Short Videos
by: Gong, Haisong, et al.
Published: (2025)

Differential Mental Disorder Detection with Psychology-Inspired Multimodal Stimuli
by: Zhou, Zhiyuan, et al.
Published: (2026)

RoboKA: KAN Informed Multimodal Learning for RoboCall Surveillance System
by: Choudhury, Nitin, et al.
Published: (2026)

Design of a 5G Multimedia Broadcast Application Function Supporting Adaptive Error Recovery
by: Lentisco, C. M., et al.
Published: (2024)

Provably Secure Robust Image Steganography via Cross-Modal Error Correction
by: Qi, Yuang, et al.
Published: (2024)

Listening and Seeing Again: Generative Error Correction for Audio-Visual Speech Recognition
by: Liu, Rui, et al.
Published: (2025)

DAT: Dual-Aware Adaptive Transmission for Efficient Multimodal LLM Inference in Edge-Cloud Systems
by: Guo, Qi, et al.
Published: (2026)

Cross-Platform Neural Video Coding: A Case Study
by: Conceição, Ruhan, et al.
Published: (2024)

Structured Image-based Coding for Efficient Gaussian Splatting Compression
by: Martin, Pedro, et al.
Published: (2026)

Feature Coding in the Era of Large Models: Dataset, Test Conditions, and Benchmark
by: Gao, Changsheng, et al.
Published: (2024)

Enhancing Film Grain Coding in VVC: Improving Encoding Quality and Efficiency
by: Menon, Vignesh V, et al.
Published: (2024)

High Capacity Reversible Data Hiding for Encrypted 3D Mesh Models Based on Topology
by: Tang, Yun, et al.
Published: (2022)

Loss-resilient Coding of Texture and Depth for Free-viewpoint Video Conferencing
by: Macchiavello, Bruno, et al.
Published: (2013)

EyEar: Learning Audio Synchronized Human Gaze Trajectory Based on Physics-Informed Dynamics
by: Liu, Xiaochuan, et al.
Published: (2025)

Voxel-GS: Quantized Scaffold Gaussian Splatting Compression with Run-Length Coding
by: Fu, Chunyang, et al.
Published: (2025)

Inter-Frame Coding for Dynamic Meshes via Coarse-to-Fine Anchor Mesh Generation
by: Huang, He, et al.
Published: (2024)

High-level Codes and Fine-grained Weights for Online Multi-modal Hashing Retrieval
by: Zhan, Yu-Wei, et al.
Published: (2024)

Joint Optimization of Buffer Delay and HARQ for Video Communications
by: Cheng, Baoping, et al.
Published: (2024)

D$^2$-JSCC: Digital Deep Joint Source-channel Coding for Semantic Communications
by: Huang, Jianhao, et al.
Published: (2024)

Multimodal Semantic Communication for Generative Audio-Driven Video Conferencing
by: Tong, Haonan, et al.
Published: (2024)

MCPNS: A Macropixel Collocated Position and Its Neighbors Search for Plenoptic 2.0 Video Coding
by: Van Duong, Vinh, et al.
Published: (2023)

Audio-Visual Cross-Modal Compression for Generative Face Video Coding
by: Xu, Youmin, et al.
Published: (2025)

Latent Feature-Guided Conditional Diffusion for Generative Image Semantic Communication
by: Chen, Zehao, et al.
Published: (2025)

Enabling American Sign Language Communication Under Low Data Rates
by: Santhalingam, Panneer Selvam, et al.
Published: (2025)

RETRACTION: English Writing Correction Based on Intelligent Text Semantic Analysis
by: Advances in Multimedia
Published: (2025)

Language-oriented Semantic Communication for Image Transmission with Fine-Tuned Diffusion Model
by: Wei, Xinfeng, et al.
Published: (2024)

Efficient Geometry Compression and Communication for 3D Gaussian Splatting Point Clouds
by: Xie, Liang, et al.
Published: (2025)

Multiscale Feature Importance-based Bit Allocation for End-to-End Feature Coding for Machines
by: Liu, Junle, et al.
Published: (2025)

KEN: Knowledge Augmentation and Emotion Guidance Network for Multimodal Fake News Detection
by: Zhu, Peican, et al.
Published: (2025)

Exploring the Role of Audio in Multimodal Misinformation Detection
by: Liu, Moyang, et al.
Published: (2024)

ProMSC-MIS: Prompt-based Multimodal Semantic Communication for Multi-Spectral Image Segmentation
by: Zhang, Haoshuo, et al.
Published: (2025)

AI-Driven Virtual Teacher for Enhanced Educational Efficiency: Leveraging Large Pretrain Models for Autonomous Error Analysis and Correction
by: Xu, Tianlong, et al.
Published: (2024)