Saved in:
| Main Authors: | Jin, Liuyi, Haroon, Amran, Stoleru, Radu, Gunawardena, Pasan, Middleton, Michael, Kim, Jeeeun |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.14119 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
A Smart-Glasses for Emergency Medical Services via Multimodal Multitask Learning
by: Jin, Liuyi, et al.
Published: (2025)
by: Jin, Liuyi, et al.
Published: (2025)
ERIC: Estimating Rainfall with Commodity Doorbell Camera for Precision Residential Irrigation
by: Liu, Tian, et al.
Published: (2024)
by: Liu, Tian, et al.
Published: (2024)
AxiomVision: Accuracy-Guaranteed Adaptive Visual Model Selection for Perspective-Aware Video Analytics
by: Dai, Xiangxiang, et al.
Published: (2024)
by: Dai, Xiangxiang, et al.
Published: (2024)
AI-Integrated Decision Support System for Real-Time Market Growth Forecasting and Multi-Source Content Diffusion Analytics
by: Yin, Ziqing, et al.
Published: (2025)
by: Yin, Ziqing, et al.
Published: (2025)
Semantic-Guided Unsupervised Video Summarization
by: Liu, Haizhou, et al.
Published: (2026)
by: Liu, Haizhou, et al.
Published: (2026)
GRACE: Loss-Resilient Real-Time Video through Neural Codecs
by: Cheng, Yihua, et al.
Published: (2023)
by: Cheng, Yihua, et al.
Published: (2023)
PromptMobile: Efficient Promptus for Low Bandwidth Mobile Video Streaming
by: Liu, Liming, et al.
Published: (2025)
by: Liu, Liming, et al.
Published: (2025)
EVER: Edge-Assisted Auto-Verification for Mobile MR-Aided Operation
by: Chen, Jiangong, et al.
Published: (2025)
by: Chen, Jiangong, et al.
Published: (2025)
RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training
by: Ding, Muhe, et al.
Published: (2024)
by: Ding, Muhe, et al.
Published: (2024)
Towards Real-Time Neural Volumetric Rendering on Mobile Devices: A Measurement Study
by: Wang, Zhe, et al.
Published: (2024)
by: Wang, Zhe, et al.
Published: (2024)
Towards Open-Vocabulary Video Semantic Segmentation
by: Li, Xinhao, et al.
Published: (2024)
by: Li, Xinhao, et al.
Published: (2024)
Semantic Communication-Enabled Cloud-Edge-End-collaborative Metaverse Services Architecure
by: Li, Yuxuan, et al.
Published: (2025)
by: Li, Yuxuan, et al.
Published: (2025)
Human-in-the-Loop Bandwidth Estimation for Quality of Experience Optimization in Real-Time Video Communication
by: Khairy, Sami, et al.
Published: (2025)
by: Khairy, Sami, et al.
Published: (2025)
MM-HSD: Multi-Modal Hate Speech Detection in Videos
by: Céspedes-Sarrias, Berta, et al.
Published: (2025)
by: Céspedes-Sarrias, Berta, et al.
Published: (2025)
Wireless Video Semantic Communication with Decoupled Diffusion Multi-frame Compensation
by: Xie, Bingyan, et al.
Published: (2025)
by: Xie, Bingyan, et al.
Published: (2025)
Multimodal Emotion Recognition by Fusing Video Semantic in MOOC Learning Scenarios
by: Zhang, Yuan, et al.
Published: (2024)
by: Zhang, Yuan, et al.
Published: (2024)
Attention of a Kiss: Exploring Attention Maps in Video Diffusion for XAIxArts
by: Cole, Adam, et al.
Published: (2025)
by: Cole, Adam, et al.
Published: (2025)
QMAVIS: Long Video-Audio Understanding using Fusion of Large Multimodal Models
by: Lin, Zixing, et al.
Published: (2026)
by: Lin, Zixing, et al.
Published: (2026)
Exposing Cross-Modal Consistency for Fake News Detection in Short-Form Videos
by: Tian, Chong, et al.
Published: (2026)
by: Tian, Chong, et al.
Published: (2026)
LL-GABR: Energy Efficient Live Video Streaming Using Reinforcement Learning
by: Raman, Adithya, et al.
Published: (2024)
by: Raman, Adithya, et al.
Published: (2024)
Seeing Further and Wider: Joint Spatio-Temporal Enlargement for Micro-Video Popularity Prediction
by: Wang, Dali, et al.
Published: (2026)
by: Wang, Dali, et al.
Published: (2026)
End-to-End Learning-based Video Streaming Enhancement Pipeline: A Generative AI Approach
by: Artioli, Emanuele, et al.
Published: (2025)
by: Artioli, Emanuele, et al.
Published: (2025)
Plasticity-Aware Mixture of Experts for Learning Under QoE Shifts in Adaptive Video Streaming
by: He, Zhiqiang, et al.
Published: (2025)
by: He, Zhiqiang, et al.
Published: (2025)
AsyncVoice Agent: Real-Time Explanation for LLM Planning and Reasoning
by: Lin, Yueqian, et al.
Published: (2025)
by: Lin, Yueqian, et al.
Published: (2025)
Solving Copyright Infringement on Short Video Platforms: Novel Datasets and an Audio Restoration Deep Learning Pipeline
by: Oh, Minwoo, et al.
Published: (2025)
by: Oh, Minwoo, et al.
Published: (2025)
BDIQA: A New Dataset for Video Question Answering to Explore Cognitive Reasoning through Theory of Mind
by: Mao, Yuanyuan, et al.
Published: (2024)
by: Mao, Yuanyuan, et al.
Published: (2024)
Pianist Transformer: Towards Expressive Piano Performance Rendering via Scalable Self-Supervised Pre-Training
by: You, Hong-Jie, et al.
Published: (2025)
by: You, Hong-Jie, et al.
Published: (2025)
V-Rex: Real-Time Streaming Video LLM Acceleration via Dynamic KV Cache Retrieval
by: Kim, Donghyuk, et al.
Published: (2025)
by: Kim, Donghyuk, et al.
Published: (2025)
MMTB: Evaluating Terminal Agents on Multimedia-File Tasks
by: Heo, Chiyeong, et al.
Published: (2026)
by: Heo, Chiyeong, et al.
Published: (2026)
MTAVG-Bench 2.0: Diagnosing Failure Modes of Cinematic Expressiveness in Multi-Talker Audio-Video Generation
by: Li, Haitian, et al.
Published: (2026)
by: Li, Haitian, et al.
Published: (2026)
Co-Director: Agentic Generative Video Storytelling
by: Song, Yale, et al.
Published: (2026)
by: Song, Yale, et al.
Published: (2026)
Stage Light is Sequence$^2$: Multi-Light Control via Imitation Learning
by: Zhao, Zijian, et al.
Published: (2026)
by: Zhao, Zijian, et al.
Published: (2026)
Towards Automatic Soccer Commentary Generation with Knowledge-Enhanced Visual Reasoning
by: Jin, Zeyu, et al.
Published: (2026)
by: Jin, Zeyu, et al.
Published: (2026)
DeLoad: Demand-Driven Short-Video Preloading with Scalable Watch-Time Estimation
by: Liu, Tong, et al.
Published: (2025)
by: Liu, Tong, et al.
Published: (2025)
HiQuE: Hierarchical Question Embedding Network for Multimodal Depression Detection
by: Jung, Juho, et al.
Published: (2024)
by: Jung, Juho, et al.
Published: (2024)
A Multimedia Analytics Model for the Foundation Model Era
by: Worring, Marcel, et al.
Published: (2025)
by: Worring, Marcel, et al.
Published: (2025)
MAVOS-DD: Multilingual Audio-Video Open-Set Deepfake Detection Benchmark
by: Croitoru, Florinel-Alin, et al.
Published: (2025)
by: Croitoru, Florinel-Alin, et al.
Published: (2025)
Controllable Video-to-Music Generation with Multiple Time-Varying Conditions
by: Wu, Junxian, et al.
Published: (2025)
by: Wu, Junxian, et al.
Published: (2025)
LazyVLM: Neuro-Symbolic Approach to Video Analytics
by: Jian, Xiangru, et al.
Published: (2025)
by: Jian, Xiangru, et al.
Published: (2025)
Lightning Fast Video Anomaly Detection via Adversarial Knowledge Distillation
by: Croitoru, Florinel-Alin, et al.
Published: (2022)
by: Croitoru, Florinel-Alin, et al.
Published: (2022)
Similar Items
-
A Smart-Glasses for Emergency Medical Services via Multimodal Multitask Learning
by: Jin, Liuyi, et al.
Published: (2025) -
ERIC: Estimating Rainfall with Commodity Doorbell Camera for Precision Residential Irrigation
by: Liu, Tian, et al.
Published: (2024) -
AxiomVision: Accuracy-Guaranteed Adaptive Visual Model Selection for Perspective-Aware Video Analytics
by: Dai, Xiangxiang, et al.
Published: (2024) -
AI-Integrated Decision Support System for Real-Time Market Growth Forecasting and Multi-Source Content Diffusion Analytics
by: Yin, Ziqing, et al.
Published: (2025) -
Semantic-Guided Unsupervised Video Summarization
by: Liu, Haizhou, et al.
Published: (2026)