:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liang, Feng, Kodaira, Akio, Xu, Chenfeng, Tomizuka, Masayoshi, Keutzer, Kurt, Marculescu, Diana
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Multimedia
Online Access:	https://arxiv.org/abs/2405.15757
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Immiscible Diffusion: Accelerating Diffusion Training with Noise Assignment
by: Li, Yiheng, et al.
Published: (2024)

FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis
by: Liang, Feng, et al.
Published: (2023)

Improved Immiscible Diffusion: Accelerate Diffusion Training by Reducing Its Miscibility
by: Li, Yiheng, et al.
Published: (2025)

StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation
by: Kodaira, Akio, et al.
Published: (2023)

A Lesson in Splats: Teacher-Guided Diffusion for 3D Gaussian Splats Generation with 2D Supervision
by: Peng, Chensheng, et al.
Published: (2024)

Vision-Language Models Learn Super Images for Efficient Partially Relevant Video Retrieval
by: Nishimura, Taichi, et al.
Published: (2023)

StreamDiT: Real-Time Streaming Text-to-Video Generation
by: Kodaira, Akio, et al.
Published: (2025)

SSNVC: Single Stream Neural Video Compression with Implicit Temporal Information
by: Wang, Feng, et al.
Published: (2024)

Viewport Prediction for Volumetric Video Streaming by Exploring Video Saliency and Trajectory Information
by: Li, Jie, et al.
Published: (2023)

Adaptive 3D Gaussian Splatting Video Streaming
by: Gong, Han, et al.
Published: (2025)

StreamingEval: A Unified Evaluation Protocol towards Realistic Streaming Video Understanding
by: Tang, Guowei, et al.
Published: (2026)

StreamDiffusionV2: A Streaming System for Dynamic and Interactive Video Generation
by: Feng, Tianrui, et al.
Published: (2025)

SwinGS: Sliding Window Gaussian Splatting for Volumetric Video Streaming with Arbitrary Length
by: Liu, Bangya, et al.
Published: (2024)

High-Quality Live Video Streaming via Transcoding Time Prediction and Preset Selection
by: Shahre-Babak, Zahra Nabizadeh, et al.
Published: (2023)

Joint Flow And Feature Refinement Using Attention For Video Restoration
by: Merugu, Ranjith, et al.
Published: (2025)

In-Loop Filtering Using Learned Look-Up Tables for Video Coding
by: Li, Zhuoyuan, et al.
Published: (2025)

Accelerated Event-Based Feature Detection and Compression for Surveillance Video Systems
by: Freeman, Andrew C., et al.
Published: (2023)

GMFVAD: Using Grained Multi-modal Feature to Improve Video Anomaly Detection
by: Dai, Guangyu, et al.
Published: (2025)

MA-AVT: Modality Alignment for Parameter-Efficient Audio-Visual Transformers
by: Mahmud, Tanvir, et al.
Published: (2024)

Video Compression Meets Video Generation: Latent Inter-Frame Pruning with Attention Recovery
by: Menn, Dennis, et al.
Published: (2026)

Compression of 3D Gaussian Splatting with Optimized Feature Planes and Standard Video Codecs
by: Lee, Soonbin, et al.
Published: (2025)

On the Audio Hallucinations in Large Audio-Video Language Models
by: Nishimura, Taichi, et al.
Published: (2024)

VideoZeroBench: Probing the Limits of Video MLLMs with Spatio-Temporal Evidence Verification
by: Meng, Jiahao, et al.
Published: (2026)

Hallo-Live: Real-Time Streaming Joint Audio-Video Avatar Generation with Asynchronous Dual-Stream and Human-Centric Preference Distillation
by: Li, Chunyu, et al.
Published: (2026)

NeR-SC: Adapting Neural Video Representation to Screen Content
by: Shi, Ruohan, et al.
Published: (2026)

When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding
by: Zhang, Pingping, et al.
Published: (2024)

Zero-shot Video Moment Retrieval via Off-the-shelf Multimodal Large Language Models
by: Xu, Yifang, et al.
Published: (2025)

Consistency-aware Fake Videos Detection on Short Video Platforms
by: Wang, Junxi, et al.
Published: (2025)

A Survey on Generative AI and LLM for Video Generation, Understanding, and Streaming
by: Zhou, Pengyuan, et al.
Published: (2024)

Rate-aware Compression for NeRF-based Volumetric Video
by: Zhang, Zhiyu, et al.
Published: (2024)

Generative Frame Sampler for Long Video Understanding
by: Yao, Linli, et al.
Published: (2025)

WVSC: Wireless Video Semantic Communication with Multi-frame Compensation
by: Xie, Bingyan, et al.
Published: (2025)

KeyVideoLLM: Towards Large-scale Video Keyframe Selection
by: Liang, Hao, et al.
Published: (2024)

VideoForest: Person-Anchored Hierarchical Reasoning for Cross-Video Question Answering
by: Meng, Yiran, et al.
Published: (2025)

MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding
by: Fang, Xinyu, et al.
Published: (2024)

VideoMem: Constructing, Analyzing, Predicting Short-term and Long-term Video Memorability
by: Cohendet, Romain, et al.
Published: (2018)

VCEval: Rethinking What is a Good Educational Video and How to Automatically Evaluate It
by: Zhu, Xiaoxuan, et al.
Published: (2024)

Scalable Event-Based Video Streaming for Machines with MoQ
by: Freeman, Andrew C.
Published: (2025)

Hierarchical Action Recognition: A Contrastive Video-Language Approach with Hierarchical Interactions
by: Zhang, Rui, et al.
Published: (2024)

A Human-Annotated Video Dataset for Training and Evaluation of 360-Degree Video Summarization Methods
by: Kontostathis, Ioannis, et al.
Published: (2024)