Saved in:
| Main Authors: | Liang, Feng, Kodaira, Akio, Xu, Chenfeng, Tomizuka, Masayoshi, Keutzer, Kurt, Marculescu, Diana |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2405.15757 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Immiscible Diffusion: Accelerating Diffusion Training with Noise Assignment
by: Li, Yiheng, et al.
Published: (2024)
by: Li, Yiheng, et al.
Published: (2024)
FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis
by: Liang, Feng, et al.
Published: (2023)
by: Liang, Feng, et al.
Published: (2023)
Improved Immiscible Diffusion: Accelerate Diffusion Training by Reducing Its Miscibility
by: Li, Yiheng, et al.
Published: (2025)
by: Li, Yiheng, et al.
Published: (2025)
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation
by: Kodaira, Akio, et al.
Published: (2023)
by: Kodaira, Akio, et al.
Published: (2023)
A Lesson in Splats: Teacher-Guided Diffusion for 3D Gaussian Splats Generation with 2D Supervision
by: Peng, Chensheng, et al.
Published: (2024)
by: Peng, Chensheng, et al.
Published: (2024)
Vision-Language Models Learn Super Images for Efficient Partially Relevant Video Retrieval
by: Nishimura, Taichi, et al.
Published: (2023)
by: Nishimura, Taichi, et al.
Published: (2023)
StreamDiT: Real-Time Streaming Text-to-Video Generation
by: Kodaira, Akio, et al.
Published: (2025)
by: Kodaira, Akio, et al.
Published: (2025)
SSNVC: Single Stream Neural Video Compression with Implicit Temporal Information
by: Wang, Feng, et al.
Published: (2024)
by: Wang, Feng, et al.
Published: (2024)
Viewport Prediction for Volumetric Video Streaming by Exploring Video Saliency and Trajectory Information
by: Li, Jie, et al.
Published: (2023)
by: Li, Jie, et al.
Published: (2023)
Adaptive 3D Gaussian Splatting Video Streaming
by: Gong, Han, et al.
Published: (2025)
by: Gong, Han, et al.
Published: (2025)
StreamingEval: A Unified Evaluation Protocol towards Realistic Streaming Video Understanding
by: Tang, Guowei, et al.
Published: (2026)
by: Tang, Guowei, et al.
Published: (2026)
StreamDiffusionV2: A Streaming System for Dynamic and Interactive Video Generation
by: Feng, Tianrui, et al.
Published: (2025)
by: Feng, Tianrui, et al.
Published: (2025)
SwinGS: Sliding Window Gaussian Splatting for Volumetric Video Streaming with Arbitrary Length
by: Liu, Bangya, et al.
Published: (2024)
by: Liu, Bangya, et al.
Published: (2024)
High-Quality Live Video Streaming via Transcoding Time Prediction and Preset Selection
by: Shahre-Babak, Zahra Nabizadeh, et al.
Published: (2023)
by: Shahre-Babak, Zahra Nabizadeh, et al.
Published: (2023)
Joint Flow And Feature Refinement Using Attention For Video Restoration
by: Merugu, Ranjith, et al.
Published: (2025)
by: Merugu, Ranjith, et al.
Published: (2025)
In-Loop Filtering Using Learned Look-Up Tables for Video Coding
by: Li, Zhuoyuan, et al.
Published: (2025)
by: Li, Zhuoyuan, et al.
Published: (2025)
Accelerated Event-Based Feature Detection and Compression for Surveillance Video Systems
by: Freeman, Andrew C., et al.
Published: (2023)
by: Freeman, Andrew C., et al.
Published: (2023)
GMFVAD: Using Grained Multi-modal Feature to Improve Video Anomaly Detection
by: Dai, Guangyu, et al.
Published: (2025)
by: Dai, Guangyu, et al.
Published: (2025)
MA-AVT: Modality Alignment for Parameter-Efficient Audio-Visual Transformers
by: Mahmud, Tanvir, et al.
Published: (2024)
by: Mahmud, Tanvir, et al.
Published: (2024)
Video Compression Meets Video Generation: Latent Inter-Frame Pruning with Attention Recovery
by: Menn, Dennis, et al.
Published: (2026)
by: Menn, Dennis, et al.
Published: (2026)
Compression of 3D Gaussian Splatting with Optimized Feature Planes and Standard Video Codecs
by: Lee, Soonbin, et al.
Published: (2025)
by: Lee, Soonbin, et al.
Published: (2025)
On the Audio Hallucinations in Large Audio-Video Language Models
by: Nishimura, Taichi, et al.
Published: (2024)
by: Nishimura, Taichi, et al.
Published: (2024)
VideoZeroBench: Probing the Limits of Video MLLMs with Spatio-Temporal Evidence Verification
by: Meng, Jiahao, et al.
Published: (2026)
by: Meng, Jiahao, et al.
Published: (2026)
Hallo-Live: Real-Time Streaming Joint Audio-Video Avatar Generation with Asynchronous Dual-Stream and Human-Centric Preference Distillation
by: Li, Chunyu, et al.
Published: (2026)
by: Li, Chunyu, et al.
Published: (2026)
NeR-SC: Adapting Neural Video Representation to Screen Content
by: Shi, Ruohan, et al.
Published: (2026)
by: Shi, Ruohan, et al.
Published: (2026)
When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding
by: Zhang, Pingping, et al.
Published: (2024)
by: Zhang, Pingping, et al.
Published: (2024)
Zero-shot Video Moment Retrieval via Off-the-shelf Multimodal Large Language Models
by: Xu, Yifang, et al.
Published: (2025)
by: Xu, Yifang, et al.
Published: (2025)
Consistency-aware Fake Videos Detection on Short Video Platforms
by: Wang, Junxi, et al.
Published: (2025)
by: Wang, Junxi, et al.
Published: (2025)
A Survey on Generative AI and LLM for Video Generation, Understanding, and Streaming
by: Zhou, Pengyuan, et al.
Published: (2024)
by: Zhou, Pengyuan, et al.
Published: (2024)
Rate-aware Compression for NeRF-based Volumetric Video
by: Zhang, Zhiyu, et al.
Published: (2024)
by: Zhang, Zhiyu, et al.
Published: (2024)
Generative Frame Sampler for Long Video Understanding
by: Yao, Linli, et al.
Published: (2025)
by: Yao, Linli, et al.
Published: (2025)
WVSC: Wireless Video Semantic Communication with Multi-frame Compensation
by: Xie, Bingyan, et al.
Published: (2025)
by: Xie, Bingyan, et al.
Published: (2025)
KeyVideoLLM: Towards Large-scale Video Keyframe Selection
by: Liang, Hao, et al.
Published: (2024)
by: Liang, Hao, et al.
Published: (2024)
VideoForest: Person-Anchored Hierarchical Reasoning for Cross-Video Question Answering
by: Meng, Yiran, et al.
Published: (2025)
by: Meng, Yiran, et al.
Published: (2025)
MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding
by: Fang, Xinyu, et al.
Published: (2024)
by: Fang, Xinyu, et al.
Published: (2024)
VideoMem: Constructing, Analyzing, Predicting Short-term and Long-term Video Memorability
by: Cohendet, Romain, et al.
Published: (2018)
by: Cohendet, Romain, et al.
Published: (2018)
VCEval: Rethinking What is a Good Educational Video and How to Automatically Evaluate It
by: Zhu, Xiaoxuan, et al.
Published: (2024)
by: Zhu, Xiaoxuan, et al.
Published: (2024)
Scalable Event-Based Video Streaming for Machines with MoQ
by: Freeman, Andrew C.
Published: (2025)
by: Freeman, Andrew C.
Published: (2025)
Hierarchical Action Recognition: A Contrastive Video-Language Approach with Hierarchical Interactions
by: Zhang, Rui, et al.
Published: (2024)
by: Zhang, Rui, et al.
Published: (2024)
A Human-Annotated Video Dataset for Training and Evaluation of 360-Degree Video Summarization Methods
by: Kontostathis, Ioannis, et al.
Published: (2024)
by: Kontostathis, Ioannis, et al.
Published: (2024)
Similar Items
-
Immiscible Diffusion: Accelerating Diffusion Training with Noise Assignment
by: Li, Yiheng, et al.
Published: (2024) -
FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis
by: Liang, Feng, et al.
Published: (2023) -
Improved Immiscible Diffusion: Accelerate Diffusion Training by Reducing Its Miscibility
by: Li, Yiheng, et al.
Published: (2025) -
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation
by: Kodaira, Akio, et al.
Published: (2023) -
A Lesson in Splats: Teacher-Guided Diffusion for 3D Gaussian Splats Generation with 2D Supervision
by: Peng, Chensheng, et al.
Published: (2024)