Saved in:
| Main Authors: | Zhao, Fei, Guo, Mengxi, Zhao, Shijie, Li, Junlin, Zhang, Li, Xie, Xiaodong |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.15331 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
A Tri-Dynamic Preprocessing Framework for UGC Video Compression
by: Zhao, Fei, et al.
Published: (2025)
by: Zhao, Fei, et al.
Published: (2025)
Generative Preprocessing for Image Compression with Pre-trained Diffusion Models
by: Guo, Mengxi, et al.
Published: (2025)
by: Guo, Mengxi, et al.
Published: (2025)
Optimal Transcoding Resolution Prediction for Efficient Per-Title Bitrate Ladder Estimation
by: Yang, Jinhai, et al.
Published: (2024)
by: Yang, Jinhai, et al.
Published: (2024)
ROI-Guided Point Cloud Geometry Compression Towards Human and Machine Vision
by: Liang, Xie, et al.
Published: (2025)
by: Liang, Xie, et al.
Published: (2025)
ChatVTG: Video Temporal Grounding via Chat with Video Dialogue Large Language Models
by: Qu, Mengxue, et al.
Published: (2024)
by: Qu, Mengxue, et al.
Published: (2024)
Frequency-Assisted Adaptive Sharpening Scheme Considering Bitrate and Quality Tradeoff
by: Pang, Yingxue, et al.
Published: (2025)
by: Pang, Yingxue, et al.
Published: (2025)
Mitigating Image Captioning Hallucinations in Vision-Language Models
by: Zhao, Fei, et al.
Published: (2025)
by: Zhao, Fei, et al.
Published: (2025)
D-FCGS: Feedforward Compression of Dynamic Gaussian Splatting for Free-Viewpoint Videos
by: Zhang, Wenkang, et al.
Published: (2025)
by: Zhang, Wenkang, et al.
Published: (2025)
Adaptive High-Frequency Preprocessing for Video Coding
by: Pang, Yingxue, et al.
Published: (2025)
by: Pang, Yingxue, et al.
Published: (2025)
Rate-aware Compression for NeRF-based Volumetric Video
by: Zhang, Zhiyu, et al.
Published: (2024)
by: Zhang, Zhiyu, et al.
Published: (2024)
GSCodec Studio: A Modular Framework for Gaussian Splat Compression
by: Li, Sicheng, et al.
Published: (2025)
by: Li, Sicheng, et al.
Published: (2025)
Efficient Token Compression for Vision Transformer with Spatial Information Preserved
by: Mao, Junzhu, et al.
Published: (2025)
by: Mao, Junzhu, et al.
Published: (2025)
SMC++: Masked Learning of Unsupervised Video Semantic Compression
by: Tian, Yuan, et al.
Published: (2024)
by: Tian, Yuan, et al.
Published: (2024)
SSNVC: Single Stream Neural Video Compression with Implicit Temporal Information
by: Wang, Feng, et al.
Published: (2024)
by: Wang, Feng, et al.
Published: (2024)
Region-Adaptive Video Sharpening via Rate-Perception Optimization
by: Pang, Yingxue, et al.
Published: (2025)
by: Pang, Yingxue, et al.
Published: (2025)
MHAD: Multimodal Home Activity Dataset with Multi-Angle Videos and Synchronized Physiological Signals
by: Yu, Lei, et al.
Published: (2024)
by: Yu, Lei, et al.
Published: (2024)
Casual3DHDR: Deblurring High Dynamic Range 3D Gaussian Splatting from Casually Captured Videos
by: Gong, Shucheng, et al.
Published: (2025)
by: Gong, Shucheng, et al.
Published: (2025)
Multimodal Class-aware Semantic Enhancement Network for Audio-Visual Video Parsing
by: Zhao, Pengcheng, et al.
Published: (2024)
by: Zhao, Pengcheng, et al.
Published: (2024)
Context Guided Transformer Entropy Modeling for Video Compression
by: Tong, Junlong, et al.
Published: (2025)
by: Tong, Junlong, et al.
Published: (2025)
Predicting Satisfied User and Machine Ratio for Compressed Images: A Unified Approach
by: Zhang, Qi, et al.
Published: (2024)
by: Zhang, Qi, et al.
Published: (2024)
Emotion-Qwen: A Unified Framework for Emotion and Vision Understanding
by: Huang, Dawei, et al.
Published: (2025)
by: Huang, Dawei, et al.
Published: (2025)
Reversing the Damage: A QP-Aware Transformer-Diffusion Approach for 8K Video Restoration under Codec Compression
by: Dehaghi, Ali Mollaahmadi, et al.
Published: (2024)
by: Dehaghi, Ali Mollaahmadi, et al.
Published: (2024)
MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding
by: Fang, Xinyu, et al.
Published: (2024)
by: Fang, Xinyu, et al.
Published: (2024)
FeatDistill: A Feature Distillation Enhanced Multi-Expert Ensemble Framework for Robust AI-generated Image Detection
by: Tu, Zhilin, et al.
Published: (2026)
by: Tu, Zhilin, et al.
Published: (2026)
Human Motion Video Generation: A Survey
by: Xue, Haiwei, et al.
Published: (2025)
by: Xue, Haiwei, et al.
Published: (2025)
Hierarchical Refinement of Universal Multimodal Attacks on Vision-Language Models
by: Zhang, Peng-Fei, et al.
Published: (2026)
by: Zhang, Peng-Fei, et al.
Published: (2026)
VidCompress: Memory-Enhanced Temporal Compression for Video Understanding in Large Language Models
by: Lan, Xiaohan, et al.
Published: (2024)
by: Lan, Xiaohan, et al.
Published: (2024)
Fine-grained Image Retrieval via Dual-Vision Adaptation
by: Jiang, Xin, et al.
Published: (2025)
by: Jiang, Xin, et al.
Published: (2025)
Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach
by: Guo, Sha, et al.
Published: (2024)
by: Guo, Sha, et al.
Published: (2024)
Error Analyses of Auto-Regressive Video Diffusion Models: A Unified Framework
by: Wang, Jing, et al.
Published: (2025)
by: Wang, Jing, et al.
Published: (2025)
PG-Attack: A Precision-Guided Adversarial Attack Framework Against Vision Foundation Models for Autonomous Driving
by: Fu, Jiyuan, et al.
Published: (2024)
by: Fu, Jiyuan, et al.
Published: (2024)
Ego3DT: Tracking Every 3D Object in Ego-centric Videos
by: Hao, Shengyu, et al.
Published: (2024)
by: Hao, Shengyu, et al.
Published: (2024)
Vocabulary Hijacking in LVLMs: Unveiling Critical Attention Heads by Excluding Inert Tokens to Mitigate Hallucination
by: Chen, Yangneng, et al.
Published: (2026)
by: Chen, Yangneng, et al.
Published: (2026)
FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis
by: Liang, Feng, et al.
Published: (2023)
by: Liang, Feng, et al.
Published: (2023)
T-GVC: Trajectory-Guided Generative Video Coding at Ultra-Low Bitrates
by: Wang, Zhitao, et al.
Published: (2025)
by: Wang, Zhitao, et al.
Published: (2025)
Hybrid Local-Global Context Learning for Neural Video Compression
by: Zhai, Yongqi, et al.
Published: (2024)
by: Zhai, Yongqi, et al.
Published: (2024)
FairyGen: Storied Cartoon Video from a Single Child-Drawn Character
by: Zheng, Jiayi, et al.
Published: (2025)
by: Zheng, Jiayi, et al.
Published: (2025)
Releasing the Parameter Latency of Neural Representation for High-Efficiency Video Compression
by: Zhang, Gai, et al.
Published: (2024)
by: Zhang, Gai, et al.
Published: (2024)
On the Robustness of Human-Object Interaction Detection against Distribution Shift
by: Xie, Chi, et al.
Published: (2025)
by: Xie, Chi, et al.
Published: (2025)
Bridging the Pose-Semantic Gap: A Cascade Framework for Text-Based Person Anomaly Search
by: Xie, Zequn, et al.
Published: (2026)
by: Xie, Zequn, et al.
Published: (2026)
Similar Items
-
A Tri-Dynamic Preprocessing Framework for UGC Video Compression
by: Zhao, Fei, et al.
Published: (2025) -
Generative Preprocessing for Image Compression with Pre-trained Diffusion Models
by: Guo, Mengxi, et al.
Published: (2025) -
Optimal Transcoding Resolution Prediction for Efficient Per-Title Bitrate Ladder Estimation
by: Yang, Jinhai, et al.
Published: (2024) -
ROI-Guided Point Cloud Geometry Compression Towards Human and Machine Vision
by: Liang, Xie, et al.
Published: (2025) -
ChatVTG: Video Temporal Grounding via Chat with Video Dialogue Large Language Models
by: Qu, Mengxue, et al.
Published: (2024)