Saved in:
| Main Authors: | Zhang, Yutian, Pei, Zhongyi, Mao, Yi, Wang, Chen, Liu, Lin, Wang, Jianmin |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.01528 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Stream-R1: Reliability-Perplexity Aware Reward Distillation for Streaming Video Generation
by: Wu, Bin, et al.
Published: (2026)
by: Wu, Bin, et al.
Published: (2026)
ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling
by: Luo, Yawen, et al.
Published: (2026)
by: Luo, Yawen, et al.
Published: (2026)
FSM-Net: An Efficient Frequency-Spatial Network for Real-World Deblurring
by: Ly, Vinh-Thuan
Published: (2026)
by: Ly, Vinh-Thuan
Published: (2026)
Autonomous AI-enabled Industrial Sorting Pipeline for Advanced Textile Recycling
by: Spyridis, Yannis, et al.
Published: (2024)
by: Spyridis, Yannis, et al.
Published: (2024)
Decoupling Classifier for Boosting Few-shot Object Detection and Instance Segmentation
by: Gao, Bin-Bin, et al.
Published: (2025)
by: Gao, Bin-Bin, et al.
Published: (2025)
Vision to Geometry: 3D Spatial Memory for Sequential Embodied MLLM Reasoning and Exploration
by: Cai, Zhongyi, et al.
Published: (2025)
by: Cai, Zhongyi, et al.
Published: (2025)
Stream-T1: Test-Time Scaling for Streaming Video Generation
by: Tu, Yijing, et al.
Published: (2026)
by: Tu, Yijing, et al.
Published: (2026)
CurveStream: Boosting Streaming Video Understanding in MLLMs via Curvature-Aware Hierarchical Visual Memory Management
by: Wang, Chao, et al.
Published: (2026)
by: Wang, Chao, et al.
Published: (2026)
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length
by: Huang, Yubo, et al.
Published: (2025)
by: Huang, Yubo, et al.
Published: (2025)
From Physics to Foundation Models: A Review of AI-Driven Quantitative Remote Sensing Inversion
by: Yu, Zhenyu, et al.
Published: (2025)
by: Yu, Zhenyu, et al.
Published: (2025)
MUG-V 10B: High-efficiency Training Pipeline for Large Video Generation Models
by: Zhang, Yongshun, et al.
Published: (2025)
by: Zhang, Yongshun, et al.
Published: (2025)
IndustryEQA: Pushing the Frontiers of Embodied Question Answering in Industrial Scenarios
by: Li, Yifan, et al.
Published: (2025)
by: Li, Yifan, et al.
Published: (2025)
A Multistage Extraction Pipeline for Long Scanned Financial Documents: An Empirical Study in Industrial KYC Workflows
by: Han, Yuxuan, et al.
Published: (2026)
by: Han, Yuxuan, et al.
Published: (2026)
AI-Driven Innovations in Volumetric Video Streaming: A Review
by: Entezami, Erfan, et al.
Published: (2024)
by: Entezami, Erfan, et al.
Published: (2024)
InfVSR: Toward Consistency-Driven Streaming Generative Video Super-Resolution
by: Zhang, Ziqing, et al.
Published: (2025)
by: Zhang, Ziqing, et al.
Published: (2025)
Beyond Artifacts: Real-Centric Envelope Modeling for Reliable AI-Generated Image Detection
by: Liu, Ruiqi, et al.
Published: (2025)
by: Liu, Ruiqi, et al.
Published: (2025)
DeformStream: Deformation-based Adaptive Volumetric Video Streaming
by: Li, Boyan, et al.
Published: (2024)
by: Li, Boyan, et al.
Published: (2024)
Visible-Infrared Person Re-Identification via Patch-Mixed Cross-Modality Learning
by: Qian, Zhihao, et al.
Published: (2023)
by: Qian, Zhihao, et al.
Published: (2023)
CasP: Improving Semi-Dense Feature Matching Pipeline Leveraging Cascaded Correspondence Priors for Guidance
by: Chen, Peiqi, et al.
Published: (2025)
by: Chen, Peiqi, et al.
Published: (2025)
Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion Generation
by: Zhen, Dingcheng, et al.
Published: (2025)
by: Zhen, Dingcheng, et al.
Published: (2025)
Finding Visual Saliency in Continuous Spike Stream
by: Zhu, Lin, et al.
Published: (2024)
by: Zhu, Lin, et al.
Published: (2024)
A Reconstruction System for Industrial Pipeline Inner Walls Using Panoramic Image Stitching with Endoscopic Imaging
by: Ma, Rui, et al.
Published: (2026)
by: Ma, Rui, et al.
Published: (2026)
Hunyuan3D Studio: End-to-End AI Pipeline for Game-Ready 3D Asset Generation
by: Lei, Biwen, et al.
Published: (2025)
by: Lei, Biwen, et al.
Published: (2025)
StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling
by: Wei, Meng, et al.
Published: (2025)
by: Wei, Meng, et al.
Published: (2025)
MAC-VO: Metrics-aware Covariance for Learning-based Stereo Visual Odometry
by: Qiu, Yuheng, et al.
Published: (2024)
by: Qiu, Yuheng, et al.
Published: (2024)
Click-to-Ask: An AI Live Streaming Assistant with Offline Copywriting and Online Interactive QA
by: Yu, Ruizhi, et al.
Published: (2026)
by: Yu, Ruizhi, et al.
Published: (2026)
PPBoost: Progressive Prompt Boosting for Text-Driven Medical Image Segmentation
by: Li, Xuchen, et al.
Published: (2025)
by: Li, Xuchen, et al.
Published: (2025)
Attention Guidance Mechanism for Handwritten Mathematical Expression Recognition
by: Liu, Yutian, et al.
Published: (2024)
by: Liu, Yutian, et al.
Published: (2024)
AutoIAD: Manager-Driven Multi-Agent Collaboration for Automated Industrial Anomaly Detection
by: Ji, Dongwei, et al.
Published: (2025)
by: Ji, Dongwei, et al.
Published: (2025)
Motion Matters: Compact Gaussian Streaming for Free-Viewpoint Video Reconstruction
by: Chen, Jiacong, et al.
Published: (2025)
by: Chen, Jiacong, et al.
Published: (2025)
Stream Query Denoising for Vectorized HD Map Construction
by: Wang, Shuo, et al.
Published: (2024)
by: Wang, Shuo, et al.
Published: (2024)
Accelerating Streaming Video Large Language Models via Hierarchical Token Compression
by: Wang, Yiyu, et al.
Published: (2025)
by: Wang, Yiyu, et al.
Published: (2025)
MAC: A Benchmark for Multiple Attributes Compositional Zero-Shot Learning
by: Xu, Shuo, et al.
Published: (2024)
by: Xu, Shuo, et al.
Published: (2024)
Penalizing Boundary Activation for Object Completeness in Diffusion Models
by: Xu, Haoyang, et al.
Published: (2025)
by: Xu, Haoyang, et al.
Published: (2025)
IS-Diff: Improving Diffusion-Based Inpainting with Better Initial Seed
by: Lyu, Yongzhe, et al.
Published: (2025)
by: Lyu, Yongzhe, et al.
Published: (2025)
AutoPrompt: Automated Red-Teaming of Text-to-Image Models via LLM-Driven Adversarial Prompts
by: Liu, Yufan, et al.
Published: (2025)
by: Liu, Yufan, et al.
Published: (2025)
Saliency Driven Imagery Preprocessing for Efficient Compression -- Industrial Paper
by: Downes, Justin, et al.
Published: (2026)
by: Downes, Justin, et al.
Published: (2026)
ChartM$^3$: A Multi-Stage Code-Driven Pipeline for Constructing Multi-Dimensional and Multi-Step Visual Reasoning Data in Chart Comprehension
by: Xu, Duo, et al.
Published: (2025)
by: Xu, Duo, et al.
Published: (2025)
Industrial Anomaly Detection and Localization Using Weakly-Supervised Residual Transformers
by: Li, Hanxi, et al.
Published: (2023)
by: Li, Hanxi, et al.
Published: (2023)
Insight: A Multi-Modal Diagnostic Pipeline using LLMs for Ocular Surface Disease Diagnosis
by: Yeh, Chun-Hsiao, et al.
Published: (2024)
by: Yeh, Chun-Hsiao, et al.
Published: (2024)
Similar Items
-
Stream-R1: Reliability-Perplexity Aware Reward Distillation for Streaming Video Generation
by: Wu, Bin, et al.
Published: (2026) -
ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling
by: Luo, Yawen, et al.
Published: (2026) -
FSM-Net: An Efficient Frequency-Spatial Network for Real-World Deblurring
by: Ly, Vinh-Thuan
Published: (2026) -
Autonomous AI-enabled Industrial Sorting Pipeline for Advanced Textile Recycling
by: Spyridis, Yannis, et al.
Published: (2024) -
Decoupling Classifier for Boosting Few-shot Object Detection and Instance Segmentation
by: Gao, Bin-Bin, et al.
Published: (2025)