Saved in:
| Main Authors: | Ruan, Chi, Zhao, Jiying, Chen, Wenhu |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.20622 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection
by: Chen, Fangyi, et al.
Published: (2024)
by: Chen, Fangyi, et al.
Published: (2024)
QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design
by: Schneider, Benjamin, et al.
Published: (2025)
by: Schneider, Benjamin, et al.
Published: (2025)
Real-Time Oriented Object Detection Transformer in Remote Sensing Images
by: Ding, Zeyu, et al.
Published: (2026)
by: Ding, Zeyu, et al.
Published: (2026)
LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection
by: Chen, Qiang, et al.
Published: (2024)
by: Chen, Qiang, et al.
Published: (2024)
RT-DETRv2: Improved Baseline with Bag-of-Freebies for Real-Time Detection Transformer
by: Lv, Wenyu, et al.
Published: (2024)
by: Lv, Wenyu, et al.
Published: (2024)
Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers
by: Ren, Weiming, et al.
Published: (2025)
by: Ren, Weiming, et al.
Published: (2025)
RF-DETR: Neural Architecture Search for Real-Time Detection Transformers
by: Robinson, Isaac, et al.
Published: (2025)
by: Robinson, Isaac, et al.
Published: (2025)
Livatar-1: Real-Time Talking Heads Generation with Tailored Flow Matching
by: Liu, Haiyang, et al.
Published: (2025)
by: Liu, Haiyang, et al.
Published: (2025)
YOLO-IOD: Towards Real Time Incremental Object Detection
by: Zhang, Shizhou, et al.
Published: (2025)
by: Zhang, Shizhou, et al.
Published: (2025)
UAV-Assisted Real-Time Disaster Detection Using Optimized Transformer Model
by: Jankovic, Branislava, et al.
Published: (2025)
by: Jankovic, Branislava, et al.
Published: (2025)
Le-DETR: Revisiting Real-Time Detection Transformer with Efficient Encoder Design
by: Huang, Jiannan, et al.
Published: (2026)
by: Huang, Jiannan, et al.
Published: (2026)
PixelWorld: How Far Are We from Perceiving Everything as Pixels?
by: Lyu, Zhiheng, et al.
Published: (2025)
by: Lyu, Zhiheng, et al.
Published: (2025)
Real-Time Deepfake Detection in the Real-World
by: Cavia, Bar, et al.
Published: (2024)
by: Cavia, Bar, et al.
Published: (2024)
YOLOv10: Real-Time End-to-End Object Detection
by: Wang, Ao, et al.
Published: (2024)
by: Wang, Ao, et al.
Published: (2024)
Learning Motion Blur Robust Vision Transformers for Real-Time UAV Tracking
by: Wu, You, et al.
Published: (2024)
by: Wu, You, et al.
Published: (2024)
Real-Time 3D Object Detection with Inference-Aligned Learning
by: Zhao, Chenyu, et al.
Published: (2025)
by: Zhao, Chenyu, et al.
Published: (2025)
A Real-Time Framework for Domain-Adaptive Underwater Object Detection with Image Enhancement
by: Wen, Junjie, et al.
Published: (2024)
by: Wen, Junjie, et al.
Published: (2024)
CF-DETR: Coarse-to-Fine Transformer for Real-Time Object Detection
by: Shin, Woojin, et al.
Published: (2025)
by: Shin, Woojin, et al.
Published: (2025)
Kosmos-G: Generating Images in Context with Multimodal Large Language Models
by: Pan, Xichen, et al.
Published: (2023)
by: Pan, Xichen, et al.
Published: (2023)
ABC: Achieving Better Control of Multimodal Embeddings using VLMs
by: Schneider, Benjamin, et al.
Published: (2025)
by: Schneider, Benjamin, et al.
Published: (2025)
Real-Time Detection of Electronic Components in Waste Printed Circuit Boards: A Transformer-Based Approach
by: Mohsin, Muhammad, et al.
Published: (2024)
by: Mohsin, Muhammad, et al.
Published: (2024)
Learning an Adaptive and View-Invariant Vision Transformer for Real-Time UAV Tracking
by: Wu, You, et al.
Published: (2024)
by: Wu, You, et al.
Published: (2024)
KV-Tracker: Real-Time Pose Tracking with Transformers
by: Taher, Marwan, et al.
Published: (2025)
by: Taher, Marwan, et al.
Published: (2025)
RealCam: Real-Time Novel-View Video Generation with Interactive Camera Control
by: Xu, Youcan, et al.
Published: (2026)
by: Xu, Youcan, et al.
Published: (2026)
ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation
by: Ren, Weiming, et al.
Published: (2024)
by: Ren, Weiming, et al.
Published: (2024)
RT-DETRv4: Painlessly Furthering Real-Time Object Detection with Vision Foundation Models
by: Liao, Zijun, et al.
Published: (2025)
by: Liao, Zijun, et al.
Published: (2025)
Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head
by: Zhao, Tiancheng, et al.
Published: (2024)
by: Zhao, Tiancheng, et al.
Published: (2024)
DiT as Real-Time Rerenderer: Streaming Video Stylization with Autoregressive Diffusion Transformer
by: Lyu, Hengye, et al.
Published: (2026)
by: Lyu, Hengye, et al.
Published: (2026)
Context Forcing: Consistent Autoregressive Video Generation with Long Context
by: Chen, Shuo, et al.
Published: (2026)
by: Chen, Shuo, et al.
Published: (2026)
When Every Millisecond Counts: Real-Time Anomaly Detection via the Multimodal Asynchronous Hybrid Network
by: Xiao, Dong, et al.
Published: (2025)
by: Xiao, Dong, et al.
Published: (2025)
RTMap: Real-Time Recursive Mapping with Change Detection and Localization
by: Du, Yuheng, et al.
Published: (2025)
by: Du, Yuheng, et al.
Published: (2025)
Real-Time Indoor Object Detection based on hybrid CNN-Transformer Approach
by: Laidoudi, Salah Eddine, et al.
Published: (2024)
by: Laidoudi, Salah Eddine, et al.
Published: (2024)
UniVideo: Unified Understanding, Generation, and Editing for Videos
by: Wei, Cong, et al.
Published: (2025)
by: Wei, Cong, et al.
Published: (2025)
Test-Time Intensity Consistency Adaptation for Shadow Detection
by: Zhu, Leyi, et al.
Published: (2024)
by: Zhu, Leyi, et al.
Published: (2024)
Style-Adaptive Detection Transformer for Single-Source Domain Generalized Object Detection
by: Han, Jianhong, et al.
Published: (2025)
by: Han, Jianhong, et al.
Published: (2025)
Helios: Real Real-Time Long Video Generation Model
by: Yuan, Shenghai, et al.
Published: (2026)
by: Yuan, Shenghai, et al.
Published: (2026)
CogDoc: Towards Unified thinking in Documents
by: Xu, Qixin, et al.
Published: (2025)
by: Xu, Qixin, et al.
Published: (2025)
Starve to Perceive: Taming Lazy Perception in VLMs with Constrained Visual Bandwidth
by: Wu, Yuhuan, et al.
Published: (2026)
by: Wu, Yuhuan, et al.
Published: (2026)
VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation
by: Ren, Weiming, et al.
Published: (2024)
by: Ren, Weiming, et al.
Published: (2024)
ROMA: Run-Time Object Detection To Maximize Real-Time Accuracy
by: Lee, JunKyu, et al.
Published: (2022)
by: Lee, JunKyu, et al.
Published: (2022)
Similar Items
-
RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection
by: Chen, Fangyi, et al.
Published: (2024) -
QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design
by: Schneider, Benjamin, et al.
Published: (2025) -
Real-Time Oriented Object Detection Transformer in Remote Sensing Images
by: Ding, Zeyu, et al.
Published: (2026) -
LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection
by: Chen, Qiang, et al.
Published: (2024) -
RT-DETRv2: Improved Baseline with Bag-of-Freebies for Real-Time Detection Transformer
by: Lv, Wenyu, et al.
Published: (2024)