Saved in:
| Main Authors: | Wang, Jinghao, He, Qiyuan, Gu, Chunbin, Heng, Pheng-Ann |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.13179 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
REAR: Rethinking Visual Autoregressive Models via Generator-Tokenizer Consistency Regularization
by: He, Qiyuan, et al.
Published: (2025)
by: He, Qiyuan, et al.
Published: (2025)
AID: Attention Interpolation of Text-to-Image Diffusion
by: He, Qiyuan, et al.
Published: (2024)
by: He, Qiyuan, et al.
Published: (2024)
Memory-Efficient Prompt Tuning for Incremental Histopathology Classification
by: Zhu, Yu, et al.
Published: (2024)
by: Zhu, Yu, et al.
Published: (2024)
Comprehensive Generative Replay for Task-Incremental Segmentation with Concurrent Appearance and Semantic Forgetting
by: Li, Wei, et al.
Published: (2024)
by: Li, Wei, et al.
Published: (2024)
VPG: Visual Prefix Guidance for Autoregressive Image and Video Generation
by: Liao, Xinyao, et al.
Published: (2026)
by: Liao, Xinyao, et al.
Published: (2026)
Unifying Physically-Informed Weather Priors in A Single Model for Image Restoration Across Multiple Adverse Weather Conditions
by: Xu, Jiaqi, et al.
Published: (2026)
by: Xu, Jiaqi, et al.
Published: (2026)
Tri-modal Confluence with Temporal Dynamics for Scene Graph Generation in Operating Rooms
by: Guo, Diandian, et al.
Published: (2024)
by: Guo, Diandian, et al.
Published: (2024)
Towards Synchronous Memorizability and Generalizability with Site-Modulated Diffusion Replay for Cross-Site Continual Segmentation
by: Xu, Dunyuan, et al.
Published: (2024)
by: Xu, Dunyuan, et al.
Published: (2024)
Deep Omni-supervised Learning for Rib Fracture Detection from Chest Radiology Images
by: Chai, Zhizhong, et al.
Published: (2023)
by: Chai, Zhizhong, et al.
Published: (2023)
Unveiling Deep Shadows: A Survey and Benchmark on Image and Video Shadow Detection, Removal, and Generation in the Deep Learning Era
by: Hu, Xiaowei, et al.
Published: (2024)
by: Hu, Xiaowei, et al.
Published: (2024)
AR-RAG: Autoregressive Retrieval Augmentation for Image Generation
by: Qi, Jingyuan, et al.
Published: (2025)
by: Qi, Jingyuan, et al.
Published: (2025)
UniHOPE: A Unified Approach for Hand-Only and Hand-Object Pose Estimation
by: Wang, Yinqiao, et al.
Published: (2025)
by: Wang, Yinqiao, et al.
Published: (2025)
SiMA-Hand: Boosting 3D Hand-Mesh Reconstruction by Single-to-Multi-View Adaptation
by: Wang, Yinqiao, et al.
Published: (2024)
by: Wang, Yinqiao, et al.
Published: (2024)
Unlocking Positive Transfer in Incrementally Learning Surgical Instruments: A Self-reflection Hierarchical Prompt Framework
by: Zhu, Yu, et al.
Published: (2026)
by: Zhu, Yu, et al.
Published: (2026)
Tiny-Engram: Trigger-Indexed Concept Tables for Generative Vision
by: Cai, Runyuan, et al.
Published: (2026)
by: Cai, Runyuan, et al.
Published: (2026)
Cross Prompting Consistency with Segment Anything Model for Semi-supervised Medical Image Segmentation
by: Miao, Juzheng, et al.
Published: (2024)
by: Miao, Juzheng, et al.
Published: (2024)
G2Face: High-Fidelity Reversible Face Anonymization via Generative and Geometric Priors
by: Yang, Haoxin, et al.
Published: (2024)
by: Yang, Haoxin, et al.
Published: (2024)
Cross-modality Guidance-aided Multi-modal Learning with Dual Attention for MRI Brain Tumor Grading
by: Xu, Dunyuan, et al.
Published: (2024)
by: Xu, Dunyuan, et al.
Published: (2024)
S^2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR
by: Pei, Jialun, et al.
Published: (2024)
by: Pei, Jialun, et al.
Published: (2024)
Conceptrol: Concept Control of Zero-shot Personalized Image Generation
by: He, Qiyuan, et al.
Published: (2025)
by: He, Qiyuan, et al.
Published: (2025)
SurgLQA: Scalable Long-Horizon Surgical Video Question Answering
by: Guo, Diandian, et al.
Published: (2026)
by: Guo, Diandian, et al.
Published: (2026)
Surgical Workflow Recognition and Blocking Effectiveness Detection in Laparoscopic Liver Resections with Pringle Maneuver
by: Guo, Diandian, et al.
Published: (2024)
by: Guo, Diandian, et al.
Published: (2024)
From Learning to Unlearning: Biomedical Security Protection in Multimodal Large Language Models
by: Xu, Dunyuan, et al.
Published: (2025)
by: Xu, Dunyuan, et al.
Published: (2025)
MS2Mesh-XR: Multi-modal Sketch-to-Mesh Generation in XR Environments
by: Tong, Yuqi, et al.
Published: (2024)
by: Tong, Yuqi, et al.
Published: (2024)
Multi-scale Spatio-temporal Transformer-based Imbalanced Longitudinal Learning for Glaucoma Forecasting from Irregular Time Series Images
by: Yang, Xikai, et al.
Published: (2024)
by: Yang, Xikai, et al.
Published: (2024)
Autoregressive Image Generation without Vector Quantization
by: Li, Tianhong, et al.
Published: (2024)
by: Li, Tianhong, et al.
Published: (2024)
ImageFolder: Autoregressive Image Generation with Folded Tokens
by: Li, Xiang, et al.
Published: (2024)
by: Li, Xiang, et al.
Published: (2024)
Benchmarking Endoscopic Surgical Image Restoration and Beyond
by: Pei, Jialun, et al.
Published: (2025)
by: Pei, Jialun, et al.
Published: (2025)
Towards Real-World Adverse Weather Image Restoration: Enhancing Clearness and Semantics with Vision-Language Models
by: Xu, Jiaqi, et al.
Published: (2024)
by: Xu, Jiaqi, et al.
Published: (2024)
Topology-Constrained Learning for Efficient Laparoscopic Liver Landmark Detection
by: Cui, Ruize, et al.
Published: (2025)
by: Cui, Ruize, et al.
Published: (2025)
ATLAS: Agentic or Latent Visual Reasoning? One Word is Enough for Both
by: Guo, Ziyu, et al.
Published: (2026)
by: Guo, Ziyu, et al.
Published: (2026)
Creating Virtual Environments with 3D Gaussian Splatting: A Comparative Study
by: Qiu, Shi, et al.
Published: (2025)
by: Qiu, Shi, et al.
Published: (2025)
Advancing Extended Reality with 3D Gaussian Splatting: Innovations and Prospects
by: Qiu, Shi, et al.
Published: (2024)
by: Qiu, Shi, et al.
Published: (2024)
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
by: Liu, Jiaming, et al.
Published: (2025)
by: Liu, Jiaming, et al.
Published: (2025)
Medical Large Vision Language Models with Multi-Image Visual Ability
by: Yang, Xikai, et al.
Published: (2025)
by: Yang, Xikai, et al.
Published: (2025)
SceneDecorator: Towards Scene-Oriented Story Generation with Scene Planning and Scene Consistency
by: Song, Quanjian, et al.
Published: (2025)
by: Song, Quanjian, et al.
Published: (2025)
Adapting 2D Multi-Modal Large Language Model for 3D CT Image Analysis
by: Yu, Yang, et al.
Published: (2026)
by: Yu, Yang, et al.
Published: (2026)
Revisiting Shadow Detection: A New Benchmark Dataset for Complex World
by: Hu, Xiaowei, et al.
Published: (2019)
by: Hu, Xiaowei, et al.
Published: (2019)
A Narrative Review of Image Processing Techniques Related to Prostate Ultrasound
by: Wang, Haiqiao, et al.
Published: (2024)
by: Wang, Haiqiao, et al.
Published: (2024)
Video Instance Shadow Detection Under the Sun and Sky
by: Xing, Zhenghao, et al.
Published: (2022)
by: Xing, Zhenghao, et al.
Published: (2022)
Similar Items
-
REAR: Rethinking Visual Autoregressive Models via Generator-Tokenizer Consistency Regularization
by: He, Qiyuan, et al.
Published: (2025) -
AID: Attention Interpolation of Text-to-Image Diffusion
by: He, Qiyuan, et al.
Published: (2024) -
Memory-Efficient Prompt Tuning for Incremental Histopathology Classification
by: Zhu, Yu, et al.
Published: (2024) -
Comprehensive Generative Replay for Task-Incremental Segmentation with Concurrent Appearance and Semantic Forgetting
by: Li, Wei, et al.
Published: (2024) -
VPG: Visual Prefix Guidance for Autoregressive Image and Video Generation
by: Liao, Xinyao, et al.
Published: (2026)