Saved in:
| Main Authors: | Sun, Qichen, Guo, Zhengrui, Peng, Rui, Chen, Hao, Wang, Jinzhuo |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.12711 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
FOCUS: Knowledge-enhanced Adaptive Visual Compression for Few-shot Whole Slide Image Classification
by: Guo, Zhengrui, et al.
Published: (2024)
by: Guo, Zhengrui, et al.
Published: (2024)
Context Matters: Query-aware Dynamic Long Sequence Modeling of Gigapixel Images
by: Guo, Zhengrui, et al.
Published: (2025)
by: Guo, Zhengrui, et al.
Published: (2025)
Animate Any Character in Any World
by: Wang, Yitong, et al.
Published: (2025)
by: Wang, Yitong, et al.
Published: (2025)
AnyFit: Controllable Virtual Try-on for Any Combination of Attire Across Any Scenario
by: Li, Yuhan, et al.
Published: (2024)
by: Li, Yuhan, et al.
Published: (2024)
AnyMo: Scaling Any-Modality Conditional Motion Generation with Masked Modeling
by: Li, Yiheng, et al.
Published: (2026)
by: Li, Yiheng, et al.
Published: (2026)
AnySR: Realizing Image Super-Resolution as Any-Scale, Any-Resource
by: Zhan, Wengyi, et al.
Published: (2024)
by: Zhan, Wengyi, et al.
Published: (2024)
Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation
by: Wu, Shengqiong, et al.
Published: (2025)
by: Wu, Shengqiong, et al.
Published: (2025)
Any-to-Any Vision-Language Model for Multimodal X-ray Imaging and Radiological Report Generation
by: Molino, Daniele, et al.
Published: (2025)
by: Molino, Daniele, et al.
Published: (2025)
AnyTrans: Translate AnyText in the Image with Large Scale Models
by: Qian, Zhipeng, et al.
Published: (2024)
by: Qian, Zhipeng, et al.
Published: (2024)
Depth Anything at Any Condition
by: Sun, Boyuan, et al.
Published: (2025)
by: Sun, Boyuan, et al.
Published: (2025)
Unaligning Everything: Or Aligning Any Text to Any Image in Multimodal Models
by: Salman, Shaeke, et al.
Published: (2024)
by: Salman, Shaeke, et al.
Published: (2024)
AnyTSR: Any-Scale Thermal Super-Resolution for UAV
by: Li, Mengyuan, et al.
Published: (2025)
by: Li, Mengyuan, et al.
Published: (2025)
AnyTop: Character Animation Diffusion with Any Topology
by: Gat, Inbar, et al.
Published: (2025)
by: Gat, Inbar, et al.
Published: (2025)
Depth Any Camera: Zero-Shot Metric Depth Estimation from Any Camera
by: Guo, Yuliang, et al.
Published: (2025)
by: Guo, Yuliang, et al.
Published: (2025)
Segment Any Anomaly without Training via Hybrid Prompt Regularization
by: Cao, Yunkang, et al.
Published: (2023)
by: Cao, Yunkang, et al.
Published: (2023)
Recognize Any Regions
by: Yang, Haosen, et al.
Published: (2023)
by: Yang, Haosen, et al.
Published: (2023)
AnyPattern: Towards In-context Image Copy Detection
by: Wang, Wenhao, et al.
Published: (2024)
by: Wang, Wenhao, et al.
Published: (2024)
Any4D: Open-Prompt 4D Generation from Natural Language and Images
by: Li, Hao, et al.
Published: (2025)
by: Li, Hao, et al.
Published: (2025)
X2SAM: Any Segmentation in Images and Videos
by: Wang, Hao, et al.
Published: (2026)
by: Wang, Hao, et al.
Published: (2026)
AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation
by: Gu, Yuchao, et al.
Published: (2026)
by: Gu, Yuchao, et al.
Published: (2026)
Compress Any Segment Anything Model (SAM)
by: Fan, Juntong, et al.
Published: (2025)
by: Fan, Juntong, et al.
Published: (2025)
Neural Gaffer: Relighting Any Object via Diffusion
by: Jin, Haian, et al.
Published: (2024)
by: Jin, Haian, et al.
Published: (2024)
Any2Any 3D Diffusion Models with Knowledge Transfer: A Radiotherapy Planning Study
by: Wang, Yuhan, et al.
Published: (2026)
by: Wang, Yuhan, et al.
Published: (2026)
RARE: Refine Any Registration of Pairwise Point Clouds via Zero-Shot Learning
by: Zheng, Chengyu, et al.
Published: (2025)
by: Zheng, Chengyu, et al.
Published: (2025)
X-SAM: From Segment Anything to Any Segmentation
by: Wang, Hao, et al.
Published: (2025)
by: Wang, Hao, et al.
Published: (2025)
Any Target Can be Offense: Adversarial Example Generation via Generalized Latent Infection
by: Sun, Youheng, et al.
Published: (2024)
by: Sun, Youheng, et al.
Published: (2024)
Generative Unlearning for Any Identity
by: Seo, Juwon, et al.
Published: (2024)
by: Seo, Juwon, et al.
Published: (2024)
NExT-OMNI: Towards Any-to-Any Omnimodal Foundation Models with Discrete Flow Matching
by: Luo, Run, et al.
Published: (2025)
by: Luo, Run, et al.
Published: (2025)
One Model to Translate Them All: Universal Any-to-Any Translation for Heterogeneous Collaborative Perception
by: Li, Yang, et al.
Published: (2026)
by: Li, Yang, et al.
Published: (2026)
Label-Efficient Deep Learning in Medical Image Analysis: Challenges and Future Directions
by: Jin, Cheng, et al.
Published: (2023)
by: Jin, Cheng, et al.
Published: (2023)
Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs
by: Wang, Haochen, et al.
Published: (2025)
by: Wang, Haochen, et al.
Published: (2025)
AnyV2V: A Tuning-Free Framework For Any Video-to-Video Editing Tasks
by: Ku, Max, et al.
Published: (2024)
by: Ku, Max, et al.
Published: (2024)
Any-to-Bokeh: Arbitrary-Subject Video Refocusing with Video Diffusion Model
by: Yang, Yang, et al.
Published: (2025)
by: Yang, Yang, et al.
Published: (2025)
Navigation with VLM framework: Towards Going to Any Language
by: Yin, Zecheng, et al.
Published: (2024)
by: Yin, Zecheng, et al.
Published: (2024)
AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling
by: Zhan, Jun, et al.
Published: (2024)
by: Zhan, Jun, et al.
Published: (2024)
ConceptSeg-R1: Segment Any Concept via Meta-Reinforcement Learning
by: Zhao, Yuan, et al.
Published: (2026)
by: Zhao, Yuan, et al.
Published: (2026)
MAGREF: Masked Guidance for Any-Reference Video Generation with Subject Disentanglement
by: Deng, Yufan, et al.
Published: (2025)
by: Deng, Yufan, et al.
Published: (2025)
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities
by: Bachmann, Roman, et al.
Published: (2024)
by: Bachmann, Roman, et al.
Published: (2024)
ATI: Any Trajectory Instruction for Controllable Video Generation
by: Wang, Angtian, et al.
Published: (2025)
by: Wang, Angtian, et al.
Published: (2025)
Discovering Pathology Rationale and Token Allocation for Efficient Multimodal Pathology Reasoning
by: Xu, Zhe, et al.
Published: (2025)
by: Xu, Zhe, et al.
Published: (2025)
Similar Items
-
FOCUS: Knowledge-enhanced Adaptive Visual Compression for Few-shot Whole Slide Image Classification
by: Guo, Zhengrui, et al.
Published: (2024) -
Context Matters: Query-aware Dynamic Long Sequence Modeling of Gigapixel Images
by: Guo, Zhengrui, et al.
Published: (2025) -
Animate Any Character in Any World
by: Wang, Yitong, et al.
Published: (2025) -
AnyFit: Controllable Virtual Try-on for Any Combination of Attire Across Any Scenario
by: Li, Yuhan, et al.
Published: (2024) -
AnyMo: Scaling Any-Modality Conditional Motion Generation with Masked Modeling
by: Li, Yiheng, et al.
Published: (2026)