Saved in:
| Main Authors: | Chen, Jingzhi, Xu, Lijian |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.18505 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ZeroSense:How Vision matters in Long Context Compression
by: Gao, Yonghan, et al.
Published: (2026)
by: Gao, Yonghan, et al.
Published: (2026)
Multimodal Model for Computational Pathology:Representation Learning and Image Compression
by: Wu, Peihang, et al.
Published: (2026)
by: Wu, Peihang, et al.
Published: (2026)
Orchestrating the Symphony of Prompt Distribution Learning for Human-Object Interaction Detection
by: Jia, Mingda, et al.
Published: (2024)
by: Jia, Mingda, et al.
Published: (2024)
XrayClaw: Cooperative-Competitive Multi-Agent Alignment for Trustworthy Chest X-ray Diagnosis
by: Young, Shawn, et al.
Published: (2026)
by: Young, Shawn, et al.
Published: (2026)
HGP-Mamba: Integrating Histology and Generated Protein Features for Mamba-based Multimodal Survival Risk Prediction
by: Dai, Jing, et al.
Published: (2026)
by: Dai, Jing, et al.
Published: (2026)
From Static to Dynamic: a Survey of Topology-Aware Perception in Autonomous Driving
by: Chen, Yixiao, et al.
Published: (2025)
by: Chen, Yixiao, et al.
Published: (2025)
CellSymphony: Deciphering the molecular and phenotypic orchestration of cells with single-cell pathomics
by: Acosta, Paul H., et al.
Published: (2025)
by: Acosta, Paul H., et al.
Published: (2025)
Efficient Chest X-ray Representation Learning via Semantic-Partitioned Contrastive Learning
by: Feng, Wangyu, et al.
Published: (2026)
by: Feng, Wangyu, et al.
Published: (2026)
The Model Knows Which Tokens Matter: Automatic Token Selection via Noise Gating
by: He, Landi, et al.
Published: (2026)
by: He, Landi, et al.
Published: (2026)
Fewer Tokens, Greater Scaling: Self-Adaptive Visual Bases for Efficient and Expansive Representation Learning
by: Young, Shawn, et al.
Published: (2025)
by: Young, Shawn, et al.
Published: (2025)
TC-SSA: Token Compression via Semantic Slot Aggregation for Gigapixel Pathology Reasoning
by: Chen, Zhuo, et al.
Published: (2026)
by: Chen, Zhuo, et al.
Published: (2026)
From Static to Dynamic: Exploring Self-supervised Image-to-Video Representation Transfer Learning
by: Liu, Yang, et al.
Published: (2026)
by: Liu, Yang, et al.
Published: (2026)
PIR: Photometric Inverse Rendering with Shading Cues Modeling and Surface Reflectance Regularization
by: Bao, Jingzhi, et al.
Published: (2024)
by: Bao, Jingzhi, et al.
Published: (2024)
FaceInsight: A Multimodal Large Language Model for Face Perception
by: Li, Jingzhi, et al.
Published: (2025)
by: Li, Jingzhi, et al.
Published: (2025)
From Static to Interactive: Adapting Visual in-Context Learners for User-Driven Tasks
by: Schmidt, Carlos, et al.
Published: (2026)
by: Schmidt, Carlos, et al.
Published: (2026)
Enhancing Visual Grounding and Generalization: A Multi-Task Cycle Training Approach for Vision-Language Models
by: Yang, Xiaoyu, et al.
Published: (2023)
by: Yang, Xiaoyu, et al.
Published: (2023)
Symphony: A Cognitively-Inspired Multi-Agent System for Long-Video Understanding
by: Yan, Haiyang, et al.
Published: (2026)
by: Yan, Haiyang, et al.
Published: (2026)
Static for Dynamic: Towards a Deeper Understanding of Dynamic Facial Expressions Using Static Expression Data
by: Chen, Yin, et al.
Published: (2024)
by: Chen, Yin, et al.
Published: (2024)
Mixed Prototype Consistency Learning for Semi-supervised Medical Image Segmentation
by: Li, Lijian
Published: (2024)
by: Li, Lijian
Published: (2024)
Complementarity-driven Representation Learning for Multi-modal Knowledge Graph Completion
by: Li, Lijian
Published: (2025)
by: Li, Lijian
Published: (2025)
Towards Generalized Few-Shot Open-Set Object Detection
by: Su, Binyi, et al.
Published: (2022)
by: Su, Binyi, et al.
Published: (2022)
StyleShot: A Snapshot on Any Style
by: Gao, Junyao, et al.
Published: (2024)
by: Gao, Junyao, et al.
Published: (2024)
Chain of Modality: From Static Fusion to Dynamic Orchestration in Omni-MLLMs
by: Luo, Ziyang, et al.
Published: (2026)
by: Luo, Ziyang, et al.
Published: (2026)
From Static to Dynamic: Adapting Landmark-Aware Image Models for Facial Expression Recognition in Videos
by: Chen, Yin, et al.
Published: (2023)
by: Chen, Yin, et al.
Published: (2023)
Harmonizing Light and Darkness: A Symphony of Prior-guided Data Synthesis and Adaptive Focus for Nighttime Flare Removal
by: Qu, Lishen, et al.
Published: (2024)
by: Qu, Lishen, et al.
Published: (2024)
Static Scene Reconstruction from Dynamic Egocentric Videos
by: Cui, Qifei, et al.
Published: (2026)
by: Cui, Qifei, et al.
Published: (2026)
Snapshot: Towards Application-centered Models for Pedestrian Trajectory Prediction in Urban Traffic Environments
by: Uhlemann, Nico, et al.
Published: (2024)
by: Uhlemann, Nico, et al.
Published: (2024)
A Deep Unfolding Framework for Diffractive Snapshot Spectral Imaging
by: Zhuge, Zhengyue, et al.
Published: (2025)
by: Zhuge, Zhengyue, et al.
Published: (2025)
Beyond Surrogate Gradients: Fully Differentiable Token Pruning for Vision-Language Models
by: He, Landi, et al.
Published: (2026)
by: He, Landi, et al.
Published: (2026)
From Prediction to Explanation: Multimodal, Explainable, and Interactive Deepfake Detection Framework for Non-Expert Users
by: Tariq, Shahroz, et al.
Published: (2025)
by: Tariq, Shahroz, et al.
Published: (2025)
From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors
by: Zhao, Liangbing, et al.
Published: (2026)
by: Zhao, Liangbing, et al.
Published: (2026)
Disentangling Static and Dynamic Information for Reducing Static Bias in Action Recognition
by: Kobayashi, Masato, et al.
Published: (2025)
by: Kobayashi, Masato, et al.
Published: (2025)
SnapCap: Efficient Snapshot Compressive Video Captioning
by: Sun, Jianqiao, et al.
Published: (2024)
by: Sun, Jianqiao, et al.
Published: (2024)
Rethinking Point Clouds as Sequences: A Causal Next-Token Predictive Learning Framework
by: Yao, Yumeng, et al.
Published: (2026)
by: Yao, Yumeng, et al.
Published: (2026)
Generalized Semantic Contrastive Learning via Embedding Side Information for Few-Shot Object Detection
by: Chen, Ruoyu, et al.
Published: (2025)
by: Chen, Ruoyu, et al.
Published: (2025)
One Snapshot is All You Need: A Generalized Method for mmWave Signal Generation
by: Huang, Teng, et al.
Published: (2025)
by: Huang, Teng, et al.
Published: (2025)
Pose-Aware Diffusion for 3D Generation
by: Zhou, Zihan, et al.
Published: (2026)
by: Zhou, Zihan, et al.
Published: (2026)
Deep Optics for Video Snapshot Compressive Imaging
by: Wang, Ping, et al.
Published: (2024)
by: Wang, Ping, et al.
Published: (2024)
Flow Snapshot Neurons in Action: Deep Neural Networks Generalize to Biological Motion Perception
by: Han, Shuangpeng, et al.
Published: (2024)
by: Han, Shuangpeng, et al.
Published: (2024)
Score-based Generative Priors Guided Model-driven Network for MRI Reconstruction
by: Qiao, Xiaoyu, et al.
Published: (2024)
by: Qiao, Xiaoyu, et al.
Published: (2024)
Similar Items
-
ZeroSense:How Vision matters in Long Context Compression
by: Gao, Yonghan, et al.
Published: (2026) -
Multimodal Model for Computational Pathology:Representation Learning and Image Compression
by: Wu, Peihang, et al.
Published: (2026) -
Orchestrating the Symphony of Prompt Distribution Learning for Human-Object Interaction Detection
by: Jia, Mingda, et al.
Published: (2024) -
XrayClaw: Cooperative-Competitive Multi-Agent Alignment for Trustworthy Chest X-ray Diagnosis
by: Young, Shawn, et al.
Published: (2026) -
HGP-Mamba: Integrating Histology and Generated Protein Features for Mamba-based Multimodal Survival Risk Prediction
by: Dai, Jing, et al.
Published: (2026)