:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Chen, Jingzhi, Xu, Lijian
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2603.18505
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

ZeroSense:How Vision matters in Long Context Compression
by: Gao, Yonghan, et al.
Published: (2026)

Multimodal Model for Computational Pathology:Representation Learning and Image Compression
by: Wu, Peihang, et al.
Published: (2026)

Orchestrating the Symphony of Prompt Distribution Learning for Human-Object Interaction Detection
by: Jia, Mingda, et al.
Published: (2024)

XrayClaw: Cooperative-Competitive Multi-Agent Alignment for Trustworthy Chest X-ray Diagnosis
by: Young, Shawn, et al.
Published: (2026)

HGP-Mamba: Integrating Histology and Generated Protein Features for Mamba-based Multimodal Survival Risk Prediction
by: Dai, Jing, et al.
Published: (2026)

From Static to Dynamic: a Survey of Topology-Aware Perception in Autonomous Driving
by: Chen, Yixiao, et al.
Published: (2025)

CellSymphony: Deciphering the molecular and phenotypic orchestration of cells with single-cell pathomics
by: Acosta, Paul H., et al.
Published: (2025)

Efficient Chest X-ray Representation Learning via Semantic-Partitioned Contrastive Learning
by: Feng, Wangyu, et al.
Published: (2026)

The Model Knows Which Tokens Matter: Automatic Token Selection via Noise Gating
by: He, Landi, et al.
Published: (2026)

Fewer Tokens, Greater Scaling: Self-Adaptive Visual Bases for Efficient and Expansive Representation Learning
by: Young, Shawn, et al.
Published: (2025)

TC-SSA: Token Compression via Semantic Slot Aggregation for Gigapixel Pathology Reasoning
by: Chen, Zhuo, et al.
Published: (2026)

From Static to Dynamic: Exploring Self-supervised Image-to-Video Representation Transfer Learning
by: Liu, Yang, et al.
Published: (2026)

PIR: Photometric Inverse Rendering with Shading Cues Modeling and Surface Reflectance Regularization
by: Bao, Jingzhi, et al.
Published: (2024)

FaceInsight: A Multimodal Large Language Model for Face Perception
by: Li, Jingzhi, et al.
Published: (2025)

From Static to Interactive: Adapting Visual in-Context Learners for User-Driven Tasks
by: Schmidt, Carlos, et al.
Published: (2026)

Enhancing Visual Grounding and Generalization: A Multi-Task Cycle Training Approach for Vision-Language Models
by: Yang, Xiaoyu, et al.
Published: (2023)

Symphony: A Cognitively-Inspired Multi-Agent System for Long-Video Understanding
by: Yan, Haiyang, et al.
Published: (2026)

Static for Dynamic: Towards a Deeper Understanding of Dynamic Facial Expressions Using Static Expression Data
by: Chen, Yin, et al.
Published: (2024)

Mixed Prototype Consistency Learning for Semi-supervised Medical Image Segmentation
by: Li, Lijian
Published: (2024)

Complementarity-driven Representation Learning for Multi-modal Knowledge Graph Completion
by: Li, Lijian
Published: (2025)

Towards Generalized Few-Shot Open-Set Object Detection
by: Su, Binyi, et al.
Published: (2022)

StyleShot: A Snapshot on Any Style
by: Gao, Junyao, et al.
Published: (2024)

Chain of Modality: From Static Fusion to Dynamic Orchestration in Omni-MLLMs
by: Luo, Ziyang, et al.
Published: (2026)

From Static to Dynamic: Adapting Landmark-Aware Image Models for Facial Expression Recognition in Videos
by: Chen, Yin, et al.
Published: (2023)

Harmonizing Light and Darkness: A Symphony of Prior-guided Data Synthesis and Adaptive Focus for Nighttime Flare Removal
by: Qu, Lishen, et al.
Published: (2024)

Static Scene Reconstruction from Dynamic Egocentric Videos
by: Cui, Qifei, et al.
Published: (2026)

Snapshot: Towards Application-centered Models for Pedestrian Trajectory Prediction in Urban Traffic Environments
by: Uhlemann, Nico, et al.
Published: (2024)

A Deep Unfolding Framework for Diffractive Snapshot Spectral Imaging
by: Zhuge, Zhengyue, et al.
Published: (2025)

Beyond Surrogate Gradients: Fully Differentiable Token Pruning for Vision-Language Models
by: He, Landi, et al.
Published: (2026)

From Prediction to Explanation: Multimodal, Explainable, and Interactive Deepfake Detection Framework for Non-Expert Users
by: Tariq, Shahroz, et al.
Published: (2025)

From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors
by: Zhao, Liangbing, et al.
Published: (2026)

Disentangling Static and Dynamic Information for Reducing Static Bias in Action Recognition
by: Kobayashi, Masato, et al.
Published: (2025)

SnapCap: Efficient Snapshot Compressive Video Captioning
by: Sun, Jianqiao, et al.
Published: (2024)

Rethinking Point Clouds as Sequences: A Causal Next-Token Predictive Learning Framework
by: Yao, Yumeng, et al.
Published: (2026)

Generalized Semantic Contrastive Learning via Embedding Side Information for Few-Shot Object Detection
by: Chen, Ruoyu, et al.
Published: (2025)

One Snapshot is All You Need: A Generalized Method for mmWave Signal Generation
by: Huang, Teng, et al.
Published: (2025)

Pose-Aware Diffusion for 3D Generation
by: Zhou, Zihan, et al.
Published: (2026)

Deep Optics for Video Snapshot Compressive Imaging
by: Wang, Ping, et al.
Published: (2024)

Flow Snapshot Neurons in Action: Deep Neural Networks Generalize to Biological Motion Perception
by: Han, Shuangpeng, et al.
Published: (2024)

Score-based Generative Priors Guided Model-driven Network for MRI Reconstruction
by: Qiao, Xiaoyu, et al.
Published: (2024)