:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liu, Xiaoyi, Tang, Hao
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2504.18040
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

FOLIAGE: Towards Physical Intelligence World Models Via Unbounded Surface Evolution
by: Liu, Xiaoyi, et al.
Published: (2025)

DiffFNO: Diffusion Fourier Neural Operator
by: Liu, Xiaoyi, et al.
Published: (2024)

UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface
by: Tang, Hao, et al.
Published: (2025)

PDF: A Probability-Driven Framework for Open World 3D Point Cloud Semantic Segmentation
by: Xu, Jinfeng, et al.
Published: (2024)

Deep Learning in Medical Image Classification from MRI-based Brain Tumor Images
by: Liu, Xiaoyi, et al.
Published: (2024)

GVKF: Gaussian Voxel Kernel Functions for Highly Efficient Surface Reconstruction in Open Scenes
by: Song, Gaochao, et al.
Published: (2024)

A PCA Based Model for Surface Reconstruction from Incomplete Point Clouds
by: Liu, Hao
Published: (2025)

physfusion: A Transformer-based Dual-Stream Radar and Vision Fusion Framework for Open Water Surface Object Detection
by: Wan, Yuting, et al.
Published: (2026)

BaryIR: Learning Multi-Source Unified Representation in Continuous Barycenter Space for Generalizable All-in-One Image Restoration
by: Tang, Xiaole, et al.
Published: (2025)

UniLiP: Adapting CLIP for Unified Multimodal Understanding, Generation and Editing
by: Tang, Hao, et al.
Published: (2025)

DynImg: Key Frames with Visual Prompts are Good Representation for Multi-Modal Video Understanding
by: Bao, Xiaoyi, et al.
Published: (2025)

OSR-ViT: A Simple and Modular Framework for Open-Set Object Detection and Discovery
by: Inkawhich, Matthew, et al.
Published: (2024)

GMC: A General Framework of Multi-stage Context Learning and Utilization for Visual Detection Tasks
by: Wang, Xuan, et al.
Published: (2024)

Differential Coding for Training-Free ANN-to-SNN Conversion
by: Huang, Zihan, et al.
Published: (2025)

Degradation-Aware Residual-Conditioned Optimal Transport for Unified Image Restoration
by: Tang, Xiaole, et al.
Published: (2024)

Learning Continuous Wasserstein Barycenter Space for Generalized All-in-One Image Restoration
by: Tang, Xiaole, et al.
Published: (2026)

One Pixel is All I Need
by: Siqin, Deng, et al.
Published: (2024)

VeraRetouch: A Lightweight Fully Differentiable Framework for Multi-Task Reasoning Photo Retouching
by: Guo, Yihong, et al.
Published: (2026)

XQ-GAN: An Open-source Image Tokenization Framework for Autoregressive Generation
by: Li, Xiang, et al.
Published: (2024)

PhysRig: Differentiable Physics-Based Skinning and Rigging Framework for Realistic Articulated Object Modeling
by: Zhang, Hao, et al.
Published: (2025)

Towards Training-free Open-world Segmentation via Image Prompt Foundation Models
by: Tang, Lv, et al.
Published: (2023)

Token Transforming: A Unified and Training-Free Token Compression Framework for Vision Transformer Acceleration
by: Zeng, Fanhu, et al.
Published: (2025)

UniVid: The Open-Source Unified Video Model
by: Luo, Jiabin, et al.
Published: (2025)

CorrCLIP: Reconstructing Patch Correlations in CLIP for Open-Vocabulary Semantic Segmentation
by: Zhang, Dengke, et al.
Published: (2024)

Open-Det: An Efficient Learning Framework for Open-Ended Detection
by: Cao, Guiping, et al.
Published: (2025)

OpenConstruction: A Systematic Synthesis of Open Visual Datasets for Data-Centric Artificial Intelligence in Construction Monitoring
by: Xiong, Ruoxin, et al.
Published: (2025)

Impromptu VLA: Open Weights and Open Data for Driving Vision-Language-Action Models
by: Chi, Haohan, et al.
Published: (2025)

ReME: A Data-Centric Framework for Training-Free Open-Vocabulary Segmentation
by: Xuan, Xiwei, et al.
Published: (2025)

An Efficient Streaming Video Understanding Framework with Agentic Control
by: Liu, Jinming, et al.
Published: (2026)

Open-Vocabulary Segmentation with Semantic-Assisted Calibration
by: Liu, Yong, et al.
Published: (2023)

DiffCD: A Symmetric Differentiable Chamfer Distance for Neural Implicit Surface Fitting
by: Härenstam-Nielsen, Linus, et al.
Published: (2024)

calibfusion: Transformer-Based Differentiable Calibration for Radar-Camera Fusion Detection in Water-Surface Environments
by: Wan, Yuting, et al.
Published: (2026)

StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion
by: Tao, Ming, et al.
Published: (2024)

Follow Your Motion: A Generic Temporal Consistency Portrait Editing Framework with Trajectory Guidance
by: Yang, Haijie, et al.
Published: (2025)

Aria: An Open Multimodal Native Mixture-of-Experts Model
by: Li, Dongxu, et al.
Published: (2024)

Open-World Test-Time Adaptation with Hierarchical Feature Aggregation and Attention Affine
by: Liu, Ziqiong, et al.
Published: (2025)

UADet: A Remarkably Simple Yet Effective Uncertainty-Aware Open-Set Object Detection Framework
by: Cheng, Silin, et al.
Published: (2024)

Recall and Refine: A Simple but Effective Source-free Open-set Domain Adaptation Framework
by: Nejjar, Ismail, et al.
Published: (2024)

Thermal Detection of People with Mobility Restrictions for Barrier Reduction at Traffic Lights Controlled Intersections
by: Ni, Xiao, et al.
Published: (2025)

Query-Based Knowledge Sharing for Open-Vocabulary Multi-Label Classification
by: Zhu, Xuelin, et al.
Published: (2024)