Saved in:
| Main Authors: | Guo, Yihang, Yu, Tianyuan, Bai, Liang, Guo, Yanming, Ruan, Yirun, Li, William, Zheng, Weishi |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.23915 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
COLA: Context-aware Language-driven Test-time Adaptation
by: Zhang, Aiming, et al.
Published: (2025)
by: Zhang, Aiming, et al.
Published: (2025)
Multimodal Multilabel Classification by CLIP
by: Guo, Yanming
Published: (2024)
by: Guo, Yanming
Published: (2024)
MMVIAD: Multi-view Multi-task Video Understanding for Industrial Anomaly Detection
by: Zhao, Xiran, et al.
Published: (2026)
by: Zhao, Xiran, et al.
Published: (2026)
Synergistic Multiscale Detail Refinement via Intrinsic Supervision for Underwater Image Enhancement
by: Zhang, Dehuan, et al.
Published: (2023)
by: Zhang, Dehuan, et al.
Published: (2023)
Anisotropic Diffusion Probabilistic Model for Imbalanced Image Classification
by: Kong, Jingyu, et al.
Published: (2024)
by: Kong, Jingyu, et al.
Published: (2024)
MoE3D: Mixture of Experts meets Multi-Modal 3D Understanding
by: Li, Yu, et al.
Published: (2025)
by: Li, Yu, et al.
Published: (2025)
Diagnosing Shortcut-Induced Rigidity in Continual Learning: The Einstellung Rigidity Index (ERI)
by: Gu, Kai, et al.
Published: (2025)
by: Gu, Kai, et al.
Published: (2025)
FoCLIP: A Feature-Space Misalignment Framework for CLIP-Based Image Manipulation and Detection
by: Chen, Yulin, et al.
Published: (2025)
by: Chen, Yulin, et al.
Published: (2025)
Deep Learning for Visual Speech Analysis: A Survey
by: Sheng, Changchong, et al.
Published: (2022)
by: Sheng, Changchong, et al.
Published: (2022)
MMAIF: Multi-task and Multi-degradation All-in-One for Image Fusion with Language Guidance
by: Cao, Zihan, et al.
Published: (2025)
by: Cao, Zihan, et al.
Published: (2025)
FRUC: Feedforward Dynamic Scene Reconstruction from Uncalibrated Collaborative Driving Views
by: Tao, Yihang, et al.
Published: (2026)
by: Tao, Yihang, et al.
Published: (2026)
Bayesian Evidential Learning for Few-Shot Classification
by: Linghu, Xiongkun, et al.
Published: (2022)
by: Linghu, Xiongkun, et al.
Published: (2022)
FocalCount: Towards Class-Count Imbalance in Class-Agnostic Counting
by: Zhu, Huilin, et al.
Published: (2025)
by: Zhu, Huilin, et al.
Published: (2025)
MAGE: A Multi-task Architecture for Gaze Estimation with an Efficient Calibration Module
by: Huang, Haoming, et al.
Published: (2025)
by: Huang, Haoming, et al.
Published: (2025)
V2XCrafter: Learning to Generate Driving Scene Across Agents
by: Tao, Yihang, et al.
Published: (2026)
by: Tao, Yihang, et al.
Published: (2026)
Beyond Mamba: Enhancing State-space Models with Deformable Dilated Convolutions for Multi-scale Traffic Object Detection
by: Li, Jun, et al.
Published: (2026)
by: Li, Jun, et al.
Published: (2026)
Body Segmentation Using Multi-task Learning
by: Jug, Julijan, et al.
Published: (2022)
by: Jug, Julijan, et al.
Published: (2022)
Revisiting Photometric Ambiguity for Accurate Gaussian-Splatting Surface Reconstruction
by: Li, Jiahe, et al.
Published: (2026)
by: Li, Jiahe, et al.
Published: (2026)
AdaTask: A Task-aware Adaptive Learning Rate Approach to Multi-task Learning
by: Yang, Enneng, et al.
Published: (2022)
by: Yang, Enneng, et al.
Published: (2022)
AMSP-UOD: When Vortex Convolution and Stochastic Perturbation Meet Underwater Object Detection
by: Zhou, Jingchun, et al.
Published: (2023)
by: Zhou, Jingchun, et al.
Published: (2023)
mTREE: Multi-Level Text-Guided Representation End-to-End Learning for Whole Slide Image Analysis
by: Liu, Quan, et al.
Published: (2024)
by: Liu, Quan, et al.
Published: (2024)
PAUL: Uncertainty-Guided Partition and Augmentation for Robust Cross-View Geo-Localization under Noisy Correspondence
by: Li, Zheng, et al.
Published: (2025)
by: Li, Zheng, et al.
Published: (2025)
Co-Training Vision Language Models for Remote Sensing Multi-task Learning
by: Li, Qingyun, et al.
Published: (2025)
by: Li, Qingyun, et al.
Published: (2025)
On the Error Analysis of 3D Gaussian Splatting and an Optimal Projection Strategy
by: Huang, Letian, et al.
Published: (2024)
by: Huang, Letian, et al.
Published: (2024)
MAC-Lookup: Multi-Axis Conditional Lookup Model for Underwater Image Enhancement
by: Yi, Fanghai, et al.
Published: (2025)
by: Yi, Fanghai, et al.
Published: (2025)
Revisiting Face Forgery Detection: From Facial Representation to Forgery Detection
by: Guo, Zonghui, et al.
Published: (2024)
by: Guo, Zonghui, et al.
Published: (2024)
LCGC: Learning from Consistency Gradient Conflicting for Class-Imbalanced Semi-Supervised Debiasing
by: Xing, Weiwei, et al.
Published: (2025)
by: Xing, Weiwei, et al.
Published: (2025)
OmniCamera: A Unified Framework for Multi-task Video Generation with Arbitrary Camera Control
by: Wang, Yukun, et al.
Published: (2026)
by: Wang, Yukun, et al.
Published: (2026)
Multi-Task Learning for Robot Perception with Imbalanced Data
by: Erkent, Ozgur
Published: (2026)
by: Erkent, Ozgur
Published: (2026)
Diffusion-based Visual Anagram as Multi-task Learning
by: Xu, Zhiyuan, et al.
Published: (2024)
by: Xu, Zhiyuan, et al.
Published: (2024)
Multi-task Learning For Joint Action and Gesture Recognition
by: Spathis, Konstantinos, et al.
Published: (2025)
by: Spathis, Konstantinos, et al.
Published: (2025)
Video-based Generalized Category Discovery via Memory-Guided Consistency-Aware Contrastive Learning
by: Jing, Zhang, et al.
Published: (2025)
by: Jing, Zhang, et al.
Published: (2025)
MultiMatch: Multi-task Learning for Semi-supervised Domain Generalization
by: Qi, Lei, et al.
Published: (2022)
by: Qi, Lei, et al.
Published: (2022)
HSFusion: A high-level vision task-driven infrared and visible image fusion network via semantic and geometric domain transformation
by: Jiang, Chengjie, et al.
Published: (2024)
by: Jiang, Chengjie, et al.
Published: (2024)
Toward Safe, Trustworthy and Realistic Augmented Reality User Experience
by: Xiu, Yanming
Published: (2025)
by: Xiu, Yanming
Published: (2025)
Self-Supervised Multi-Scale Network for Blind Image Deblurring via Alternating Optimization
by: Guo, Lening, et al.
Published: (2024)
by: Guo, Lening, et al.
Published: (2024)
CLIMD: A Curriculum Learning Framework for Imbalanced Multimodal Diagnosis
by: Han, Kai, et al.
Published: (2025)
by: Han, Kai, et al.
Published: (2025)
On Learning Multi-Modal Forgery Representation for Diffusion Generated Video Detection
by: Song, Xiufeng, et al.
Published: (2024)
by: Song, Xiufeng, et al.
Published: (2024)
Towards Metric-Aware Multi-Person Mesh Recovery by Jointly Optimizing Human Crowd in Camera Space
by: Wang, Kaiwen, et al.
Published: (2025)
by: Wang, Kaiwen, et al.
Published: (2025)
Transparent Fragments Contour Estimation via Visual-Tactile Fusion for Autonomous Reassembly
by: Lin, Qihao, et al.
Published: (2026)
by: Lin, Qihao, et al.
Published: (2026)
Similar Items
-
COLA: Context-aware Language-driven Test-time Adaptation
by: Zhang, Aiming, et al.
Published: (2025) -
Multimodal Multilabel Classification by CLIP
by: Guo, Yanming
Published: (2024) -
MMVIAD: Multi-view Multi-task Video Understanding for Industrial Anomaly Detection
by: Zhao, Xiran, et al.
Published: (2026) -
Synergistic Multiscale Detail Refinement via Intrinsic Supervision for Underwater Image Enhancement
by: Zhang, Dehuan, et al.
Published: (2023) -
Anisotropic Diffusion Probabilistic Model for Imbalanced Image Classification
by: Kong, Jingyu, et al.
Published: (2024)