Saved in:
| Main Authors: | Lei, Chenyang, Chen, Liyi, Cen, Jun, Chen, Xiao, Lei, Zhen, Heide, Felix, Chen, Qifeng, Zhang, Zhaoxiang |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2411.18669 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SimMAT: Exploring Transferability from Vision Foundation Models to Any Image Modality
by: Lei, Chenyang, et al.
Published: (2024)
by: Lei, Chenyang, et al.
Published: (2024)
Robust Depth Enhancement via Polarization Prompt Fusion Tuning
by: Ikemura, Kei, et al.
Published: (2024)
by: Ikemura, Kei, et al.
Published: (2024)
General Geometry-aware Weakly Supervised 3D Object Detection
by: Zhang, Guowen, et al.
Published: (2024)
by: Zhang, Guowen, et al.
Published: (2024)
SimMLM: A Simple Framework for Multi-modal Learning with Missing Modality
by: Li, Sijie, et al.
Published: (2025)
by: Li, Sijie, et al.
Published: (2025)
FIRM: Flexible Interactive Reflection reMoval
by: Chen, Xiao, et al.
Published: (2024)
by: Chen, Xiao, et al.
Published: (2024)
Adaptive Domain Learning for Cross-domain Image Denoising
by: Qian, Zian, et al.
Published: (2024)
by: Qian, Zian, et al.
Published: (2024)
Automatic Controllable Colorization via Imagination
by: Cong, Xiaoyan, et al.
Published: (2024)
by: Cong, Xiaoyan, et al.
Published: (2024)
BEVDilation: LiDAR-Centric Multi-Modal Fusion for 3D Object Detection
by: Zhang, Guowen, et al.
Published: (2025)
by: Zhang, Guowen, et al.
Published: (2025)
FreeTuner: Any Subject in Any Style with Training-free Diffusion
by: Xu, Youcan, et al.
Published: (2024)
by: Xu, Youcan, et al.
Published: (2024)
Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention
by: Chen, Zuyao, et al.
Published: (2023)
by: Chen, Zuyao, et al.
Published: (2023)
GPT4SGG: Synthesizing Scene Graphs from Holistic and Region-specific Narratives
by: Chen, Zuyao, et al.
Published: (2023)
by: Chen, Zuyao, et al.
Published: (2023)
Polarization Wavefront Lidar: Learning Large Scene Reconstruction from Polarized Wavefronts
by: Scheuble, Dominik, et al.
Published: (2024)
by: Scheuble, Dominik, et al.
Published: (2024)
TaskGalaxy: Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types
by: Chen, Jiankang, et al.
Published: (2025)
by: Chen, Jiankang, et al.
Published: (2025)
Generalized and Efficient 2D Gaussian Splatting for Arbitrary-scale Super-Resolution
by: Chen, Du, et al.
Published: (2025)
by: Chen, Du, et al.
Published: (2025)
Simple bots breed social punishment in humans
by: Shen, Chen, et al.
Published: (2022)
by: Shen, Chen, et al.
Published: (2022)
A Few-Shot Metric Learning Method with Dual-Channel Attention for Cross-Modal Same-Neuron Identification
by: Li, Wenwei, et al.
Published: (2025)
by: Li, Wenwei, et al.
Published: (2025)
Large Motion Video Autoencoding with Cross-modal Video VAE
by: Xing, Yazhou, et al.
Published: (2024)
by: Xing, Yazhou, et al.
Published: (2024)
Cross-Modal Obfuscation for Jailbreak Attacks on Large Vision-Language Models
by: Jiang, Lei, et al.
Published: (2025)
by: Jiang, Lei, et al.
Published: (2025)
Instruction-based Image Editing with Planning, Reasoning, and Generation
by: Ji, Liya, et al.
Published: (2026)
by: Ji, Liya, et al.
Published: (2026)
Unleashing the Power of Imbalanced Modality Information for Multi-modal Knowledge Graph Completion
by: Zhang, Yichi, et al.
Published: (2024)
by: Zhang, Yichi, et al.
Published: (2024)
Cross-Organ and Cross-Scanner Adenocarcinoma Segmentation using Rein to Fine-tune Vision Foundation Models
by: Cai, Pengzhou, et al.
Published: (2024)
by: Cai, Pengzhou, et al.
Published: (2024)
Using Left and Right Brains Together: Towards Vision and Language Planning
by: Cen, Jun, et al.
Published: (2024)
by: Cen, Jun, et al.
Published: (2024)
Data Selection for Fine-tuning Vision Language Models via Cross Modal Alignment Trajectories
by: Naharas, Nilay, et al.
Published: (2025)
by: Naharas, Nilay, et al.
Published: (2025)
Deep Class-guided Hashing for Multi-label Cross-modal Retrieval
by: Chen, Hao, et al.
Published: (2024)
by: Chen, Hao, et al.
Published: (2024)
Why LLM Safety Guardrails Collapse After Fine-tuning: A Similarity Analysis Between Alignment and Fine-tuning Datasets
by: Hsiung, Lei, et al.
Published: (2025)
by: Hsiung, Lei, et al.
Published: (2025)
AnyECG-Lab: An Exploration Study of Fine-tuning an ECG Foundation Model to Estimate Laboratory Values from Single-Lead ECG Signals
by: Xiao, Yujie, et al.
Published: (2025)
by: Xiao, Yujie, et al.
Published: (2025)
Multi-modal Reasoning with LLMs for Visual Semantic Arithmetic
by: Xu, Chuou, et al.
Published: (2026)
by: Xu, Chuou, et al.
Published: (2026)
MDReID: Modality-Decoupled Learning for Any-to-Any Multi-Modal Object Re-Identification
by: Feng, Yingying, et al.
Published: (2025)
by: Feng, Yingying, et al.
Published: (2025)
RA-CMF: Region-Adaptive Conditional MeanFlow for CT Image Reconstruction
by: Apurba, Md Shifatul Ahsan, et al.
Published: (2026)
by: Apurba, Md Shifatul Ahsan, et al.
Published: (2026)
DrivingGPT: Unifying Driving World Modeling and Planning with Multi-modal Autoregressive Transformers
by: Chen, Yuntao, et al.
Published: (2024)
by: Chen, Yuntao, et al.
Published: (2024)
Search to Fine-tune Pre-trained Graph Neural Networks for Graph-level Tasks
by: Wang, Zhili, et al.
Published: (2023)
by: Wang, Zhili, et al.
Published: (2023)
CMF-IoU: Multi-Stage Cross-Modal Fusion 3D Object Detection with IoU Joint Prediction
by: Ning, Zhiwei, et al.
Published: (2025)
by: Ning, Zhiwei, et al.
Published: (2025)
Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model
by: Yang, Kai, et al.
Published: (2023)
by: Yang, Kai, et al.
Published: (2023)
Fine-tuning an ECG Foundation Model to Predict Coronary CT Angiography Outcomes
by: Xiao, Yujie, et al.
Published: (2025)
by: Xiao, Yujie, et al.
Published: (2025)
SkySense V2: A Unified Foundation Model for Multi-modal Remote Sensing
by: Zhang, Yingying, et al.
Published: (2025)
by: Zhang, Yingying, et al.
Published: (2025)
As Simple as Fine-tuning: LLM Alignment via Bidirectional Negative Feedback Loss
by: Mao, Xin, et al.
Published: (2024)
by: Mao, Xin, et al.
Published: (2024)
Cross-modality Attention Adapter: A Glioma Segmentation Fine-tuning Method for SAM Using Multimodal Brain MR Images
by: Shi, Xiaoyu, et al.
Published: (2023)
by: Shi, Xiaoyu, et al.
Published: (2023)
MCRPL: A Pretrain, Prompt & Fine-tune Paradigm for Non-overlapping Many-to-one Cross-domain Recommendation
by: Liu, Hao, et al.
Published: (2024)
by: Liu, Hao, et al.
Published: (2024)
VisionTS++: Cross-Modal Time Series Foundation Model with Continual Pre-trained Vision Backbones
by: Shen, Lefei, et al.
Published: (2025)
by: Shen, Lefei, et al.
Published: (2025)
DiffSpeaker: Speech-Driven 3D Facial Animation with Diffusion Transformer
by: Ma, Zhiyuan, et al.
Published: (2024)
by: Ma, Zhiyuan, et al.
Published: (2024)
Similar Items
-
SimMAT: Exploring Transferability from Vision Foundation Models to Any Image Modality
by: Lei, Chenyang, et al.
Published: (2024) -
Robust Depth Enhancement via Polarization Prompt Fusion Tuning
by: Ikemura, Kei, et al.
Published: (2024) -
General Geometry-aware Weakly Supervised 3D Object Detection
by: Zhang, Guowen, et al.
Published: (2024) -
SimMLM: A Simple Framework for Multi-modal Learning with Missing Modality
by: Li, Sijie, et al.
Published: (2025) -
FIRM: Flexible Interactive Reflection reMoval
by: Chen, Xiao, et al.
Published: (2024)