Saved in:
| Main Authors: | Liao, Ning, Zhang, Shaofeng, Xia, Renqiu, Cao, Min, Qiao, Yu, Yan, Junchi |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2310.06594 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
M-Tuning: Prompt Tuning with Mitigated Label Bias in Open-Set Scenarios
by: Liao, Ning, et al.
Published: (2023)
by: Liao, Ning, et al.
Published: (2023)
PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders
by: Zhang, Xiangdong, et al.
Published: (2024)
by: Zhang, Xiangdong, et al.
Published: (2024)
Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views
by: Zhang, Xiangdong, et al.
Published: (2025)
by: Zhang, Xiangdong, et al.
Published: (2025)
Learning Adaptive and Temporally Causal Video Tokenization in a 1D Latent Space
by: Li, Yan, et al.
Published: (2025)
by: Li, Yan, et al.
Published: (2025)
Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation
by: Li, Yan, et al.
Published: (2024)
by: Li, Yan, et al.
Published: (2024)
StructChart: On the Schema, Metric, and Augmentation for Visual Chart Understanding
by: Xia, Renqiu, et al.
Published: (2023)
by: Xia, Renqiu, et al.
Published: (2023)
ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning
by: Xia, Renqiu, et al.
Published: (2024)
by: Xia, Renqiu, et al.
Published: (2024)
Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy
by: Ye, Hancheng, et al.
Published: (2024)
by: Ye, Hancheng, et al.
Published: (2024)
DriveVGGT: Calibration-Constrained Visual Geometry Transformers for Multi-Camera Autonomous Driving
by: Jia, Xiaosong, et al.
Published: (2025)
by: Jia, Xiaosong, et al.
Published: (2025)
EvoTok: A Unified Image Tokenizer via Residual Latent Evolution for Visual Understanding and Generation
by: Li, Yan, et al.
Published: (2026)
by: Li, Yan, et al.
Published: (2026)
ReSimAD: Zero-Shot 3D Domain Transfer for Autonomous Driving with Source Reconstruction and Target Simulation
by: Zhang, Bo, et al.
Published: (2023)
by: Zhang, Bo, et al.
Published: (2023)
VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models
by: Zhang, Xiangdong, et al.
Published: (2025)
by: Zhang, Xiangdong, et al.
Published: (2025)
GeoBench: Rethinking Multimodal Geometric Problem-Solving via Hierarchical Evaluation
by: Feng, Yuan, et al.
Published: (2025)
by: Feng, Yuan, et al.
Published: (2025)
GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training
by: Xia, Renqiu, et al.
Published: (2024)
by: Xia, Renqiu, et al.
Published: (2024)
Dual-Branch Center-Surrounding Contrast: Rethinking Contrastive Learning for 3D Point Clouds
by: Zhang, Shaofeng, et al.
Published: (2025)
by: Zhang, Shaofeng, et al.
Published: (2025)
TextRefiner: Internal Visual Feature as Efficient Refiner for Vision-Language Models Prompt Tuning
by: Xie, Jingjing, et al.
Published: (2024)
by: Xie, Jingjing, et al.
Published: (2024)
SPOT: Scalable 3D Pre-training via Occupancy Prediction for Learning Transferable 3D Representations
by: Yan, Xiangchao, et al.
Published: (2023)
by: Yan, Xiangchao, et al.
Published: (2023)
Repulsor: Accelerating Generative Modeling with a Contrastive Memory Bank
by: Zhang, Shaofeng, et al.
Published: (2025)
by: Zhang, Shaofeng, et al.
Published: (2025)
Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression
by: Ye, Hancheng, et al.
Published: (2024)
by: Ye, Hancheng, et al.
Published: (2024)
AVION: Aerial Vision-Language Instruction from Offline Teacher to Prompt-Tuned Network
by: Hu, Yu, et al.
Published: (2026)
by: Hu, Yu, et al.
Published: (2026)
MedHallTune: An Instruction-Tuning Benchmark for Mitigating Medical Hallucination in Vision-Language Models
by: Yan, Qiao, et al.
Published: (2025)
by: Yan, Qiao, et al.
Published: (2025)
Continual LLaVA: Continual Instruction Tuning in Large Vision-Language Models
by: Cao, Meng, et al.
Published: (2024)
by: Cao, Meng, et al.
Published: (2024)
MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity
by: Liu, Yangzhou, et al.
Published: (2024)
by: Liu, Yangzhou, et al.
Published: (2024)
Can Vision Language Models Assess Graphic Design Aesthetics? A Benchmark, Evaluation, and Dataset Perspective
by: An, Arctanx, et al.
Published: (2026)
by: An, Arctanx, et al.
Published: (2026)
FineRMoE: Dimension Expansion for Finer-Grained Expert with Its Upcycling Approach
by: Liao, Ning, et al.
Published: (2026)
by: Liao, Ning, et al.
Published: (2026)
Resolving Representation Ambiguity in Feedforward Novel View Synthesis Transformer via Semantic-Spatial Decoupling
by: Wu, Yihang, et al.
Published: (2026)
by: Wu, Yihang, et al.
Published: (2026)
Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
by: Zhang, Jinrui, et al.
Published: (2024)
by: Zhang, Jinrui, et al.
Published: (2024)
Boosting Order-Preserving and Transferability for Neural Architecture Search: a Joint Architecture Refined Search and Fine-tuning Approach
by: Zhang, Beichen, et al.
Published: (2024)
by: Zhang, Beichen, et al.
Published: (2024)
Instruction-Free Tuning of Large Vision Language Models for Medical Instruction Following
by: Kang, Myeongkyun, et al.
Published: (2026)
by: Kang, Myeongkyun, et al.
Published: (2026)
InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation
by: Yang, Shuai, et al.
Published: (2025)
by: Yang, Shuai, et al.
Published: (2025)
Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach
by: Zhang, Shaofeng, et al.
Published: (2024)
by: Zhang, Shaofeng, et al.
Published: (2024)
VCM: Vision Concept Modeling Based on Implicit Contrastive Learning with Vision-Language Instruction Fine-Tuning
by: Luo, Run, et al.
Published: (2025)
by: Luo, Run, et al.
Published: (2025)
Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing
by: Zhao, Xiangyu, et al.
Published: (2025)
by: Zhao, Xiangyu, et al.
Published: (2025)
Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning
by: Gou, Yunhao, et al.
Published: (2023)
by: Gou, Yunhao, et al.
Published: (2023)
DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models
by: Xia, Renqiu, et al.
Published: (2024)
by: Xia, Renqiu, et al.
Published: (2024)
Streaming Video Instruction Tuning
by: Xia, Jiaer, et al.
Published: (2025)
by: Xia, Jiaer, et al.
Published: (2025)
Image Over Text: Transforming Formula Recognition Evaluation with Character Detection Matching
by: Wang, Bin, et al.
Published: (2024)
by: Wang, Bin, et al.
Published: (2024)
PointAlign: Feature-Level Alignment Regularization for 3D Vision-Language Models
by: Su, Yuanhao, et al.
Published: (2026)
by: Su, Yuanhao, et al.
Published: (2026)
B-AVIBench: Towards Evaluating the Robustness of Large Vision-Language Model on Black-box Adversarial Visual-Instructions
by: Zhang, Hao, et al.
Published: (2024)
by: Zhang, Hao, et al.
Published: (2024)
AntifakePrompt: Prompt-Tuned Vision-Language Models are Fake Image Detectors
by: Chang, You-Ming, et al.
Published: (2023)
by: Chang, You-Ming, et al.
Published: (2023)
Similar Items
-
M-Tuning: Prompt Tuning with Mitigated Label Bias in Open-Set Scenarios
by: Liao, Ning, et al.
Published: (2023) -
PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders
by: Zhang, Xiangdong, et al.
Published: (2024) -
Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views
by: Zhang, Xiangdong, et al.
Published: (2025) -
Learning Adaptive and Temporally Causal Video Tokenization in a 1D Latent Space
by: Li, Yan, et al.
Published: (2025) -
Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation
by: Li, Yan, et al.
Published: (2024)