:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhou, Yuyin, Li, Xianhang, Liu, Fengze, Wei, Qingyue, Chen, Xuxi, Yu, Lequan, Xie, Cihang, Lungren, Matthew P., Xing, Lei
Format:	Preprint
Published:	2022
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2202.04291
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

3D-TransUNet for Brain Metastases Segmentation in the BraTS2023 Challenge
by: Yang, Siwei, et al.
Published: (2024)

OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning
by: Liu, Yanqing, et al.
Published: (2025)

Scaling White-Box Transformers for Vision
by: Yang, Jinrui, et al.
Published: (2024)

MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine
by: Xie, Yunfei, et al.
Published: (2024)

Revisiting Adversarial Training at Scale
by: Wang, Zeyu, et al.
Published: (2024)

CLIPS: An Enhanced CLIP Framework for Learning with Synthetic Captions
by: Liu, Yanqing, et al.
Published: (2024)

OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning
by: Li, Xianhang, et al.
Published: (2025)

Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training
by: Gao, Yipeng, et al.
Published: (2023)

Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More
by: Wang, Feng, et al.
Published: (2025)

Combating Semantic Contamination in Learning with Label Noise
by: Fan, Wenxiao, et al.
Published: (2024)

OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation
by: Zhang, Letian, et al.
Published: (2026)

What If We Recaption Billions of Web Images with LLaMA-3?
by: Li, Xianhang, et al.
Published: (2024)

$\texttt{Complex-Edit}$: CoT-Like Instruction Generation for Complexity-Controllable Image Editing Benchmark
by: Yang, Siwei, et al.
Published: (2025)

Omni-MMSI: Toward Identity-attributed Social Interaction Understanding
by: Li, Xinpeng, et al.
Published: (2026)

GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset
by: Wang, Yuhan, et al.
Published: (2025)

Unleashing the Potential of SAM for Medical Adaptation via Hierarchical Decoding
by: Cheng, Zhiheng, et al.
Published: (2024)

Knowledge or Reasoning? A Close Look at How LLMs Think Across Domains
by: Wu, Juncheng, et al.
Published: (2025)

Mamba-R: Vision Mamba ALSO Needs Registers
by: Wang, Feng, et al.
Published: (2024)

Adventurer: Optimizing Vision Mamba Architecture Designs for Efficiency
by: Wang, Feng, et al.
Published: (2024)

A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties
by: Xiao, Junfei, et al.
Published: (2023)

ARFlow: Autoregressive Flow with Hybrid Linear Attention
by: Hui, Mude, et al.
Published: (2025)

Multi-sensor Learning Enables Information Transfer across Different Sensory Data and Augments Multi-modality Imaging
by: Zhu, Lingting, et al.
Published: (2024)

HybridMIM: A Hybrid Masked Image Modeling Framework for 3D Medical Image Segmentation
by: Xing, Zhaohu, et al.
Published: (2023)

Adaptive NNs asymptotic tracking control for high‐order nonlinear systems under prescribed performance and asymmetric output constraints
by: Kun Jiang, et al.
Published: (2024)

SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
by: Chen, Hardy, et al.
Published: (2025)

A Unified and Controllable Framework for Layered Image Generation with Visual Effects
by: Yang, Jinrui, et al.
Published: (2026)

ClinSeekAgent: Automating Multimodal Evidence Seeking for Agentic Clinical Reasoning
by: Wu, Juncheng, et al.
Published: (2026)

HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing
by: Hui, Mude, et al.
Published: (2024)

Your Agent, Their Asset: A Real-World Safety Analysis of OpenClaw
by: Wang, Zijun, et al.
Published: (2026)

Navigation as Attackers Wish? Towards Building Robust Embodied Agents under Federated Learning
by: Zhang, Yunchao, et al.
Published: (2022)

Target-Oriented Pretraining Data Selection via Neuron-Activated Graph
by: Wang, Zijun, et al.
Published: (2026)

A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor?
by: Xie, Yunfei, et al.
Published: (2024)

Combating Label Noise With A General Surrogate Model For Sample Selection
by: Liang, Chao, et al.
Published: (2023)

Medical Vision Generalist: Unifying Medical Imaging Tasks in Context
by: Ren, Sucheng, et al.
Published: (2024)

On the Adversarial Robustness of Camera-based 3D Object Detection
by: Xie, Shaoyuan, et al.
Published: (2023)

A Robust Transformer–Based Error Compensation Method for Gyroscope of IMUs
by: Xin Ye, et al.
Published: (2025)

PLReMix: Combating Noisy Labels with Pseudo-Label Relaxed Contrastive Representation Learning
by: Liu, Xiaoyu, et al.
Published: (2024)

Context-driven Missing-Modality Learning for Robust Medical Diagnosis with Image-Tabular Data
by: Liu, Tianling, et al.
Published: (2026)

FedRGL: Robust Federated Graph Learning for Label Noise
by: Li, De, et al.
Published: (2024)

Hide and Seek in Noise Labels: Noise-Robust Collaborative Active Learning with LLM-Powered Assistance
by: Yuan, Bo, et al.
Published: (2025)