:: Library Catalog

Ảnh bìa

Đã lưu trong:

Chi tiết về thư mục
Những tác giả chính:	Xu, Yifeng, He, Zhenliang, Kan, Meina, Shan, Shiguang, Chen, Xilin
Định dạng:	Preprint
Được phát hành:	2025
Những chủ đề:	Computer Vision and Pattern Recognition Artificial Intelligence Machine Learning
Truy cập trực tuyến:	https://arxiv.org/abs/2505.19084
Các nhãn:	Thêm thẻ Không có thẻ, Là người đầu tiên thẻ bản ghi này!

Những quyển sách tương tự

JoPano: Unified Panorama Generation via Joint Modeling
Bằng: Feng, Wancheng, et al.
Được phát hành: (2025)

CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation
Bằng: Xu, Yifeng, et al.
Được phát hành: (2024)

Dual Attention Guided Defense Against Malicious Edits
Bằng: Zhang, Jie, et al.
Được phát hành: (2025)

Towards Transferable Defense Against Malicious Image Edits
Bằng: Zhang, Jie, et al.
Được phát hành: (2025)

Semantic Mismatch and Perceptual Degradation: A New Perspective on Image Editing Immunity
Bằng: Dong, Shuai, et al.
Được phát hành: (2025)

OSI: One-step Inversion Excels in Extracting Diffusion Watermarks
Bằng: Chen, Yuwei, et al.
Được phát hành: (2026)

Neural Gate: Mitigating Privacy Risks in LVLMs via Neuron-Level Gradient Gating
Bằng: Cao, Xiangkui, et al.
Được phát hành: (2026)

Trigger without Trace: Towards Stealthy Backdoor Attack on Text-to-Image Diffusion Models
Bằng: Zhang, Jie, et al.
Được phát hành: (2025)

UniPose: A Unified Multimodal Framework for Human Pose Comprehension, Generation and Editing
Bằng: Li, Yiheng, et al.
Được phát hành: (2024)

MM-MoralBench: A MultiModal Moral Evaluation Benchmark for Large Vision-Language Models
Bằng: Yan, Bei, et al.
Được phát hành: (2024)

Measuring the Measurers: Quality Evaluation of Hallucination Benchmarks for Large Vision-Language Models
Bằng: Yan, Bei, et al.
Được phát hành: (2024)

Generalized Semi-Supervised Learning via Self-Supervised Feature Adaptation
Bằng: Liang, Jiachen, et al.
Được phát hành: (2024)

Guiding Diffusion-based Reconstruction with Contrastive Signals for Balanced Visual Representation
Bằng: Han, Boyu, et al.
Được phát hành: (2026)

INFACT: A Diagnostic Benchmark for Induced Faithfulness and Factuality Hallucinations in Video-LLMs
Bằng: Yang, Junqi, et al.
Được phát hành: (2026)

Understanding Visual Concepts Across Models
Bằng: Trabucco, Brandon, et al.
Được phát hành: (2024)

HPNet: Dynamic Trajectory Forecasting with Historical Prediction Attention
Bằng: Tang, Xiaolong, et al.
Được phát hành: (2024)

A Geometric Unification of Concept Learning with Concept Cones
Bằng: Rocchi--Henry, Alexandre, et al.
Được phát hành: (2025)

Learning Separable Hidden Unit Contributions for Speaker-Adaptive Lip-Reading
Bằng: Luo, Songtao, et al.
Được phát hành: (2023)

Understanding Implosion in Text-to-Image Generative Models
Bằng: Ding, Wenxin, et al.
Được phát hành: (2024)

VLBiasBench: A Comprehensive Benchmark for Evaluating Bias in Large Vision-Language Model
Bằng: Wang, Sibo, et al.
Được phát hành: (2024)

PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding
Bằng: Cho, Jang Hyun, et al.
Được phát hành: (2025)

EgoAgent: A Joint Predictive Agent Model in Egocentric Worlds
Bằng: Chen, Lu, et al.
Được phát hành: (2025)

PackDiT: Joint Human Motion and Text Generation via Mutual Prompting
Bằng: Jiang, Zhongyu, et al.
Được phát hành: (2025)

Visual Generation Without Guidance
Bằng: Chen, Huayu, et al.
Được phát hành: (2025)

Composition Vision-Language Understanding via Segment and Depth Anything Model
Bằng: Huo, Mingxiao, et al.
Được phát hành: (2024)

MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMs
Bằng: Liu, Ziyu, et al.
Được phát hành: (2024)

VeCLIP: Improving CLIP Training via Visual-enriched Captions
Bằng: Lai, Zhengfeng, et al.
Được phát hành: (2023)

VideoNSA: Native Sparse Attention Scales Video Understanding
Bằng: Song, Enxin, et al.
Được phát hành: (2025)

Generative Visual Code Mobile World Models
Bằng: Koh, Woosung, et al.
Được phát hành: (2026)

Next Visual Granularity Generation
Bằng: Wang, Yikai, et al.
Được phát hành: (2025)

RGB-Th-Bench: A Dense benchmark for Visual-Thermal Understanding of Vision Language Models
Bằng: Moshtaghi, Mehdi, et al.
Được phát hành: (2025)

ModelGrow: Continual Text-to-Video Pre-training with Model Expansion and Language Understanding Enhancement
Bằng: Rao, Zhefan, et al.
Được phát hành: (2024)

VTBench: Evaluating Visual Tokenizers for Autoregressive Image Generation
Bằng: Lin, Huawei, et al.
Được phát hành: (2025)

Reversible Unfolding Network for Concealed Visual Perception with Generative Refinement
Bằng: He, Chunming, et al.
Được phát hành: (2025)

OmniPrism: Learning Disentangled Visual Concept for Image Generation
Bằng: Li, Yangyang, et al.
Được phát hành: (2024)

Semi-supervised Concept Bottleneck Models
Bằng: Hu, Lijie, et al.
Được phát hành: (2024)

VLSU: Mapping the Limits of Joint Multimodal Understanding for AI Safety
Bằng: Palaskar, Shruti, et al.
Được phát hành: (2025)

VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation
Bằng: Zou, Bocheng, et al.
Được phát hành: (2024)

SafeFix: Targeted Model Repair via Controlled Image Generation
Bằng: Xu, Ouyang, et al.
Được phát hành: (2025)

SpectralAR: Spectral Autoregressive Visual Generation
Bằng: Huang, Yuanhui, et al.
Được phát hành: (2025)