:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Li, Qiuhao, Yuan, Shenghai
Format:	Preprint
Veröffentlicht:	2024
Schlagworte:	Computer Vision and Pattern Recognition Artificial Intelligence
Online-Zugang:	https://arxiv.org/abs/2402.05747
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation
von: Yuan, Shenghai, et al.
Veröffentlicht: (2025)

GERA: Geometric Embedding for Efficient Point Registration Analysis
von: Li, Geng, et al.
Veröffentlicht: (2024)

WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model
von: Li, Zongjian, et al.
Veröffentlicht: (2024)

MMAUD: A Comprehensive Multi-Modal Anti-UAV Dataset for Modern Miniature Drone Threats
von: Yuan, Shenghai, et al.
Veröffentlicht: (2024)

VITAL: Interactive Few-Shot Imitation Learning via Visual Human-in-the-Loop Corrections
von: Kasaei, Hamidreza, et al.
Veröffentlicht: (2024)

UniWorld-V1: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation
von: Lin, Bin, et al.
Veröffentlicht: (2025)

LoopVLA: Learning Sufficiency in Recurrent Refinement for Vision-Language-Action Models
von: Shen, Boyang, et al.
Veröffentlicht: (2026)

Towards Efficient and Intelligent Laser Weeding: Method and Dataset for Weed Stem Detection
von: Liu, Dingning, et al.
Veröffentlicht: (2025)

Self-Correcting VLA: Online Action Refinement via Sparse World Imagination
von: Liu, Chenyv, et al.
Veröffentlicht: (2026)

WAS: Dataset and Methods for Artistic Text Segmentation
von: Xie, Xudong, et al.
Veröffentlicht: (2024)

Closed-Loop Action Chunks with Dynamic Corrections for Training-Free Diffusion Policy
von: Wu, Pengyuan, et al.
Veröffentlicht: (2026)

Comp-Attn: Present-and-Align Attention for Compositional Video Generation
von: Zhang, Hongyu, et al.
Veröffentlicht: (2025)

InfBaGel: Human-Object-Scene Interaction Generation with Dynamic Perception and Iterative Refinement
von: Zou, Yude, et al.
Veröffentlicht: (2026)

PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation
von: Xue, Qiyao, et al.
Veröffentlicht: (2024)

Self-Correcting Text-to-Video Generation with Misalignment Detection and Localized Refinement
von: Lee, Daeun, et al.
Veröffentlicht: (2024)

Risk-Aware Human-in-the-Loop Framework with Adaptive Intrusion Response for Autonomous Vehicles
von: Wasif, Dawood, et al.
Veröffentlicht: (2026)

HuLP: Human-in-the-Loop for Prognosis
von: Ridzuan, Muhammad, et al.
Veröffentlicht: (2024)

A Large-Scale Multimodal Dataset and Benchmarks for Human Activity Scene Understanding and Reasoning
von: Jiang, Siyang, et al.
Veröffentlicht: (2025)

MagicMan: Generative Novel View Synthesis of Humans with 3D-Aware Diffusion and Iterative Refinement
von: He, Xu, et al.
Veröffentlicht: (2024)

Stitch and Tell: A Structured Multimodal Data Augmentation Method for Spatial Understanding
von: Yin, Hang, et al.
Veröffentlicht: (2025)

Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel
von: Wang, Zun, et al.
Veröffentlicht: (2024)

Refined Temporal Pyramidal Compression-and-Amplification Transformer for 3D Human Pose Estimation
von: Liu, Hanbing, et al.
Veröffentlicht: (2023)

Human Activity Recognition using RGB-Event based Sensors: A Multi-modal Heat Conduction Model and A Benchmark Dataset
von: Wang, Shiao, et al.
Veröffentlicht: (2025)

Self-Correcting Self-Consuming Loops for Generative Model Training
von: Gillman, Nate, et al.
Veröffentlicht: (2024)

PromptLoop: Plug-and-Play Prompt Refinement via Latent Feedback for Diffusion Model Alignment
von: Lee, Suhyeon, et al.
Veröffentlicht: (2025)

CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance
von: Deng, Yufan, et al.
Veröffentlicht: (2025)

Just Add Geometry: Gradient-Free Open-Vocabulary 3D Detection Without Human-in-the-Loop
von: Goel, Atharv, et al.
Veröffentlicht: (2025)

Sketch Input Method Editor: A Comprehensive Dataset and Methodology for Systematic Input Recognition
von: Zhu, Guangming, et al.
Veröffentlicht: (2023)

A Comprehensive Survey on Human Video Generation: Challenges, Methods, and Insights
von: Lei, Wentao, et al.
Veröffentlicht: (2024)

PRISM: Video Dataset Condensation with Progressive Refinement and Insertion for Sparse Motion
von: Choi, Jaehyun, et al.
Veröffentlicht: (2025)

3D Weakly Supervised Semantic Segmentation via Class-Aware and Geometry-Guided Pseudo-Label Refinement
von: Xu, Xiaoxu, et al.
Veröffentlicht: (2025)

MAGREF: Masked Guidance for Any-Reference Video Generation with Subject Disentanglement
von: Deng, Yufan, et al.
Veröffentlicht: (2025)

Architecture, Dataset and Model-Scale Agnostic Data-free Meta-Learning
von: Hu, Zixuan, et al.
Veröffentlicht: (2023)

RS-GPT4V: A Unified Multimodal Instruction-Following Dataset for Remote Sensing Image Understanding
von: Xu, Linrui, et al.
Veröffentlicht: (2024)

OCR-Quality: A Human-Annotated Dataset for OCR Quality Assessment
von: Zhang, Yulong
Veröffentlicht: (2025)

A Comprehensive Dataset for Human vs. AI Generated Image Detection
von: Roy, Rajarshi, et al.
Veröffentlicht: (2026)

Longitudinal Vestibular Schwannoma Dataset with Consensus-based Human-in-the-loop Annotations
von: Wijethilake, Navodini, et al.
Veröffentlicht: (2025)

QNCD: Quantization Noise Correction for Diffusion Models
von: Chu, Huanpeng, et al.
Veröffentlicht: (2024)

Bridging Synthetic and Real-World Domains: A Human-in-the-Loop Weakly-Supervised Framework for Industrial Toxic Emission Segmentation
von: Tao, Yida, et al.
Veröffentlicht: (2025)

AdCorDA: Classifier Refinement via Adversarial Correction and Domain Adaptation
von: Shen, Lulan, et al.
Veröffentlicht: (2024)