:: Library Catalog

Imagen de Portada

Guardado en:

Detalles Bibliográficos
Autores principales:	Li, Linyuan, Qiu, Jianing, Saha, Anujit, Li, Lin, Li, Poyuan, He, Mengxian, Guo, Ziyu, Yuan, Wu
Formato:	Preprint
Publicado:	2024
Materias:	Computer Vision and Pattern Recognition
Acceso en línea:	https://arxiv.org/abs/2411.07619
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Ejemplares similares

Biomedical SAM 2: Segment Anything in Biomedical Images and Videos
por: Yan, Zhiling, et al.
Publicado: (2024)

Bora: Biomedical Generalist Video Generation Model
por: Sun, Weixiang, et al.
Publicado: (2024)

MoTE: Reconciling Generalization with Specialization for Visual-Language to Video Knowledge Transfer
por: Zhu, Minghao, et al.
Publicado: (2024)

Enhance the Image: Super Resolution using Artificial Intelligence in MRI
por: Li, Ziyu, et al.
Publicado: (2024)

AROID: Improving Adversarial Robustness Through Online Instance-Wise Data Augmentation
por: Li, Lin, et al.
Publicado: (2023)

VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking
por: Hu, Runyi, et al.
Publicado: (2025)

DreaMontage: Arbitrary Frame-Guided One-Shot Video Generation
por: Liu, Jiawei, et al.
Publicado: (2025)

GBR: Generative Bundle Refinement for High-fidelity Gaussian Splatting with Enhanced Mesh Reconstruction
por: Zhang, Jianing, et al.
Publicado: (2024)

RGC-VQA: An Exploration Database for Robotic-Generated Video Quality Assessment
por: Jin, Jianing, et al.
Publicado: (2025)

Recognizing Pneumonia in Real-World Chest X-rays with a Classifier Trained with Images Synthetically Generated by Nano Banana
por: Peng, Jiachuan, et al.
Publicado: (2025)

Is Artificial Intelligence Generated Image Detection a Solved Problem?
por: Li, Ziqiang, et al.
Publicado: (2025)

PathAsst: A Generative Foundation AI Assistant Towards Artificial General Intelligence of Pathology
por: Sun, Yuxuan, et al.
Publicado: (2023)

Adapt2Reward: Adapting Video-Language Models to Generalizable Robotic Rewards via Failure Prompts
por: Yang, Yanting, et al.
Publicado: (2024)

Dual Tuning for Reasoning Efficacy-Driven Data Curation in Multimodal LLM Training
por: Zheng, Ruobing, et al.
Publicado: (2026)

Explainable Artificial Intelligence in Biomedical Image Analysis: A Comprehensive Survey
por: Dagnaw, Getamesay Haile, et al.
Publicado: (2025)

GenVideoLens: Where LVLMs Fall Short in AI-Generated Video Detection?
por: Zou, Yueying, et al.
Publicado: (2026)

LoViC: Efficient Long Video Generation with Context Compression
por: Jiang, Jiaxiu, et al.
Publicado: (2025)

GENIUS: Generative Fluid Intelligence Evaluation Suite
por: An, Ruichuan, et al.
Publicado: (2026)

One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Models
por: Li, Lin, et al.
Publicado: (2024)

PhyRPR: Training-Free Physics-Constrained Video Generation
por: Zhao, Yibo, et al.
Publicado: (2026)

Brick-Diffusion: Generating Long Videos with Brick-to-Wall Denoising
por: Yuan, Yunlong, et al.
Publicado: (2025)

Safe-Sora: Safe Text-to-Video Generation via Graphical Watermarking
por: Su, Zihan, et al.
Publicado: (2025)

CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
por: Wang, Zhao, et al.
Publicado: (2024)

Geo-Align: Video Generation Alignment via Metric Geometry Reward
por: Li, Zizun, et al.
Publicado: (2026)

Efficient Text-driven Motion Generation via Latent Consistency Training
por: Hu, Mengxian, et al.
Publicado: (2024)

Helios: Real Real-Time Long Video Generation Model
por: Yuan, Shenghai, et al.
Publicado: (2026)

FlashI2V: Fourier-Guided Latent Shifting Prevents Conditional Image Leakage in Image-to-Video Generation
por: Ge, Yunyang, et al.
Publicado: (2025)

Task-Focused Memorization for Multimodal Agents
por: Zou, Tao, et al.
Publicado: (2026)

DQ3D: Depth-guided Query for Transformer-Based 3D Object Detection in Traffic Scenarios
por: Wang, Ziyu, et al.
Publicado: (2025)

Mora: Enabling Generalist Video Generation via A Multi-Agent Framework
por: Yuan, Zhengqing, et al.
Publicado: (2024)

IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models
por: Lei, Jiayi, et al.
Publicado: (2025)

ShareVerse: Multi-Agent Consistent Video Generation for Shared World Modeling
por: Zhu, Jiayi, et al.
Publicado: (2026)

Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
por: Lin, Weifeng, et al.
Publicado: (2025)

ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
por: Chen, Lin, et al.
Publicado: (2024)

Lane Departure Accident Prevention in Foggy Conditions: A Prior-Guided Dynamic Feature Fusion Transformer Framework for Real-Time Lane Detection
por: Zhang, Ronghui, et al.
Publicado: (2025)

RULER-Bench: Probing Rule-based Reasoning Abilities of Next-level Video Generation Models for Vision Foundation Intelligence
por: He, Xuming, et al.
Publicado: (2025)

OptiWorld: Optimal Control for Video World Generation under Physical Constraints
por: Yuan, Yu, et al.
Publicado: (2026)

Phantom-Data : Towards a General Subject-Consistent Video Generation Dataset
por: Chen, Zhuowei, et al.
Publicado: (2025)

Hierarchical Banzhaf Interaction for General Video-Language Representation Learning
por: Jin, Peng, et al.
Publicado: (2024)

OSP-Next: Efficient High-Quality Video Generation with Sparse Sequence Parallelism, HiF8 Quantization, and Reinforcement Learning
por: Ge, Yunyang, et al.
Publicado: (2026)