:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Gao, Jialu, Gupta, Mithun Das, Li, Qun, Kshatriya, Raveena, Wilson, Andrew D., Chang, Keng-hao, Kumaravel, Balasaravanan Thoravi
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2603.13745
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Grounding Task Assistance with Multimodal Cues from a Single Demonstration
by: Sarch, Gabriel, et al.
Published: (2025)

Out of Sight, Not Out of Context? Egocentric Spatial Reasoning in VLMs Across Disjoint Frames
by: Ravi, Sahithya, et al.
Published: (2025)

Doc To The Future: Infomorphs for Interactive, Multimodal Document Transformation and Generation
by: Kumaravel, Balasaravanan Thoravi
Published: (2025)

BlendScape: Enabling End-User Customization of Video-Conferencing Environments through Generative AI
by: Rajaram, Shwetha, et al.
Published: (2024)

SpaceBlender: Creating Context-Rich Collaborative Spaces Through Generative 3D Scene Blending
by: Numan, Nels, et al.
Published: (2024)

Determinantal Point Process as an alternative to NMS
by: Some, Samik, et al.
Published: (2020)

Creative4U: MLLMs-based Advertising Creative Image Selector with Comparative Reasoning
by: Lin, Yukang, et al.
Published: (2025)

MineNPC-Task: Task Suite for Memory-Aware Minecraft Agents
by: Doss, Tamil Sudaravan Mohan, et al.
Published: (2026)

Leveraging Large Models to Evaluate Novel Content: A Case Study on Advertisement Creativity
by: Hou, Zhaoyi Joey, et al.
Published: (2025)

Chirpy3D: Part-Aware Multi-View Diffusion for Creative Fine-Grained Object Generation
by: Ng, Kam Woh, et al.
Published: (2025)

Unified Unsupervised Salient Object Detection via Knowledge Transfer
by: Yuan, Yao, et al.
Published: (2024)

Uncertainty Guided Refinement for Fine-Grained Salient Object Detection
by: Yuan, Yao, et al.
Published: (2025)

Self-Creative Text-to-Object Generation using Semantic-Aware Spatial Weighting
by: Yu, Yue, et al.
Published: (2026)

Dual-Perspective Knowledge Enrichment for Semi-Supervised 3D Object Detection
by: Han, Yucheng, et al.
Published: (2024)

PartCraft: Crafting Creative Objects by Parts
by: Ng, Kam Woh, et al.
Published: (2024)

SceneDiff: A Benchmark and Method for Multiview Object Change Detection
by: Wu, Yuqun, et al.
Published: (2025)

VC-LLM: Automated Advertisement Video Creation from Raw Footage using Multi-modal LLMs
by: Qian, Dongjun, et al.
Published: (2025)

TP2O: Creative Text Pair-to-Object Generation using Balance Swap-Sampling
by: Li, Jun, et al.
Published: (2023)

HydraMamba: Multi-Head State Space Model for Global Point Cloud Learning
by: Qu, Kanglin, et al.
Published: (2025)

A$^3$: Towards Advertising Aesthetic Assessment
by: Ji, Kaiyuan, et al.
Published: (2026)

WSCLoc: Weakly-Supervised Sparse-View Camera Relocalization
by: Wang, Jialu, et al.
Published: (2024)

CRAFT: Designing Creative and Functional 3D Objects
by: Guo, Michelle, et al.
Published: (2024)

Strictly-ID-Preserved and Controllable Accessory Advertising Image Generation
by: Xue, Youze, et al.
Published: (2024)

Towards Reliable Advertising Image Generation Using Human Feedback
by: Du, Zhenbang, et al.
Published: (2024)

Uncertainty-Encoded Multi-Modal Fusion for Robust Object Detection in Autonomous Driving
by: Lou, Yang, et al.
Published: (2023)

CREA: A Collaborative Multi-Agent Framework for Creative Image Editing and Generation
by: Venkatesh, Kavana, et al.
Published: (2025)

KeyGS: A Keyframe-Centric Gaussian Splatting Method for Monocular Image Sequences
by: Chang, Keng-Wei, et al.
Published: (2024)

Automated Prompt Generation for Creative and Counterfactual Text-to-image Synthesis
by: Jelaca, Aleksa, et al.
Published: (2025)

Vibe Spaces for Creatively Connecting and Expressing Visual Concepts
by: Yang, Huzheng, et al.
Published: (2025)

Unified Domain Generalization and Adaptation for Multi-View 3D Object Detection
by: Chang, Gyusam, et al.
Published: (2024)

Creative Image Generation with Diffusion Models
by: Song, Kunpeng, et al.
Published: (2026)

Teleportraits: Training-Free People Insertion into Any Scene
by: Gao, Jialu, et al.
Published: (2025)

Efficient Domain-Adaptive Multi-Task Dense Prediction with Vision Foundation Models
by: Kang, Beomseok, et al.
Published: (2025)

Automatic Teaching Platform on Vision Language Retrieval Augmented Generation
by: Gokhman, Ruslan, et al.
Published: (2025)

G-HOP: Generative Hand-Object Prior for Interaction Reconstruction and Grasp Synthesis
by: Ye, Yufei, et al.
Published: (2024)

Distribution-Conditional Generation: From Class Distribution to Creative Generation
by: Feng, Fu, et al.
Published: (2025)

Siamese-DETR for Generic Multi-Object Tracking
by: Liu, Qiankun, et al.
Published: (2023)

MambaLoc: Efficient Camera Localisation via State Space Model
by: Wang, Jialu, et al.
Published: (2024)

AG-TAL: Anatomically-Guided Topology-Aware Loss for Multiclass Segmentation of the Circle of Willis Using Large-Scale Multi-Center Datasets
by: Liu, Jialu, et al.
Published: (2026)

DOS: Directional Object Separation in Text Embeddings for Multi-Object Image Generation
by: Byun, Dongnam, et al.
Published: (2025)