Saved in:
| Main Authors: | Gao, Jialu, Gupta, Mithun Das, Li, Qun, Kshatriya, Raveena, Wilson, Andrew D., Chang, Keng-hao, Kumaravel, Balasaravanan Thoravi |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.13745 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Grounding Task Assistance with Multimodal Cues from a Single Demonstration
by: Sarch, Gabriel, et al.
Published: (2025)
by: Sarch, Gabriel, et al.
Published: (2025)
Out of Sight, Not Out of Context? Egocentric Spatial Reasoning in VLMs Across Disjoint Frames
by: Ravi, Sahithya, et al.
Published: (2025)
by: Ravi, Sahithya, et al.
Published: (2025)
Doc To The Future: Infomorphs for Interactive, Multimodal Document Transformation and Generation
by: Kumaravel, Balasaravanan Thoravi
Published: (2025)
by: Kumaravel, Balasaravanan Thoravi
Published: (2025)
BlendScape: Enabling End-User Customization of Video-Conferencing Environments through Generative AI
by: Rajaram, Shwetha, et al.
Published: (2024)
by: Rajaram, Shwetha, et al.
Published: (2024)
SpaceBlender: Creating Context-Rich Collaborative Spaces Through Generative 3D Scene Blending
by: Numan, Nels, et al.
Published: (2024)
by: Numan, Nels, et al.
Published: (2024)
Determinantal Point Process as an alternative to NMS
by: Some, Samik, et al.
Published: (2020)
by: Some, Samik, et al.
Published: (2020)
Creative4U: MLLMs-based Advertising Creative Image Selector with Comparative Reasoning
by: Lin, Yukang, et al.
Published: (2025)
by: Lin, Yukang, et al.
Published: (2025)
MineNPC-Task: Task Suite for Memory-Aware Minecraft Agents
by: Doss, Tamil Sudaravan Mohan, et al.
Published: (2026)
by: Doss, Tamil Sudaravan Mohan, et al.
Published: (2026)
Leveraging Large Models to Evaluate Novel Content: A Case Study on Advertisement Creativity
by: Hou, Zhaoyi Joey, et al.
Published: (2025)
by: Hou, Zhaoyi Joey, et al.
Published: (2025)
Chirpy3D: Part-Aware Multi-View Diffusion for Creative Fine-Grained Object Generation
by: Ng, Kam Woh, et al.
Published: (2025)
by: Ng, Kam Woh, et al.
Published: (2025)
Unified Unsupervised Salient Object Detection via Knowledge Transfer
by: Yuan, Yao, et al.
Published: (2024)
by: Yuan, Yao, et al.
Published: (2024)
Uncertainty Guided Refinement for Fine-Grained Salient Object Detection
by: Yuan, Yao, et al.
Published: (2025)
by: Yuan, Yao, et al.
Published: (2025)
Self-Creative Text-to-Object Generation using Semantic-Aware Spatial Weighting
by: Yu, Yue, et al.
Published: (2026)
by: Yu, Yue, et al.
Published: (2026)
Dual-Perspective Knowledge Enrichment for Semi-Supervised 3D Object Detection
by: Han, Yucheng, et al.
Published: (2024)
by: Han, Yucheng, et al.
Published: (2024)
PartCraft: Crafting Creative Objects by Parts
by: Ng, Kam Woh, et al.
Published: (2024)
by: Ng, Kam Woh, et al.
Published: (2024)
SceneDiff: A Benchmark and Method for Multiview Object Change Detection
by: Wu, Yuqun, et al.
Published: (2025)
by: Wu, Yuqun, et al.
Published: (2025)
VC-LLM: Automated Advertisement Video Creation from Raw Footage using Multi-modal LLMs
by: Qian, Dongjun, et al.
Published: (2025)
by: Qian, Dongjun, et al.
Published: (2025)
TP2O: Creative Text Pair-to-Object Generation using Balance Swap-Sampling
by: Li, Jun, et al.
Published: (2023)
by: Li, Jun, et al.
Published: (2023)
HydraMamba: Multi-Head State Space Model for Global Point Cloud Learning
by: Qu, Kanglin, et al.
Published: (2025)
by: Qu, Kanglin, et al.
Published: (2025)
A$^3$: Towards Advertising Aesthetic Assessment
by: Ji, Kaiyuan, et al.
Published: (2026)
by: Ji, Kaiyuan, et al.
Published: (2026)
WSCLoc: Weakly-Supervised Sparse-View Camera Relocalization
by: Wang, Jialu, et al.
Published: (2024)
by: Wang, Jialu, et al.
Published: (2024)
CRAFT: Designing Creative and Functional 3D Objects
by: Guo, Michelle, et al.
Published: (2024)
by: Guo, Michelle, et al.
Published: (2024)
Strictly-ID-Preserved and Controllable Accessory Advertising Image Generation
by: Xue, Youze, et al.
Published: (2024)
by: Xue, Youze, et al.
Published: (2024)
Towards Reliable Advertising Image Generation Using Human Feedback
by: Du, Zhenbang, et al.
Published: (2024)
by: Du, Zhenbang, et al.
Published: (2024)
Uncertainty-Encoded Multi-Modal Fusion for Robust Object Detection in Autonomous Driving
by: Lou, Yang, et al.
Published: (2023)
by: Lou, Yang, et al.
Published: (2023)
CREA: A Collaborative Multi-Agent Framework for Creative Image Editing and Generation
by: Venkatesh, Kavana, et al.
Published: (2025)
by: Venkatesh, Kavana, et al.
Published: (2025)
KeyGS: A Keyframe-Centric Gaussian Splatting Method for Monocular Image Sequences
by: Chang, Keng-Wei, et al.
Published: (2024)
by: Chang, Keng-Wei, et al.
Published: (2024)
Automated Prompt Generation for Creative and Counterfactual Text-to-image Synthesis
by: Jelaca, Aleksa, et al.
Published: (2025)
by: Jelaca, Aleksa, et al.
Published: (2025)
Vibe Spaces for Creatively Connecting and Expressing Visual Concepts
by: Yang, Huzheng, et al.
Published: (2025)
by: Yang, Huzheng, et al.
Published: (2025)
Unified Domain Generalization and Adaptation for Multi-View 3D Object Detection
by: Chang, Gyusam, et al.
Published: (2024)
by: Chang, Gyusam, et al.
Published: (2024)
Creative Image Generation with Diffusion Models
by: Song, Kunpeng, et al.
Published: (2026)
by: Song, Kunpeng, et al.
Published: (2026)
Teleportraits: Training-Free People Insertion into Any Scene
by: Gao, Jialu, et al.
Published: (2025)
by: Gao, Jialu, et al.
Published: (2025)
Efficient Domain-Adaptive Multi-Task Dense Prediction with Vision Foundation Models
by: Kang, Beomseok, et al.
Published: (2025)
by: Kang, Beomseok, et al.
Published: (2025)
Automatic Teaching Platform on Vision Language Retrieval Augmented Generation
by: Gokhman, Ruslan, et al.
Published: (2025)
by: Gokhman, Ruslan, et al.
Published: (2025)
G-HOP: Generative Hand-Object Prior for Interaction Reconstruction and Grasp Synthesis
by: Ye, Yufei, et al.
Published: (2024)
by: Ye, Yufei, et al.
Published: (2024)
Distribution-Conditional Generation: From Class Distribution to Creative Generation
by: Feng, Fu, et al.
Published: (2025)
by: Feng, Fu, et al.
Published: (2025)
Siamese-DETR for Generic Multi-Object Tracking
by: Liu, Qiankun, et al.
Published: (2023)
by: Liu, Qiankun, et al.
Published: (2023)
MambaLoc: Efficient Camera Localisation via State Space Model
by: Wang, Jialu, et al.
Published: (2024)
by: Wang, Jialu, et al.
Published: (2024)
AG-TAL: Anatomically-Guided Topology-Aware Loss for Multiclass Segmentation of the Circle of Willis Using Large-Scale Multi-Center Datasets
by: Liu, Jialu, et al.
Published: (2026)
by: Liu, Jialu, et al.
Published: (2026)
DOS: Directional Object Separation in Text Embeddings for Multi-Object Image Generation
by: Byun, Dongnam, et al.
Published: (2025)
by: Byun, Dongnam, et al.
Published: (2025)
Similar Items
-
Grounding Task Assistance with Multimodal Cues from a Single Demonstration
by: Sarch, Gabriel, et al.
Published: (2025) -
Out of Sight, Not Out of Context? Egocentric Spatial Reasoning in VLMs Across Disjoint Frames
by: Ravi, Sahithya, et al.
Published: (2025) -
Doc To The Future: Infomorphs for Interactive, Multimodal Document Transformation and Generation
by: Kumaravel, Balasaravanan Thoravi
Published: (2025) -
BlendScape: Enabling End-User Customization of Video-Conferencing Environments through Generative AI
by: Rajaram, Shwetha, et al.
Published: (2024) -
SpaceBlender: Creating Context-Rich Collaborative Spaces Through Generative 3D Scene Blending
by: Numan, Nels, et al.
Published: (2024)