Saved in:
| Main Authors: | Agrawal, Vaibhav, Parihar, Rishubh, Bhat, Pradhaan, Sarvadevabhatla, Ravi Kiran, Babu, R. Venkatesh |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.23359 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Compass Control: Multi Object Orientation Control for Text-to-Image Generation
by: Parihar, Rishubh, et al.
Published: (2025)
by: Parihar, Rishubh, et al.
Published: (2025)
MonoPlace3D: Learning 3D-Aware Object Placement for 3D Monocular Detection
by: Parihar, Rishubh, et al.
Published: (2025)
by: Parihar, Rishubh, et al.
Published: (2025)
PreciseControl: Enhancing Text-To-Image Diffusion Models with Fine-Grained Attribute Control
by: Parihar, Rishubh, et al.
Published: (2024)
by: Parihar, Rishubh, et al.
Published: (2024)
Text2Place: Affordance-aware Text Guided Human Placement
by: Parihar, Rishubh, et al.
Published: (2024)
by: Parihar, Rishubh, et al.
Published: (2024)
Balancing Act: Distribution-Guided Debiasing in Diffusion Models
by: Parihar, Rishubh, et al.
Published: (2024)
by: Parihar, Rishubh, et al.
Published: (2024)
Kontinuous Kontext: Continuous Strength Control for Instruction-based Image Editing
by: Parihar, Rishubh, et al.
Published: (2025)
by: Parihar, Rishubh, et al.
Published: (2025)
RoadTones: Tone Controllable Text Generation from Road Event Videos
by: Parikh, Chirag, et al.
Published: (2026)
by: Parikh, Chirag, et al.
Published: (2026)
Reflecting Reality: Enabling Diffusion Models to Produce Faithful Mirror Reflections
by: Dhiman, Ankit, et al.
Published: (2024)
by: Dhiman, Ankit, et al.
Published: (2024)
OLAF: A Plug-and-Play Framework for Enhanced Multi-object Multi-part Scene Parsing
by: Gupta, Pranav, et al.
Published: (2024)
by: Gupta, Pranav, et al.
Published: (2024)
TexTAR : Textual Attribute Recognition in Multi-domain and Multi-lingual Document Images
by: Kumar, Rohan, et al.
Published: (2025)
by: Kumar, Rohan, et al.
Published: (2025)
Unveiling Text in Challenging Stone Inscriptions: A Character-Context-Aware Patching Strategy for Binarization
by: Jena, Pratyush, et al.
Published: (2026)
by: Jena, Pratyush, et al.
Published: (2026)
MoRAG -- Multi-Fusion Retrieval Augmented Generation for Human Motion
by: Kalakonda, Sai Shashank, et al.
Published: (2024)
by: Kalakonda, Sai Shashank, et al.
Published: (2024)
STRinGS: Selective Text Refinement in Gaussian Splatting
by: Raundhal, Abhinav, et al.
Published: (2025)
by: Raundhal, Abhinav, et al.
Published: (2025)
Transfer-LMR: Heavy-Tail Driving Behavior Recognition in Diverse Traffic Scenarios
by: Parikh, Chirag, et al.
Published: (2024)
by: Parikh, Chirag, et al.
Published: (2024)
DashCop: Automated E-ticket Generation for Two-Wheeler Traffic Violations Using Dashcam Videos
by: Rawat, Deepti, et al.
Published: (2025)
by: Rawat, Deepti, et al.
Published: (2025)
IndicDLP: A Foundational Dataset for Multi-Lingual and Multi-Domain Document Layout Parsing
by: Nath, Oikantik, et al.
Published: (2025)
by: Nath, Oikantik, et al.
Published: (2025)
Harnessing Diffusion-Generated Synthetic Images for Fair Image Classification
by: Basu, Abhipsa, et al.
Published: (2025)
by: Basu, Abhipsa, et al.
Published: (2025)
RoadSocial: A Diverse VideoQA Dataset and Benchmark for Road Event Understanding from Social Video Narratives
by: Parikh, Chirag, et al.
Published: (2025)
by: Parikh, Chirag, et al.
Published: (2025)
IDD-X: A Multi-View Dataset for Ego-relative Important Object Localization and Explanation in Dense and Unstructured Traffic
by: Parikh, Chirag, et al.
Published: (2024)
by: Parikh, Chirag, et al.
Published: (2024)
Controllable Egocentric Video Generation via Occlusion-Aware Sparse 3D Hand Joints
by: Zhang, Chenyangguang, et al.
Published: (2026)
by: Zhang, Chenyangguang, et al.
Published: (2026)
OAHuman: Occlusion-Aware 3D Human Reconstruction from Monocular Images
by: Yang, Yuanwang, et al.
Published: (2026)
by: Yang, Yuanwang, et al.
Published: (2026)
CrackUDA: Incremental Unsupervised Domain Adaptation for Improved Crack Segmentation in Civil Structures
by: Srivastava, Kushagra, et al.
Published: (2024)
by: Srivastava, Kushagra, et al.
Published: (2024)
Seeing Through the Tool: A Controlled Benchmark for Occlusion Robustness in Foundation Segmentation Models
by: Ho, Nhan, et al.
Published: (2026)
by: Ho, Nhan, et al.
Published: (2026)
Occlusion-Aware 3D Motion Interpretation for Abnormal Behavior Detection
by: Li, Su, et al.
Published: (2024)
by: Li, Su, et al.
Published: (2024)
Occlusion-aware Text-Image-Point Cloud Pretraining for Open-World 3D Object Recognition
by: Nguyen, Khanh, et al.
Published: (2025)
by: Nguyen, Khanh, et al.
Published: (2025)
UniC-Lift: Unified 3D Instance Segmentation via Contrastive Learning
by: Dhiman, Ankit, et al.
Published: (2025)
by: Dhiman, Ankit, et al.
Published: (2025)
Text-Image Conditioned 3D Generation
by: Cen, Jiazhong, et al.
Published: (2026)
by: Cen, Jiazhong, et al.
Published: (2026)
GeoDiv: Framework For Measuring Geographical Diversity In Text-To-Image Models
by: Basu, Abhipsa, et al.
Published: (2026)
by: Basu, Abhipsa, et al.
Published: (2026)
Wave-Former: Through-Occlusion 3D Reconstruction via Wireless Shape Completion
by: Dodds, Laura, et al.
Published: (2025)
by: Dodds, Laura, et al.
Published: (2025)
DreamControl: Control-Based Text-to-3D Generation with 3D Self-Prior
by: Huang, Tianyu, et al.
Published: (2023)
by: Huang, Tianyu, et al.
Published: (2023)
DehazeGS: Seeing Through Fog with 3D Gaussian Splatting
by: Yu, Jinze, et al.
Published: (2025)
by: Yu, Jinze, et al.
Published: (2025)
Occlusion-Aware 3D Hand-Object Pose Estimation with Masked AutoEncoders
by: Yang, Hui, et al.
Published: (2025)
by: Yang, Hui, et al.
Published: (2025)
Lookalike3D: Seeing Double in 3D
by: Yeshwanth, Chandan, et al.
Published: (2026)
by: Yeshwanth, Chandan, et al.
Published: (2026)
Shoot-Bounce-3D: Single-Shot Occlusion-Aware 3D from Lidar by Decomposing Two-Bounce Light
by: Klinghoffer, Tzofi, et al.
Published: (2025)
by: Klinghoffer, Tzofi, et al.
Published: (2025)
Text-to-3D by Stitching a Multi-view Reconstruction Network to a Video Generator
by: Go, Hyojun, et al.
Published: (2025)
by: Go, Hyojun, et al.
Published: (2025)
CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation
by: Wang, Qinghe, et al.
Published: (2025)
by: Wang, Qinghe, et al.
Published: (2025)
PI3D: Efficient Text-to-3D Generation with Pseudo-Image Diffusion
by: Liu, Ying-Tian, et al.
Published: (2023)
by: Liu, Ying-Tian, et al.
Published: (2023)
Sparse3DTrack: Monocular 3D Object Tracking Using Sparse Supervision
by: Gosala, Nikhil, et al.
Published: (2026)
by: Gosala, Nikhil, et al.
Published: (2026)
SplatFont3D: Structure-Aware Text-to-3D Artistic Font Generation with Part-Level Style Control
by: Gan, Ji, et al.
Published: (2025)
by: Gan, Ji, et al.
Published: (2025)
DreamCS: Geometry-Aware Text-to-3D Generation with Unpaired 3D Reward Supervision
by: Zou, Xiandong, et al.
Published: (2025)
by: Zou, Xiandong, et al.
Published: (2025)
Similar Items
-
Compass Control: Multi Object Orientation Control for Text-to-Image Generation
by: Parihar, Rishubh, et al.
Published: (2025) -
MonoPlace3D: Learning 3D-Aware Object Placement for 3D Monocular Detection
by: Parihar, Rishubh, et al.
Published: (2025) -
PreciseControl: Enhancing Text-To-Image Diffusion Models with Fine-Grained Attribute Control
by: Parihar, Rishubh, et al.
Published: (2024) -
Text2Place: Affordance-aware Text Guided Human Placement
by: Parihar, Rishubh, et al.
Published: (2024) -
Balancing Act: Distribution-Guided Debiasing in Diffusion Models
by: Parihar, Rishubh, et al.
Published: (2024)