:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Khan, Mohammad Sadil, Usama, Muhammad, Potamias, Rolandos Alexandros, Stricker, Didier, Afzal, Muhammad Zeshan, Deng, Jiankang, Elezi, Ismail
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2603.05607
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

NURBGen: High-Fidelity Text-to-CAD Generation through LLM-Driven NURBS Modeling
by: Usama, Muhammad, et al.
Published: (2025)

Text2CAD: Generating Sequential CAD Models from Beginner-to-Expert Level Text Prompts
by: Khan, Mohammad Sadil, et al.
Published: (2024)

BRep Boundary and Junction Detection for CAD Reverse Engineering
by: Ali, Sk Aziz, et al.
Published: (2024)

MARVEL-40M+: Multi-Level Visual Elaboration for High-Fidelity Text-to-3D Content Creation
by: Sinha, Sankalp, et al.
Published: (2024)

Do You See What I Am Pointing At? Gesture-Based Egocentric Video Question Answering
by: Choi, Yura, et al.
Published: (2026)

SituationalLLM: Proactive language models with scene awareness for dynamic, contextual task guidance
by: Khan, Muhammad Saif Ullah, et al.
Published: (2024)

Estimating Human Poses Across Datasets: A Unified Skeleton and Multi-Teacher Distillation Approach
by: Khan, Muhammad Saif Ullah, et al.
Published: (2024)

A Hybrid Approach for Document Layout Analysis in Document images
by: Shehzadi, Tahira, et al.
Published: (2024)

HaWoR: World-Space Hand Motion Reconstruction from Egocentric Videos
by: Zhang, Jinglei, et al.
Published: (2025)

WiLoR: End-to-end 3D Hand Localization and Reconstruction in-the-wild
by: Potamias, Rolandos Alexandros, et al.
Published: (2024)

Shape2.5D: A Dataset of Texture-less Surfaces for Depth and Normals Estimation
by: Khan, Muhammad Saif Ullah, et al.
Published: (2024)

Semi-Supervised Object Detection: A Survey on Progress from CNN to Transformer
by: Shehzadi, Tahira, et al.
Published: (2024)

Classroom-Inspired Multi-Mentor Distillation with Adaptive Learning Strategies
by: Sarode, Shalini, et al.
Published: (2024)

ReConText3D: Replay-based Continual Text-to-3D Generation
by: Khan, Muhammad Ahmed Ullah, et al.
Published: (2026)

SemAttNet: Towards Attention-based Semantic Aware Guided Depth Completion
by: Nazir, Danish, et al.
Published: (2022)

Towards End-to-End Semi-Supervised Table Detection with Semantic Aligned Matching Transformer
by: Shehzadi, Tahira, et al.
Published: (2024)

End-to-End Semi-Supervised approach with Modulated Object Queries for Table Detection in Documents
by: Ehsan, Iqraa, et al.
Published: (2024)

SAGS: Structure-Aware 3D Gaussian Splatting
by: Ververas, Evangelos, et al.
Published: (2024)

Signs as Tokens: A Retrieval-Enhanced Multilingual Sign Language Generator
by: Zuo, Ronglai, et al.
Published: (2024)

ImHead: A Large-scale Implicit Morphable Model for Localized Head Modeling
by: Potamias, Rolandos Alexandros, et al.
Published: (2025)

CEDex: Cross-Embodiment Dexterous Grasp Generation at Scale from Human-like Contact Representations
by: Wu, Zhiyuan, et al.
Published: (2025)

Enhanced Bank Check Security: Introducing a Novel Dataset and Transformer-Based Approach for Detection and Verification
by: Khan, Muhammad Saif Ullah, et al.
Published: (2024)

VVitCutLER: Towards Unsupervised Object Detection and Segmentation in Videos
by: Lu, Zhijing, et al.
Published: (2026)

Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object Detection
by: Shehzadi, Tahira, et al.
Published: (2024)

CICA: Content-Injected Contrastive Alignment for Zero-Shot Document Image Classification
by: Sinha, Sankalp, et al.
Published: (2024)

Dex2HOI: Dexterous Bimanual Two-Object Interaction Generation
by: Pratikaki, Chrysa, et al.
Published: (2026)

CAD-SIGNet: CAD Language Inference from Point Clouds using Layer-wise Sketch Instance Guided Attention
by: Khan, Mohammad Sadil, et al.
Published: (2024)

Beyond Boxes: Mask-Guided Spatio-Temporal Feature Aggregation for Video Object Detection
by: Hashmi, Khurram Azeem, et al.
Published: (2024)

HO-Flow: Generalizable Hand-Object Interaction Generation with Latent Flow Matching
by: Chen, Zerui, et al.
Published: (2026)

Neural Sign Actors: A diffusion model for 3D sign language production from text
by: Baltatzis, Vasileios, et al.
Published: (2023)

Interact2Ar: Full-Body Human-Human Interaction Generation via Autoregressive Diffusion Models
by: Ruiz-Ponce, Pablo, et al.
Published: (2025)

UniMorphGrasp: Diffusion Model with Morphology-Awareness for Cross-Embodiment Dexterous Grasp Generation
by: Wu, Zhiyuan, et al.
Published: (2026)

MaDiS: Taming Masked Diffusion Language Models for Sign Language Generation
by: Zuo, Ronglai, et al.
Published: (2026)

VTimeCoT: Thinking by Drawing for Video Temporal Grounding and Reasoning
by: Zhang, Jinglei, et al.
Published: (2025)

Human Pose Descriptions and Subject-Focused Attention for Improved Zero-Shot Transfer in Human-Centric Classification Tasks
by: Khan, Muhammad Saif Ullah, et al.
Published: (2024)

ZeroGS: Training 3D Gaussian Splatting from Unposed Images
by: Chen, Yu, et al.
Published: (2024)

G3DR: Generative 3D Reconstruction in ImageNet
by: Reddy, Pradyumna, et al.
Published: (2024)

Deep Active Learning: A Reality Check
by: Gashi, Edrina, et al.
Published: (2024)

$V_kD:$ Improving Knowledge Distillation using Orthogonal Projections
by: Miles, Roy, et al.
Published: (2024)

Design2Cloth: 3D Cloth Generation from 2D Masks
by: Zheng, Jiali, et al.
Published: (2024)