:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Mishra, Ishan, Li, Jiajie, Mishra, Deepak, Xiong, Jinjun
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2602.11316
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Recognize Any Surgical Object: Unleashing the Power of Weakly-Supervised Data
by: Li, Jiajie, et al.
Published: (2025)

Chain-of-Adaptation: Surgical Vision-Language Adaptation with Reinforcement Learning
by: Li, Jiajie, et al.
Published: (2026)

CoBooM: Codebook Guided Bootstrapping for Medical Image Representation Learning
by: Singh, Azad, et al.
Published: (2024)

F2former: When Fractional Fourier Meets Deep Wiener Deconvolution and Selective Frequency Transformer for Image Deblurring
by: Paul, Subhajit, et al.
Published: (2024)

AD-Relight: Training-Free Banner Relighting via Illumination Translation with Diffusion Priors
by: Mishra, Rameshwar, et al.
Published: (2026)

DiSSECT: Structuring Transfer-Ready Medical Image Representations through Discrete Self-Supervision
by: Singh, Azad, et al.
Published: (2025)

Current Symmetry Group Equivariant Convolution Frameworks for Representation Learning
by: Basheer, Ramzan, et al.
Published: (2024)

Enhanced Survival Prediction in Head and Neck Cancer Using Convolutional Block Attention and Multimodal Data Fusion
by: Farooq, Aiman, et al.
Published: (2024)

Translating Imaging to Genomics: Leveraging Transformers for Predictive Modeling
by: Farooq, Aiman, et al.
Published: (2024)

SynchroRaMa : Lip-Synchronized and Emotion-Aware Talking Face Generation via Multi-Modal Emotion Embedding
by: Yee, Phyo Thet, et al.
Published: (2025)

OTCXR: Rethinking Self-supervised Alignment using Optimal Transport for Chest X-ray Analysis
by: Gorade, Vandan, et al.
Published: (2024)

VIDMP3: Video Editing by Representing Motion with Pose and Position Priors
by: Mishra, Sandeep, et al.
Published: (2025)

Leveraging Auxiliary Classification for Rib Fracture Segmentation
by: G., Harini, et al.
Published: (2024)

Survival Prediction in Lung Cancer through Multi-Modal Representation Learning
by: Farooq, Aiman, et al.
Published: (2024)

LLaVA-Surg: Towards Multimodal Surgical Assistant via Structured Surgical Video Learning
by: Li, Jiajie, et al.
Published: (2024)

RobSurv: Vector Quantization-Based Multi-Modal Learning for Robust Cancer Survival Prediction
by: Farooq, Aiman, et al.
Published: (2025)

Loss Knows Best: Detecting Annotation Errors in Videos via Loss Trajectories
by: Alwis, Praditha, et al.
Published: (2026)

Fine-Grained Rib Fracture Diagnosis with Hyperbolic Embeddings: A Detailed Annotation Framework and Multi-Label Classification Model
by: Pate, Shripad, et al.
Published: (2025)

LoR-LUT: Learning Compact 3D Lookup Tables via Low-Rank Residuals
by: Zhao, Ziqi, et al.
Published: (2026)

U-WNO:U-Net-enhanced Wavelet Neural Operator for fetal head segmentation
by: Seth, Pranava, et al.
Published: (2024)

MLVICX: Multi-Level Variance-Covariance Exploration for Chest X-ray Self-Supervised Representation Learning
by: Singh, Azad, et al.
Published: (2024)

Fiducial Tag Localization on a 3D LiDAR Prior Map
by: Liu, Yibo, et al.
Published: (2022)

Efficient Hybrid CNN-GNN Architecture for Monocular Depth Estimation
by: Narayan, Ishan
Published: (2026)

Clean-GS: Semantic Mask-Guided Pruning for 3D Gaussian Splatting
by: Mishra, Subhankar
Published: (2026)

OphEdit: Training-Free Text-Guided Editing of Ophthalmic Surgical Videos
by: Jangir, Ritul, et al.
Published: (2026)

Novel Human Machine Interface via Robust Hand Gesture Recognition System using Channel Pruned YOLOv5s Model
by: Sen, Abir, et al.
Published: (2024)

Audio-driven Talking Face Generation with Stabilized Synchronization Loss
by: Yaman, Dogucan, et al.
Published: (2023)

Leveraging Data to Say No: Memory Augmented Plug-and-Play Selective Prediction
by: Sarkar, Aditya, et al.
Published: (2026)

PSSI-MaxST: An Efficient Pixel-Segment Similarity Index Using Intensity and Smoothness Features for Maximum Spanning Tree Based Segmentation
by: Shejole, Kaustubh Shivshankar, et al.
Published: (2026)

Image Synthesis with Graph Conditioning: CLIP-Guided Diffusion Models for Scene Graphs
by: Mishra, Rameshwar, et al.
Published: (2024)

Densely Decoded Networks with Adaptive Deep Supervision for Medical Image Segmentation
by: Mishra, Suraj, et al.
Published: (2024)

Selective, Controlled and Domain-Agnostic Unlearning in Pretrained CLIP: A Training- and Data-Free Approach
by: Mishra, Ashish, et al.
Published: (2025)

Fast and Generalizable NeRF Architecture Selection for Satellite Scene Reconstruction
by: Chakraborty, Devjyoti, et al.
Published: (2026)

R2 Loss: Range Restriction Loss for Model Compression and Quantization
by: Kundu, Arnav, et al.
Published: (2023)

From Understanding to Engagement: Personalized pharmacy Video Clips via Vision Language Models (VLMs)
by: Mishra, Suyash, et al.
Published: (2026)

Seg-HGNN: Unsupervised and Light-Weight Image Segmentation with Hyperbolic Graph Neural Networks
by: Mondal, Debjyoti, et al.
Published: (2024)

Transfer Learning-Based CNN Models for Plant Species Identification Using Leaf Venation Patterns
by: Bharadwaj, Bandita, et al.
Published: (2025)

RLM: A Vision-Language Model Approach for Radar Scene Understanding
by: Mishra, Pushkal, et al.
Published: (2025)

Reconstruction of Contour Lines During the Digitization of Contour Maps to Build a Digital Elevation Model
by: Subedi, Aroj, et al.
Published: (2024)

TRIM: A Self-Supervised Video Summarization Framework Maximizing Temporal Relative Information and Representativeness
by: Mishra, Pritam, et al.
Published: (2025)