:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Pathak, Utkarsh, Gunda, Chandra Sai Krishna, Prakash, Anusha, Agarwal, Keshav, Murthy, Hema A.
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Computer Vision and Pattern Recognition I.5.4
Online Access:	https://arxiv.org/abs/2506.03884
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Motion-Guided Semantic Alignment with Negative Prompts for Zero-Shot Video Action Recognition
by: Wang, Yiming, et al.
Published: (2026)

FS-DAG: Few Shot Domain Adapting Graph Networks for Visually Rich Document Understanding
by: Agarwal, Amit, et al.
Published: (2025)

ForensicFormer: Hierarchical Multi-Scale Reasoning for Cross-Domain Image Forgery Detection
by: Samson, Hema Hariharan
Published: (2026)

Towards Developing State-of-the-Art TTS Synthesisers for 13 Indian Languages with Signal Processing aided Alignments
by: Prakash, Anusha, et al.
Published: (2022)

Streaming Anchor Loss: Augmenting Supervision with Temporal Significance
by: Sarawgi, Utkarsh Oggy, et al.
Published: (2023)

Pipeline and Dataset Generation for Automated Fact-checking in Almost Any Language
by: Drchal, Jan, et al.
Published: (2023)

Automating Clinical Information Retrieval from Finnish Electronic Health Records Using Large Language Models
by: Saukkoriipi, Mikko, et al.
Published: (2026)

LLM-Guided Exemplar Selection for Few-Shot Wearable-Sensor Human Activity Recognition
by: Ronando, Elsen, et al.
Published: (2025)

Lightweight MRI-Based Automated Segmentation of Pancreatic Cancer with Auto3DSeg
by: Jha, Keshav, et al.
Published: (2025)

Leveraging GNSS and Onboard Visual Data from Consumer Vehicles for Robust Road Network Estimation
by: Opra, Balázs, et al.
Published: (2024)

Silent Impact: Tracking Tennis Shots from the Passive Arm
by: Park, Junyong, et al.
Published: (2025)

Cognitive Linguistic Identity Fusion Score (CLIFS): A Scalable Cognition-Informed Approach to Quantifying Identity Fusion from Text
by: Wright, Devin R., et al.
Published: (2025)

Delayed Fusion: Integrating Large Language Models into First-Pass Decoding in End-to-end Speech Recognition
by: Hori, Takaaki, et al.
Published: (2025)

Thaka at KSAA-2026 Task 2: Regularized Fine-Tuning for Arabic Speech Diacritization
by: Alamr, Meshal, et al.
Published: (2026)

Language Predicts Identity Fusion Across Cultures and Reveals Divergent Pathways to Violence
by: Wright, Devin R., et al.
Published: (2026)

Interpretable Modeling of Driver Attention Shifts with a Vision--Language Model
by: Hamid, Kaiser, et al.
Published: (2025)

CG-TTRL: Context-Guided Test-Time Reinforcement Learning for On-Device Large Language Models
by: Hosseini, Peyman, et al.
Published: (2025)

Proximity QA: Unleashing the Power of Multi-Modal Large Language Models for Spatial Proximity Analysis
by: Li, Jianing, et al.
Published: (2024)

Apictorial Jigsaw Puzzle Reconstruction Based on Curve Matching via a Corotational Beam Spline
by: Orynyak, Igor, et al.
Published: (2025)

A Computer Vision Pipeline for Iterative Bullet Hole Tracking in Rifle Zeroing
by: Belcher, Robert M., et al.
Published: (2026)

The Influence of Iconicity in Transfer Learning for Sign Language Recognition
by: Artiaga, Keren, et al.
Published: (2026)

Libra: Leveraging Temporal Images for Biomedical Radiology Analysis
by: Zhang, Xi, et al.
Published: (2024)

EZ-Sort: Efficient Pairwise Comparison via Zero-Shot CLIP-Based Pre-Ordering and Human-in-the-Loop Sorting
by: Park, Yujin, et al.
Published: (2025)

Detection of Personal Data in Structured Datasets Using a Large Language Model
by: Ntwali, Albert Agisha, et al.
Published: (2025)

Prompt Sensitivity in Vision-Language Grounding: How Small Changes in Wording Affect Object Detection
by: Deka, Dawar Jyoti, et al.
Published: (2026)

Everyday Speech in the Indian Subcontinent
by: P, Utkarsh
Published: (2024)

EMOVOME: A Dataset for Emotion Recognition in Spontaneous Real-Life Speech
by: Gómez-Zaragozá, Lucía, et al.
Published: (2024)

OrganicHAR: Towards Activity Discovery in Organic Settings for Privacy Preserving Sensors Using Efficient Video Analysis
by: Patidar, Prasoon, et al.
Published: (2026)

RG-TTA: Regime-Guided Meta-Control for Test-Time Adaptation in Streaming Time Series
by: Kumar, Indar, et al.
Published: (2026)

Audio-based Kinship Verification Using Age Domain Conversion
by: Sun, Qiyang, et al.
Published: (2024)

Quantized Vision-Language Models for Damage Assessment: A Comparative Study of LLaVA-1.5-7B Quantization Levels
by: Yasuno, Takato
Published: (2026)

Exploring an Inter-Pausal Unit (IPU) based Approach for Indic End-to-End TTS Systems
by: Prakash, Anusha, et al.
Published: (2024)

PRISMA: Preference-Reinforced Self-Training Approach for Interpretable Emotionally Intelligent Negotiation Dialogues
by: Kajare, Prajwal Vijay, et al.
Published: (2026)

propella-1: Multi-Property Document Annotation for LLM Data Curation at Scale
by: Idahl, Maximilian, et al.
Published: (2026)

LongSumEval: Question-Answering Based Evaluation and Feedback-Driven Refinement for Long Document Summarization
by: Nguyen, Huyen, et al.
Published: (2026)

Scaling Laws for State Dynamics in Large Language Models
by: Li, Jacob X, et al.
Published: (2025)

Fine-Tuning Vision-Language Models for Understanding Current Damage and Scoring Priority with Quality Guard Agent
by: Yasuno, Takato
Published: (2026)

Talking Tennis: Language Feedback from 3D Biomechanical Action Recognition
by: Dashore, Arushi, et al.
Published: (2025)

UTAL-GNN: Unsupervised Temporal Action Localization using Graph Neural Networks
by: Badatya, Bikash Kumar, et al.
Published: (2025)

AgenticPruner: MAC-Constrained Neural Network Compression via LLM-Driven Strategy Search
by: Esmat, Shahrzad, et al.
Published: (2026)