:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Anand, Avinash, Jaiswal, Raj, Gupta, Mohit, Bangar, Siddhesh S, Bhuyan, Pijush, Lal, Naman, Singh, Rajeev, Jha, Ritika, Shah, Rajiv Ratn, Satoh, Shin'ichi
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2404.09530
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

TC-OCR: TableCraft OCR for Efficient Detection & Recognition of Table Structure & Content
by: Anand, Avinash, et al.
Published: (2024)

Certified Zeroth-order Black-Box Defense with Robust UNet Denoiser
by: Verma, Astha, et al.
Published: (2023)

Advancements in Scientific Controllable Text Generation Methods
by: Goel, Arnav, et al.
Published: (2023)

KG-CTG: Citation Generation through Knowledge Graph-guided Large Language Models
by: Anand, Avinash, et al.
Published: (2024)

Context-Enhanced Language Models for Generating Multi-Paper Citations
by: Anand, Avinash, et al.
Published: (2024)

Keystroke Dynamics Against Academic Dishonesty in the Age of LLMs
by: Kundu, Debnath, et al.
Published: (2024)

Su-RoBERTa: A Semi-supervised Approach to Predicting Suicide Risk through Social Media using Base Language Models
by: Tank, Chayan, et al.
Published: (2024)

Improving Physics Reasoning in Large Language Models Using Mixture of Refinement Agents
by: Jaiswal, Raj, et al.
Published: (2024)

Multilingual Mathematical Reasoning: Advancing Open-Source LLMs in Hindi and English
by: Anand, Avinash, et al.
Published: (2024)

MM-PhyRLHF: Reinforcement Learning Framework for Multimodal Physics Question-Answering
by: Kapuriya, Janak, et al.
Published: (2024)

Mathify: Evaluating Large Language Models on Mathematical Problem Solving Tasks
by: Anand, Avinash, et al.
Published: (2024)

Med-CoDE: Medical Critique based Disagreement Evaluation Framework
by: Gupta, Mohit, et al.
Published: (2025)

Improving Multimodal LLMs Ability In Geometry Problem Solving, Reasoning, And Multistep Scoring
by: Anand, Avinash, et al.
Published: (2024)

Depression Detection and Analysis using Large Language Models on Textual and Audio-Visual Modalities
by: Tank, Chayan, et al.
Published: (2024)

MM-PhyQA: Multimodal Physics Question-Answering With Multi-Image CoT Prompting
by: Anand, Avinash, et al.
Published: (2024)

On Optimal Steering to Achieve Exact Fairness
by: Sharma, Mohit, et al.
Published: (2025)

Enhancing LLMs for Physics Problem-Solving using Reinforcement Learning with Human-AI Feedback
by: Anand, Avinash, et al.
Published: (2024)

Speech Representation Learning Revisited: The Necessity of Separate Learnable Parameters and Robust Data Augmentation
by: Yadav, Hemant, et al.
Published: (2024)

LittiChoQA: Literary Texts in Indic Languages Chosen for Question Answering
by: Khandelwal, Aarya, et al.
Published: (2026)

RConE: Rough Cone Embedding for Multi-Hop Logical Query Answering on Multi-Modal Knowledge Graphs
by: Kharbanda, Mayank, et al.
Published: (2024)

Analysing the Masked predictive coding training criterion for pre-training a Speech Representation Model
by: Yadav, Hemant, et al.
Published: (2023)

Long-context Non-factoid Question Answering in Indic Languages
by: Mishra, Ritwik, et al.
Published: (2025)

MS-HuBERT: Mitigating Pre-training and Inference Mismatch in Masked Language Modelling methods for learning Speech Representations
by: Yadav, Hemant, et al.
Published: (2024)

JOOCI: a Framework for Learning Comprehensive Speech Representations
by: Yadav, Hemant, et al.
Published: (2024)

Steps are all you need: Rethinking STEM Education with Prompt Engineering
by: Addala, Krishnasai, et al.
Published: (2024)

Better and Worse with Scale: How Contextual Entrainment Diverges with Model Size
by: Kukreja, Dikshant, et al.
Published: (2026)

ReSeDis: A Dataset for Referring-based Object Search across Large-Scale Image Collections
by: Huang, Ziling, et al.
Published: (2025)

Knowledge Graphs are all you need: Leveraging KGs in Physics Question Answering
by: Addala, Krishnasai, et al.
Published: (2024)

Structured Definitions and Segmentations for Legal Reasoning in LLMs: A Study on Indian Legal Data
by: Khatri, Mann, et al.
Published: (2025)

Multilingual Coreference Resolution in Low-resource South Asian Languages
by: Mishra, Ritwik, et al.
Published: (2024)

Spiritual-LLM : Gita Inspired Mental Health Therapy In the Era of LLMs
by: Kapuriya, Janak, et al.
Published: (2025)

Multilingual Non-Factoid Question Answering with Answer Paragraph Selection
by: Mishra, Ritwik, et al.
Published: (2024)

DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing
by: Sahipjohn, Neha, et al.
Published: (2024)

VECL-TTS: Voice identity and Emotional style controllable Cross-Lingual Text-to-Speech
by: Gudmalwar, Ashishkumar, et al.
Published: (2024)

Reconstruction Guided Few-shot Network For Remote Sensing Image Classification
by: Jaiswal, Mohit, et al.
Published: (2026)

EmoReg: Directional Latent Vector Modeling for Emotional Intensity Regularization in Diffusion-based Voice Conversion
by: Gudmalwar, Ashishkumar, et al.
Published: (2024)

Enhancing Scientific Visual Question Answering via Vision-Caption aware Supervised Fine-Tuning
by: Kapuriya, Janak, et al.
Published: (2025)

Probabilistic Online Event Downsampling
by: Girbau-Xalabarder, Andreu, et al.
Published: (2025)

The Effects of Short Video-Sharing Services on Video Copy Detection
by: Yanagi, Rintaro, et al.
Published: (2024)

Fairness Without Labels: Pseudo-Balancing for Bias Mitigation in Face Gender Classification
by: Dong, Haohua, et al.
Published: (2025)