:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Biswas, Anjanava, Talukdar, Wrick
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2406.06569
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Improving Large Language Model (LLM) fidelity through context-aware grounding: A systematic approach to reliability and veracity
by: Talukdar, Wrick, et al.
Published: (2024)

Synergizing Unsupervised and Supervised Learning: A Hybrid Approach for Accurate Natural Language Task Modeling
by: Talukdar, Wrick, et al.
Published: (2024)

Intelligent Clinical Documentation: Harnessing Generative AI for Patient-Centric Clinical Note Generation
by: Biswas, Anjanava, et al.
Published: (2024)

Robustness of Structured Data Extraction from In-plane Rotated Documents using Multi-Modal Large Language Models (LLM)
by: Biswas, Anjanava, et al.
Published: (2024)

Guardrails for trust, safety, and ethical development and deployment of Large Language Models (LLM)
by: Biswas, Anjanava, et al.
Published: (2026)

FinEmbedDiff: A Cost-Effective Approach of Classifying Financial Documents with Vector Sampling using Multi-modal Embedding Models
by: Biswas, Anjanava, et al.
Published: (2024)

BARE: Leveraging Base Language Models for Few-Shot Synthetic Data Generation
by: Zhu, Alan, et al.
Published: (2025)

DualAlign: Generating Clinically Grounded Synthetic Data
by: Li, Rumeng, et al.
Published: (2025)

Better as Generators Than Classifiers: Leveraging LLMs and Synthetic Data for Low-Resource Multilingual Classification
by: Pecher, Branislav, et al.
Published: (2026)

Enhancing Domain-Specific Retrieval-Augmented Generation: Synthetic Data Generation and Evaluation using Reasoning Models
by: Jadon, Aryan, et al.
Published: (2025)

Gas Station of the Future: A Perspective on AI/ML and IoT in Retail Downstream
by: Talukdar, Wrick
Published: (2025)

Fill In The Gaps: Model Calibration and Generalization with Synthetic Data
by: Ba, Yang, et al.
Published: (2024)

Learning from Synthetic Data Improves Multi-hop Reasoning
by: Kabra, Anmol, et al.
Published: (2026)

Speaking the Same Language: Leveraging LLMs in Standardizing Clinical Data for AI
by: Sett, Arindam, et al.
Published: (2024)

Contrastive Decoding for Synthetic Data Generation in Low-Resource Language Modeling
by: Ulm, Jannek, et al.
Published: (2025)

DiffSampling: Enhancing Diversity and Accuracy in Neural Text Generation
by: Franceschelli, Giorgio, et al.
Published: (2025)

Reasoning-Driven Synthetic Data Generation and Evaluation
by: Davidson, Tim R., et al.
Published: (2026)

MisSynth: Improving MISSCI Logical Fallacies Classification with Synthetic Data
by: Poliakov, Mykhailo, et al.
Published: (2025)

DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models
by: Zhou, Ying, et al.
Published: (2024)

Maximize Your Data's Potential: Enhancing LLM Accuracy with Two-Phase Pretraining
by: Feng, Steven, et al.
Published: (2024)

Synthetic Patient-Physician Dialogue Generation from Clinical Notes Using LLM
by: Das, Trisha, et al.
Published: (2024)

Out-of-Distribution Detection using Synthetic Data Generation
by: Abbas, Momin, et al.
Published: (2025)

Dynamic Context Evolution for Scalable Synthetic Data Generation
by: Lingo, Ryan, et al.
Published: (2026)

CasualSynth: Generating Structurally Sound Synthetic Data
by: Cheng, Zehua, et al.
Published: (2026)

West-of-N: Synthetic Preferences for Self-Improving Reward Models
by: Pace, Alizée, et al.
Published: (2024)

CALICO: Conversational Agent Localization via Synthetic Data Generation
by: Rosenbaum, Andy, et al.
Published: (2024)

A Multi-Faceted Evaluation Framework for Assessing Synthetic Data Generated by Large Language Models
by: Yuan, Yefeng, et al.
Published: (2024)

Socially Aware Synthetic Data Generation for Suicidal Ideation Detection Using Large Language Models
by: Ghanadian, Hamideh, et al.
Published: (2024)

AI-assisted Protocol Information Extraction For Improved Accuracy and Efficiency in Clinical Trial Workflows
by: Babaeipour, Ramtin, et al.
Published: (2026)

Topic-VQ-VAE: Leveraging Latent Codebooks for Flexible Topic-Guided Document Generation
by: Yoo, YoungJoon, et al.
Published: (2023)

From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data
by: Xiong, Zheyang, et al.
Published: (2024)

Improving Direct Persian-English Speech-to-Speech Translation with Discrete Units and Synthetic Parallel Data
by: Rashidi, Sina, et al.
Published: (2025)

Enhancing Marker Scoring Accuracy through Ordinal Confidence Modelling in Educational Assessments
by: Chakravarty, Abhirup, et al.
Published: (2025)

Synthetic Data Generation & Multi-Step RL for Reasoning & Tool Use
by: Goldie, Anna, et al.
Published: (2025)

Matrix: Peer-to-Peer Multi-Agent Synthetic Data Generation Framework
by: Wang, Dong, et al.
Published: (2025)

NutriGen: Personalized Meal Plan Generator Leveraging Large Language Models to Enhance Dietary and Nutritional Adherence
by: Khamesian, Saman, et al.
Published: (2025)

CodecLM: Aligning Language Models with Tailored Synthetic Data
by: Wang, Zifeng, et al.
Published: (2024)

Does Training on Synthetic Data Make Models Less Robust?
by: Zhang, Lingze, et al.
Published: (2025)

Building a Foundational Guardrail for General Agentic Systems via Synthetic Data
by: Huang, Yue, et al.
Published: (2025)

Synthetic vs. Gold: The Role of LLM Generated Labels and Data in Cyberbullying Detection
by: Kazemi, Arefeh, et al.
Published: (2025)