Saved in:
| Main Authors: | Maheshwari, Gaurav, Bellet, Aurélien, Denis, Pascal, Keller, Mikaela |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2405.14521 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Efficacy of Synthetic Data as a Benchmark
by: Maheshwari, Gaurav, et al.
Published: (2024)
by: Maheshwari, Gaurav, et al.
Published: (2024)
Curate-Train-Refine: A Closed-Loop Agentic Framework for Zero Shot Classification
by: Maheshwari, Gaurav, et al.
Published: (2026)
by: Maheshwari, Gaurav, et al.
Published: (2026)
To Word Senses and Beyond: Inducing Concepts with Contextualized Language Models
by: Liétard, Bastien, et al.
Published: (2024)
by: Liétard, Bastien, et al.
Published: (2024)
Optimal Transport under Group Fairness Constraints
by: Bleistein, Linus, et al.
Published: (2026)
by: Bleistein, Linus, et al.
Published: (2026)
Loss Gap Parity for Fairness in Heterogeneous Federated Learning
by: Erraji, Brahim, et al.
Published: (2026)
by: Erraji, Brahim, et al.
Published: (2026)
Enhancing Clinical Documentation with Synthetic Data: Leveraging Generative Models for Improved Accuracy
by: Biswas, Anjanava, et al.
Published: (2024)
by: Biswas, Anjanava, et al.
Published: (2024)
BARE: Leveraging Base Language Models for Few-Shot Synthetic Data Generation
by: Zhu, Alan, et al.
Published: (2025)
by: Zhu, Alan, et al.
Published: (2025)
Financial Sentiment Analysis: Leveraging Actual and Synthetic Data for Supervised Fine-tuning
by: Atsiwo, Abraham
Published: (2024)
by: Atsiwo, Abraham
Published: (2024)
Privacy Amplification Through Synthetic Data: Insights from Linear Regression
by: Pierquin, Clément, et al.
Published: (2025)
by: Pierquin, Clément, et al.
Published: (2025)
CasualSynth: Generating Structurally Sound Synthetic Data
by: Cheng, Zehua, et al.
Published: (2026)
by: Cheng, Zehua, et al.
Published: (2026)
Better as Generators Than Classifiers: Leveraging LLMs and Synthetic Data for Low-Resource Multilingual Classification
by: Pecher, Branislav, et al.
Published: (2026)
by: Pecher, Branislav, et al.
Published: (2026)
Privacy Amplification Persists under Unlimited Synthetic Data Release
by: Pierquin, Clément, et al.
Published: (2026)
by: Pierquin, Clément, et al.
Published: (2026)
MALTO at SemEval-2024 Task 6: Leveraging Synthetic Data for LLM Hallucination Detection
by: Borra, Federico, et al.
Published: (2024)
by: Borra, Federico, et al.
Published: (2024)
Hierarchical Latent Structures in Data Generation Process Unify Mechanistic Phenomena across Scale
by: Rohweder, Jonas, et al.
Published: (2026)
by: Rohweder, Jonas, et al.
Published: (2026)
Q-NL Verifier: Leveraging Synthetic Data for Robust Knowledge Graph Question Answering
by: Schwabe, Tim, et al.
Published: (2025)
by: Schwabe, Tim, et al.
Published: (2025)
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows
by: Patel, Ajay, et al.
Published: (2024)
by: Patel, Ajay, et al.
Published: (2024)
BadFair: Backdoored Fairness Attacks with Group-conditioned Triggers
by: Xue, Jiaqi, et al.
Published: (2024)
by: Xue, Jiaqi, et al.
Published: (2024)
DICTDIS: Dictionary Constrained Disambiguation for Improved NMT
by: Maheshwari, Ayush, et al.
Published: (2022)
by: Maheshwari, Ayush, et al.
Published: (2022)
Balancing Cost and Effectiveness of Synthetic Data Generation Strategies for LLMs
by: Chan, Yung-Chieh, et al.
Published: (2024)
by: Chan, Yung-Chieh, et al.
Published: (2024)
Towards Active Synthetic Data Generation for Finetuning Language Models
by: Kessler, Samuel, et al.
Published: (2025)
by: Kessler, Samuel, et al.
Published: (2025)
Group Fairness Meets the Black Box: Enabling Fair Algorithms on Closed LLMs via Post-Processing
by: Xian, Ruicheng, et al.
Published: (2025)
by: Xian, Ruicheng, et al.
Published: (2025)
Synthetic Data Generation and Joint Learning for Robust Code-Mixed Translation
by: Kartik, Kartik, et al.
Published: (2024)
by: Kartik, Kartik, et al.
Published: (2024)
Optimal Transport with Heterogeneously Missing Data
by: Bleistein, Linus, et al.
Published: (2025)
by: Bleistein, Linus, et al.
Published: (2025)
Reasoning-Driven Synthetic Data Generation and Evaluation
by: Davidson, Tim R., et al.
Published: (2026)
by: Davidson, Tim R., et al.
Published: (2026)
The Power of LLM-Generated Synthetic Data for Stance Detection in Online Political Discussions
by: Wagner, Stefan Sylvius, et al.
Published: (2024)
by: Wagner, Stefan Sylvius, et al.
Published: (2024)
Activation Steering for Synthetic Data Generation: The Role of Diversity in Downstream Safety Detection
by: Deshpande, Vijeta, et al.
Published: (2026)
by: Deshpande, Vijeta, et al.
Published: (2026)
Synthetic Context Generation for Question Generation
by: Liu, Naiming, et al.
Published: (2024)
by: Liu, Naiming, et al.
Published: (2024)
Transfer of Structural Knowledge from Synthetic Languages
by: Budnikov, Mikhail, et al.
Published: (2025)
by: Budnikov, Mikhail, et al.
Published: (2025)
Fill In The Gaps: Model Calibration and Generalization with Synthetic Data
by: Ba, Yang, et al.
Published: (2024)
by: Ba, Yang, et al.
Published: (2024)
Out-of-Distribution Detection using Synthetic Data Generation
by: Abbas, Momin, et al.
Published: (2025)
by: Abbas, Momin, et al.
Published: (2025)
Dynamic Context Evolution for Scalable Synthetic Data Generation
by: Lingo, Ryan, et al.
Published: (2026)
by: Lingo, Ryan, et al.
Published: (2026)
DRTriton: Large-Scale Synthetic Data Driven Reinforcement Learning for Triton Kernel Generation
by: Guo, Siqi, et al.
Published: (2026)
by: Guo, Siqi, et al.
Published: (2026)
Sometimes I am a Tree: Data Drives Unstable Hierarchical Generalization
by: Qin, Tian, et al.
Published: (2024)
by: Qin, Tian, et al.
Published: (2024)
Fully Decentralized Joint Learning of Personalized Models and Collaboration Graphs
by: Zantedeschi, Valentina, et al.
Published: (2019)
by: Zantedeschi, Valentina, et al.
Published: (2019)
Private Rate-Constrained Optimization with Applications to Fair Learning
by: Yaghini, Mohammad, et al.
Published: (2025)
by: Yaghini, Mohammad, et al.
Published: (2025)
DualAlign: Generating Clinically Grounded Synthetic Data
by: Li, Rumeng, et al.
Published: (2025)
by: Li, Rumeng, et al.
Published: (2025)
A Review on Generative AI Models for Synthetic Medical Text, Time Series, and Longitudinal Data
by: Loni, Mohammad, et al.
Published: (2024)
by: Loni, Mohammad, et al.
Published: (2024)
RingSQL: Generating Synthetic Data with Schema-Independent Templates for Text-to-SQL Reasoning Models
by: Sterbentz, Marko, et al.
Published: (2026)
by: Sterbentz, Marko, et al.
Published: (2026)
Synthetic Data Generation in Low-Resource Settings via Fine-Tuning of Large Language Models
by: Kaddour, Jean, et al.
Published: (2023)
by: Kaddour, Jean, et al.
Published: (2023)
CALICO: Conversational Agent Localization via Synthetic Data Generation
by: Rosenbaum, Andy, et al.
Published: (2024)
by: Rosenbaum, Andy, et al.
Published: (2024)
Similar Items
-
Efficacy of Synthetic Data as a Benchmark
by: Maheshwari, Gaurav, et al.
Published: (2024) -
Curate-Train-Refine: A Closed-Loop Agentic Framework for Zero Shot Classification
by: Maheshwari, Gaurav, et al.
Published: (2026) -
To Word Senses and Beyond: Inducing Concepts with Contextualized Language Models
by: Liétard, Bastien, et al.
Published: (2024) -
Optimal Transport under Group Fairness Constraints
by: Bleistein, Linus, et al.
Published: (2026) -
Loss Gap Parity for Fairness in Heterogeneous Federated Learning
by: Erraji, Brahim, et al.
Published: (2026)