Saved in:
| Main Authors: | Hui, Zheng, Guo, Zhaoxiao, Zhao, Hang, Duan, Juanyong, Huang, Congrui |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.14740 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ToxiLab: How Well Do Open-Source LLMs Generate Synthetic Toxicity Data?
by: Hui, Zheng, et al.
Published: (2024)
by: Hui, Zheng, et al.
Published: (2024)
ToxiGAN: Toxic Data Augmentation via LLM-Guided Directional Adversarial Generation
by: Li, Peiran, et al.
Published: (2026)
by: Li, Peiran, et al.
Published: (2026)
Beyond Direct Generation: A Decomposed Approach to Well-Crafted Screenwriting with LLMs
by: Lei, Hang, et al.
Published: (2025)
by: Lei, Hang, et al.
Published: (2025)
HarmMetric Eval: Benchmarking Metrics and Judges for LLM Harmfulness Assessment
by: Yang, Langqi, et al.
Published: (2025)
by: Yang, Langqi, et al.
Published: (2025)
Lost-in-the-Middle in Long-Text Generation: Synthetic Dataset, Evaluation Framework, and Mitigation
by: Zhang, Junhao, et al.
Published: (2025)
by: Zhang, Junhao, et al.
Published: (2025)
VaccineRAG: Boosting Multimodal Large Language Models' Immunity to Harmful RAG Samples
by: Sun, Qixin, et al.
Published: (2025)
by: Sun, Qixin, et al.
Published: (2025)
Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation
by: Huang, Tiansheng, et al.
Published: (2024)
by: Huang, Tiansheng, et al.
Published: (2024)
What are the Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets? Insights and Best Practices
by: Chen, Zhi, et al.
Published: (2024)
by: Chen, Zhi, et al.
Published: (2024)
LieCraft: A Multi-Agent Framework for Evaluating Deceptive Capabilities in Language Models
by: Olson, Matthew Lyle, et al.
Published: (2026)
by: Olson, Matthew Lyle, et al.
Published: (2026)
OmniTrace: A Unified Framework for Generation-Time Attribution in Omni-Modal LLMs
by: Yan, Qianqi, et al.
Published: (2026)
by: Yan, Qianqi, et al.
Published: (2026)
MIND: A Multi-agent Framework for Zero-shot Harmful Meme Detection
by: Liu, Ziyan, et al.
Published: (2025)
by: Liu, Ziyan, et al.
Published: (2025)
Harm or Humor: A Multimodal, Multilingual Benchmark for Overt and Covert Harmful Humor
by: Sharshar, Ahmed, et al.
Published: (2026)
by: Sharshar, Ahmed, et al.
Published: (2026)
ToxiFrench: Benchmarking and Enhancing Language Models via CoT Fine-Tuning for French Toxicity Detection
by: Delaval, Axel, et al.
Published: (2025)
by: Delaval, Axel, et al.
Published: (2025)
Self-HarmLLM: Can Large Language Model Harm Itself?
by: Kim, Heehwan, et al.
Published: (2025)
by: Kim, Heehwan, et al.
Published: (2025)
StealthGraph: Exposing Domain-Specific Risks in LLMs through Knowledge-Graph-Guided Harmful Prompt Generation
by: Zheng, Huawei, et al.
Published: (2026)
by: Zheng, Huawei, et al.
Published: (2026)
Generating Synthetic Datasets for Few-shot Prompt Tuning
by: Guo, Xu, et al.
Published: (2024)
by: Guo, Xu, et al.
Published: (2024)
Optimsyn: Influence-Guided Rubrics Optimization for Synthetic Data Generation
by: Fan, Zhiting, et al.
Published: (2026)
by: Fan, Zhiting, et al.
Published: (2026)
HarmTransform: Transforming Explicit Harmful Queries into Stealthy via Multi-Agent Debate
by: Zhu, Shenzhe
Published: (2025)
by: Zhu, Shenzhe
Published: (2025)
AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
by: Andriushchenko, Maksym, et al.
Published: (2024)
by: Andriushchenko, Maksym, et al.
Published: (2024)
PluriHarms: Benchmarking the Full Spectrum of Human Judgments on AI Harm
by: Li, Jing-Jing, et al.
Published: (2026)
by: Li, Jing-Jing, et al.
Published: (2026)
Why Synthetic Isn't Real Yet: A Diagnostic Framework for Contact Center Dialogue Generation
by: Devanathan, Rishikesh, et al.
Published: (2025)
by: Devanathan, Rishikesh, et al.
Published: (2025)
Measuring Diversity in Synthetic Datasets
by: Zhu, Yuchang, et al.
Published: (2025)
by: Zhu, Yuchang, et al.
Published: (2025)
Large Language Models are Vulnerable to Bait-and-Switch Attacks for Generating Harmful Content
by: Bianchi, Federico, et al.
Published: (2024)
by: Bianchi, Federico, et al.
Published: (2024)
Controlled Generation for Private Synthetic Text
by: Zhao, Zihao, et al.
Published: (2025)
by: Zhao, Zihao, et al.
Published: (2025)
Cross-Lingual Consistency: A Novel Inference Framework for Advancing Reasoning in Large Language Models
by: Yu, Zhiwei, et al.
Published: (2025)
by: Yu, Zhiwei, et al.
Published: (2025)
`For Argument's Sake, Show Me How to Harm Myself!': Jailbreaking LLMs in Suicide and Self-Harm Contexts
by: Schoene, Annika M, et al.
Published: (2025)
by: Schoene, Annika M, et al.
Published: (2025)
TARGA: Targeted Synthetic Data Generation for Practical Reasoning over Structured Data
by: Huang, Xiang, et al.
Published: (2024)
by: Huang, Xiang, et al.
Published: (2024)
Crafting Narrative Closures: Zero-Shot Learning with SSM Mamba for Short Story Ending Generation
by: Sharma, Divyam, et al.
Published: (2024)
by: Sharma, Divyam, et al.
Published: (2024)
GenCRF: Generative Clustering and Reformulation Framework for Enhanced Intent-Driven Information Retrieval
by: Seo, Wonduk, et al.
Published: (2024)
by: Seo, Wonduk, et al.
Published: (2024)
Matrix: Peer-to-Peer Multi-Agent Synthetic Data Generation Framework
by: Wang, Dong, et al.
Published: (2025)
by: Wang, Dong, et al.
Published: (2025)
Synthetic4Health: Generating Annotated Synthetic Clinical Letters
by: Ren, Libo, et al.
Published: (2024)
by: Ren, Libo, et al.
Published: (2024)
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem
by: Wang, Weixun, et al.
Published: (2025)
by: Wang, Weixun, et al.
Published: (2025)
A LLM-Powered Automatic Grading Framework with Human-Level Guidelines Optimization
by: Chu, Yucheng, et al.
Published: (2024)
by: Chu, Yucheng, et al.
Published: (2024)
ChatLang-8: An LLM-Based Synthetic Data Generation Framework for Grammatical Error Correction
by: Park, Jeiyoon, et al.
Published: (2024)
by: Park, Jeiyoon, et al.
Published: (2024)
Towards Comprehensive Detection of Chinese Harmful Memes
by: Lu, Junyu, et al.
Published: (2024)
by: Lu, Junyu, et al.
Published: (2024)
SocialHarmBench: Revealing LLM Vulnerabilities to Socially Harmful Requests
by: Pandey, Punya Syon, et al.
Published: (2025)
by: Pandey, Punya Syon, et al.
Published: (2025)
ChallengeMe: An Adversarial Learning-enabled Text Summarization Framework
by: Deng, Xiaoyu, et al.
Published: (2025)
by: Deng, Xiaoyu, et al.
Published: (2025)
TripCraft: A Benchmark for Spatio-Temporally Fine Grained Travel Planning
by: Chaudhuri, Soumyabrata, et al.
Published: (2025)
by: Chaudhuri, Soumyabrata, et al.
Published: (2025)
Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation
by: Wang, Yibo, et al.
Published: (2025)
by: Wang, Yibo, et al.
Published: (2025)
Generation of Synthetic Clinical Text: A Systematic Review
by: Alshaikhdeeb, Basel, et al.
Published: (2025)
by: Alshaikhdeeb, Basel, et al.
Published: (2025)
Similar Items
-
ToxiLab: How Well Do Open-Source LLMs Generate Synthetic Toxicity Data?
by: Hui, Zheng, et al.
Published: (2024) -
ToxiGAN: Toxic Data Augmentation via LLM-Guided Directional Adversarial Generation
by: Li, Peiran, et al.
Published: (2026) -
Beyond Direct Generation: A Decomposed Approach to Well-Crafted Screenwriting with LLMs
by: Lei, Hang, et al.
Published: (2025) -
HarmMetric Eval: Benchmarking Metrics and Judges for LLM Harmfulness Assessment
by: Yang, Langqi, et al.
Published: (2025) -
Lost-in-the-Middle in Long-Text Generation: Synthetic Dataset, Evaluation Framework, and Mitigation
by: Zhang, Junhao, et al.
Published: (2025)