Saved in:
| Main Author: | Lin, Hender |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.06648 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Zero-shot LLM-guided Counterfactual Generation: A Case Study on NLP Model Evaluation
by: Bhattacharjee, Amrita, et al.
Published: (2024)
by: Bhattacharjee, Amrita, et al.
Published: (2024)
Generative Adversarial Reasoner: Enhancing LLM Reasoning with Adversarial Reinforcement Learning
by: Liu, Qihao, et al.
Published: (2025)
by: Liu, Qihao, et al.
Published: (2025)
Context-Enhanced Contrastive Search for Improved LLM Text Generation
by: Sen, Jaydip, et al.
Published: (2025)
by: Sen, Jaydip, et al.
Published: (2025)
NLP Verification: Towards a General Methodology for Certifying Robustness
by: Casadio, Marco, et al.
Published: (2024)
by: Casadio, Marco, et al.
Published: (2024)
Muon is Scalable for LLM Training
by: Liu, Jingyuan, et al.
Published: (2025)
by: Liu, Jingyuan, et al.
Published: (2025)
LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training
by: Wang, Yiming, et al.
Published: (2025)
by: Wang, Yiming, et al.
Published: (2025)
Multi-Preference Optimization: Generalizing DPO via Set-Level Contrasts
by: Gupta, Taneesh, et al.
Published: (2024)
by: Gupta, Taneesh, et al.
Published: (2024)
Bridging Robustness and Generalization Against Word Substitution Attacks in NLP via the Growth Bound Matrix Approach
by: Bouri, Mohammed, et al.
Published: (2025)
by: Bouri, Mohammed, et al.
Published: (2025)
CAPO: Towards Enhancing LLM Reasoning through Generative Credit Assignment
by: Xie, Guofu, et al.
Published: (2025)
by: Xie, Guofu, et al.
Published: (2025)
EvalxNLP: A Framework for Benchmarking Post-Hoc Explainability Methods on NLP Models
by: Dhaini, Mahdi, et al.
Published: (2025)
by: Dhaini, Mahdi, et al.
Published: (2025)
RobustSentEmbed: Robust Sentence Embeddings Using Adversarial Self-Supervised Contrastive Learning
by: Asl, Javad Rafiei, et al.
Published: (2024)
by: Asl, Javad Rafiei, et al.
Published: (2024)
$C^2$: Scalable Auto-Feedback for LLM-based Chart Generation
by: Koh, Woosung, et al.
Published: (2024)
by: Koh, Woosung, et al.
Published: (2024)
Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability
by: Lin, Zicheng, et al.
Published: (2024)
by: Lin, Zicheng, et al.
Published: (2024)
Evaluating Large Language Models Using Contrast Sets: An Experimental Approach
by: Sanwal, Manish
Published: (2024)
by: Sanwal, Manish
Published: (2024)
Automated Literature Review Using NLP Techniques and LLM-Based Retrieval-Augmented Generation
by: Ali, Nurshat Fateh, et al.
Published: (2024)
by: Ali, Nurshat Fateh, et al.
Published: (2024)
Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts
by: Yin, Yueqin, et al.
Published: (2024)
by: Yin, Yueqin, et al.
Published: (2024)
Evaluating Adversarial Robustness of Concept Representations in Sparse Autoencoders
by: Li, Aaron J., et al.
Published: (2025)
by: Li, Aaron J., et al.
Published: (2025)
Tuning without Peeking: Provable Generalization Bounds and Robust LLM Post-Training
by: Labiad, Ismail, et al.
Published: (2025)
by: Labiad, Ismail, et al.
Published: (2025)
Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs
by: Sheshadri, Abhay, et al.
Published: (2024)
by: Sheshadri, Abhay, et al.
Published: (2024)
Robustly Improving LLM Fairness in Realistic Settings via Interpretability
by: Karvonen, Adam, et al.
Published: (2025)
by: Karvonen, Adam, et al.
Published: (2025)
Enhancing Annotated Bibliography Generation with LLM Ensembles
by: Bermejo, Sergio
Published: (2024)
by: Bermejo, Sergio
Published: (2024)
Post-training an LLM for RAG? Train on Self-Generated Demonstrations
by: Finlayson, Matthew, et al.
Published: (2025)
by: Finlayson, Matthew, et al.
Published: (2025)
ToxiGAN: Toxic Data Augmentation via LLM-Guided Directional Adversarial Generation
by: Li, Peiran, et al.
Published: (2026)
by: Li, Peiran, et al.
Published: (2026)
GEAR: A General Evaluation Framework for Abductive Reasoning
by: He, Kaiyu, et al.
Published: (2025)
by: He, Kaiyu, et al.
Published: (2025)
CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR
by: Cui, Sijia, et al.
Published: (2026)
by: Cui, Sijia, et al.
Published: (2026)
Same Question, Different Words: A Latent Adversarial Framework for Prompt Robustness
by: Fu, Tingchen, et al.
Published: (2025)
by: Fu, Tingchen, et al.
Published: (2025)
Training and Evaluating Language Models with Template-based Data Generation
by: Zhang, Yifan
Published: (2024)
by: Zhang, Yifan
Published: (2024)
Set-LLM: A Permutation-Invariant LLM
by: Egressy, Beni, et al.
Published: (2025)
by: Egressy, Beni, et al.
Published: (2025)
Federated Learning with Layer Skipping: Efficient Training of Large Language Models for Healthcare NLP
by: Zhang, Lihong, et al.
Published: (2025)
by: Zhang, Lihong, et al.
Published: (2025)
Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers
by: Karvonen, Adam, et al.
Published: (2025)
by: Karvonen, Adam, et al.
Published: (2025)
A Persona-Based Evaluation Framework for Pluralistic Alignment in Generative AI
by: Karagoz, Atahan
Published: (2026)
by: Karagoz, Atahan
Published: (2026)
Adversarial Preference Optimization: Enhancing Your Alignment via RM-LLM Game
by: Cheng, Pengyu, et al.
Published: (2023)
by: Cheng, Pengyu, et al.
Published: (2023)
IPAD: Inverse Prompt for AI Detection - A Robust and Interpretable LLM-Generated Text Detector
by: Chen, Zheng, et al.
Published: (2025)
by: Chen, Zheng, et al.
Published: (2025)
Enhancing LLM Evaluations: The Garbling Trick
by: Bradley, William F.
Published: (2024)
by: Bradley, William F.
Published: (2024)
Interpretable AI for Time-Series: Multi-Model Heatmap Fusion with Global Attention and NLP-Generated Explanations
by: Francis, Jiztom Kavalakkatt, et al.
Published: (2025)
by: Francis, Jiztom Kavalakkatt, et al.
Published: (2025)
Adversarial Lens: Exploiting Attention Layers to Generate Adversarial Examples for Evaluation
by: Dhole, Kaustubh
Published: (2025)
by: Dhole, Kaustubh
Published: (2025)
SyGra: A Unified Graph-Based Framework for Scalable Generation, Quality Tagging, and Management of Synthetic Data
by: Pradhan, Bidyapati, et al.
Published: (2025)
by: Pradhan, Bidyapati, et al.
Published: (2025)
Foundational Automatic Evaluators: Scaling Multi-Task Generative Evaluator Training for Reasoning-Centric Domains
by: Xu, Austin, et al.
Published: (2025)
by: Xu, Austin, et al.
Published: (2025)
CodeRefine: A Pipeline for Enhancing LLM-Generated Code Implementations of Research Papers
by: Trofimova, Ekaterina, et al.
Published: (2024)
by: Trofimova, Ekaterina, et al.
Published: (2024)
From Text to Graph: Leveraging Graph Neural Networks for Enhanced Explainability in NLP
by: Yáñez-Romero, Fabio, et al.
Published: (2025)
by: Yáñez-Romero, Fabio, et al.
Published: (2025)
Similar Items
-
Zero-shot LLM-guided Counterfactual Generation: A Case Study on NLP Model Evaluation
by: Bhattacharjee, Amrita, et al.
Published: (2024) -
Generative Adversarial Reasoner: Enhancing LLM Reasoning with Adversarial Reinforcement Learning
by: Liu, Qihao, et al.
Published: (2025) -
Context-Enhanced Contrastive Search for Improved LLM Text Generation
by: Sen, Jaydip, et al.
Published: (2025) -
NLP Verification: Towards a General Methodology for Certifying Robustness
by: Casadio, Marco, et al.
Published: (2024) -
Muon is Scalable for LLM Training
by: Liu, Jingyuan, et al.
Published: (2025)