Saved in:
| Main Authors: | Li, Zongjie, Wang, Chaozheng, Ma, Pingchuan, Wu, Daoyuan, Wang, Shuai, Gao, Cuiyun, Liu, Yang |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2310.01432 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Testing and Understanding Erroneous Planning in LLM Agents through Synthesized User Inputs
by: Ji, Zhenlan, et al.
Published: (2024)
by: Ji, Zhenlan, et al.
Published: (2024)
STShield: Single-Token Sentinel for Real-Time Jailbreak Detection in Large Language Models
by: Wang, Xunguang, et al.
Published: (2025)
by: Wang, Xunguang, et al.
Published: (2025)
WARBENCH: A Comprehensive Benchmark for Evaluating LLMs in Military Decision-Making
by: Li, Zongjie, et al.
Published: (2026)
by: Li, Zongjie, et al.
Published: (2026)
Taxonomy, Evaluation and Exploitation of IPI-Centric LLM Agent Defense Frameworks
by: Ji, Zimo, et al.
Published: (2025)
by: Ji, Zimo, et al.
Published: (2025)
Measuring and Augmenting Large Language Models for Solving Capture-the-Flag Challenges
by: Ji, Zimo, et al.
Published: (2025)
by: Ji, Zimo, et al.
Published: (2025)
IP Leakage Attacks Targeting LLM-Based Multi-Agent Systems
by: Wang, Liwen, et al.
Published: (2025)
by: Wang, Liwen, et al.
Published: (2025)
The Prompt Alchemist: Automated LLM-Tailored Prompt Optimization for Test Case Generation
by: Gao, Shuzheng, et al.
Published: (2025)
by: Gao, Shuzheng, et al.
Published: (2025)
Digging Into the Internal: Causality-Based Analysis of LLM Function Calling
by: Ji, Zhenlan, et al.
Published: (2025)
by: Ji, Zhenlan, et al.
Published: (2025)
An Empirical Study on Large Language Models in Accuracy and Robustness under Chinese Industrial Scenarios
by: Li, Zongjie, et al.
Published: (2024)
by: Li, Zongjie, et al.
Published: (2024)
SelfDefend: LLMs Can Defend Themselves against Jailbreaking in a Practical Manner
by: Wang, Xunguang, et al.
Published: (2024)
by: Wang, Xunguang, et al.
Published: (2024)
Differentiation-Based Extraction of Proprietary Data from Fine-Tuned LLMs
by: Li, Zongjie, et al.
Published: (2025)
by: Li, Zongjie, et al.
Published: (2025)
API-guided Dataset Synthesis to Finetune Large Code Models
by: Li, Zongjie, et al.
Published: (2024)
by: Li, Zongjie, et al.
Published: (2024)
SoK: Evaluating Jailbreak Guardrails for Large Language Models
by: Wang, Xunguang, et al.
Published: (2025)
by: Wang, Xunguang, et al.
Published: (2025)
GuidedBench: Measuring and Mitigating the Evaluation Discrepancies of In-the-wild LLM Jailbreak Methods
by: Huang, Ruixuan, et al.
Published: (2025)
by: Huang, Ruixuan, et al.
Published: (2025)
CodeVisionary: An Agent-based Framework for Evaluating Large Language Models in Code Generation
by: Wang, Xinchen, et al.
Published: (2025)
by: Wang, Xinchen, et al.
Published: (2025)
Optimal Brain Iterative Merging: Mitigating Interference in LLM Merging
by: Wang, Zhixiang, et al.
Published: (2025)
by: Wang, Zhixiang, et al.
Published: (2025)
Beyond Content Safety: Real-Time Monitoring for Reasoning Vulnerabilities in Large Language Models
by: Wang, Xunguang, et al.
Published: (2026)
by: Wang, Xunguang, et al.
Published: (2026)
Extrapolation Merging: Keep Improving With Extrapolation and Merging
by: Lin, Yiguan, et al.
Published: (2025)
by: Lin, Yiguan, et al.
Published: (2025)
Empirical Study of Code Large Language Models for Binary Security Patch Detection
by: Li, Qingyuan, et al.
Published: (2025)
by: Li, Qingyuan, et al.
Published: (2025)
Reasoning as a Resource: Optimizing Fast and Slow Thinking in Code Generation Models
by: Li, Zongjie, et al.
Published: (2025)
by: Li, Zongjie, et al.
Published: (2025)
Probing Association Biases in LLM Moderation Over-Sensitivity
by: Wang, Yuxin, et al.
Published: (2025)
by: Wang, Yuxin, et al.
Published: (2025)
HALF: Harm-Aware LLM Fairness Evaluation Aligned with Deployment
by: Mekky, Ali, et al.
Published: (2025)
by: Mekky, Ali, et al.
Published: (2025)
Gender and Positional Biases in LLM-Based Hiring Decisions: Evidence from Comparative CV/Résumé Evaluations
by: Rozado, David
Published: (2025)
by: Rozado, David
Published: (2025)
Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge
by: Ye, Jiayi, et al.
Published: (2024)
by: Ye, Jiayi, et al.
Published: (2024)
SPVR: syntax-to-prompt vulnerability repair based on large language models
by: Wang, Ruoke, et al.
Published: (2024)
by: Wang, Ruoke, et al.
Published: (2024)
Grounded in Reality: Learning and Deploying Proactive LLM from Offline Logs
by: Wei, Fei, et al.
Published: (2025)
by: Wei, Fei, et al.
Published: (2025)
Taming Various Privilege Escalation in LLM-Based Agent Systems: A Mandatory Access Control Framework
by: Ji, Zimo, et al.
Published: (2026)
by: Ji, Zimo, et al.
Published: (2026)
Knowledge-Infused Legal Wisdom: Navigating LLM Consultation through the Lens of Diagnostics and Positive-Unlabeled Reinforcement Learning
by: Wu, Yang, et al.
Published: (2024)
by: Wu, Yang, et al.
Published: (2024)
From Biased Chatbots to Biased Agents: Examining Role Assignment Effects on LLM Agent Robustness
by: Cao, Linbo, et al.
Published: (2026)
by: Cao, Linbo, et al.
Published: (2026)
Reason-Align-Respond: Aligning LLM Reasoning with Knowledge Graphs for KGQA
by: Shen, Xiangqing, et al.
Published: (2025)
by: Shen, Xiangqing, et al.
Published: (2025)
BoRP: Bootstrapped Regression Probing for Scalable and Human-Aligned LLM Evaluation
by: Sun, Peng, et al.
Published: (2026)
by: Sun, Peng, et al.
Published: (2026)
Training-free LLM Merging for Multi-task Learning
by: Fu, Zichuan, et al.
Published: (2025)
by: Fu, Zichuan, et al.
Published: (2025)
LLMs Can Defend Themselves Against Jailbreaking in a Practical Manner: A Vision Paper
by: Wu, Daoyuan, et al.
Published: (2024)
by: Wu, Daoyuan, et al.
Published: (2024)
LPFQA: A Long-Tail Professional Forum-based Benchmark for LLM Evaluation
by: Zhu, Liya, et al.
Published: (2025)
by: Zhu, Liya, et al.
Published: (2025)
MRScore: Evaluating Radiology Report Generation with LLM-based Reward System
by: Liu, Yunyi, et al.
Published: (2024)
by: Liu, Yunyi, et al.
Published: (2024)
LLMs Judge Themselves: A Game-Theoretic Framework for Human-Aligned Evaluation
by: Yang, Gao, et al.
Published: (2025)
by: Yang, Gao, et al.
Published: (2025)
Citation-Enhanced Generation for LLM-based Chatbots
by: Li, Weitao, et al.
Published: (2024)
by: Li, Weitao, et al.
Published: (2024)
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
by: Zhao, Chengshuai, et al.
Published: (2025)
by: Zhao, Chengshuai, et al.
Published: (2025)
Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Merging
by: Liu, Deyuan, et al.
Published: (2024)
by: Liu, Deyuan, et al.
Published: (2024)
Search-Based LLMs for Code Optimization
by: Gao, Shuzheng, et al.
Published: (2024)
by: Gao, Shuzheng, et al.
Published: (2024)
Similar Items
-
Testing and Understanding Erroneous Planning in LLM Agents through Synthesized User Inputs
by: Ji, Zhenlan, et al.
Published: (2024) -
STShield: Single-Token Sentinel for Real-Time Jailbreak Detection in Large Language Models
by: Wang, Xunguang, et al.
Published: (2025) -
WARBENCH: A Comprehensive Benchmark for Evaluating LLMs in Military Decision-Making
by: Li, Zongjie, et al.
Published: (2026) -
Taxonomy, Evaluation and Exploitation of IPI-Centric LLM Agent Defense Frameworks
by: Ji, Zimo, et al.
Published: (2025) -
Measuring and Augmenting Large Language Models for Solving Capture-the-Flag Challenges
by: Ji, Zimo, et al.
Published: (2025)