Saved in:
| Main Authors: | Hua, Andong, Tang, Kenan, Gu, Chenhe, Gu, Jindong, Wong, Eric, Qin, Yao |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.01790 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Improving Adversarial Transferability in MLLMs via Dynamic Vision-Language Alignment Attack
by: Gu, Chenhe, et al.
Published: (2025)
by: Gu, Chenhe, et al.
Published: (2025)
Banana100: Breaking NR-IQA Metrics by 100 Iterative Image Replications with Nano Banana Pro
by: Tang, Kenan, et al.
Published: (2026)
by: Tang, Kenan, et al.
Published: (2026)
PromptBench: A Unified Library for Evaluation of Large Language Models
by: Zhu, Kaijie, et al.
Published: (2023)
by: Zhu, Kaijie, et al.
Published: (2023)
Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates
by: Lyu, Kaifeng, et al.
Published: (2024)
by: Lyu, Kaifeng, et al.
Published: (2024)
Interesting Scientific Idea Generation using Knowledge Graphs and LLMs: Evaluations with 100 Research Group Leaders
by: Gu, Xuemei, et al.
Published: (2024)
by: Gu, Xuemei, et al.
Published: (2024)
Special Characters Attack: Toward Scalable Training Data Extraction From Large Language Models
by: Bai, Yang, et al.
Published: (2024)
by: Bai, Yang, et al.
Published: (2024)
Architectural Flaw Detection in Civil Engineering Using GPT-4
by: Kumar, Saket, et al.
Published: (2024)
by: Kumar, Saket, et al.
Published: (2024)
Selective Prompting Tuning for Personalized Conversations with LLMs
by: Huang, Qiushi, et al.
Published: (2024)
by: Huang, Qiushi, et al.
Published: (2024)
AgentBench: Evaluating LLMs as Agents
by: Liu, Xiao, et al.
Published: (2023)
by: Liu, Xiao, et al.
Published: (2023)
LLMs Should Express Uncertainty Explicitly
by: Guo, Junyu, et al.
Published: (2026)
by: Guo, Junyu, et al.
Published: (2026)
On Meta-Prompting
by: de Wynter, Adrian, et al.
Published: (2023)
by: de Wynter, Adrian, et al.
Published: (2023)
Batch Calibration: Rethinking Calibration for In-Context Learning and Prompt Engineering
by: Zhou, Han, et al.
Published: (2023)
by: Zhou, Han, et al.
Published: (2023)
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge
by: Tang, Yao, et al.
Published: (2026)
by: Tang, Yao, et al.
Published: (2026)
Dynamic Evaluation of Large Language Models by Meta Probing Agents
by: Zhu, Kaijie, et al.
Published: (2024)
by: Zhu, Kaijie, et al.
Published: (2024)
Higher-order Linear Attention
by: Zhang, Yifan, et al.
Published: (2025)
by: Zhang, Yifan, et al.
Published: (2025)
Evaluating the Generalization Ability of Quantized LLMs: Benchmark, Analysis, and Toolbox
by: Liu, Yijun, et al.
Published: (2024)
by: Liu, Yijun, et al.
Published: (2024)
Revisiting Prompt Sensitivity in Large Language Models for Text Classification: The Role of Prompt Underspecification
by: Pecher, Branislav, et al.
Published: (2026)
by: Pecher, Branislav, et al.
Published: (2026)
Does Machine Unlearning Truly Remove Knowledge?
by: Chen, Haokun, et al.
Published: (2025)
by: Chen, Haokun, et al.
Published: (2025)
KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models
by: Yu, Zhuohao, et al.
Published: (2024)
by: Yu, Zhuohao, et al.
Published: (2024)
Safe, or Simply Incapable? Rethinking Safety Evaluation for Phone-Use Agents
by: Tang, Zhengyang, et al.
Published: (2026)
by: Tang, Zhengyang, et al.
Published: (2026)
Evaluating Prompt Engineering Techniques for Accuracy and Confidence Elicitation in Medical LLMs
by: Naderi, Nariman, et al.
Published: (2025)
by: Naderi, Nariman, et al.
Published: (2025)
Rethinking the Potential of Multimodality in Collaborative Problem Solving Diagnosis with Large Language Models
by: Wong, K., et al.
Published: (2025)
by: Wong, K., et al.
Published: (2025)
How Susceptible are LLMs to Influence in Prompts?
by: Anagnostidis, Sotiris, et al.
Published: (2024)
by: Anagnostidis, Sotiris, et al.
Published: (2024)
Beyond Confidence: Rethinking Self-Assessments for Performance Prediction in LLMs
by: Bhattacharyya, Sree, et al.
Published: (2026)
by: Bhattacharyya, Sree, et al.
Published: (2026)
Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts
by: Yin, Yueqin, et al.
Published: (2024)
by: Yin, Yueqin, et al.
Published: (2024)
Tactic: Adaptive Sparse Attention with Clustering and Distribution Fitting for Long-Context LLMs
by: Zhu, Kan, et al.
Published: (2025)
by: Zhu, Kan, et al.
Published: (2025)
Learning and Enforcing Context-Sensitive Control for LLMs
by: Albinhassan, Mohammad, et al.
Published: (2026)
by: Albinhassan, Mohammad, et al.
Published: (2026)
POSIX: A Prompt Sensitivity Index For Large Language Models
by: Chatterjee, Anwoy, et al.
Published: (2024)
by: Chatterjee, Anwoy, et al.
Published: (2024)
RewardAnything: Generalizable Principle-Following Reward Models
by: Yu, Zhuohao, et al.
Published: (2025)
by: Yu, Zhuohao, et al.
Published: (2025)
Tensor Product Attention Is All You Need
by: Zhang, Yifan, et al.
Published: (2025)
by: Zhang, Yifan, et al.
Published: (2025)
Prompt Repetition Improves Non-Reasoning LLMs
by: Leviathan, Yaniv, et al.
Published: (2025)
by: Leviathan, Yaniv, et al.
Published: (2025)
Tabular Transfer Learning via Prompting LLMs
by: Nam, Jaehyun, et al.
Published: (2024)
by: Nam, Jaehyun, et al.
Published: (2024)
DyVal: Dynamic Evaluation of Large Language Models for Reasoning Tasks
by: Zhu, Kaijie, et al.
Published: (2023)
by: Zhu, Kaijie, et al.
Published: (2023)
UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Language Models
by: Gu, Xiaojie, et al.
Published: (2025)
by: Gu, Xiaojie, et al.
Published: (2025)
StyleBench: Evaluating thinking styles in Large Language Models
by: Guo, Junyu, et al.
Published: (2025)
by: Guo, Junyu, et al.
Published: (2025)
Rethinking the Role of Prompting Strategies in LLM Test-Time Scaling: A Perspective of Probability Theory
by: Liu, Yexiang, et al.
Published: (2025)
by: Liu, Yexiang, et al.
Published: (2025)
Rethinking Time Series Forecasting with LLMs via Nearest Neighbor Contrastive Learning
by: Bogahawatte, Jayanie, et al.
Published: (2024)
by: Bogahawatte, Jayanie, et al.
Published: (2024)
Reasoning Through Execution: Unifying Process and Outcome Rewards for Code Generation
by: Yu, Zhuohao, et al.
Published: (2024)
by: Yu, Zhuohao, et al.
Published: (2024)
An Evaluation on Large Language Model Outputs: Discourse and Memorization
by: de Wynter, Adrian, et al.
Published: (2023)
by: de Wynter, Adrian, et al.
Published: (2023)
Understanding and Mitigating Bias Inheritance in LLM-based Data Augmentation on Downstream Tasks
by: Li, Miaomiao, et al.
Published: (2025)
by: Li, Miaomiao, et al.
Published: (2025)
Similar Items
-
Improving Adversarial Transferability in MLLMs via Dynamic Vision-Language Alignment Attack
by: Gu, Chenhe, et al.
Published: (2025) -
Banana100: Breaking NR-IQA Metrics by 100 Iterative Image Replications with Nano Banana Pro
by: Tang, Kenan, et al.
Published: (2026) -
PromptBench: A Unified Library for Evaluation of Large Language Models
by: Zhu, Kaijie, et al.
Published: (2023) -
Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates
by: Lyu, Kaifeng, et al.
Published: (2024) -
Interesting Scientific Idea Generation using Knowledge Graphs and LLMs: Evaluations with 100 Research Group Leaders
by: Gu, Xuemei, et al.
Published: (2024)