Saved in:
| Main Author: | Mouzouni, Charafeddine |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.21368 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Mapping the Exploitation Surface: A 10,000-Trial Taxonomy of What Makes LLM Agents Exploit Vulnerabilities
by: Mouzouni, Charafeddine
Published: (2026)
by: Mouzouni, Charafeddine
Published: (2026)
Three Phases of Expert Routing: How Load Balance Evolves During Mixture-of-Experts Training
by: Mouzouni, Charafeddine
Published: (2026)
by: Mouzouni, Charafeddine
Published: (2026)
Context Kubernetes: Declarative Orchestration of Enterprise Knowledge for Agentic AI Systems
by: Mouzouni, Charafeddine
Published: (2026)
by: Mouzouni, Charafeddine
Published: (2026)
Are Large Language Models Reliable AI Scientists? Assessing Reverse-Engineering of Black-Box Systems
by: Geng, Jiayi, et al.
Published: (2025)
by: Geng, Jiayi, et al.
Published: (2025)
Ten Words Only Still Help: Improving Black-Box AI-Generated Text Detection via Proxy-Guided Efficient Re-Sampling
by: Shi, Yuhui, et al.
Published: (2024)
by: Shi, Yuhui, et al.
Published: (2024)
Multi-Agent Reasoning with Consistency Verification Improves Uncertainty Calibration in Medical MCQA
by: Martinez, John Ray B.
Published: (2026)
by: Martinez, John Ray B.
Published: (2026)
FactSelfCheck: Fact-Level Black-Box Hallucination Detection for LLMs
by: Sawczyn, Albert, et al.
Published: (2025)
by: Sawczyn, Albert, et al.
Published: (2025)
Soft Self-Consistency Improves Language Model Agents
by: Wang, Han, et al.
Published: (2024)
by: Wang, Han, et al.
Published: (2024)
Large Language Model Confidence Estimation via Black-Box Access
by: Pedapati, Tejaswini, et al.
Published: (2024)
by: Pedapati, Tejaswini, et al.
Published: (2024)
Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI Agents
by: Miao, Jiacheng, et al.
Published: (2025)
by: Miao, Jiacheng, et al.
Published: (2025)
Efficient Test-Time Scaling via Self-Calibration
by: Huang, Chengsong, et al.
Published: (2025)
by: Huang, Chengsong, et al.
Published: (2025)
ConsistRM: Improving Generative Reward Models via Consistency-Aware Self-Training
by: Liang, Yu, et al.
Published: (2026)
by: Liang, Yu, et al.
Published: (2026)
Training Deliberative Monitors for Black-Box Scheming Detection
by: Sinha, Aditya, et al.
Published: (2026)
by: Sinha, Aditya, et al.
Published: (2026)
Diagnosing LLM Judge Reliability: Conformal Prediction Sets and Transitivity Violations
by: Gupta, Manan, et al.
Published: (2026)
by: Gupta, Manan, et al.
Published: (2026)
Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers
by: Bouchard, Dylan, et al.
Published: (2025)
by: Bouchard, Dylan, et al.
Published: (2025)
Matryoshka Pilot: Learning to Drive Black-Box LLMs with LLMs
by: Li, Changhao, et al.
Published: (2024)
by: Li, Changhao, et al.
Published: (2024)
In-Context Explainers: Harnessing LLMs for Explaining Black Box Models
by: Kroeger, Nicholas, et al.
Published: (2023)
by: Kroeger, Nicholas, et al.
Published: (2023)
HYDRA: Model Factorization Framework for Black-Box LLM Personalization
by: Zhuang, Yuchen, et al.
Published: (2024)
by: Zhuang, Yuchen, et al.
Published: (2024)
Self-Consistency Preference Optimization
by: Prasad, Archiki, et al.
Published: (2024)
by: Prasad, Archiki, et al.
Published: (2024)
BBox-Adapter: Lightweight Adapting for Black-Box Large Language Models
by: Sun, Haotian, et al.
Published: (2024)
by: Sun, Haotian, et al.
Published: (2024)
Mafin: Enhancing Black-Box Embeddings with Model Augmented Fine-Tuning
by: Zhang, Mingtian, et al.
Published: (2024)
by: Zhang, Mingtian, et al.
Published: (2024)
Topic Modelling Black Box Optimization
by: Akramov, Roman, et al.
Published: (2025)
by: Akramov, Roman, et al.
Published: (2025)
Calibration vs Decision Making: Revisiting the Reliability Paradox in Unlearned Language Models
by: Shukla, Divyaksh, et al.
Published: (2026)
by: Shukla, Divyaksh, et al.
Published: (2026)
The Reliability Paradox: Exploring How Shortcut Learning Undermines Language Model Calibration
by: Bihani, Geetanjali, et al.
Published: (2024)
by: Bihani, Geetanjali, et al.
Published: (2024)
Think Consistently, Reason Efficiently: Energy-Based Calibration for Implicit Chain-of-Thought
by: Chen, Zhikang, et al.
Published: (2025)
by: Chen, Zhikang, et al.
Published: (2025)
How to Train Your Advisor: Steering Black-Box LLMs with Advisor Models
by: Asawa, Parth, et al.
Published: (2025)
by: Asawa, Parth, et al.
Published: (2025)
Bias Similarity Measurement: A Black-Box Audit of Fairness Across LLMs
by: Jeong, Hyejun, et al.
Published: (2024)
by: Jeong, Hyejun, et al.
Published: (2024)
Maestro: Joint Graph & Config Optimization for Reliable AI Agents
by: Wang, Wenxiao, et al.
Published: (2025)
by: Wang, Wenxiao, et al.
Published: (2025)
Collab-RAG: Boosting Retrieval-Augmented Generation for Complex Question Answering via White-Box and Black-Box LLM Collaboration
by: Xu, Ran, et al.
Published: (2025)
by: Xu, Ran, et al.
Published: (2025)
Data-Driven Calibration of Prediction Sets in Large Vision-Language Models Based on Inductive Conformal Prediction
by: Ye, Yuanchang, et al.
Published: (2025)
by: Ye, Yuanchang, et al.
Published: (2025)
Self-Training Meets Consistency: Improving LLMs' Reasoning with Consistency-Driven Rationale Evaluation
by: Lee, Jaehyeok, et al.
Published: (2024)
by: Lee, Jaehyeok, et al.
Published: (2024)
Can AI-Generated Text be Reliably Detected?
by: Sadasivan, Vinu Sankar, et al.
Published: (2023)
by: Sadasivan, Vinu Sankar, et al.
Published: (2023)
Efficient Non-Parametric Uncertainty Quantification for Black-Box Large Language Models and Decision Planning
by: Tsai, Yao-Hung Hubert, et al.
Published: (2024)
by: Tsai, Yao-Hung Hubert, et al.
Published: (2024)
Online Reasoning Calibration: Test-Time Training Enables Generalizable Conformal LLM Reasoning
by: Zhou, Cai, et al.
Published: (2026)
by: Zhou, Cai, et al.
Published: (2026)
Tree of Attacks: Jailbreaking Black-Box LLMs Automatically
by: Mehrotra, Anay, et al.
Published: (2023)
by: Mehrotra, Anay, et al.
Published: (2023)
Peering Inside the Black Box: Uncovering LLM Errors in Optimization Modelling through Component-Level Evaluation
by: Refai, Dania, et al.
Published: (2025)
by: Refai, Dania, et al.
Published: (2025)
CYCLE-INSTRUCT: Fully Seed-Free Instruction Tuning via Dual Self-Training and Cycle Consistency
by: Shen, Zhanming, et al.
Published: (2025)
by: Shen, Zhanming, et al.
Published: (2025)
A Semantic-Sampling Framework for Evaluating Calibration in Open-Ended Question Answering
by: Wang, Zhanliang, et al.
Published: (2026)
by: Wang, Zhanliang, et al.
Published: (2026)
Towards Consistent Natural-Language Explanations via Explanation-Consistency Finetuning
by: Chen, Yanda, et al.
Published: (2024)
by: Chen, Yanda, et al.
Published: (2024)
Sample-Efficient Online Learning in LM Agents via Hindsight Trajectory Rewriting
by: Hu, Michael Y., et al.
Published: (2025)
by: Hu, Michael Y., et al.
Published: (2025)
Similar Items
-
Mapping the Exploitation Surface: A 10,000-Trial Taxonomy of What Makes LLM Agents Exploit Vulnerabilities
by: Mouzouni, Charafeddine
Published: (2026) -
Three Phases of Expert Routing: How Load Balance Evolves During Mixture-of-Experts Training
by: Mouzouni, Charafeddine
Published: (2026) -
Context Kubernetes: Declarative Orchestration of Enterprise Knowledge for Agentic AI Systems
by: Mouzouni, Charafeddine
Published: (2026) -
Are Large Language Models Reliable AI Scientists? Assessing Reverse-Engineering of Black-Box Systems
by: Geng, Jiayi, et al.
Published: (2025) -
Ten Words Only Still Help: Improving Black-Box AI-Generated Text Detection via Proxy-Guided Efficient Re-Sampling
by: Shi, Yuhui, et al.
Published: (2024)