Saved in:
| Main Authors: | Ivanov, Maksim, Rana, Abhijay |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.26321 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Can Coding Agents Be General Agents?
by: Ivanov, Maksim, et al.
Published: (2026)
by: Ivanov, Maksim, et al.
Published: (2026)
Benchmark Test-Time Scaling of General LLM Agents
by: Li, Xiaochuan, et al.
Published: (2026)
by: Li, Xiaochuan, et al.
Published: (2026)
Beneficial Reasoning Behaviors in Agentic Search and Effective Post-training to Obtain Them
by: Jin, Jiahe, et al.
Published: (2025)
by: Jin, Jiahe, et al.
Published: (2025)
From Single Agent to Multi-Agent: Improving Traffic Signal Control
by: Tislenko, Maksim, et al.
Published: (2024)
by: Tislenko, Maksim, et al.
Published: (2024)
Moral Anchor System: A Predictive Framework for AI Value Alignment and Drift Prevention
by: Ravindran, Santhosh Kumar
Published: (2025)
by: Ravindran, Santhosh Kumar
Published: (2025)
Artifacts as Memory Beyond the Agent Boundary
by: Martin, John D., et al.
Published: (2026)
by: Martin, John D., et al.
Published: (2026)
SynthAI: A Multi Agent Generative AI Framework for Automated Modular HLS Design Generation
by: Sheikholeslam, Seyed Arash, et al.
Published: (2024)
by: Sheikholeslam, Seyed Arash, et al.
Published: (2024)
Privacy Artifact ConnecTor (PACT): Embedding Enterprise Artifacts for Compliance AI Agents
by: Fang, Chenhao, et al.
Published: (2025)
by: Fang, Chenhao, et al.
Published: (2025)
SafeMind: Benchmarking and Mitigating Safety Risks in Embodied LLM Agents
by: Chen, Ruolin, et al.
Published: (2025)
by: Chen, Ruolin, et al.
Published: (2025)
Federated Learning with Sample-level Client Drift Mitigation
by: Xu, Haoran, et al.
Published: (2025)
by: Xu, Haoran, et al.
Published: (2025)
AnchorDrive: LLM Scenario Rollout with Anchor-Guided Diffusion Regeneration for Safety-Critical Scenario Generation
by: Jiang, Zhulin, et al.
Published: (2026)
by: Jiang, Zhulin, et al.
Published: (2026)
Awakening Codex | AI Foundations Drift vs. Anchor: Cross-Instance Diagnostic Behavioral Coherence Testing Across Container States Feb 2026
by: Solen, Alyssa, et al.
Published: (2026)
by: Solen, Alyssa, et al.
Published: (2026)
DIAMOND: Directed Inference for Artifact Mitigation in Flow Matching Models
by: Polowczyk, Alicja, et al.
Published: (2026)
by: Polowczyk, Alicja, et al.
Published: (2026)
Topology Reorganized Graph Contrastive Learning with Mitigating Semantic Drift
by: Zhang, Jiaqiang, et al.
Published: (2024)
by: Zhang, Jiaqiang, et al.
Published: (2024)
When Agents Persuade: Rhetoric Generation and Mitigation in LLMs
by: Jose, Julia, et al.
Published: (2026)
by: Jose, Julia, et al.
Published: (2026)
RoleCDE:Benchmarking and Mitigating Role-Alignment Trade-offs in Role-Playing Agents
by: Lai, Huayi, et al.
Published: (2026)
by: Lai, Huayi, et al.
Published: (2026)
ManiBench: A Benchmark for Testing Visual-Logic Drift and Syntactic Hallucinations in Manim Code Generation
by: Oli, Nabin
Published: (2026)
by: Oli, Nabin
Published: (2026)
Model-First Reasoning LLM Agents: Reducing Hallucinations through Explicit Problem Modeling
by: Rana, Annu, et al.
Published: (2025)
by: Rana, Annu, et al.
Published: (2025)
FLAME: Adaptive and Reactive Concept Drift Mitigation for Federated Learning Deployments
by: Mavromatis, Ioannis, et al.
Published: (2024)
by: Mavromatis, Ioannis, et al.
Published: (2024)
PARALLAX: Separating Genuine Hallucination Detection from Benchmark Construction Artifacts
by: Hussain, Khizar, et al.
Published: (2026)
by: Hussain, Khizar, et al.
Published: (2026)
VideoGameQA-Bench: Evaluating Vision-Language Models for Video Game Quality Assurance
by: Taesiri, Mohammad Reza, et al.
Published: (2025)
by: Taesiri, Mohammad Reza, et al.
Published: (2025)
Agent Tools Orchestration Leaks More: Dataset, Benchmark, and Mitigation
by: Qiao, Yuxuan, et al.
Published: (2025)
by: Qiao, Yuxuan, et al.
Published: (2025)
$A^3$-Bench: Benchmarking Memory-Driven Scientific Reasoning via Anchor and Attractor Activation
by: Zhang, Jian, et al.
Published: (2026)
by: Zhang, Jian, et al.
Published: (2026)
Drift-Based Dataset Stability Benchmark
by: Soukup, Dominik, et al.
Published: (2025)
by: Soukup, Dominik, et al.
Published: (2025)
AFSS: Artifact-Focused Self-Synthesis for Mitigating Bias in Audio Deepfake Detection
by: Nguyen-Le, Hai-Son, et al.
Published: (2026)
by: Nguyen-Le, Hai-Son, et al.
Published: (2026)
AI Benchmarks and Datasets for LLM Evaluation
by: Ivanov, Todor, et al.
Published: (2024)
by: Ivanov, Todor, et al.
Published: (2024)
Mitigating Resolution-Drift in Federated Learning: Case of Keypoint Detection
by: Lim, Taeheon, et al.
Published: (2025)
by: Lim, Taeheon, et al.
Published: (2025)
Analyzing and Mitigating Negation Artifacts using Data Augmentation for Improving ELECTRA-Small Model Accuracy
by: Noghabaei, Mojtaba
Published: (2025)
by: Noghabaei, Mojtaba
Published: (2025)
Hallucination by Code Generation LLMs: Taxonomy, Benchmarks, Mitigation, and Challenges
by: Lee, Yunseo, et al.
Published: (2025)
by: Lee, Yunseo, et al.
Published: (2025)
Adaptive Meta-Learning for Robust Deepfake Detection: A Multi-Agent Framework to Data Drift and Model Generalization
by: P, Dinesh Srivasthav, et al.
Published: (2024)
by: P, Dinesh Srivasthav, et al.
Published: (2024)
Agent Drift: Quantifying Behavioral Degradation in Multi-Agent LLM Systems Over Extended Interactions
by: Rath, Abhishek
Published: (2026)
by: Rath, Abhishek
Published: (2026)
LinkAnchor: An Autonomous LLM-Based Agent for Issue-to-Commit Link Recovery
by: Akhavan, Arshia, et al.
Published: (2025)
by: Akhavan, Arshia, et al.
Published: (2025)
Still Fresh? Evaluating Temporal Drift in Retrieval Benchmarks
by: Kuissi, Nathan, et al.
Published: (2026)
by: Kuissi, Nathan, et al.
Published: (2026)
BrokenVideos: A Benchmark Dataset for Fine-Grained Artifact Localization in AI-Generated Videos
by: Lin, Jiahao, et al.
Published: (2025)
by: Lin, Jiahao, et al.
Published: (2025)
Mitigating Relative Over-Generalization in Multi-Agent Reinforcement Learning
by: Zhu, Ting, et al.
Published: (2024)
by: Zhu, Ting, et al.
Published: (2024)
AraTable: Benchmarking LLMs' Reasoning and Understanding of Arabic Tabular Data
by: Alshaikh, Rana, et al.
Published: (2025)
by: Alshaikh, Rana, et al.
Published: (2025)
A General Anchor-Based Framework for Scalable Fair Clustering
by: Wei, Shengfei, et al.
Published: (2025)
by: Wei, Shengfei, et al.
Published: (2025)
Technical Report: Evaluating Goal Drift in Language Model Agents
by: Arike, Rauno, et al.
Published: (2025)
by: Arike, Rauno, et al.
Published: (2025)
Social Bias in LLM-Generated Code: Benchmark and Mitigation
by: Rabbi, Fazle, et al.
Published: (2026)
by: Rabbi, Fazle, et al.
Published: (2026)
GTA: A Benchmark for General Tool Agents
by: Wang, Jize, et al.
Published: (2024)
by: Wang, Jize, et al.
Published: (2024)
Similar Items
-
Can Coding Agents Be General Agents?
by: Ivanov, Maksim, et al.
Published: (2026) -
Benchmark Test-Time Scaling of General LLM Agents
by: Li, Xiaochuan, et al.
Published: (2026) -
Beneficial Reasoning Behaviors in Agentic Search and Effective Post-training to Obtain Them
by: Jin, Jiahe, et al.
Published: (2025) -
From Single Agent to Multi-Agent: Improving Traffic Signal Control
by: Tislenko, Maksim, et al.
Published: (2024) -
Moral Anchor System: A Predictive Framework for AI Value Alignment and Drift Prevention
by: Ravindran, Santhosh Kumar
Published: (2025)