:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Andres, Miguel E., Fedorov, Vadim, Sadek, Rida, Spagnolo-Arrizabalaga, Enric, Trudel, Nadescha
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2511.04133
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

AI-Assisted Visual Test and Bug Management Platform for Manual Testers
by: Mansi Rajgopal Kulkarni
Published: (2026)

DriveTester: A Unified Platform for Simulation-Based Autonomous Driving Testing
by: Cheng, Mingfei, et al.
Published: (2024)

HPCAgentTester: A Multi-Agent LLM Approach for Enhanced HPC Unit Test Generation
by: Karanjai, Rabimba, et al.
Published: (2025)

AI-Driven Agents with Prompts Designed for High Agreeableness Increase the Likelihood of Being Mistaken for a Human in the Turing Test
by: León-Domínguez, U., et al.
Published: (2024)

Position: Stop Evaluating AI with Human Tests, Develop Principled, AI-specific Tests instead
by: Sühr, Tom, et al.
Published: (2025)

AI-driven Java Performance Testing: Balancing Result Quality with Testing Time
by: Traini, Luca, et al.
Published: (2024)

TRACER: Trace-Based Adaptive Cost-Efficient Routing for LLM Classification
by: Rida, Adam
Published: (2026)

Responsible AI for Test Equity and Quality: The Duolingo English Test as a Case Study
by: Burstein, Jill, et al.
Published: (2024)

AI-Assisted Unit Test Writing and Test-Driven Code Refactoring: A Case Study
by: Smolic, Ema, et al.
Published: (2026)

TestAgent: An Adaptive and Intelligent Expert for Human Assessment
by: Yu, Junhao, et al.
Published: (2025)

WiFiPenTester: Advancing Wireless Ethical Hacking with Governed GenAI
by: Al-Sinani, Haitham S., et al.
Published: (2026)

Breaking Barriers in Software Testing: The Power of AI-Driven Automation
by: Naqvi, Saba, et al.
Published: (2025)

AiraXiv: An AI-Driven Open-Access Platform for Human and AI Scientists
by: Pan, Junshu, et al.
Published: (2026)

SkillTester: Benchmarking Utility and Security of Agent Skills
by: Wang, Leye, et al.
Published: (2026)

LADEV: A Language-Driven Testing and Evaluation Platform for Vision-Language-Action Models in Robotic Manipulation
by: Wang, Zhijie, et al.
Published: (2024)

"Turing Tests" For An AI Scientist
by: Yin, Xiaoxin
Published: (2024)

Human-AI Collaborative Game Testing with Vision Language Models
by: Zhang, Boran, et al.
Published: (2025)

What Understanding Means in AI-Laden Astronomy
by: Ting, Yuan-Sen, et al.
Published: (2026)

Automated Red Teaming with GOAT: the Generative Offensive Agent Tester
by: Pavlova, Maya, et al.
Published: (2024)

BreachSeek: A Multi-Agent Automated Penetration Tester
by: Alshehri, Ibrahim, et al.
Published: (2024)

An Internet of Intelligent Things Framework for Decentralized Heterogeneous Platforms
by: Allayev, Vadim, et al.
Published: (2025)

Unit Testing in ASP Revisited: Language and Test-Driven Development Environment
by: Amendola, Giovanni, et al.
Published: (2024)

Voice Mapping of Text-to-Speech Systems: A Metric-Based Approach for Voice Quality Assessment
by: Cai, Huanchen, et al.
Published: (2026)

LLMs May Not Be Human-Level Players, But They Can Be Testers: Measuring Game Difficulty with LLM Agents
by: Xiao, Chang, et al.
Published: (2024)

LLMs for Automated Unit Test Generation and Assessment in Java: The AgoneTest Framework
by: Lops, Andrea, et al.
Published: (2025)

Automated Self-Testing as a Quality Gate: Evidence-Driven Release Management for LLM Applications
by: Maiorano, Alexandre Cristovão
Published: (2026)

Tests as Prompt: A Test-Driven-Development Benchmark for LLM Code Generation
by: Cui, Yi
Published: (2025)

Test-Driven Development for Code Generation
by: Mathews, Noble Saji, et al.
Published: (2024)

From Test-Taking to Test-Making: Examining LLM Authoring of Commonsense Assessment Items
by: Roemmele, Melissa, et al.
Published: (2024)

PyGen: A Collaborative Human-AI Approach to Python Package Creation
by: Barua, Saikat, et al.
Published: (2024)

The Future of Software Testing: AI-Powered Test Case Generation and Validation
by: Baqar, Mohammad, et al.
Published: (2024)

AI-Driven Tools in Modern Software Quality Assurance: An Assessment of Benefits, Challenges, and Future Directions
by: Pysmennyi, Ihor, et al.
Published: (2025)

Expert Evaluation and the Limits of Human Feedback in Mental Health AI Safety Testing
by: Jafari, Kiana, et al.
Published: (2026)

Agentic AI for Human Resources: LLM-Driven Candidate Assessment
by: Yuksel, Kamer Ali, et al.
Published: (2026)

Comparing Human Expertise and Large Language Models Embeddings in Content Validity Assessment of Personality Tests
by: Milano, Nicola, et al.
Published: (2025)

MetAdv: A Unified and Interactive Adversarial Testing Platform for Autonomous Driving
by: Liu, Aishan, et al.
Published: (2025)

Disrupting Test Development with AI Assistants
by: Joshi, Vijay, et al.
Published: (2024)

A Turing Test: Are AI Chatbots Behaviorally Similar to Humans?
by: Mei, Qiaozhu, et al.
Published: (2023)

Digital Health and Indoor Air Quality: An IoT-Driven Human-Centred Visualisation Platform for Behavioural Change and Technology Acceptance
by: Kureshi, Rameez Raja, et al.
Published: (2024)

AI-Enabled Adaptive Fault Injection for Self-Regulating Software Testing in AWS Cloud Platforms
by: Manuja Bandal
Published: (2022)