:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xi, Xiaoyin, Capak, Neeku, Stockwell, Kate, Yu, Zhe
Format:	Preprint
Published:	2026
Subjects:	Software Engineering Machine Learning
Online Access:	https://arxiv.org/abs/2601.06761
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Efficient Story Point Estimation With Comparative Learning
by: Khan, Monoshiz Mahbub, et al.
Published: (2025)

Primal Generation, Dual Judgment: Self-Training from Test-Time Scaling
by: Jiao, Yizhu, et al.
Published: (2026)

Comparative Analysis of AWS Model Deployment Services
by: Bagai, Rahul
Published: (2024)

Data Augmentation for Code Translation with Comparable Corpora and Multiple References
by: Xie, Yiqing, et al.
Published: (2023)

From Particles to Perils: SVGD-Based Hazardous Scenario Generation for Autonomous Driving Systems Testing
by: Liang, Linfeng, et al.
Published: (2026)

Comparative Analysis of Quantum and Classical Support Vector Classifiers for Software Bug Prediction: An Exploratory Study
by: Nadim, Md, et al.
Published: (2025)

RM -RF: Reward Model for Run-Free Unit Test Evaluation
by: Bruches, Elena, et al.
Published: (2026)

Influence-Guided Concolic Testing of Transformer Robustness
by: Hong, Chih-Duo, et al.
Published: (2025)

Enhancing LLM-Based Test Generation by Eliminating Covered Code
by: Xu, WeiZhe, et al.
Published: (2026)

Detecting Proxy Gaming in RL and LLM Alignment via Evaluator Stress Tests
by: Shihab, Ibne Farabi, et al.
Published: (2025)

PrismaDV: Automated Task-Aware Data Unit Test Generation
by: Chen, Hao, et al.
Published: (2026)

Adaptive Reinforcement Learning for Dynamic Configuration Allocation in Pre-Production Testing
by: Zhu, Yu
Published: (2025)

REFLEX: Reference-Free Evaluation of Log Summarization via Large Language Model Judgment
by: Mudgal, Priyanka
Published: (2025)

Comparative Evaluation of Embedding Representations for Financial News Sentiment Analysis
by: Roy, Joyjit, et al.
Published: (2025)

Concolic Testing on Individual Fairness of Neural Network Models
by: Huang, Ming-I, et al.
Published: (2025)

FairReweighing: Density Estimation-Based Reweighing Framework for Improving Separation in Fair Regression
by: Xi, Xiaoyin, et al.
Published: (2025)

Generative AI to Generate Test Data Generators
by: Baudry, Benoit, et al.
Published: (2024)

Productively Deploying Emerging Models on Emerging Platforms: A Top-Down Approach for Testing and Debugging
by: Feng, Siyuan, et al.
Published: (2024)

Investigating Test Overfitting on SWE-bench
by: Ahmed, Toufique, et al.
Published: (2025)

Latent Regularization in Generative Test Input Generation
by: Merabishvili, Giorgi, et al.
Published: (2026)

Targeted Deep Learning System Boundary Testing
by: Weißl, Oliver, et al.
Published: (2024)

Automated Trustworthiness Testing for Machine Learning Classifiers
by: Cho, Steven, et al.
Published: (2024)

Targeted Test Selection Approach in Continuous Integration
by: Plyusnin, Pavel, et al.
Published: (2025)

LGMT: Logic-Grounded Metamorphic Testing for Evaluating the Reasoning Reliability of LLMs
by: Zhou, Zenghui, et al.
Published: (2026)

Mock Deep Testing: Toward Separate Development of Data and Models for Deep Learning
by: Manke, Ruchira, et al.
Published: (2025)

HyperNet-Adaptation for Diffusion-Based Test Case Generation
by: Weißl, Oliver, et al.
Published: (2026)

CA2: Code-Aware Agent for Automated Game Testing
by: Adaikkappan, Valliappan Chidambaram, et al.
Published: (2026)

Automating REST API Postman Test Cases Using LLM
by: Sri, S Deepika, et al.
Published: (2024)

Otter: Generating Tests from Issues to Validate SWE Patches
by: Ahmed, Toufique, et al.
Published: (2025)

Automatic Detection of LLM-Generated Code: A Comparative Case Study of Contemporary Models Across Function and Class Granularities
by: Rahman, Musfiqur, et al.
Published: (2024)

Can OpenSource beat ChatGPT? -- A Comparative Study of Large Language Models for Text-to-Code Generation
by: Mayer, Luis, et al.
Published: (2024)

Story Point Estimation Using Large Language Models
by: Shetty, Pranam Prakash, et al.
Published: (2026)

Data vs. Model Machine Learning Fairness Testing: An Empirical Study
by: Shome, Arumoy, et al.
Published: (2024)

ExplainFuzz: Explainable and Constraint-Conditioned Test Generation with Probabilistic Circuits
by: Baiget, Annaëlle, et al.
Published: (2026)

Using Large Language Models to Generate JUnit Tests: An Empirical Study
by: Siddiq, Mohammed Latif, et al.
Published: (2023)

PRIMG : Efficient LLM-driven Test Generation Using Mutant Prioritization
by: Bouafif, Mohamed Salah, et al.
Published: (2025)

Identifying Flaky Tests in Quantum Code: A Machine Learning Approach
by: Kaur, Khushdeep, et al.
Published: (2025)

LLM-Powered Test Case Generation for Detecting Bugs in Plausible Programs
by: Liu, Kaibo, et al.
Published: (2024)

Constrained Adversarial Learning for Automated Software Testing: a literature review
by: Vitorino, João, et al.
Published: (2023)

GPU Temperature Simulation-Based Testing for In-Vehicle Deep Learning Frameworks
by: Zou, Yinglong, et al.
Published: (2025)