:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Dorner, Florian E., Chen, Yatong, Cruz, André F., Yang, Fanny
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2507.12399
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

How Benchmark Prediction from Fewer Data Misses the Mark
by: Zhang, Guanhua, et al.
Published: (2025)

Don't Label Twice: Quantity Beats Quality when Comparing Binary Classifiers on a Budget
by: Dorner, Florian E., et al.
Published: (2024)

Conditional Prediction ROC Bands for Graph Classification
by: Wu, Yujia, et al.
Published: (2024)

Multiclass ROC
by: Wang, Liang, et al.
Published: (2024)

Limits to scalable evaluation at the frontier: LLM as Judge won't beat twice the data
by: Dorner, Florian E., et al.
Published: (2024)

Knee or ROC
by: Wendt, Veronica, et al.
Published: (2024)

Interactive proofs for verifying (quantum) learning and testing
by: Caro, Matthias C., et al.
Published: (2024)

Strategic Hypothesis Testing
by: Hossain, Safwan, et al.
Published: (2025)

SubROC: AUC-Based Discovery of Exceptional Subgroup Performance for Binary Classifiers
by: Siegl, Tom, et al.
Published: (2025)

A Multiclass ROC Curve
by: Giudici, Paolo, et al.
Published: (2025)

Training on the Test Task Confounds Evaluation and Emergence
by: Dominguez-Olmedo, Ricardo, et al.
Published: (2024)

Performative Prediction with Bandit Feedback: Learning through Reparameterization
by: Chen, Yatong, et al.
Published: (2023)

Federated Computation of ROC and PR Curves
by: Xu, Xuefeng, et al.
Published: (2025)

Incentivizing Honesty among Competitors in Collaborative Learning and Optimization
by: Dorner, Florian E., et al.
Published: (2023)

To Give or Not to Give? The Impacts of Strategically Withheld Recourse
by: Chen, Yatong, et al.
Published: (2025)

Leaderboard Incentives: Model Rankings under Strategic Post-Training
by: Chen, Yatong, et al.
Published: (2026)

s1: Simple test-time scaling
by: Muennighoff, Niklas, et al.
Published: (2025)

Scaling Up ROC-Optimizing Support Vector Machines
by: Bae, Gimun, et al.
Published: (2025)

FROC: Building Fair ROC from a Trained Classifier
by: Vummintala, Avyukta Manjunatha, et al.
Published: (2024)

Learning Pareto manifolds in high dimensions: How can regularization help?
by: Wegel, Tobias, et al.
Published: (2025)

Product distribution learning with imperfect advice
by: Bhattacharyya, Arnab, et al.
Published: (2025)

FACROC: a fairness measure for FAir Clustering through ROC curves
by: Quy, Tai Le, et al.
Published: (2025)

Artificial intelligence for methane detection: from continuous monitoring to verified mitigation
by: Mateo-Garcia, Gonzalo, et al.
Published: (2025)

Whose Preferences? Differences in Fairness Preferences and Their Impact on the Fairness of AI Utilizing Human Feedback
by: Lerner, Emilia Agis, et al.
Published: (2024)

Predicting Blood Type: Assessing Model Performance with ROC Analysis
by: Altayar, Malik A., et al.
Published: (2025)

Area under the ROC Curve has the Most Consistent Evaluation for Binary Classification
by: Li, Jing
Published: (2024)

Delay, Plateau, or Collapse: Evaluating the Impact of Systematic Verification Error on RLVR
by: Egashira, Kazuki, et al.
Published: (2026)

Thought calibration: Efficient and confident test-time scaling
by: Wu, Menghua, et al.
Published: (2025)

Towards a unified and verified understanding of group-operation networks
by: Wu, Wilson, et al.
Published: (2024)

Interval-Based AUC (iAUC): Extending ROC Analysis to Uncertainty-Aware Classification
by: Li, Yuqi, et al.
Published: (2026)

Tournament Leave-pair-out Cross-validation for Receiver Operating Characteristic (ROC) Analysis
by: Perez, Ileana Montoya, et al.
Published: (2018)

Position: Stop Evaluating AI with Human Tests, Develop Principled, AI-specific Tests instead
by: Sühr, Tom, et al.
Published: (2025)

Efficient line search for optimizing Area Under the ROC Curve in gradient descent
by: Fowler, Jadon, et al.
Published: (2024)

Learning multivariate Gaussians with imperfect advice
by: Bhattacharyya, Arnab, et al.
Published: (2024)

Online bipartite matching with imperfect advice
by: Choo, Davin, et al.
Published: (2024)

What should post-training optimize? A test-time scaling law perspective
by: Li, Muheng, et al.
Published: (2026)

Collapsing ROC approach for risk prediction research on both common and rare variants
by: Wei, Changshuai, et al.
Published: (2025)

Imitating from auxiliary imperfect demonstrations via Adversarial Density Weighted Regression
by: Zhang, Ziqi, et al.
Published: (2024)

Connections between reinforcement learning with feedback,test-time scaling, and diffusion guidance: An anthology
by: Jiao, Yuchen, et al.
Published: (2025)

How does over-squashing affect the power of GNNs?
by: Di Giovanni, Francesco, et al.
Published: (2023)