Guardat en:
| Autors principals: | Xu, Zehao, Paek, Tony, O'Sullivan, Kevin, Dobi, Attila |
|---|---|
| Format: | Preprint |
| Publicat: |
2026
|
| Matèries: | |
| Accés en línia: | https://arxiv.org/abs/2602.16111 |
| Etiquetes: |
Afegir etiqueta
Sense etiquetes, Sigues el primer a etiquetar aquest registre!
|
Ítems similars
Decision Quality Evaluation Framework at Pinterest
per: Tian, Yuqi, et al.
Publicat: (2026)
per: Tian, Yuqi, et al.
Publicat: (2026)
Efficient Prediction of Pass@k Scaling in Large Language Models
per: Kazdan, Joshua, et al.
Publicat: (2025)
per: Kazdan, Joshua, et al.
Publicat: (2025)
Conformal Safety Monitoring for Flight Testing: A Case Study in Data-Driven Safety Learning
per: Feldman, Aaron O., et al.
Publicat: (2025)
per: Feldman, Aaron O., et al.
Publicat: (2025)
Sample-Efficient and Surrogate-Based Design Optimization of Underwater Vehicle Hulls
per: Vardhan, Harsh, et al.
Publicat: (2023)
per: Vardhan, Harsh, et al.
Publicat: (2023)
SEED-SET: Scalable Evolving Experimental Design for System-level Ethical Testing
per: Parashar, Anjali, et al.
Publicat: (2026)
per: Parashar, Anjali, et al.
Publicat: (2026)
MC-GTA: Metric-Constrained Model-Based Clustering using Goodness-of-fit Tests with Autocorrelations
per: Wang, Zhangyu, et al.
Publicat: (2024)
per: Wang, Zhangyu, et al.
Publicat: (2024)
StatLLM: A Dataset for Evaluating the Performance of Large Language Models in Statistical Analysis
per: Song, Xinyi, et al.
Publicat: (2025)
per: Song, Xinyi, et al.
Publicat: (2025)
Performance Evaluation of Large Language Models in Statistical Programming
per: Song, Xinyi, et al.
Publicat: (2025)
per: Song, Xinyi, et al.
Publicat: (2025)
Subnational Geocoding of Global Disasters Using Large Language Models
per: Ronco, Michele, et al.
Publicat: (2025)
per: Ronco, Michele, et al.
Publicat: (2025)
Evaluating the Use of Large Language Models as Synthetic Social Agents in Social Science Research
per: Madden, Emma Rose
Publicat: (2025)
per: Madden, Emma Rose
Publicat: (2025)
From Passive Metric to Active Signal: The Evolving Role of Uncertainty Quantification in Large Language Models
per: Zhang, Jiaxin, et al.
Publicat: (2026)
per: Zhang, Jiaxin, et al.
Publicat: (2026)
AI for Handball: predicting and explaining the 2024 Olympic Games tournament with Deep Learning and Large Language Models
per: Felice, Florian
Publicat: (2024)
per: Felice, Florian
Publicat: (2024)
Analyzing the factors that are involved in length of inpatient stay at the hospital for diabetes patients
per: Lam, Jorden, et al.
Publicat: (2024)
per: Lam, Jorden, et al.
Publicat: (2024)
Quantifying Uncertainty in AI Visibility: A Statistical Framework for Generative Search Measurement
per: Sielinski, Ronald
Publicat: (2026)
per: Sielinski, Ronald
Publicat: (2026)
Classification Modeling with RNN-Based, Random Forest, and XGBoost for Imbalanced Data: A Case of Early Crash Detection in ASEAN-5 Stock Markets
per: Siswara, Deri, et al.
Publicat: (2024)
per: Siswara, Deri, et al.
Publicat: (2024)
DeepScore: A Comprehensive Approach to Measuring Quality in AI-Generated Clinical Documentation
per: Oleson, Jon
Publicat: (2024)
per: Oleson, Jon
Publicat: (2024)
E-valuator: Reliable Agent Verifiers with Sequential Hypothesis Testing
per: Sadhuka, Shuvom, et al.
Publicat: (2025)
per: Sadhuka, Shuvom, et al.
Publicat: (2025)
A Distribution-Free Framework for Rewrite-Based Human-text Detection via Knockoff Filtering
per: Liu, Yi
Publicat: (2026)
per: Liu, Yi
Publicat: (2026)
When prompt perturbations break your A/B test: A valid statistical test for generative surveying
per: Helm, Hayden, et al.
Publicat: (2026)
per: Helm, Hayden, et al.
Publicat: (2026)
Scale-Translation Equivariant Network for Oceanic Internal Solitary Wave Localization
per: Wan, Zhang, et al.
Publicat: (2024)
per: Wan, Zhang, et al.
Publicat: (2024)
FactsR: A Safer Method for Producing High Quality Healthcare Documentation
per: Hansen, Victor Petrén Bach, et al.
Publicat: (2025)
per: Hansen, Victor Petrén Bach, et al.
Publicat: (2025)
Latency-Response Theory Model: Evaluating Large Language Models via Response Accuracy and Chain-of-Thought Length
per: Xu, Zhiyu, et al.
Publicat: (2025)
per: Xu, Zhiyu, et al.
Publicat: (2025)
A Statistical Theory of Regularization-Based Continual Learning
per: Zhao, Xuyang, et al.
Publicat: (2024)
per: Zhao, Xuyang, et al.
Publicat: (2024)
Acquiring Better Load Estimates by Combining Anomaly and Change Point Detection in Power Grid Time-series Measurements
per: Bouman, Roel, et al.
Publicat: (2024)
per: Bouman, Roel, et al.
Publicat: (2024)
Personalization of Large Foundation Models for Health Interventions
per: Konigorski, Stefan, et al.
Publicat: (2026)
per: Konigorski, Stefan, et al.
Publicat: (2026)
Eligibility-Aware Evidence Synthesis: An Agentic Framework for Clinical Trial Meta-Analysis
per: Zhao, Yao, et al.
Publicat: (2026)
per: Zhao, Yao, et al.
Publicat: (2026)
Confidence Adjusted Surprise Measure for Active Resourceful Trials (CA-SMART): A Data-driven Active Learning Framework for Accelerating Material Discovery under Resource Constraints
per: Raihan, Ahmed Shoyeb, et al.
Publicat: (2025)
per: Raihan, Ahmed Shoyeb, et al.
Publicat: (2025)
Cinder: A fast and fair matchmaking system
per: Pal, Saurav
Publicat: (2025)
per: Pal, Saurav
Publicat: (2025)
The Evolution of Probabilistic Price Forecasting Techniques: A Review of the Day-Ahead, Intra-Day, and Balancing Markets
per: O'Connor, Ciaran, et al.
Publicat: (2025)
per: O'Connor, Ciaran, et al.
Publicat: (2025)
Learning Explainable Treatment Policies with Clinician-Informed Representations: A Practical Approach
per: Ferstad, Johannes O., et al.
Publicat: (2024)
per: Ferstad, Johannes O., et al.
Publicat: (2024)
Can-SAVE: Deploying Low-Cost and Population-Scale Cancer Screening via Survival Analysis Variables and EHR
per: Philonenko, Petr, et al.
Publicat: (2023)
per: Philonenko, Petr, et al.
Publicat: (2023)
TransitGPT: A Generative AI-based framework for interacting with GTFS data using Large Language Models
per: Devunuri, Saipraneeth, et al.
Publicat: (2024)
per: Devunuri, Saipraneeth, et al.
Publicat: (2024)
CERES: A Probabilistic Early Warning System for Acute Food Insecurity
per: Pedersen, Tom Danny S.
Publicat: (2026)
per: Pedersen, Tom Danny S.
Publicat: (2026)
A network analysis of decision strategies of human experts in steel manufacturing
per: Merten, Daniel Christopher, et al.
Publicat: (2021)
per: Merten, Daniel Christopher, et al.
Publicat: (2021)
Predictive Scale-Bridging Simulations through Active Learning
per: Karra, Satish, et al.
Publicat: (2022)
per: Karra, Satish, et al.
Publicat: (2022)
HiBayES: A Hierarchical Bayesian Modeling Framework for AI Evaluation Statistics
per: Luettgau, Lennart, et al.
Publicat: (2025)
per: Luettgau, Lennart, et al.
Publicat: (2025)
Process-Aware Analysis of Treatment Paths in Heart Failure Patients: A Case Study
per: Beyel, Harry H., et al.
Publicat: (2024)
per: Beyel, Harry H., et al.
Publicat: (2024)
TCKAN:A Novel Integrated Network Model for Predicting Mortality Risk in Sepsis Patients
per: Dong, Fanglin
Publicat: (2024)
per: Dong, Fanglin
Publicat: (2024)
VACT: A Video Automatic Causal Testing System and a Benchmark
per: Yang, Haotong, et al.
Publicat: (2025)
per: Yang, Haotong, et al.
Publicat: (2025)
A survey of using EHR as real-world evidence for discovering and validating new drug indications
per: Talukdar, Nabasmita, et al.
Publicat: (2025)
per: Talukdar, Nabasmita, et al.
Publicat: (2025)
Ítems similars
-
Decision Quality Evaluation Framework at Pinterest
per: Tian, Yuqi, et al.
Publicat: (2026) -
Efficient Prediction of Pass@k Scaling in Large Language Models
per: Kazdan, Joshua, et al.
Publicat: (2025) -
Conformal Safety Monitoring for Flight Testing: A Case Study in Data-Driven Safety Learning
per: Feldman, Aaron O., et al.
Publicat: (2025) -
Sample-Efficient and Surrogate-Based Design Optimization of Underwater Vehicle Hulls
per: Vardhan, Harsh, et al.
Publicat: (2023) -
SEED-SET: Scalable Evolving Experimental Design for System-level Ethical Testing
per: Parashar, Anjali, et al.
Publicat: (2026)