:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Huang, Tzu-Heng, Cao, Catherine, Schoenberg, Spencer, Vishwakarma, Harit, Roberts, Nicholas, Sala, Frederic
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2502.12366
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Time To Impeach LLM-as-a-Judge: Programs are the Future of Evaluation
by: Huang, Tzu-Heng, et al.
Published: (2025)

Promises and Pitfalls of Threshold-based Auto-labeling
by: Vishwakarma, Harit, et al.
Published: (2022)

OTTER: Effortless Label Distribution Adaptation of Zero-shot Models
by: Shin, Changho, et al.
Published: (2024)

The ALCHEmist: Automated Labeling 500x CHEaper Than LLM Data Annotators
by: Huang, Tzu-Heng, et al.
Published: (2024)

Learning from Less: Measuring the Effectiveness of RLVR in Low Data and Compute Regimes
by: Bauer, Justin, et al.
Published: (2026)

Stronger Than You Think: Benchmarking Weak Supervision on Realistic Tasks
by: Zhang, Tianyi, et al.
Published: (2025)

Pearls from Pebbles: Improved Confidence Functions for Auto-labeling
by: Vishwakarma, Harit, et al.
Published: (2024)

Adaptive Scoring and Thresholding with Human Feedback for Robust Out-of-Distribution Detection
by: Yamada, Daisuke, et al.
Published: (2025)

Taming False Positives in Out-of-Distribution Detection with Human Feedback
by: Vishwakarma, Harit, et al.
Published: (2024)

MoRe Fine-Tuning with 10x Fewer Parameters
by: Tan, Wenxuan, et al.
Published: (2024)

Automating Benchmark Design
by: Dsouza, Amanda, et al.
Published: (2025)

Evaluating Sample Utility for Efficient Data Selection by Mimicking Model Weights
by: Huang, Tzu-Heng, et al.
Published: (2025)

Weak-to-Strong Generalization Through the Data-Centric Lens
by: Shin, Changho, et al.
Published: (2024)

RubiCap: Rubric-Guided Reinforcement Learning for Dense Image Captioning
by: Huang, Tzu-Heng, et al.
Published: (2026)

Test-Time Scaling Makes Overtraining Compute-Optimal
by: Roberts, Nicholas, et al.
Published: (2026)

Multimodal Data Curation via Object Detection and Filter Ensembles
by: Huang, Tzu-Heng, et al.
Published: (2024)

CARE: Confounder-Aware Aggregation for Reliable LLM Evaluation
by: Zhao, Jitian, et al.
Published: (2026)

WS-GRPO: Weakly-Supervised Group-Relative Policy Optimization for Rollout-Efficient Reasoning
by: Mundada, Gagan, et al.
Published: (2026)

Tabby: A Language Model Architecture for Tabular and Structured Data Synthesis
by: Cromp, Sonia, et al.
Published: (2025)

Prune 'n Predict: Optimizing LLM Decision-making with Conformal Prediction
by: Vishwakarma, Harit, et al.
Published: (2024)

R&B: Domain Regrouping and Data Mixture Balancing for Efficient Foundation Model Training
by: Ge, Albert, et al.
Published: (2025)

Causal Spherical Hypergraph Networks for Modelling Social Uncertainty
by: Harit, Anoushka, et al.
Published: (2025)

RicciFlowRec: A Geometric Root Cause Recommender Using Ricci Curvature on Financial Graphs
by: Sun, Zhongtian, et al.
Published: (2025)

From News to Returns: A Granger-Causal Hypergraph Transformer on the Sphere
by: Harit, Anoushka, et al.
Published: (2025)

Actionable Interpretability via Causal Hypergraphs: Unravelling Batch Size Effects in Deep Learning
by: Sun, Zhongtian, et al.
Published: (2025)

Weakly Supervised Label Learning Flows
by: Lu, You, et al.
Published: (2023)

Pareto Optimal Code Generation
by: Orlanski, Gabriel, et al.
Published: (2025)

ManifoldMind: Dynamic Hyperbolic Reasoning for Trustworthy Recommendations
by: Harit, Anoushka, et al.
Published: (2025)

A General Framework for Learning from Weak Supervision
by: Chen, Hao, et al.
Published: (2024)

Scriptorium
Published: (2019)

COSMOS: Predictable and Cost-Effective Adaptation of LLMs
by: Wang, Jiayu, et al.
Published: (2025)

INDOTABVQA: A Benchmark for Cross-Lingual Table Understanding in Bahasa Indonesia Documents
by: Gautam, Somraj, et al.
Published: (2026)

Breaking Down Financial News Impact: A Novel AI Approach with Geometric Hypergraphs
by: Harit, Anoushka, et al.
Published: (2024)

Personalize Your LLM: Fake it then Align it
by: Zhang, Yijing, et al.
Published: (2025)

Expressivity-Efficiency Tradeoffs for Hybrid Sequence Models
by: Cooper, John, et al.
Published: (2026)

Quantifying Structure in CLIP Embeddings: A Statistical Framework for Concept Interpretation
by: Zhao, Jitian, et al.
Published: (2025)

A Generic Self-Supervised Framework of Learning Invariant Discriminative Features
by: Ntelemis, Foivos, et al.
Published: (2022)

Table Detection with Active Learning
by: Gautam, Somraj, et al.
Published: (2025)

Pretrained Hybrids with MAD Skills
by: Roberts, Nicholas, et al.
Published: (2024)

A Unified Empirical Risk Minimization Framework for Flexible N-Tuples Weak Supervision
by: Huang, Shuying, et al.
Published: (2025)