:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Peiqi, Lam, Barbara D., Liu, Yingcheng, Asgari-Targhi, Ameneh, Panda, Rameswar, Wells, William M., Kapur, Tina, Golland, Polina
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2410.04315
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Diversity Measurement and Subset Selection for Instruction Tuning Datasets
by: Wang, Peiqi, et al.
Published: (2024)

Connecting Jensen-Shannon and Kullback-Leibler Divergences: A New Bound for Representation Learning
by: Dorent, Reuben, et al.
Published: (2025)

Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
by: Brandon, William, et al.
Published: (2024)

Gated Linear Attention Transformers with Hardware-Efficient Training
by: Yang, Songlin, et al.
Published: (2023)

Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization
by: Nrusimha, Aniruddha, et al.
Published: (2024)

FlashFormer: Whole-Model Kernels for Efficient Low-Batch Inference
by: Nrusimha, Aniruddha, et al.
Published: (2025)

Scaling Stick-Breaking Attention: An Efficient Implementation and In-depth Study
by: Tan, Shawn, et al.
Published: (2024)

API Pack: A Massive Multi-Programming Language Dataset for API Call Generation
by: Guo, Zhen, et al.
Published: (2024)

PaTH Attention: Position Encoding via Accumulating Householder Transformations
by: Yang, Songlin, et al.
Published: (2025)

Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models
by: Pan, Bowen, et al.
Published: (2024)

Causality and Scientific Inquiry: Lessons from Space Physics and Medical Sciences
by: Asgari-Targhi, Marzieh, et al.
Published: (2026)

Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts
by: Kang, Junmo, et al.
Published: (2024)

The Confidence Trap: Gender Bias and Predictive Certainty in LLMs
by: Sabir, Ahmed, et al.
Published: (2026)

TOUCAN: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments
by: Xu, Zhangchen, et al.
Published: (2025)

The Illusion of Certainty: Uncertainty Quantification for LLMs Fails under Ambiguity
by: Tomov, Tim, et al.
Published: (2025)

Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler
by: Shen, Yikang, et al.
Published: (2024)

Unified Cross-Modal Medical Image Synthesis with Hierarchical Mixture of Product-of-Experts
by: Dorent, Reuben, et al.
Published: (2024)

Explain-Query-Test: Self-Evaluating LLMs Via Explanation and Comprehension Discrepancy
by: Taghanaki, Saeid Asgari, et al.
Published: (2025)

Less is More: Rethinking Few-Shot Learning and Recurrent Neural Nets
by: Pereg, Deborah, et al.
Published: (2022)

Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models
by: Jain, Neel, et al.
Published: (2024)

PRISM: Demystifying Retention and Interaction in Mid-Training
by: Runwal, Bharat, et al.
Published: (2026)

Fetuses Made Simple: Modeling and Tracking of Fetal Shape and Pose
by: Liu, Yingcheng, et al.
Published: (2025)

Process Supervision of Confidence Margin for Calibrated LLM Reasoning
by: Wang, Liaoyaqi, et al.
Published: (2026)

Efficient Post-Training Pruning of Large Language Models with Statistical Correction
by: Yu, Peiqi, et al.
Published: (2026)

MMLU-Pro+: Evaluating Higher-Order Reasoning and Shortcut Learning in LLMs
by: Taghanaki, Saeid Asgari, et al.
Published: (2024)

Scalable Best-of-N Selection for Large Language Models via Self-Certainty
by: Kang, Zhewei, et al.
Published: (2025)

Beyond Simple Averaging: Improving NLP Ensemble Performance with Topological-Data-Analysis-Based Weighting
by: Proskura, Polina, et al.
Published: (2024)

Efficient Reasoning for Large Reasoning Language Models via Certainty-Guided Reflection Suppression
by: Huang, Jiameng, et al.
Published: (2025)

The Illusion of Certainty: Decoupling Capability and Calibration in On-Policy Distillation
by: Zhang, Jiaxin, et al.
Published: (2026)

Scattered Mixture-of-Experts Implementation
by: Tan, Shawn, et al.
Published: (2024)

Stepwise Verification and Remediation of Student Reasoning Errors with Large Language Model Tutors
by: Daheim, Nico, et al.
Published: (2024)

Contrastive Learning and Mixture of Experts Enables Precise Vector Embeddings
by: Hallee, Logan, et al.
Published: (2024)

MathTutorBench: A Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors
by: Macina, Jakub, et al.
Published: (2025)

Trained on Tokens, Calibrated on Concepts: The Emergence of Semantic Calibration in LLMs
by: Nakkiran, Preetum, et al.
Published: (2025)

Aligning Fetal Anatomy with Kinematic Tree Log-Euclidean PolyRigid Transforms
by: Liu, Yingcheng, et al.
Published: (2026)

Calibrated Speculative Decoding: Frequency-Guided Candidate Selection for Efficient Inference
by: Zhou, Xuwen, et al.
Published: (2026)

Conformal Linguistic Calibration: Trading-off between Factuality and Specificity
by: Jiang, Zhengping, et al.
Published: (2025)

On Calibration of LLM-based Guard Models for Reliable Content Moderation
by: Liu, Hongfu, et al.
Published: (2024)

Facts in Stats: Impacts of Pretraining Diversity on Language Model Generalization
by: Behnia, Tina, et al.
Published: (2025)

LLM Knowledge is Brittle: Truthfulness Representations Rely on Superficial Resemblance
by: Haller, Patrick, et al.
Published: (2025)