Saved in:
| Main Author: | Parupudi, V. S. Raghu |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.08596 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Magnitude Matters: a Superior Class of Similarity Metrics for Holistic Semantic Understanding
by: Parupudi, V. S. Raghu
Published: (2025)
by: Parupudi, V. S. Raghu
Published: (2025)
Systematic Diagnosis of Brittle Reasoning in Large Language Models
by: Parupudi, V. S. Raghu
Published: (2025)
by: Parupudi, V. S. Raghu
Published: (2025)
Before and After Temperature: A Distributional View of Creative LLM Generation
by: Parupudi, V. S. Raghu, et al.
Published: (2026)
by: Parupudi, V. S. Raghu, et al.
Published: (2026)
Rethinking Perplexity: Revealing the Impact of Input Length on Perplexity Evaluation in LLMs
by: Cheng, Letian, et al.
Published: (2026)
by: Cheng, Letian, et al.
Published: (2026)
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models
by: Ankner, Zachary, et al.
Published: (2024)
by: Ankner, Zachary, et al.
Published: (2024)
Slaves to the Law of Large Numbers: An Asymptotic Equipartition Property for Perplexity in Generative Language Models
by: Bell, Tyler, et al.
Published: (2024)
by: Bell, Tyler, et al.
Published: (2024)
Pack of LLMs: Model Fusion at Test-Time via Perplexity Optimization
by: Mavromatis, Costas, et al.
Published: (2024)
by: Mavromatis, Costas, et al.
Published: (2024)
The Perplexity Paradox: Why Code Compresses Better Than Math in LLM Prompts
by: Johnson, Warren
Published: (2026)
by: Johnson, Warren
Published: (2026)
Reasoning Models Better Express Their Confidence
by: Yoon, Dongkeun, et al.
Published: (2025)
by: Yoon, Dongkeun, et al.
Published: (2025)
Lowest Span Confidence: A Zero-Shot Metric for Efficient and Black-Box Hallucination Detection in LLMs
by: Qiao, Yitong, et al.
Published: (2026)
by: Qiao, Yitong, et al.
Published: (2026)
Perplexity Trap: PLM-Based Retrievers Overrate Low Perplexity Documents
by: Wang, Haoyu, et al.
Published: (2025)
by: Wang, Haoyu, et al.
Published: (2025)
Is my model perplexed for the right reason? Contrasting LLMs' Benchmark Behavior with Token-Level Perplexity
by: Prins, Zoë, et al.
Published: (2026)
by: Prins, Zoë, et al.
Published: (2026)
Perplexity-Aware Data Scaling Law: Perplexity Landscapes Predict Performance for Continual Pre-training
by: Liu, Lei, et al.
Published: (2025)
by: Liu, Lei, et al.
Published: (2025)
Do BERT-Like Bidirectional Models Still Perform Better on Text Classification in the Era of LLMs?
by: Zhang, Junyan, et al.
Published: (2025)
by: Zhang, Junyan, et al.
Published: (2025)
Rectify Evaluation Preference: Improving LLMs' Critique on Math Reasoning via Perplexity-aware Reinforcement Learning
by: Tian, Changyuan, et al.
Published: (2025)
by: Tian, Changyuan, et al.
Published: (2025)
Do LLMs Find Human Answers To Fact-Driven Questions Perplexing? A Case Study on Reddit
by: Seegmiller, Parker, et al.
Published: (2024)
by: Seegmiller, Parker, et al.
Published: (2024)
On Verbalized Confidence Scores for LLMs
by: Yang, Daniel, et al.
Published: (2024)
by: Yang, Daniel, et al.
Published: (2024)
Writing in Symbiosis: Mapping Human Creative Agency in the AI Era
by: Doshi, Vivan, et al.
Published: (2025)
by: Doshi, Vivan, et al.
Published: (2025)
Demystifying Prompts in Language Models via Perplexity Estimation
by: Gonen, Hila, et al.
Published: (2022)
by: Gonen, Hila, et al.
Published: (2022)
CREATE: Testing LLMs for Associative Creativity
by: Wadhwa, Manya, et al.
Published: (2026)
by: Wadhwa, Manya, et al.
Published: (2026)
Mapping Overlaps in Benchmarks through Perplexity in the Wild
by: Wu, Siyang, et al.
Published: (2025)
by: Wu, Siyang, et al.
Published: (2025)
Improving Pretraining Data Using Perplexity Correlations
by: Thrush, Tristan, et al.
Published: (2024)
by: Thrush, Tristan, et al.
Published: (2024)
Rethinking GSPO: The Perplexity-Entropy Equivalence
by: Liu, Chi
Published: (2025)
by: Liu, Chi
Published: (2025)
Beyond Perplexity: Multi-dimensional Safety Evaluation of LLM Compression
by: Xu, Zhichao, et al.
Published: (2024)
by: Xu, Zhichao, et al.
Published: (2024)
Multicalibration for Confidence Scoring in LLMs
by: Detommaso, Gianluca, et al.
Published: (2024)
by: Detommaso, Gianluca, et al.
Published: (2024)
MetricalARGS: A Taxonomy for Studying Metrical Poetry with LLMs
by: Kranti, Chalamalasetti, et al.
Published: (2025)
by: Kranti, Chalamalasetti, et al.
Published: (2025)
Moral Mazes in the Era of LLMs
by: Nguyen, Dang, et al.
Published: (2026)
by: Nguyen, Dang, et al.
Published: (2026)
On the Robustness of Verbal Confidence of LLMs in Adversarial Attacks
by: Obadinma, Stephen, et al.
Published: (2025)
by: Obadinma, Stephen, et al.
Published: (2025)
Confidence Estimation for LLMs in Multi-turn Interactions
by: Zhang, Caiqi, et al.
Published: (2026)
by: Zhang, Caiqi, et al.
Published: (2026)
What is Wrong with Perplexity for Long-context Language Modeling?
by: Fang, Lizhe, et al.
Published: (2024)
by: Fang, Lizhe, et al.
Published: (2024)
Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs
by: Xiong, Miao, et al.
Published: (2023)
by: Xiong, Miao, et al.
Published: (2023)
destroR: Attacking Transfer Models with Obfuscous Examples to Discard Perplexity
by: Ahmed, Saadat Rafid, et al.
Published: (2025)
by: Ahmed, Saadat Rafid, et al.
Published: (2025)
How Well Can Knowledge Edit Methods Edit Perplexing Knowledge?
by: Ge, Huaizhi, et al.
Published: (2024)
by: Ge, Huaizhi, et al.
Published: (2024)
A Thorough Examination of Decoding Methods in the Era of LLMs
by: Shi, Chufan, et al.
Published: (2024)
by: Shi, Chufan, et al.
Published: (2024)
Confidence Improves Self-Consistency in LLMs
by: Taubenfeld, Amir, et al.
Published: (2025)
by: Taubenfeld, Amir, et al.
Published: (2025)
A Perplexity and Menger Curvature-Based Approach for Similarity Evaluation of Large Language Models
by: Zhang, Yuantao, et al.
Published: (2025)
by: Zhang, Yuantao, et al.
Published: (2025)
Evaluating the Creativity of LLMs in Persian Literary Text Generation
by: Tourajmehr, Armin, et al.
Published: (2025)
by: Tourajmehr, Armin, et al.
Published: (2025)
IDEAFix: Evaluation Framework for Creative Defixation Prompting in LLMs
by: Carichon, F., et al.
Published: (2026)
by: Carichon, F., et al.
Published: (2026)
Supervised Optimism Correction: Be Confident When LLMs Are Sure
by: Zhang, Junjie, et al.
Published: (2025)
by: Zhang, Junjie, et al.
Published: (2025)
Confidence Estimation in Automatic Short Answer Grading with LLMs
by: Cong, Longwei, et al.
Published: (2026)
by: Cong, Longwei, et al.
Published: (2026)
Similar Items
-
Magnitude Matters: a Superior Class of Similarity Metrics for Holistic Semantic Understanding
by: Parupudi, V. S. Raghu
Published: (2025) -
Systematic Diagnosis of Brittle Reasoning in Large Language Models
by: Parupudi, V. S. Raghu
Published: (2025) -
Before and After Temperature: A Distributional View of Creative LLM Generation
by: Parupudi, V. S. Raghu, et al.
Published: (2026) -
Rethinking Perplexity: Revealing the Impact of Input Length on Perplexity Evaluation in LLMs
by: Cheng, Letian, et al.
Published: (2026) -
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models
by: Ankner, Zachary, et al.
Published: (2024)