Saved in:
| Main Authors: | Tan, Zhiquan, Wei, Lai, Wang, Jindong, Xie, Xing, Huang, Weiran |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.06140 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Diff-eRank: A Novel Rank-Based Metric for Evaluating Large Language Models
by: Wei, Lai, et al.
Published: (2024)
by: Wei, Lai, et al.
Published: (2024)
The Information of Large Language Model Geometry
by: Tan, Zhiquan, et al.
Published: (2024)
by: Tan, Zhiquan, et al.
Published: (2024)
KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models
by: Yu, Zhuohao, et al.
Published: (2024)
by: Yu, Zhuohao, et al.
Published: (2024)
Dynamic Evaluation of Large Language Models by Meta Probing Agents
by: Zhu, Kaijie, et al.
Published: (2024)
by: Zhu, Kaijie, et al.
Published: (2024)
PromptBench: A Unified Library for Evaluation of Large Language Models
by: Zhu, Kaijie, et al.
Published: (2023)
by: Zhu, Kaijie, et al.
Published: (2023)
DyVal: Dynamic Evaluation of Large Language Models for Reasoning Tasks
by: Zhu, Kaijie, et al.
Published: (2023)
by: Zhu, Kaijie, et al.
Published: (2023)
Inference-Cost-Aware Dynamic Tree Construction for Efficient Inference in Large Language Models
by: Hong, Yinrong, et al.
Published: (2025)
by: Hong, Yinrong, et al.
Published: (2025)
CultureLLM: Incorporating Cultural Differences into Large Language Models
by: Li, Cheng, et al.
Published: (2024)
by: Li, Cheng, et al.
Published: (2024)
PromptRobust: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts
by: Zhu, Kaijie, et al.
Published: (2023)
by: Zhu, Kaijie, et al.
Published: (2023)
SparseEval: Efficient Evaluation of Large Language Models by Sparse Optimization
by: Zhang, Taolin, et al.
Published: (2026)
by: Zhang, Taolin, et al.
Published: (2026)
Knowledge Editing on Black-box Large Language Models
by: Song, Xiaoshuai, et al.
Published: (2024)
by: Song, Xiaoshuai, et al.
Published: (2024)
ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
by: Oh, Jio, et al.
Published: (2024)
by: Oh, Jio, et al.
Published: (2024)
Information-Theoretic Perspectives on Optimizers
by: Tan, Zhiquan, et al.
Published: (2025)
by: Tan, Zhiquan, et al.
Published: (2025)
Understanding Grokking Through A Robustness Viewpoint
by: Tan, Zhiquan, et al.
Published: (2023)
by: Tan, Zhiquan, et al.
Published: (2023)
Reinforcement Learning-based Knowledge Distillation with LLM-as-a-Judge
by: Shen, Yiyang, et al.
Published: (2026)
by: Shen, Yiyang, et al.
Published: (2026)
Self-Distilled Reasoner: On-Policy Self-Distillation for Large Language Models
by: Zhao, Siyan, et al.
Published: (2026)
by: Zhao, Siyan, et al.
Published: (2026)
Self-Supervised Learning for Neural Topic Models with Variance-Invariance-Covariance Regularization
by: Xu, Weiran, et al.
Published: (2025)
by: Xu, Weiran, et al.
Published: (2025)
Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models
by: Chua, Lynn, et al.
Published: (2024)
by: Chua, Lynn, et al.
Published: (2024)
AECBench: A Hierarchical Benchmark for Knowledge Evaluation of Large Language Models in the AEC Field
by: Liang, Chen, et al.
Published: (2025)
by: Liang, Chen, et al.
Published: (2025)
Evaluation and Improvement of Fault Detection for Large Language Models
by: Hu, Qiang, et al.
Published: (2024)
by: Hu, Qiang, et al.
Published: (2024)
Delta Knowledge Distillation for Large Language Models
by: Cao, Yihan, et al.
Published: (2025)
by: Cao, Yihan, et al.
Published: (2025)
CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models
by: Wang, Song, et al.
Published: (2024)
by: Wang, Song, et al.
Published: (2024)
Fact or Guesswork? Evaluating Large Language Models' Medical Knowledge with Structured One-Hop Judgments
by: Li, Jiaxi, et al.
Published: (2025)
by: Li, Jiaxi, et al.
Published: (2025)
Matrix Information Theory for Self-Supervised Learning
by: Zhang, Yifan, et al.
Published: (2023)
by: Zhang, Yifan, et al.
Published: (2023)
UniSD: Towards a Unified Self-Distillation Framework for Large Language Models
by: Jin, Yiqiao, et al.
Published: (2026)
by: Jin, Yiqiao, et al.
Published: (2026)
DrugAssist: A Large Language Model for Molecule Optimization
by: Ye, Geyan, et al.
Published: (2023)
by: Ye, Geyan, et al.
Published: (2023)
Detoxifying Large Language Models via Knowledge Editing
by: Wang, Mengru, et al.
Published: (2024)
by: Wang, Mengru, et al.
Published: (2024)
TraceDet: Hallucination Detection from the Decoding Trace of Diffusion Large Language Models
by: Chang, Shenxu, et al.
Published: (2025)
by: Chang, Shenxu, et al.
Published: (2025)
Don't Just Say "I don't know"! Self-aligning Large Language Models for Responding to Unknown Questions with Explanations
by: Deng, Yang, et al.
Published: (2024)
by: Deng, Yang, et al.
Published: (2024)
Concept Unlearning in Large Language Models via Self-Constructed Knowledge Triplets
by: Yamashita, Tomoya, et al.
Published: (2025)
by: Yamashita, Tomoya, et al.
Published: (2025)
Can Prompts Rewind Time for LLMs? Evaluating the Effectiveness of Prompted Knowledge Cutoffs
by: Gao, Xin, et al.
Published: (2025)
by: Gao, Xin, et al.
Published: (2025)
Evaluating the Factuality of Large Language Models using Large-Scale Knowledge Graphs
by: Liu, Xiaoze, et al.
Published: (2024)
by: Liu, Xiaoze, et al.
Published: (2024)
Improved Techniques for Optimization-Based Jailbreaking on Large Language Models
by: Jia, Xiaojun, et al.
Published: (2024)
by: Jia, Xiaojun, et al.
Published: (2024)
Large Language Models Can Self-Improve At Web Agent Tasks
by: Patel, Ajay, et al.
Published: (2024)
by: Patel, Ajay, et al.
Published: (2024)
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
by: Ferrando, Javier, et al.
Published: (2024)
by: Ferrando, Javier, et al.
Published: (2024)
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information
by: Shen, Guobin, et al.
Published: (2026)
by: Shen, Guobin, et al.
Published: (2026)
MM-LIMA: Less Is More for Alignment in Multi-Modal Datasets
by: Wei, Lai, et al.
Published: (2023)
by: Wei, Lai, et al.
Published: (2023)
I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models
by: Hu, Xing, et al.
Published: (2024)
by: Hu, Xing, et al.
Published: (2024)
RePST: Language Model Empowered Spatio-Temporal Forecasting via Semantic-Oriented Reprogramming
by: Wang, Hao, et al.
Published: (2024)
by: Wang, Hao, et al.
Published: (2024)
Provable Contrastive Continual Learning
by: Wen, Yichen, et al.
Published: (2024)
by: Wen, Yichen, et al.
Published: (2024)
Similar Items
-
Diff-eRank: A Novel Rank-Based Metric for Evaluating Large Language Models
by: Wei, Lai, et al.
Published: (2024) -
The Information of Large Language Model Geometry
by: Tan, Zhiquan, et al.
Published: (2024) -
KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models
by: Yu, Zhuohao, et al.
Published: (2024) -
Dynamic Evaluation of Large Language Models by Meta Probing Agents
by: Zhu, Kaijie, et al.
Published: (2024) -
PromptBench: A Unified Library for Evaluation of Large Language Models
by: Zhu, Kaijie, et al.
Published: (2023)