Saved in:
| Main Authors: | He, Chaoqun, Luo, Renjie, Hu, Shengding, Zhao, Yuanqian, Zhou, Jie, Wu, Hanghao, Zhang, Jiajie, Han, Xu, Liu, Zhiyuan, Sun, Maosong |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2404.07584 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
UltraEval-Audio: A Unified Framework for Comprehensive Evaluation of Audio Foundation Models
by: Shi, Qundong, et al.
Published: (2026)
by: Shi, Qundong, et al.
Published: (2026)
States Hidden in Hidden States: LLMs Emerge Discrete State Representations Implicitly
by: Chen, Junhao, et al.
Published: (2024)
by: Chen, Junhao, et al.
Published: (2024)
Unified View of Grokking, Double Descent and Emergent Abilities: A Perspective from Circuits Competition
by: Huang, Yufei, et al.
Published: (2024)
by: Huang, Yufei, et al.
Published: (2024)
EmbodiedEval: Evaluate Multimodal LLMs as Embodied Agents
by: Cheng, Zhili, et al.
Published: (2025)
by: Cheng, Zhili, et al.
Published: (2025)
Predicting Emergent Abilities with Infinite Resolution Evaluation
by: Hu, Shengding, et al.
Published: (2023)
by: Hu, Shengding, et al.
Published: (2023)
OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems
by: He, Chaoqun, et al.
Published: (2024)
by: He, Chaoqun, et al.
Published: (2024)
Stuffed Mamba: Oversized States Lead to the Inability to Forget
by: Chen, Yingfa, et al.
Published: (2024)
by: Chen, Yingfa, et al.
Published: (2024)
LEGENT: Open Platform for Embodied Agents
by: Cheng, Zhili, et al.
Published: (2024)
by: Cheng, Zhili, et al.
Published: (2024)
DecorateLM: Data Engineering through Corpus Rating, Tagging, and Editing with Language Models
by: Zhao, Ranchi, et al.
Published: (2024)
by: Zhao, Ranchi, et al.
Published: (2024)
A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules
by: Luo, Kairong, et al.
Published: (2025)
by: Luo, Kairong, et al.
Published: (2025)
Beyond the Turn-Based Game: Enabling Real-Time Conversations with Duplex Models
by: Zhang, Xinrong, et al.
Published: (2024)
by: Zhang, Xinrong, et al.
Published: (2024)
ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer
by: Hu, Jinyi, et al.
Published: (2024)
by: Hu, Jinyi, et al.
Published: (2024)
Densing Law of LLMs
by: Xiao, Chaojun, et al.
Published: (2024)
by: Xiao, Chaojun, et al.
Published: (2024)
$\infty$Bench: Extending Long Context Evaluation Beyond 100K Tokens
by: Zhang, Xinrong, et al.
Published: (2024)
by: Zhang, Xinrong, et al.
Published: (2024)
Omni-DuplexEval: Evaluating Real-time Duplex Omni-modal Interaction
by: He, Chaoqun, et al.
Published: (2026)
by: He, Chaoqun, et al.
Published: (2026)
ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity within Large Language Models
by: Song, Chenyang, et al.
Published: (2024)
by: Song, Chenyang, et al.
Published: (2024)
FlagEvalMM: A Flexible Framework for Comprehensive Multimodal Model Evaluation
by: He, Zheqi, et al.
Published: (2025)
by: He, Zheqi, et al.
Published: (2025)
MMCircuitEval: A Comprehensive Multimodal Circuit-Focused Benchmark for Evaluating LLMs
by: Zhao, Chenchen, et al.
Published: (2025)
by: Zhao, Chenchen, et al.
Published: (2025)
Matrix Fejér-Riesz type theorem for a union of an interval and a point
by: Sun, Shengding, et al.
Published: (2025)
by: Sun, Shengding, et al.
Published: (2025)
Large Multilingual Models Pivot Zero-Shot Multimodal Learning across Languages
by: Hu, Jinyi, et al.
Published: (2023)
by: Hu, Jinyi, et al.
Published: (2023)
STExplore: An Integrated Online Platform for Comprehensive Analysis and Visualization of Spatial Transcriptomics Data
by: Yongtian Wang, et al.
Published: (2025)
by: Yongtian Wang, et al.
Published: (2025)
On the strength of Burer's lifted convex relaxation to quadratic programming with ball constraints
by: Kılınç-Karzan, Fatma, et al.
Published: (2024)
by: Kılınç-Karzan, Fatma, et al.
Published: (2024)
LiCoEval: Evaluating LLMs on License Compliance in Code Generation
by: Xu, Weiwei, et al.
Published: (2024)
by: Xu, Weiwei, et al.
Published: (2024)
VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos
by: Song, Tingyu, et al.
Published: (2025)
by: Song, Tingyu, et al.
Published: (2025)
Representation Learning for Natural Language Processing
by: Liu, Zhiyuan, et al.
Published: (2020)
by: Liu, Zhiyuan, et al.
Published: (2020)
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies
by: Hu, Shengding, et al.
Published: (2024)
by: Hu, Shengding, et al.
Published: (2024)
Fusion-Eval: Integrating Assistant Evaluators with LLMs
by: Shu, Lei, et al.
Published: (2023)
by: Shu, Lei, et al.
Published: (2023)
AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs
by: Li, Shangzhan, et al.
Published: (2025)
by: Li, Shangzhan, et al.
Published: (2025)
GraphEval: A Lightweight Graph-Based LLM Framework for Idea Evaluation
by: Feng, Tao, et al.
Published: (2025)
by: Feng, Tao, et al.
Published: (2025)
Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs
by: Xiong, Miao, et al.
Published: (2023)
by: Xiong, Miao, et al.
Published: (2023)
MM-Eval: A Hierarchical Benchmark for Modern Mongolian Evaluation in LLMs
by: Zhang, Mengyuan, et al.
Published: (2024)
by: Zhang, Mengyuan, et al.
Published: (2024)
Value Compass Benchmarks: A Platform for Fundamental and Validated Evaluation of LLMs Values
by: Yao, Jing, et al.
Published: (2025)
by: Yao, Jing, et al.
Published: (2025)
APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs
by: Huang, Yuxiang, et al.
Published: (2025)
by: Huang, Yuxiang, et al.
Published: (2025)
H-Neurons: On the Existence, Impact, and Origin of Hallucination-Associated Neurons in LLMs
by: Gao, Cheng, et al.
Published: (2025)
by: Gao, Cheng, et al.
Published: (2025)
SAGE-Eval: Evaluating LLMs for Systematic Generalizations of Safety Facts
by: Yueh-Han, Chen, et al.
Published: (2025)
by: Yueh-Han, Chen, et al.
Published: (2025)
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory
by: Xiao, Chaojun, et al.
Published: (2024)
by: Xiao, Chaojun, et al.
Published: (2024)
A Microgravity Simulation Experimental Platform For Small Space Robots In Orbit
by: Luo, Hang, et al.
Published: (2025)
by: Luo, Hang, et al.
Published: (2025)
CA-LoRA: Adapting Existing LoRA for Compressed LLMs to Enable Efficient Multi-Tasking on Personal Devices
by: Zhao, Weilin, et al.
Published: (2023)
by: Zhao, Weilin, et al.
Published: (2023)
HKCanto-Eval: A Benchmark for Evaluating Cantonese Language Understanding and Cultural Comprehension in LLMs
by: Cheng, Tsz Chung, et al.
Published: (2025)
by: Cheng, Tsz Chung, et al.
Published: (2025)
MiniCPM4: Ultra-Efficient LLMs on End Devices
by: MiniCPM Team, et al.
Published: (2025)
by: MiniCPM Team, et al.
Published: (2025)
Similar Items
-
UltraEval-Audio: A Unified Framework for Comprehensive Evaluation of Audio Foundation Models
by: Shi, Qundong, et al.
Published: (2026) -
States Hidden in Hidden States: LLMs Emerge Discrete State Representations Implicitly
by: Chen, Junhao, et al.
Published: (2024) -
Unified View of Grokking, Double Descent and Emergent Abilities: A Perspective from Circuits Competition
by: Huang, Yufei, et al.
Published: (2024) -
EmbodiedEval: Evaluate Multimodal LLMs as Embodied Agents
by: Cheng, Zhili, et al.
Published: (2025) -
Predicting Emergent Abilities with Infinite Resolution Evaluation
by: Hu, Shengding, et al.
Published: (2023)