Saved in:
| Main Author: | Zeng, Hui |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2304.12986 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
RomanLens: The Role Of Latent Romanization In Multilinguality In LLMs
by: Saji, Alan, et al.
Published: (2025)
by: Saji, Alan, et al.
Published: (2025)
Text-Based Approaches to Item Difficulty Modeling in Large-Scale Assessments: A Systematic Review
by: Peters, Sydney, et al.
Published: (2025)
by: Peters, Sydney, et al.
Published: (2025)
MemeLens: Multilingual Multitask VLMs for Memes
by: Shahroor, Ali Ezzat, et al.
Published: (2026)
by: Shahroor, Ali Ezzat, et al.
Published: (2026)
Entropy-Based Measurement of Value Drift and Alignment Work in Large Language Models
by: Fadli, Samih
Published: (2025)
by: Fadli, Samih
Published: (2025)
Measuring Reasoning Utility in LLMs via Conditional Entropy Reduction
by: Guo, Xu
Published: (2025)
by: Guo, Xu
Published: (2025)
Evaluating the Efficacy of Hybrid Deep Learning Models in Distinguishing AI-Generated Text
by: Oketunji, Abiodun Finbarrs
Published: (2023)
by: Oketunji, Abiodun Finbarrs
Published: (2023)
UniHetero: Could Generation Enhance Understanding for Vision-Language-Model at Large Data Scale?
by: Chen, Fengjiao, et al.
Published: (2025)
by: Chen, Fengjiao, et al.
Published: (2025)
Cetvel: A Unified Benchmark for Evaluating Language Understanding, Generation and Cultural Capacity of LLMs for Turkish
by: Er, Yakup Abrek, et al.
Published: (2025)
by: Er, Yakup Abrek, et al.
Published: (2025)
A Use-Case Specific Dataset for Measuring Dimensions of Responsible Performance in LLM-generated Text
by: Sagae, Alicia, et al.
Published: (2025)
by: Sagae, Alicia, et al.
Published: (2025)
Large Language Models Can Better Understand Knowledge Graphs Than We Thought
by: Dai, Xinbang, et al.
Published: (2024)
by: Dai, Xinbang, et al.
Published: (2024)
Low-Resource Court Judgment Summarization for Common Law Systems
by: Liu, Shuaiqi, et al.
Published: (2024)
by: Liu, Shuaiqi, et al.
Published: (2024)
Large Language Model (LLM) Bias Index -- LLMBI
by: Oketunji, Abiodun Finbarrs, et al.
Published: (2023)
by: Oketunji, Abiodun Finbarrs, et al.
Published: (2023)
Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory
by: Cacioli, Jon-Paul
Published: (2026)
by: Cacioli, Jon-Paul
Published: (2026)
Robustness of Large Language Models to Perturbations in Text
by: Singh, Ayush, et al.
Published: (2024)
by: Singh, Ayush, et al.
Published: (2024)
Neural Machine Translation for Malayalam Paraphrase Generation
by: Varghese, Christeena, et al.
Published: (2024)
by: Varghese, Christeena, et al.
Published: (2024)
Is Our Chatbot Telling Lies? Assessing Correctness of an LLM-based Dutch Support Chatbot
by: Lassche, Herman, et al.
Published: (2024)
by: Lassche, Herman, et al.
Published: (2024)
Predictive Simultaneous Interpretation: Harnessing Large Language Models for Democratizing Real-Time Multilingual Communication
by: Iida, Kurando, et al.
Published: (2024)
by: Iida, Kurando, et al.
Published: (2024)
Exploring RWKV for Sentence Embeddings: Layer-wise Analysis and Baseline Comparison for Semantic Similarity
by: Pan, Xinghan
Published: (2025)
by: Pan, Xinghan
Published: (2025)
UA-Legal-Bench: A Benchmark for Evaluating Large Language Models on Ukrainian Legal Reasoning
by: Ovcharov, Volodymyr
Published: (2026)
by: Ovcharov, Volodymyr
Published: (2026)
From Guessing to Asking: An Approach to Resolving the Persona Knowledge Gap in LLMs during Multi-Turn Conversations
by: Baskar, Sarvesh, et al.
Published: (2025)
by: Baskar, Sarvesh, et al.
Published: (2025)
Pun Unintended: LLMs and the Illusion of Humor Understanding
by: Zangari, Alessandro, et al.
Published: (2025)
by: Zangari, Alessandro, et al.
Published: (2025)
SomaliBench Eval: Measuring English-to-Somali Refusal Gaps in Open-Weight Language Models
by: Dahir, Khalid Yusuf
Published: (2026)
by: Dahir, Khalid Yusuf
Published: (2026)
ACCORD: Closing the Commonsense Measurability Gap
by: Roewer-Després, François, et al.
Published: (2024)
by: Roewer-Després, François, et al.
Published: (2024)
IntentGrasp: A Comprehensive Benchmark for Intent Understanding
by: Yin, Yuwei, et al.
Published: (2026)
by: Yin, Yuwei, et al.
Published: (2026)
Mechanistic Understanding of Language Models in Syntactic Code Completion
by: Miller, Samuel, et al.
Published: (2025)
by: Miller, Samuel, et al.
Published: (2025)
Dynamic Demonstration Retrieval and Cognitive Understanding for Emotional Support Conversation
by: Xu, Zhe, et al.
Published: (2024)
by: Xu, Zhe, et al.
Published: (2024)
Understanding the Effects of RLHF on the Quality and Detectability of LLM-Generated Texts
by: Xu, Beining, et al.
Published: (2025)
by: Xu, Beining, et al.
Published: (2025)
Do LLMs Truly Understand When a Precedent Is Overruled?
by: Zhang, Li, et al.
Published: (2025)
by: Zhang, Li, et al.
Published: (2025)
MISR: Measuring Instrumental Self-Reasoning in Frontier Models
by: Fronsdal, Kai, et al.
Published: (2024)
by: Fronsdal, Kai, et al.
Published: (2024)
Action-Item-Driven Summarization of Long Meeting Transcripts
by: Golia, Logan, et al.
Published: (2023)
by: Golia, Logan, et al.
Published: (2023)
Domain-specific ChatBots for Science using Embeddings
by: Yager, Kevin G.
Published: (2023)
by: Yager, Kevin G.
Published: (2023)
$\rm SP^3$: Enhancing Structured Pruning via PCA Projection
by: Hu, Yuxuan, et al.
Published: (2023)
by: Hu, Yuxuan, et al.
Published: (2023)
PSST: A Benchmark for Evaluation-driven Text Public-Speaking Style Transfer
by: Sun, Huashan, et al.
Published: (2023)
by: Sun, Huashan, et al.
Published: (2023)
On Preserving the Knowledge of Long Clinical Texts
by: Hasan, Mohammad Junayed, et al.
Published: (2023)
by: Hasan, Mohammad Junayed, et al.
Published: (2023)
Controllable Text Summarization: Unraveling Challenges, Approaches, and Prospects -- A Survey
by: Urlana, Ashok, et al.
Published: (2023)
by: Urlana, Ashok, et al.
Published: (2023)
TWIZ-v2: The Wizard of Multimodal Conversational-Stimulus
by: Ferreira, Rafael, et al.
Published: (2023)
by: Ferreira, Rafael, et al.
Published: (2023)
Temporal Knowledge Question Answering via Abstract Reasoning Induction
by: Chen, Ziyang, et al.
Published: (2023)
by: Chen, Ziyang, et al.
Published: (2023)
Teaching Probabilistic Logical Reasoning to Transformers
by: Nafar, Aliakbar, et al.
Published: (2023)
by: Nafar, Aliakbar, et al.
Published: (2023)
A Multi-Task, Multi-Modal Approach for Predicting Categorical and Dimensional Emotions
by: Ispas, Alex-Răzvan, et al.
Published: (2023)
by: Ispas, Alex-Răzvan, et al.
Published: (2023)
Unleashing the potential of prompt engineering for large language models
by: Chen, Banghao, et al.
Published: (2023)
by: Chen, Banghao, et al.
Published: (2023)
Similar Items
-
RomanLens: The Role Of Latent Romanization In Multilinguality In LLMs
by: Saji, Alan, et al.
Published: (2025) -
Text-Based Approaches to Item Difficulty Modeling in Large-Scale Assessments: A Systematic Review
by: Peters, Sydney, et al.
Published: (2025) -
MemeLens: Multilingual Multitask VLMs for Memes
by: Shahroor, Ali Ezzat, et al.
Published: (2026) -
Entropy-Based Measurement of Value Drift and Alignment Work in Large Language Models
by: Fadli, Samih
Published: (2025) -
Measuring Reasoning Utility in LLMs via Conditional Entropy Reduction
by: Guo, Xu
Published: (2025)