Saved in:
| Main Author: | Zhang, Damin |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.10899 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SYNTHEVAL: Hybrid Behavioral Testing of NLP Models with Synthetic CheckLists
by: Zhao, Raoyuan, et al.
Published: (2024)
by: Zhao, Raoyuan, et al.
Published: (2024)
Structured yet Bounded Temporal Understanding in Large Language Models
by: Zhang, Damin, et al.
Published: (2025)
by: Zhang, Damin, et al.
Published: (2025)
Hire Me or Not? Examining Language Model's Behavior with Occupation Attributes
by: Zhang, Damin, et al.
Published: (2024)
by: Zhang, Damin, et al.
Published: (2024)
Evaluating Large Language Model Capability in Vietnamese Fact-Checking Data Generation
by: To, Long Truong, et al.
Published: (2024)
by: To, Long Truong, et al.
Published: (2024)
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models
by: Zhang, Kaichen, et al.
Published: (2024)
by: Zhang, Kaichen, et al.
Published: (2024)
From Performance to Purpose: A Sociotechnical Taxonomy for Evaluating Large Language Model Utility
by: Levinson, Gavin, et al.
Published: (2026)
by: Levinson, Gavin, et al.
Published: (2026)
FACT-AUDIT: An Adaptive Multi-Agent Framework for Dynamic Fact-Checking Evaluation of Large Language Models
by: Lin, Hongzhan, et al.
Published: (2025)
by: Lin, Hongzhan, et al.
Published: (2025)
A Taxonomy for Data Contamination in Large Language Models
by: Palavalli, Medha, et al.
Published: (2024)
by: Palavalli, Medha, et al.
Published: (2024)
Evaluating Large Language Models on Time Series Feature Understanding: A Comprehensive Taxonomy and Benchmark
by: Fons, Elizabeth, et al.
Published: (2024)
by: Fons, Elizabeth, et al.
Published: (2024)
LEAF: Learning and Evaluation Augmented by Fact-Checking to Improve Factualness in Large Language Models
by: Tran, Hieu, et al.
Published: (2024)
by: Tran, Hieu, et al.
Published: (2024)
Evaluating List Construction and Temporal Understanding capabilities of Large Language Models
by: Dumitru, Alexandru, et al.
Published: (2025)
by: Dumitru, Alexandru, et al.
Published: (2025)
Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models
by: Li, Miaoran, et al.
Published: (2023)
by: Li, Miaoran, et al.
Published: (2023)
Enriching Taxonomies Using Large Language Models
by: Ghamlouch, Zeinab, et al.
Published: (2025)
by: Ghamlouch, Zeinab, et al.
Published: (2025)
Learning to Check: Unleashing Potentials for Self-Correction in Large Language Models
by: Zhang, Che, et al.
Published: (2024)
by: Zhang, Che, et al.
Published: (2024)
Multilingual Large Language Model: A Survey of Resources, Taxonomy and Frontiers
by: Qin, Libo, et al.
Published: (2024)
by: Qin, Libo, et al.
Published: (2024)
Generative Large Language Models in Automated Fact-Checking: A Survey
by: Vykopal, Ivan, et al.
Published: (2024)
by: Vykopal, Ivan, et al.
Published: (2024)
Automated Fact-Checking of Climate Change Claims with Large Language Models
by: Leippold, Markus, et al.
Published: (2024)
by: Leippold, Markus, et al.
Published: (2024)
Large Language Models for Multilingual Previously Fact-Checked Claim Detection
by: Vykopal, Ivan, et al.
Published: (2025)
by: Vykopal, Ivan, et al.
Published: (2025)
Are Large Language Models a Good Replacement of Taxonomies?
by: Sun, Yushi, et al.
Published: (2024)
by: Sun, Yushi, et al.
Published: (2024)
A Taxonomy of Stereotype Content in Large Language Models
by: Nicolas, Gandalf, et al.
Published: (2024)
by: Nicolas, Gandalf, et al.
Published: (2024)
Glitch Tokens in Large Language Models: Categorization Taxonomy and Effective Detection
by: Li, Yuxi, et al.
Published: (2024)
by: Li, Yuxi, et al.
Published: (2024)
Hallucination to Truth: A Review of Fact-Checking and Factuality Evaluation in Large Language Models
by: Rahman, Subhey Sadi, et al.
Published: (2025)
by: Rahman, Subhey Sadi, et al.
Published: (2025)
FoodTaxo: Generating Food Taxonomies with Large Language Models
by: Wullschleger, Pascal, et al.
Published: (2025)
by: Wullschleger, Pascal, et al.
Published: (2025)
Taxonomy, Opportunities, and Challenges of Representation Engineering for Large Language Models
by: Wehner, Jan, et al.
Published: (2025)
by: Wehner, Jan, et al.
Published: (2025)
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
by: Cui, Tianyu, et al.
Published: (2024)
by: Cui, Tianyu, et al.
Published: (2024)
Rich Semantic Knowledge Enhanced Large Language Models for Few-shot Chinese Spell Checking
by: Dong, Ming, et al.
Published: (2024)
by: Dong, Ming, et al.
Published: (2024)
RealFactBench: A Benchmark for Evaluating Large Language Models in Real-World Fact-Checking
by: Yang, Shuo, et al.
Published: (2025)
by: Yang, Shuo, et al.
Published: (2025)
Refining Wikidata Taxonomy using Large Language Models
by: Peng, Yiwen, et al.
Published: (2024)
by: Peng, Yiwen, et al.
Published: (2024)
Evaluation Revisited: A Taxonomy of Evaluation Concerns in Natural Language Processing
by: Dhar, Ruchira, et al.
Published: (2026)
by: Dhar, Ruchira, et al.
Published: (2026)
IPL: Leveraging Multimodal Large Language Models for Intelligent Product Listing
by: Chen, Kang, et al.
Published: (2024)
by: Chen, Kang, et al.
Published: (2024)
Cross-modality Information Check for Detecting Jailbreaking in Multimodal Large Language Models
by: Xu, Yue, et al.
Published: (2024)
by: Xu, Yue, et al.
Published: (2024)
Towards Comprehensive Stage-wise Benchmarking of Large Language Models in Fact-Checking
by: Lin, Hongzhan, et al.
Published: (2026)
by: Lin, Hongzhan, et al.
Published: (2026)
Chain-of-Layer: Iteratively Prompting Large Language Models for Taxonomy Induction from Limited Examples
by: Zeng, Qingkai, et al.
Published: (2024)
by: Zeng, Qingkai, et al.
Published: (2024)
Audio Jailbreaks in Large Audio-Language Models: Taxonomy, Attack-Defense Analysis, and Cost-Aware Evaluation
by: Feng, Bo-Han, et al.
Published: (2026)
by: Feng, Bo-Han, et al.
Published: (2026)
VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models
by: Dunlap, Lisa, et al.
Published: (2024)
by: Dunlap, Lisa, et al.
Published: (2024)
Multimodal Large Language Models to Support Real-World Fact-Checking
by: Geng, Jiahui, et al.
Published: (2024)
by: Geng, Jiahui, et al.
Published: (2024)
Fact-Checking with Large Language Models via Probabilistic Certainty and Consistency
by: Wang, Haoran, et al.
Published: (2026)
by: Wang, Haoran, et al.
Published: (2026)
A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions
by: Huang, Lei, et al.
Published: (2023)
by: Huang, Lei, et al.
Published: (2023)
Tell Me Why: Explainable Public Health Fact-Checking with Large Language Models
by: Zarharan, Majid, et al.
Published: (2024)
by: Zarharan, Majid, et al.
Published: (2024)
ClaimCheck: Real-Time Fact-Checking with Small Language Models
by: Putta, Akshith Reddy, et al.
Published: (2025)
by: Putta, Akshith Reddy, et al.
Published: (2025)
Similar Items
-
SYNTHEVAL: Hybrid Behavioral Testing of NLP Models with Synthetic CheckLists
by: Zhao, Raoyuan, et al.
Published: (2024) -
Structured yet Bounded Temporal Understanding in Large Language Models
by: Zhang, Damin, et al.
Published: (2025) -
Hire Me or Not? Examining Language Model's Behavior with Occupation Attributes
by: Zhang, Damin, et al.
Published: (2024) -
Evaluating Large Language Model Capability in Vietnamese Fact-Checking Data Generation
by: To, Long Truong, et al.
Published: (2024) -
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models
by: Zhang, Kaichen, et al.
Published: (2024)