Saved in:
| Main Authors: | Demir, M. Mikail, Canbaz, M. Abdullah |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.17691 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
LegalGuardian: A Privacy-Preserving Framework for Secure Integration of Large Language Models in Legal Practice
by: Demir, M. Mikail, et al.
Published: (2025)
by: Demir, M. Mikail, et al.
Published: (2025)
Heuristics and Biases in AI Decision-Making: Implications for Responsible AGI
by: Saeedi, Payam, et al.
Published: (2024)
by: Saeedi, Payam, et al.
Published: (2024)
LLM-Assisted Crisis Management: Building Advanced LLM Platforms for Effective Emergency Response and Public Collaboration
by: Otal, Hakan T., et al.
Published: (2024)
by: Otal, Hakan T., et al.
Published: (2024)
LLM Honeypot: Leveraging Large Language Models as Advanced Interactive Honeypot Systems
by: Otal, Hakan T., et al.
Published: (2024)
by: Otal, Hakan T., et al.
Published: (2024)
Precedent-Informed Reasoning: Mitigating Overthinking in Large Reasoning Models via Test-Time Precedent Learning
by: Wang, Qianyue, et al.
Published: (2026)
by: Wang, Qianyue, et al.
Published: (2026)
Flick: Few Labels Text Classification using K-Aware Intermediate Learning in Multi-Task Low-Resource Languages
by: Almutairi, Ali, et al.
Published: (2025)
by: Almutairi, Ali, et al.
Published: (2025)
In-Context Learning for Extreme Multi-Label Classification
by: D'Oosterlinck, Karel, et al.
Published: (2024)
by: D'Oosterlinck, Karel, et al.
Published: (2024)
Combining Supervised Learning and Reinforcement Learning for Multi-Label Classification Tasks with Partial Labels
by: Jia, Zixia, et al.
Published: (2024)
by: Jia, Zixia, et al.
Published: (2024)
Is Your LLM Really Mastering the Concept? A Multi-Agent Benchmark
by: Xu, Shuhang, et al.
Published: (2025)
by: Xu, Shuhang, et al.
Published: (2025)
Multi-Label Clinical Text Eligibility Classification and Summarization System
by: Yerramsetty, Surya Tejaswi, et al.
Published: (2025)
by: Yerramsetty, Surya Tejaswi, et al.
Published: (2025)
The Right Model for the Job: An Evaluation of Legal Multi-Label Classification Baselines
by: Forster, Martina, et al.
Published: (2024)
by: Forster, Martina, et al.
Published: (2024)
Are Your LLMs Capable of Stable Reasoning?
by: Liu, Junnan, et al.
Published: (2024)
by: Liu, Junnan, et al.
Published: (2024)
LLMEval-Med: A Real-world Clinical Benchmark for Medical LLMs with Physician Validation
by: Zhang, Ming, et al.
Published: (2025)
by: Zhang, Ming, et al.
Published: (2025)
Modeling Bias Evolution in Fashion Recommender Systems: A System Dynamics Approach
by: Goodarzi, Mahsa, et al.
Published: (2025)
by: Goodarzi, Mahsa, et al.
Published: (2025)
Belief in Authority: Impact of Authority in Multi-Agent Evaluation Framework
by: Choi, Junhyuk, et al.
Published: (2026)
by: Choi, Junhyuk, et al.
Published: (2026)
Can AI Validate Science? Benchmarking LLMs for Accurate Scientific Claim $\rightarrow$ Evidence Reasoning
by: Javaji, Shashidhar Reddy, et al.
Published: (2025)
by: Javaji, Shashidhar Reddy, et al.
Published: (2025)
Hierarchical Multi-Label Classification of Online Vaccine Concerns
by: Zhu, Chloe Qinyu, et al.
Published: (2024)
by: Zhu, Chloe Qinyu, et al.
Published: (2024)
Pastiche Novel Generation Creating: Fan Fiction You Love in Your Favorite Author's Style
by: Han, Xueran, et al.
Published: (2025)
by: Han, Xueran, et al.
Published: (2025)
PsychiatryBench: A Multi-Task Benchmark for LLMs in Psychiatry
by: Fouda, Aya E., et al.
Published: (2025)
by: Fouda, Aya E., et al.
Published: (2025)
Assessing the Performance of Human-Capable LLMs -- Are LLMs Coming for Your Job?
by: Mavi, John, et al.
Published: (2024)
by: Mavi, John, et al.
Published: (2024)
Your AI, Not Your View: The Bias of LLMs in Investment Analysis
by: Lee, Hoyoung, et al.
Published: (2025)
by: Lee, Hoyoung, et al.
Published: (2025)
Do LLMs Truly Understand When a Precedent Is Overruled?
by: Zhang, Li, et al.
Published: (2025)
by: Zhang, Li, et al.
Published: (2025)
Benchmarking LLMs for Pairwise Causal Discovery in Biomedical and Multi-Domain Contexts
by: Anuyah, Sydney, et al.
Published: (2026)
by: Anuyah, Sydney, et al.
Published: (2026)
Instances and Labels: Hierarchy-aware Joint Supervised Contrastive Learning for Hierarchical Multi-Label Text Classification
by: Yu, Simon, et al.
Published: (2023)
by: Yu, Simon, et al.
Published: (2023)
Do LLMs Recognize Your Latent Preferences? A Benchmark for Latent Information Discovery in Personalized Interaction
by: Tsaknakis, Ioannis, et al.
Published: (2025)
by: Tsaknakis, Ioannis, et al.
Published: (2025)
MultiChallenge: A Realistic Multi-Turn Conversation Evaluation Benchmark Challenging to Frontier LLMs
by: Sirdeshmukh, Ved, et al.
Published: (2025)
by: Sirdeshmukh, Ved, et al.
Published: (2025)
DKEC: Domain Knowledge Enhanced Multi-Label Classification for Diagnosis Prediction
by: Ge, Xueren, et al.
Published: (2023)
by: Ge, Xueren, et al.
Published: (2023)
Protecting Your LLMs with Information Bottleneck
by: Liu, Zichuan, et al.
Published: (2024)
by: Liu, Zichuan, et al.
Published: (2024)
MultiNRC: A Challenging and Native Multilingual Reasoning Evaluation Benchmark for LLMs
by: Fabbri, Alexander R., et al.
Published: (2025)
by: Fabbri, Alexander R., et al.
Published: (2025)
Syntriever: How to Train Your Retriever with Synthetic Data from LLMs
by: Kim, Minsang, et al.
Published: (2025)
by: Kim, Minsang, et al.
Published: (2025)
This Is Your Doge, If It Please You: Exploring Deception and Robustness in Mixture of LLMs
by: Wolf, Lorenz, et al.
Published: (2025)
by: Wolf, Lorenz, et al.
Published: (2025)
One Size Does Not Fit All: Exploring Variable Thresholds for Distance-Based Multi-Label Text Classification
by: Van Nooten, Jens, et al.
Published: (2025)
by: Van Nooten, Jens, et al.
Published: (2025)
Label Distribution Learning-Enhanced Dual-KNN for Text Classification
by: Yuan, Bo, et al.
Published: (2025)
by: Yuan, Bo, et al.
Published: (2025)
Shattering the Shortcut: A Topology-Regularized Benchmark for Multi-hop Medical Reasoning in LLMs
by: Zi, Xing, et al.
Published: (2026)
by: Zi, Xing, et al.
Published: (2026)
GeoEval: Benchmark for Evaluating LLMs and Multi-Modal Models on Geometry Problem-Solving
by: Zhang, Jiaxin, et al.
Published: (2024)
by: Zhang, Jiaxin, et al.
Published: (2024)
Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA
by: Wang, Minzheng, et al.
Published: (2024)
by: Wang, Minzheng, et al.
Published: (2024)
TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenarios
by: Wei, Shaohang, et al.
Published: (2025)
by: Wei, Shaohang, et al.
Published: (2025)
Improving Task Diversity in Label Efficient Supervised Finetuning of LLMs
by: Arabelly, Abhinav, et al.
Published: (2025)
by: Arabelly, Abhinav, et al.
Published: (2025)
Do LLMs Agree on the Creativity Evaluation of Alternative Uses?
by: Rabeyah, Abdullah Al, et al.
Published: (2024)
by: Rabeyah, Abdullah Al, et al.
Published: (2024)
XCR-Bench: A Multi-Task Benchmark for Evaluating Cultural Reasoning in LLMs
by: Kabir, Mohsinul, et al.
Published: (2026)
by: Kabir, Mohsinul, et al.
Published: (2026)
Similar Items
-
LegalGuardian: A Privacy-Preserving Framework for Secure Integration of Large Language Models in Legal Practice
by: Demir, M. Mikail, et al.
Published: (2025) -
Heuristics and Biases in AI Decision-Making: Implications for Responsible AGI
by: Saeedi, Payam, et al.
Published: (2024) -
LLM-Assisted Crisis Management: Building Advanced LLM Platforms for Effective Emergency Response and Public Collaboration
by: Otal, Hakan T., et al.
Published: (2024) -
LLM Honeypot: Leveraging Large Language Models as Advanced Interactive Honeypot Systems
by: Otal, Hakan T., et al.
Published: (2024) -
Precedent-Informed Reasoning: Mitigating Overthinking in Large Reasoning Models via Test-Time Precedent Learning
by: Wang, Qianyue, et al.
Published: (2026)