Guardado en:
| Autores principales: | Mustafa, Akram, Naseem, Usman, Azghadi, Mostafa Rahimi |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2507.03001 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Can Reasoning LLMs Enhance Clinical Document Classification?
por: Mustafa, Akram, et al.
Publicado: (2025)
por: Mustafa, Akram, et al.
Publicado: (2025)
Fairness Evaluation and Inference Level Mitigation in LLMs
por: Nadeem, Afrozah, et al.
Publicado: (2025)
por: Nadeem, Afrozah, et al.
Publicado: (2025)
Bias Beyond Borders: Political Ideology Evaluation and Steering in Multilingual LLMs
por: Nadeem, Afrozah, et al.
Publicado: (2026)
por: Nadeem, Afrozah, et al.
Publicado: (2026)
Steering Towards Fairness: Mitigating Political Bias in LLMs
por: Nadeem, Afrozah, et al.
Publicado: (2025)
por: Nadeem, Afrozah, et al.
Publicado: (2025)
Framing Political Bias in Multilingual LLMs Across Pakistani Languages
por: Nadeem, Afrozah, et al.
Publicado: (2025)
por: Nadeem, Afrozah, et al.
Publicado: (2025)
MRG-R1: Reinforcement Learning for Clinically Aligned Medical Report Generation
por: Wang, Pengyu, et al.
Publicado: (2025)
por: Wang, Pengyu, et al.
Publicado: (2025)
Kardia-R1: Unleashing LLMs to Reason toward Understanding and Empathy for Emotional Support via Rubric-as-Judge Reinforcement Learning
por: Yuan, Jiahao, et al.
Publicado: (2025)
por: Yuan, Jiahao, et al.
Publicado: (2025)
Reversal of Thought: Enhancing Large Language Models with Preference-Guided Reverse Reasoning Warm-up
por: Yuan, Jiahao, et al.
Publicado: (2024)
por: Yuan, Jiahao, et al.
Publicado: (2024)
Flick: Few Labels Text Classification using K-Aware Intermediate Learning in Multi-Task Low-Resource Languages
por: Almutairi, Ali, et al.
Publicado: (2025)
por: Almutairi, Ali, et al.
Publicado: (2025)
Analyzing Political Bias in LLMs via Target-Oriented Sentiment Classification
por: Elbouanani, Akram, et al.
Publicado: (2025)
por: Elbouanani, Akram, et al.
Publicado: (2025)
From Guidelines to Guarantees: A Graph-Based Evaluation Harness for Domain-Specific Evaluation of LLMs
por: Lundin, Jessica M., et al.
Publicado: (2025)
por: Lundin, Jessica M., et al.
Publicado: (2025)
Connecting the Dots: Evaluating Abstract Reasoning Capabilities of LLMs Using the New York Times Connections Word Game
por: Samadarshi, Prisha, et al.
Publicado: (2024)
por: Samadarshi, Prisha, et al.
Publicado: (2024)
Analyzing mixed construction and demolition waste in material recovery facilities: evolution, challenges, and applications of computer vision and deep learning
por: Langley, Adrian, et al.
Publicado: (2024)
por: Langley, Adrian, et al.
Publicado: (2024)
A Scalable Entity-Based Framework for Auditing Bias in LLMs
por: Elbouanani, Akram, et al.
Publicado: (2026)
por: Elbouanani, Akram, et al.
Publicado: (2026)
VITAL: A New Dataset for Benchmarking Pluralistic Alignment in Healthcare
por: Shetty, Anudeex, et al.
Publicado: (2025)
por: Shetty, Anudeex, et al.
Publicado: (2025)
LLMs on a Budget? Say HOLA
por: Siddiqui, Zohaib Hasan, et al.
Publicado: (2025)
por: Siddiqui, Zohaib Hasan, et al.
Publicado: (2025)
Evaluating Multimodal Large Language Models on Educational Textbook Question Answering
por: Alawwad, Hessa A., et al.
Publicado: (2025)
por: Alawwad, Hessa A., et al.
Publicado: (2025)
Do Personality Traits Interfere? Geometric Limitations of Steering in Large Language Models
por: Bhandari, Pranav, et al.
Publicado: (2026)
por: Bhandari, Pranav, et al.
Publicado: (2026)
Enhancing textual textbook question answering with large language models and retrieval augmented generation
por: Alawwad, Hessa Abdulrahman, et al.
Publicado: (2024)
por: Alawwad, Hessa Abdulrahman, et al.
Publicado: (2024)
CEA-LIST at CheckThat! 2025: Evaluating LLMs as Detectors of Bias and Opinion in Text
por: Elbouanani, Akram, et al.
Publicado: (2025)
por: Elbouanani, Akram, et al.
Publicado: (2025)
Evaluating Personality Traits in Large Language Models: Insights from Psychological Questionnaires
por: Bhandari, Pranav, et al.
Publicado: (2025)
por: Bhandari, Pranav, et al.
Publicado: (2025)
Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning
por: Wang, Haozhe, et al.
Publicado: (2025)
por: Wang, Haozhe, et al.
Publicado: (2025)
Pluralistic Alignment for Healthcare: A Role-Driven Framework
por: Zhong, Jiayou, et al.
Publicado: (2025)
por: Zhong, Jiayou, et al.
Publicado: (2025)
ReflectDiffu:Reflect between Emotion-intent Contagion and Mimicry for Empathetic Response Generation via a RL-Diffusion Framework
por: Yuan, Jiahao, et al.
Publicado: (2024)
por: Yuan, Jiahao, et al.
Publicado: (2024)
Medical Question Summarization with Entity-driven Contrastive Learning
por: Lu, Wenpeng, et al.
Publicado: (2023)
por: Lu, Wenpeng, et al.
Publicado: (2023)
HiBench: Benchmarking LLMs Capability on Hierarchical Structure Reasoning
por: Jiang, Zhuohang, et al.
Publicado: (2025)
por: Jiang, Zhuohang, et al.
Publicado: (2025)
Modeling and Optimizing User Preferences in AI Copilots: A Comprehensive Survey and Taxonomy
por: Afzoon, Saleh, et al.
Publicado: (2025)
por: Afzoon, Saleh, et al.
Publicado: (2025)
Fine-Tuning LLMs for Reliable Medical Question-Answering Services
por: Anaissi, Ali, et al.
Publicado: (2024)
por: Anaissi, Ali, et al.
Publicado: (2024)
Reasoning or Not? A Comprehensive Evaluation of Reasoning LLMs for Dialogue Summarization
por: Jin, Keyan, et al.
Publicado: (2025)
por: Jin, Keyan, et al.
Publicado: (2025)
Jailbreak Detection in Clinical Training LLMs Using Feature-Based Predictive Models
por: Nguyen, Tri, et al.
Publicado: (2025)
por: Nguyen, Tri, et al.
Publicado: (2025)
MSynFD: Multi-hop Syntax aware Fake News Detection
por: Xiao, Liang, et al.
Publicado: (2024)
por: Xiao, Liang, et al.
Publicado: (2024)
VISPA: Pluralistic Alignment via Automatic Value Selection and Activation
por: Zheng, Shenyan, et al.
Publicado: (2026)
por: Zheng, Shenyan, et al.
Publicado: (2026)
Generating Hierarchical JSON Representations of Scientific Sentences Using LLMs
por: Nimmagadda, Satya Sri Rajiteswari, et al.
Publicado: (2026)
por: Nimmagadda, Satya Sri Rajiteswari, et al.
Publicado: (2026)
AFFormer: Adaptive Feature Fusion Transformer for V2X Cooperative Perception under Channel Impairments
por: Zhou, Xi, et al.
Publicado: (2026)
por: Zhou, Xi, et al.
Publicado: (2026)
MM-Eval: A Hierarchical Benchmark for Modern Mongolian Evaluation in LLMs
por: Zhang, Mengyuan, et al.
Publicado: (2024)
por: Zhang, Mengyuan, et al.
Publicado: (2024)
Autonomous Evaluation of LLMs for Truth Maintenance and Reasoning Tasks
por: Karia, Rushang, et al.
Publicado: (2024)
por: Karia, Rushang, et al.
Publicado: (2024)
Exploring Large Language Models and Hierarchical Frameworks for Classification of Large Unstructured Legal Documents
por: Prasad, Nishchal, et al.
Publicado: (2024)
por: Prasad, Nishchal, et al.
Publicado: (2024)
MEDEQUALQA: Evaluating Biases in LLMs with Counterfactual Reasoning
por: Ghosh, Rajarshi, et al.
Publicado: (2025)
por: Ghosh, Rajarshi, et al.
Publicado: (2025)
Truth, Trust, and Trouble: Medical AI on the Edge
por: Azeez, Mohammad Anas, et al.
Publicado: (2025)
por: Azeez, Mohammad Anas, et al.
Publicado: (2025)
When Reasoning Hurts: Source-Aware Evaluation of Frontier LLMs for Clinical SOAP Note Generation
por: Faisal, Faizan
Publicado: (2026)
por: Faisal, Faizan
Publicado: (2026)
Ejemplares similares
-
Can Reasoning LLMs Enhance Clinical Document Classification?
por: Mustafa, Akram, et al.
Publicado: (2025) -
Fairness Evaluation and Inference Level Mitigation in LLMs
por: Nadeem, Afrozah, et al.
Publicado: (2025) -
Bias Beyond Borders: Political Ideology Evaluation and Steering in Multilingual LLMs
por: Nadeem, Afrozah, et al.
Publicado: (2026) -
Steering Towards Fairness: Mitigating Political Bias in LLMs
por: Nadeem, Afrozah, et al.
Publicado: (2025) -
Framing Political Bias in Multilingual LLMs Across Pakistani Languages
por: Nadeem, Afrozah, et al.
Publicado: (2025)