:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Chen, Zirong, Sun, Xutong, Li, Yuanhe, Ma, Meiyi
Format:	Preprint
Published:	2023
Subjects:	Computation and Language Artificial Intelligence Computers and Society Machine Learning
Online Access:	https://arxiv.org/abs/2312.14185
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Evaluating Prompt Engineering Techniques for Accuracy and Confidence Elicitation in Medical LLMs
by: Naderi, Nariman, et al.
Published: (2025)

CycleResearcher: Improving Automated Research via Automated Review
by: Weng, Yixuan, et al.
Published: (2024)

Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas: A Survey
by: Deng, Chengyuan, et al.
Published: (2024)

Auto-scaling Continuous Memory for GUI Agent
by: Wu, Wenyi, et al.
Published: (2025)

Integrating LSTM and BERT for Long-Sequence Data Analysis in Intelligent Tutoring Systems
by: Li, Zhaoxing, et al.
Published: (2024)

From Perceptions to Decisions: Wildfire Evacuation Decision Prediction with Behavioral Theory-informed LLMs
by: Chen, Ruxiao, et al.
Published: (2025)

Prompt-Counterfactual Explanations for Generative AI System Behavior
by: Goethals, Sofie, et al.
Published: (2026)

Breaking Down Bias: On The Limits of Generalizable Pruning Strategies
by: Ma, Sibo, et al.
Published: (2025)

The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems
by: Ren, Richard, et al.
Published: (2025)

Improving Consistency in Retrieval-Augmented Systems with Group Similarity Rewards
by: Hamman, Faisal, et al.
Published: (2025)

LLMCO2: Advancing Accurate Carbon Footprint Prediction for LLM Inferences
by: Fu, Zhenxiao, et al.
Published: (2024)

The Compliance Gap: Why AI Systems Promise to Follow Process Instructions but Don't
by: Shin, Kwan Soo
Published: (2026)

LLM-Assisted Content Conditional Debiasing for Fair Text Embedding
by: Deng, Wenlong, et al.
Published: (2024)

The Story is Not the Science: Execution-Grounded Evaluation of Mechanistic Interpretability Research
by: Bai, Xiaoyan, et al.
Published: (2026)

E2Vec: Feature Embedding with Temporal Information for Analyzing Student Actions in E-Book Systems
by: Miyazaki, Yuma, et al.
Published: (2024)

Learning with Preserving for Continual Multitask Learning
by: Wang, Hanchen David, et al.
Published: (2025)

Leveraging Social Determinants of Health in Alzheimer's Research Using LLM-Augmented Literature Mining and Knowledge Graphs
by: Shang, Tianqi, et al.
Published: (2024)

DNAZEN: Enhanced Gene Sequence Representations via Mixed Granularities of Coding Units
by: Mao, Lei, et al.
Published: (2025)

Automated Knowledge Component Generation for Interpretable Knowledge Tracing in Coding Problems
by: Duan, Zhangqi, et al.
Published: (2025)

ConCISE: Confidence-guided Compression in Step-by-step Efficient Reasoning
by: Qiao, Ziqing, et al.
Published: (2025)

RedTopic: Toward Topic-Diverse Red Teaming of Large Language Models
by: Ding, Jiale, et al.
Published: (2025)

Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models
by: Zhang, Yue, et al.
Published: (2023)

Mitigating Bias for Question Answering Models by Tracking Bias Influence
by: Ma, Mingyu Derek, et al.
Published: (2023)

Fairness of ChatGPT
by: Li, Yunqi, et al.
Published: (2023)

Future of Work with AI Agents: Auditing Automation and Augmentation Potential across the U.S. Workforce
by: Shao, Yijia, et al.
Published: (2025)

DualAlign: Generating Clinically Grounded Synthetic Data
by: Li, Rumeng, et al.
Published: (2025)

Wikipedia in the Era of LLMs: Evolution and Risks
by: Huang, Siming, et al.
Published: (2025)

AXOLOTL: Fairness through Assisted Self-Debiasing of Large Language Model Outputs
by: Ebrahimi, Sana, et al.
Published: (2024)

BiasFreeBench: a Benchmark for Mitigating Bias in Large Language Model Responses
by: Xu, Xin, et al.
Published: (2025)

LLMCarbon: Modeling the end-to-end Carbon Footprint of Large Language Models
by: Faiz, Ahmad, et al.
Published: (2023)

Literature Meets Data: A Synergistic Approach to Hypothesis Generation
by: Liu, Haokun, et al.
Published: (2024)

EigenBench: A Comparative Behavioral Measure of Value Alignment
by: Chang, Jonathn, et al.
Published: (2025)

Empowering Many, Biasing a Few: Generalist Credit Scoring through Large Language Models
by: Feng, Duanyu, et al.
Published: (2023)

A Behavioural and Representational Evaluation of Goal-Directedness in Language Model Agents
by: Arghal, Raghu, et al.
Published: (2026)

No Culture Left Behind: ArtELingo-28, a Benchmark of WikiArt with Captions in 28 Languages
by: Mohamed, Youssef, et al.
Published: (2024)

A Multi-LLM Debiasing Framework
by: Owens, Deonna M., et al.
Published: (2024)

AutoFlow: Automated Workflow Generation for Large Language Model Agents
by: Li, Zelong, et al.
Published: (2024)

Assessing Large Language Models on Climate Information
by: Bulian, Jannis, et al.
Published: (2023)

Surveying Attitudinal Alignment Between Large Language Models Vs. Humans Towards 17 Sustainable Development Goals
by: Wu, Qingyang, et al.
Published: (2024)

Reducing Large Language Model Safety Risks in Women's Health using Semantic Entropy
by: Penny-Dimri, Jahan C., et al.
Published: (2025)