Saved in:
| Main Authors: | Wang, Xiting, Jiang, Liming, Hernandez-Orallo, Jose, Stillwell, David, Sun, Luning, Luo, Fang, Xie, Xing |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2310.16379 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Multi-agent AI systems outperform human teams in creativity
by: Hu, Tiancheng, et al.
Published: (2026)
by: Hu, Tiancheng, et al.
Published: (2026)
Augmenting Rating-Scale Measures with Text-Derived Items Using the Information-Determined Scoring (IDS) Framework
by: Watson, Joe, et al.
Published: (2025)
by: Watson, Joe, et al.
Published: (2025)
General Scales Unlock AI Evaluation with Explanatory and Predictive Power
by: Zhou, Lexin, et al.
Published: (2025)
by: Zhou, Lexin, et al.
Published: (2025)
Large Language Models show both individual and collective creativity comparable to humans
by: Sun, Luning, et al.
Published: (2024)
by: Sun, Luning, et al.
Published: (2024)
Measuring What AI Systems Might Do: Towards A Measurement Science in AI
by: Voudouris, Konstantinos, et al.
Published: (2026)
by: Voudouris, Konstantinos, et al.
Published: (2026)
Distinguishing Task-Specific and General-Purpose AI in Regulation
by: Wang, Jennifer, et al.
Published: (2025)
by: Wang, Jennifer, et al.
Published: (2025)
The Dilemma of Uncertainty Estimation for General Purpose AI in the EU AI Act
by: Valdenegro-Toro, Matias, et al.
Published: (2024)
by: Valdenegro-Toro, Matias, et al.
Published: (2024)
Effective Mitigations for Systemic Risks from General-Purpose AI
by: Uuk, Risto, et al.
Published: (2024)
by: Uuk, Risto, et al.
Published: (2024)
An AI System Evaluation Framework for Advancing AI Safety: Terminology, Taxonomy, Lifecycle Mapping
by: Xia, Boming, et al.
Published: (2024)
by: Xia, Boming, et al.
Published: (2024)
Exploring the Psychometric Validity of AI-Generated Student Responses: A Study on Virtual Personas' Learning Motivation
by: Wang, Huanxiao
Published: (2025)
by: Wang, Huanxiao
Published: (2025)
AI Slop or AI-enhancement? Student perceptions of AI-generated media for an English for Academic Purposes course
by: Woo, David James, et al.
Published: (2026)
by: Woo, David James, et al.
Published: (2026)
Understanding Student Acceptance, Trust, and Attitudes Toward AI-Generated Images for Educational Purposes
by: Pyae, Aung
Published: (2024)
by: Pyae, Aung
Published: (2024)
Designing AI-Agents with Personalities: A Psychometric Approach
by: Huang, Muhua, et al.
Published: (2024)
by: Huang, Muhua, et al.
Published: (2024)
Measuring Data Science Automation: A Survey of Evaluation Tools for AI Assistants and Agents
by: Testini, Irene, et al.
Published: (2025)
by: Testini, Irene, et al.
Published: (2025)
Unlocking Learning Potentials: The Transformative Effect of Generative AI in Education Across Grade Levels
by: Xie, Meijuan, et al.
Published: (2025)
by: Xie, Meijuan, et al.
Published: (2025)
Psychometric Personality Shaping Modulates Capabilities and Safety in Language Models
by: Fitz, Stephen, et al.
Published: (2025)
by: Fitz, Stephen, et al.
Published: (2025)
The Incomplete Bridge: How AI Research (Mis)Engages with Psychology
by: Jiang, Han, et al.
Published: (2025)
by: Jiang, Han, et al.
Published: (2025)
AI Risk-Management Standards Profile for General-Purpose AI (GPAI) and Foundation Models
by: Barrett, Anthony M., et al.
Published: (2025)
by: Barrett, Anthony M., et al.
Published: (2025)
Generative AI Purpose-built for Social and Mental Health: A Real-World Pilot
by: Hull, Thomas D., et al.
Published: (2025)
by: Hull, Thomas D., et al.
Published: (2025)
The EAP-AIAS: Adapting the AI Assessment Scale for English for Academic Purposes
by: Roe, Jasper, et al.
Published: (2024)
by: Roe, Jasper, et al.
Published: (2024)
AI Evaluation Should Require Standardized Item-Level Data Releases
by: Jiang, Han, et al.
Published: (2026)
by: Jiang, Han, et al.
Published: (2026)
Measuring Human Contribution in AI-Assisted Content Generation
by: Xie, Yueqi, et al.
Published: (2024)
by: Xie, Yueqi, et al.
Published: (2024)
Leveraging LLM-Respondents for Item Evaluation: a Psychometric Analysis
by: Liu, Yunting, et al.
Published: (2024)
by: Liu, Yunting, et al.
Published: (2024)
Risk Sources and Risk Management Measures in Support of Standards for General-Purpose AI Systems
by: Gipiškis, Rokas, et al.
Published: (2024)
by: Gipiškis, Rokas, et al.
Published: (2024)
When AI Takes the Couch: Psychometric Jailbreaks Reveal Internal Conflict in Frontier Models
by: Khadangi, Afshin, et al.
Published: (2025)
by: Khadangi, Afshin, et al.
Published: (2025)
The Case for ESM3 as a General-Purpose AI Model with Systemic Risk Under the EU AI Act
by: Qureshi, Taro, et al.
Published: (2026)
by: Qureshi, Taro, et al.
Published: (2026)
Raising the Bar: Investigating the Values of Large Language Models via Generative Evolving Testing
by: Jiang, Han, et al.
Published: (2024)
by: Jiang, Han, et al.
Published: (2024)
The Status Quo and Future of AI-TPACK for Mathematics Teacher Education Students: A Case Study in Chinese Universities
by: Xie, Meijuan, et al.
Published: (2025)
by: Xie, Meijuan, et al.
Published: (2025)
Open Problems in Machine Unlearning for AI Safety
by: Barez, Fazl, et al.
Published: (2025)
by: Barez, Fazl, et al.
Published: (2025)
Privacy and Copyright Protection in Generative AI: A Lifecycle Perspective
by: Zhang, Dawen, et al.
Published: (2023)
by: Zhang, Dawen, et al.
Published: (2023)
AI in Work-Based Learning: Understanding the Purposes and Effects of Intelligent Tools Among Student Interns
by: Miranda, John Paul P., et al.
Published: (2026)
by: Miranda, John Paul P., et al.
Published: (2026)
Responsible AI Question Bank: A Comprehensive Tool for AI Risk Assessment
by: Lee, Sung Une, et al.
Published: (2024)
by: Lee, Sung Une, et al.
Published: (2024)
Evaluating the Social Impact of Generative AI Systems in Systems and Society
by: Solaiman, Irene, et al.
Published: (2023)
by: Solaiman, Irene, et al.
Published: (2023)
A Study on the Framework for Evaluating the Ethics and Trustworthiness of Generative AI
by: Jeong, Cheonsu, et al.
Published: (2025)
by: Jeong, Cheonsu, et al.
Published: (2025)
A Qualitative Study of User Perception of M365 AI Copilot
by: Bano, Muneera, et al.
Published: (2025)
by: Bano, Muneera, et al.
Published: (2025)
Elementary School Students' and Teachers' Perceptions Towards Creative Mathematical Writing with Generative AI
by: Song, Yukyeong, et al.
Published: (2024)
by: Song, Yukyeong, et al.
Published: (2024)
Evaluating a Methodology for Increasing AI Transparency: A Case Study
by: Piorkowski, David, et al.
Published: (2022)
by: Piorkowski, David, et al.
Published: (2022)
Evaluating AI Evaluation: Perils and Prospects
by: Burden, John
Published: (2024)
by: Burden, John
Published: (2024)
Human Experts' Evaluation of Generative AI for Contextualizing STEAM Education in the Global South
by: Nyaaba, Matthew, et al.
Published: (2025)
by: Nyaaba, Matthew, et al.
Published: (2025)
Achieving Responsible AI through ESG: Insights and Recommendations from Industry Engagement
by: Perera, Harsha, et al.
Published: (2024)
by: Perera, Harsha, et al.
Published: (2024)
Similar Items
-
Multi-agent AI systems outperform human teams in creativity
by: Hu, Tiancheng, et al.
Published: (2026) -
Augmenting Rating-Scale Measures with Text-Derived Items Using the Information-Determined Scoring (IDS) Framework
by: Watson, Joe, et al.
Published: (2025) -
General Scales Unlock AI Evaluation with Explanatory and Predictive Power
by: Zhou, Lexin, et al.
Published: (2025) -
Large Language Models show both individual and collective creativity comparable to humans
by: Sun, Luning, et al.
Published: (2024) -
Measuring What AI Systems Might Do: Towards A Measurement Science in AI
by: Voudouris, Konstantinos, et al.
Published: (2026)