:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Charnock, Jacob, Tlaie, Alejandro, O'Brien, Kyle, Casper, Stephen, Homewood, Aidan
Format:	Preprint
Veröffentlicht:	2026
Schlagworte:	Computers and Society
Online-Zugang:	https://arxiv.org/abs/2601.11916
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

Securing External Deeper-than-black-box GPAI Evaluations
von: Tlaie, Alejandro, et al.
Veröffentlicht: (2025)

Using AI Alignment Theory to understand the potential pitfalls of regulatory frameworks
von: Tlaie, Alejandro
Veröffentlicht: (2024)

What Should Frontier AI Developers Disclose About Internal Deployments?
von: Charnock, Jacob, et al.
Veröffentlicht: (2026)

Assessing confidence in frontier AI safety cases
von: Barrett, Stephen, et al.
Veröffentlicht: (2025)

A Blueprint for an EU Ecosystem of Secure, Deep and External AI Audits
von: Tlaie, Alejandro
Veröffentlicht: (2025)

Audit Cards: Contextualizing AI Evaluations
von: Staufer, Leon, et al.
Veröffentlicht: (2025)

A Methodology for Quantitative AI Risk Modeling
von: Murray, Malcolm, et al.
Veröffentlicht: (2025)

The AI Risk Spectrum: From Dangerous Capabilities to Existential Threats
von: Grey, Markov, et al.
Veröffentlicht: (2025)

Coordinated Disclosure of Dual-Use Capabilities: An Early Warning System for Advanced AI
von: O'Brien, Joe, et al.
Veröffentlicht: (2024)

The Role of Risk Modeling in Advanced AI Risk Management
von: Touzet, Chloé, et al.
Veröffentlicht: (2025)

Asymmetry by Design: Boosting Cyber Defenders with Differential Access to AI
von: Ee, Shaun, et al.
Veröffentlicht: (2025)

Frontier Lag: A Bibliometric Audit of Capability Misrepresentation in Academic AI Evaluation
von: Gringras, David, et al.
Veröffentlicht: (2026)

Exploring and steering the moral compass of Large Language Models
von: Tlaie, Alejandro
Veröffentlicht: (2024)

Pitfalls of Evidence-Based AI Policy
von: Casper, Stephen, et al.
Veröffentlicht: (2025)

Frontier AI Auditing: Toward Rigorous Third-Party Assessment of Safety and Security Practices at Leading AI Companies
von: Brundage, Miles, et al.
Veröffentlicht: (2026)

Risk Reporting for Developers' Internal AI Model Use
von: Delaney, Oscar, et al.
Veröffentlicht: (2026)

Practical Principles for AI Cost and Compute Accounting
von: Casper, Stephen, et al.
Veröffentlicht: (2025)

Randomness, Not Representation: The Unreliability of Evaluating Cultural Alignment in LLMs
von: Khan, Ariba, et al.
Veröffentlicht: (2025)

Third-party compliance reviews for frontier AI safety frameworks
von: Homewood, Aidan, et al.
Veröffentlicht: (2025)

Evaluating AI Providers' Frontier Safety Frameworks
von: Stelling, Lily, et al.
Veröffentlicht: (2025)

Evaluating Frontier Models for Dangerous Capabilities
von: Phuong, Mary, et al.
Veröffentlicht: (2024)

Technical Requirements for Halting Dangerous AI Activities
von: Barnett, Peter, et al.
Veröffentlicht: (2025)

Manipulation and the AI Act: Large Language Model Chatbots and the Danger of Mirrors
von: Krook, Joshua
Veröffentlicht: (2025)

Limits of Safe AI Deployment: Differentiating Oversight and Control
von: Manheim, David, et al.
Veröffentlicht: (2025)

Catastrophic Liability: Managing Systemic Risks in Frontier AI Development
von: Kierans, Aidan, et al.
Veröffentlicht: (2025)

Frontier AI Ethics: Anticipating and Evaluating the Societal Impacts of Language Model Agents
von: Lazar, Seth
Veröffentlicht: (2024)

'Teens Need to Be Educated on the Danger': Digital Access, Online Risks, and Safety Practices Among Nigerian Adolescents
von: Oguine, Munachimso B., et al.
Veröffentlicht: (2025)

The Safety Gap Toolkit: Evaluating Hidden Dangers of Open-Source Models
von: Dombrowski, Ann-Kathrin, et al.
Veröffentlicht: (2025)

Toward Quantitative Modeling of Cybersecurity Risks Due to AI Misuse
von: Barrett, Steve, et al.
Veröffentlicht: (2025)

The Evaluation Differential: When Frontier AI Models Recognise They Are Being Tested
von: Vishwarupe, Varad, et al.
Veröffentlicht: (2026)

Sabotage Evaluations for Frontier Models
von: Benton, Joe, et al.
Veröffentlicht: (2024)

Expert Survey: AI Reliability & Security Research Priorities
von: O'Brien, Joe, et al.
Veröffentlicht: (2025)

The California Report on Frontier AI Policy
von: Bommasani, Rishi, et al.
Veröffentlicht: (2025)

Open Problems in Frontier AI Risk Management
von: Ziosi, Marta, et al.
Veröffentlicht: (2026)

Assurance of Frontier AI Built for National Security
von: Pistillo, Matteo, et al.
Veröffentlicht: (2025)

Black-Box Access is Insufficient for Rigorous AI Audits
von: Casper, Stephen, et al.
Veröffentlicht: (2024)

Adapting cybersecurity frameworks to manage frontier AI risks: A defense-in-depth approach
von: Ee, Shaun, et al.
Veröffentlicht: (2024)

Trends in Frontier AI Model Count: A Forecast to 2028
von: Kumar, Iyngkarran, et al.
Veröffentlicht: (2025)

Towards Safe Multilingual Frontier AI
von: Kanepajs, Artūrs, et al.
Veröffentlicht: (2024)

Biothreat Benchmark Generation Framework for Evaluating Frontier AI Models I: The Task-Query Architecture
von: Ackerman, Gary, et al.
Veröffentlicht: (2025)