Saved in:
| Main Authors: | Longpre, Shayne, Kapoor, Sayash, Klyman, Kevin, Ramaswami, Ashwin, Bommasani, Rishi, Blili-Hamelin, Borhane, Huang, Yangsibo, Skowron, Aviya, Yong, Zheng-Xin, Kotha, Suhas, Zeng, Yi, Shi, Weiyan, Yang, Xianjun, Southen, Reid, Robey, Alexander, Chao, Patrick, Yang, Diyi, Jia, Ruoxi, Kang, Daniel, Pentland, Sandy, Narayanan, Arvind, Liang, Percy, Henderson, Peter |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.04893 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
The 2024 Foundation Model Transparency Index
by: Bommasani, Rishi, et al.
Published: (2024)
by: Bommasani, Rishi, et al.
Published: (2024)
Foundation Model Transparency Reports
by: Bommasani, Rishi, et al.
Published: (2024)
by: Bommasani, Rishi, et al.
Published: (2024)
The 2025 Foundation Model Transparency Index
by: Wan, Alexander, et al.
Published: (2025)
by: Wan, Alexander, et al.
Published: (2025)
Is ETHICS about ethics? Evaluating the ETHICS benchmark
by: Hancox-Li, Leif, et al.
Published: (2024)
by: Hancox-Li, Leif, et al.
Published: (2024)
Unsocial Intelligence: an Investigation of the Assumptions of AGI Discourse
by: Blili-Hamelin, Borhane, et al.
Published: (2024)
by: Blili-Hamelin, Borhane, et al.
Published: (2024)
On the Societal Impact of Open Foundation Models
by: Kapoor, Sayash, et al.
Published: (2024)
by: Kapoor, Sayash, et al.
Published: (2024)
Replaying pre-training data improves fine-tuning
by: Kotha, Suhas, et al.
Published: (2026)
by: Kotha, Suhas, et al.
Published: (2026)
Evolving AI Risk Management: A Maturity Model based on the NIST AI Risk Management Framework
by: Dotan, Ravit, et al.
Published: (2024)
by: Dotan, Ravit, et al.
Published: (2024)
Consistency in Language Models: Current Landscape, Challenges, and Future Directions
by: Novikova, Jekaterina, et al.
Published: (2025)
by: Novikova, Jekaterina, et al.
Published: (2025)
The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources
by: Longpre, Shayne, et al.
Published: (2024)
by: Longpre, Shayne, et al.
Published: (2024)
A Framework for Assurance Audits of Algorithmic Systems
by: Lam, Khoa, et al.
Published: (2024)
by: Lam, Khoa, et al.
Published: (2024)
Data Authenticity, Consent, & Provenance for AI are all broken: what will it take to fix them?
by: Longpre, Shayne, et al.
Published: (2024)
by: Longpre, Shayne, et al.
Published: (2024)
Pre-training under infinite compute
by: Kim, Konwoo, et al.
Published: (2025)
by: Kim, Konwoo, et al.
Published: (2025)
Future and AI-Ready Data Strategies: Response to DOC RFI on AI and Open Government Data Assets
by: Oderinwale, Hamidah, et al.
Published: (2024)
by: Oderinwale, Hamidah, et al.
Published: (2024)
In-House Evaluation Is Not Enough: Towards Robust Third-Party Flaw Disclosure for General-Purpose AI
by: Longpre, Shayne, et al.
Published: (2025)
by: Longpre, Shayne, et al.
Published: (2025)
Beyond Release: Access Considerations for Generative AI Systems
by: Solaiman, Irene, et al.
Published: (2025)
by: Solaiman, Irene, et al.
Published: (2025)
Trustworthy Social Bias Measurement
by: Bommasani, Rishi, et al.
Published: (2022)
by: Bommasani, Rishi, et al.
Published: (2022)
Do AI Companies Make Good on Voluntary Commitments to the White House?
by: Wang, Jennifer, et al.
Published: (2025)
by: Wang, Jennifer, et al.
Published: (2025)
Language model developers should report train-test overlap
by: Zhang, Andy K, et al.
Published: (2024)
by: Zhang, Andy K, et al.
Published: (2024)
Data-efficient pre-training by scaling synthetic megadocs
by: Kim, Konwoo, et al.
Published: (2026)
by: Kim, Konwoo, et al.
Published: (2026)
The Limits of Inference Scaling Through Resampling
by: Stroebl, Benedikt, et al.
Published: (2024)
by: Stroebl, Benedikt, et al.
Published: (2024)
Build Agent Advocates, Not Platform Agents
by: Kapoor, Sayash, et al.
Published: (2025)
by: Kapoor, Sayash, et al.
Published: (2025)
Promises and pitfalls of artificial intelligence for legal applications
by: Kapoor, Sayash, et al.
Published: (2024)
by: Kapoor, Sayash, et al.
Published: (2024)
Testing the Limits of Jailbreaking Defenses with the Purple Problem
by: Kim, Taeyoun, et al.
Published: (2024)
by: Kim, Taeyoun, et al.
Published: (2024)
A Systematic Review of NeurIPS Dataset Management Practices
by: Wu, Yiwei, et al.
Published: (2024)
by: Wu, Yiwei, et al.
Published: (2024)
A Different Approach to AI Safety: Proceedings from the Columbia Convening on Openness in Artificial Intelligence and AI Safety
by: François, Camille, et al.
Published: (2025)
by: François, Camille, et al.
Published: (2025)
How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs
by: Zeng, Yi, et al.
Published: (2024)
by: Zeng, Yi, et al.
Published: (2024)
The Leaderboard Illusion
by: Singh, Shivalika, et al.
Published: (2025)
by: Singh, Shivalika, et al.
Published: (2025)
Understanding Catastrophic Forgetting in Language Models via Implicit Inference
by: Kotha, Suhas, et al.
Published: (2023)
by: Kotha, Suhas, et al.
Published: (2023)
AI-Powered Autonomous Weapons Risk Geopolitical Instability and Threaten AI Research
by: Simmons-Edler, Riley, et al.
Published: (2024)
by: Simmons-Edler, Riley, et al.
Published: (2024)
ATLAS: Adaptive Transfer Scaling Laws for Multilingual Pretraining, Finetuning, and Decoding the Curse of Multilinguality
by: Longpre, Shayne, et al.
Published: (2025)
by: Longpre, Shayne, et al.
Published: (2025)
Temporal fingerprints: Identity matching across fully encrypted domain
by: Somin, Shahar, et al.
Published: (2024)
by: Somin, Shahar, et al.
Published: (2024)
AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies
by: Zeng, Yi, et al.
Published: (2024)
by: Zeng, Yi, et al.
Published: (2024)
Measuring risks inherent to our digital economies using Amazon purchase histories from US consumers
by: Berke, Alex, et al.
Published: (2025)
by: Berke, Alex, et al.
Published: (2025)
AI Agents That Matter
by: Kapoor, Sayash, et al.
Published: (2024)
by: Kapoor, Sayash, et al.
Published: (2024)
CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark
by: Siegel, Zachary S., et al.
Published: (2024)
by: Siegel, Zachary S., et al.
Published: (2024)
Provably Bounding Neural Network Preimages
by: Kotha, Suhas, et al.
Published: (2023)
by: Kotha, Suhas, et al.
Published: (2023)
SpecEval: Evaluating Model Adherence to Behavior Specifications
by: Ahmed, Ahmed, et al.
Published: (2025)
by: Ahmed, Ahmed, et al.
Published: (2025)
Social Intelligence Data Infrastructure: Structuring the Present and Navigating the Future
by: Li, Minzhi, et al.
Published: (2024)
by: Li, Minzhi, et al.
Published: (2024)
Libraries Serving the CSIR Complex.
by: Rajagopalan, T. S., et al.
Published: (1970)
by: Rajagopalan, T. S., et al.
Published: (1970)
Similar Items
-
The 2024 Foundation Model Transparency Index
by: Bommasani, Rishi, et al.
Published: (2024) -
Foundation Model Transparency Reports
by: Bommasani, Rishi, et al.
Published: (2024) -
The 2025 Foundation Model Transparency Index
by: Wan, Alexander, et al.
Published: (2025) -
Is ETHICS about ethics? Evaluating the ETHICS benchmark
by: Hancox-Li, Leif, et al.
Published: (2024) -
Unsocial Intelligence: an Investigation of the Assumptions of AGI Discourse
by: Blili-Hamelin, Borhane, et al.
Published: (2024)