:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Longpre, Shayne, Kapoor, Sayash, Klyman, Kevin, Ramaswami, Ashwin, Bommasani, Rishi, Blili-Hamelin, Borhane, Huang, Yangsibo, Skowron, Aviya, Yong, Zheng-Xin, Kotha, Suhas, Zeng, Yi, Shi, Weiyan, Yang, Xianjun, Southen, Reid, Robey, Alexander, Chao, Patrick, Yang, Diyi, Jia, Ruoxi, Kang, Daniel, Pentland, Sandy, Narayanan, Arvind, Liang, Percy, Henderson, Peter
Format:	Preprint
Published:	2024
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2403.04893
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

The 2024 Foundation Model Transparency Index
by: Bommasani, Rishi, et al.
Published: (2024)

Foundation Model Transparency Reports
by: Bommasani, Rishi, et al.
Published: (2024)

The 2025 Foundation Model Transparency Index
by: Wan, Alexander, et al.
Published: (2025)

Is ETHICS about ethics? Evaluating the ETHICS benchmark
by: Hancox-Li, Leif, et al.
Published: (2024)

Unsocial Intelligence: an Investigation of the Assumptions of AGI Discourse
by: Blili-Hamelin, Borhane, et al.
Published: (2024)

On the Societal Impact of Open Foundation Models
by: Kapoor, Sayash, et al.
Published: (2024)

Replaying pre-training data improves fine-tuning
by: Kotha, Suhas, et al.
Published: (2026)

Evolving AI Risk Management: A Maturity Model based on the NIST AI Risk Management Framework
by: Dotan, Ravit, et al.
Published: (2024)

Consistency in Language Models: Current Landscape, Challenges, and Future Directions
by: Novikova, Jekaterina, et al.
Published: (2025)

The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources
by: Longpre, Shayne, et al.
Published: (2024)

A Framework for Assurance Audits of Algorithmic Systems
by: Lam, Khoa, et al.
Published: (2024)

Data Authenticity, Consent, & Provenance for AI are all broken: what will it take to fix them?
by: Longpre, Shayne, et al.
Published: (2024)

Pre-training under infinite compute
by: Kim, Konwoo, et al.
Published: (2025)

Future and AI-Ready Data Strategies: Response to DOC RFI on AI and Open Government Data Assets
by: Oderinwale, Hamidah, et al.
Published: (2024)

In-House Evaluation Is Not Enough: Towards Robust Third-Party Flaw Disclosure for General-Purpose AI
by: Longpre, Shayne, et al.
Published: (2025)

Beyond Release: Access Considerations for Generative AI Systems
by: Solaiman, Irene, et al.
Published: (2025)

Trustworthy Social Bias Measurement
by: Bommasani, Rishi, et al.
Published: (2022)

Do AI Companies Make Good on Voluntary Commitments to the White House?
by: Wang, Jennifer, et al.
Published: (2025)

Language model developers should report train-test overlap
by: Zhang, Andy K, et al.
Published: (2024)

Data-efficient pre-training by scaling synthetic megadocs
by: Kim, Konwoo, et al.
Published: (2026)

The Limits of Inference Scaling Through Resampling
by: Stroebl, Benedikt, et al.
Published: (2024)

Build Agent Advocates, Not Platform Agents
by: Kapoor, Sayash, et al.
Published: (2025)

Promises and pitfalls of artificial intelligence for legal applications
by: Kapoor, Sayash, et al.
Published: (2024)

Testing the Limits of Jailbreaking Defenses with the Purple Problem
by: Kim, Taeyoun, et al.
Published: (2024)

A Systematic Review of NeurIPS Dataset Management Practices
by: Wu, Yiwei, et al.
Published: (2024)

A Different Approach to AI Safety: Proceedings from the Columbia Convening on Openness in Artificial Intelligence and AI Safety
by: François, Camille, et al.
Published: (2025)

How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs
by: Zeng, Yi, et al.
Published: (2024)

The Leaderboard Illusion
by: Singh, Shivalika, et al.
Published: (2025)

Understanding Catastrophic Forgetting in Language Models via Implicit Inference
by: Kotha, Suhas, et al.
Published: (2023)

AI-Powered Autonomous Weapons Risk Geopolitical Instability and Threaten AI Research
by: Simmons-Edler, Riley, et al.
Published: (2024)

ATLAS: Adaptive Transfer Scaling Laws for Multilingual Pretraining, Finetuning, and Decoding the Curse of Multilinguality
by: Longpre, Shayne, et al.
Published: (2025)

Temporal fingerprints: Identity matching across fully encrypted domain
by: Somin, Shahar, et al.
Published: (2024)

AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies
by: Zeng, Yi, et al.
Published: (2024)

Measuring risks inherent to our digital economies using Amazon purchase histories from US consumers
by: Berke, Alex, et al.
Published: (2025)

AI Agents That Matter
by: Kapoor, Sayash, et al.
Published: (2024)

CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark
by: Siegel, Zachary S., et al.
Published: (2024)

Provably Bounding Neural Network Preimages
by: Kotha, Suhas, et al.
Published: (2023)

SpecEval: Evaluating Model Adherence to Behavior Specifications
by: Ahmed, Ahmed, et al.
Published: (2025)

Social Intelligence Data Infrastructure: Structuring the Present and Navigating the Future
by: Li, Minzhi, et al.
Published: (2024)

Libraries Serving the CSIR Complex.
by: Rajagopalan, T. S., et al.
Published: (1970)