Saved in:
Similar Items
Tamper-Resistant Safeguards for Open-Weight LLMs
by: Tamirisa, Rishub, et al.
Published: (2024)
by: Tamirisa, Rishub, et al.
Published: (2024)
Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs
by: Mazeika, Mantas, et al.
Published: (2025)
by: Mazeika, Mantas, et al.
Published: (2025)
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?
by: Ren, Richard, et al.
Published: (2024)
by: Ren, Richard, et al.
Published: (2024)
Foundation models may exhibit staged progression in novel CBRN threat disclosure
by: Esvelt, Kevin M
Published: (2025)
by: Esvelt, Kevin M
Published: (2025)
The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems
by: Ren, Richard, et al.
Published: (2025)
by: Ren, Richard, et al.
Published: (2025)
FedSelect: Customized Selection of Parameters for Fine-Tuning during Personalized Federated Learning
by: Tamirisa, Rishub, et al.
Published: (2023)
by: Tamirisa, Rishub, et al.
Published: (2023)
Reducing Political Manipulation with Consistency Training
by: Phan, Long, et al.
Published: (2026)
by: Phan, Long, et al.
Published: (2026)
TextQuests: How Good are LLMs at Text-Based Video Games?
by: Phan, Long, et al.
Published: (2025)
by: Phan, Long, et al.
Published: (2025)
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
by: Mazeika, Mantas, et al.
Published: (2024)
by: Mazeika, Mantas, et al.
Published: (2024)
FedSelect: Personalized Federated Learning with Customized Selection of Parameters for Fine-Tuning
by: Tamirisa, Rishub, et al.
Published: (2024)
by: Tamirisa, Rishub, et al.
Published: (2024)
Aggressive Compression Enables LLM Weight Theft
by: Brown, Davis, et al.
Published: (2026)
by: Brown, Davis, et al.
Published: (2026)
Thin Bridges for Drug Text Alignment: Lightweight Contrastive Learning for Target Specific Drug Retrieval
by: Tupakula, Mallikarjuna
Published: (2025)
by: Tupakula, Mallikarjuna
Published: (2025)
COVID-19 Vaccines in the Pediatric Population: A Focus on Cardiac Patients
by: Ghena Lababidi, et al.
Published: (2024)
by: Ghena Lababidi, et al.
Published: (2024)
Politikos mokslo romantizmas
by: Alvydas Jokubaitis
Published: (2019)
by: Alvydas Jokubaitis
Published: (2019)
Mokslinės politinės filosofijos virtimas spekuliatyvia istorijos filosofija
by: Linas Jokubaitis
Published: (2020)
by: Linas Jokubaitis
Published: (2020)
Immanuelis Kantas ir politikos tikrumo klausimas
by: Alvydas Jokubaitis
Published: (2021)
by: Alvydas Jokubaitis
Published: (2021)
Politinė Stasio Šalkauskio kultūros filosofijos prasmė
by: Alvydas Jokubaitis
Published: (2020)
by: Alvydas Jokubaitis
Published: (2020)
Moralumo iššūkis Carlo Schmitto politiškumo sampratai
by: Alvydas Jokubaitis
Published: (2020)
by: Alvydas Jokubaitis
Published: (2020)
Immanuelio Kanto iššūkis politikos mokslui
by: Alvydas Jokubaitis
Published: (2021)
by: Alvydas Jokubaitis
Published: (2021)
Kas yra politika?
by: Alvydas Jokubaitis
Published: (2022)
by: Alvydas Jokubaitis
Published: (2022)
EnigmaEval: A Benchmark of Long Multimodal Reasoning Challenges
by: Wang, Clinton J., et al.
Published: (2025)
by: Wang, Clinton J., et al.
Published: (2025)
Reasons to Doubt the Impact of AI Risk Evaluations
by: Mukobi, Gabriel
Published: (2024)
by: Mukobi, Gabriel
Published: (2024)
Perception-Driven Bias Detection in Machine Learning via Crowdsourced Visual Judgment
by: Tupakula, Chirudeep, et al.
Published: (2025)
by: Tupakula, Chirudeep, et al.
Published: (2025)
Bare Feet in the Ballroom: The First Demonstration in Australia of Dalcroze Eurhythmics, 1919
by: Oam, Joan Pope
Published: (2020)
by: Oam, Joan Pope
Published: (2020)
Jet-Density of Finite-Gap Solutions for Classes of BKM Systems
by: Quaschner, Manuel, et al.
Published: (2026)
by: Quaschner, Manuel, et al.
Published: (2026)
Conocimiento y método en Descartes, Pascal y Leibniz
by: Josep M. Basart Muñoz
Published: (2004)
by: Josep M. Basart Muñoz
Published: (2004)
Superintelligence Strategy: Expert Version
by: Hendrycks, Dan, et al.
Published: (2025)
by: Hendrycks, Dan, et al.
Published: (2025)
The effect of confession evidence on conviction, and considering alternative scenarios as remedy in a sample of police officers
by: Neville Niccolson, et al.
Published: (2024)
by: Neville Niccolson, et al.
Published: (2024)
Complexity Classification of Product State Problems for Local Hamiltonians
by: Kallaugher, John, et al.
Published: (2024)
by: Kallaugher, John, et al.
Published: (2024)
Private equity acquisitions and product market decisions: Evidence from trademarks
by: Moazzam Khoja
Published: (2025)
by: Moazzam Khoja
Published: (2025)
Variationality of conformal geodesics in dimension 3
by: Kruglikov, Boris, et al.
Published: (2024)
by: Kruglikov, Boris, et al.
Published: (2024)
Leveraging Quantum Computing for Accelerated Classical Algorithms in Power Systems Optimization
by: Barrass, Rosemary, et al.
Published: (2025)
by: Barrass, Rosemary, et al.
Published: (2025)
On globally invariant Euler--Lagrange equations for curves
by: Kruglikov, Boris, et al.
Published: (2026)
by: Kruglikov, Boris, et al.
Published: (2026)
Cursos de Administração: a dimensão pública como sujeito excluído
by: Agatha Justen
Published: (2015)
by: Agatha Justen
Published: (2015)
A construção social do caricaturista na Primeira República: a imprensa ilustrada nos domínios da arte, da política e da intelectualidade
by: Janine Justen
Published: (2020)
by: Janine Justen
Published: (2020)
LLMs Outperform Experts on Challenging Biology Benchmarks
by: Justen, Lennart
Published: (2025)
by: Justen, Lennart
Published: (2025)
Decisão diante de conflitos bioéticos e formação em odontologia
by: Michelli Justen
Published: (2021)
by: Michelli Justen
Published: (2021)
Reprimarização, política pública do trabalho e superexploração no Brasil: revisitando Ruy Mauro Marini
by: Agatha Justen
Published: (2023)
by: Agatha Justen
Published: (2023)
Introduction to AI Safety, Ethics, and Society
by: Hendrycks, Dan
Published: (2024)
by: Hendrycks, Dan
Published: (2024)
Introduction to AI Safety, Ethics, and Society
by: Hendrycks, Dan
Published: (2024)
by: Hendrycks, Dan
Published: (2024)
Similar Items
-
Tamper-Resistant Safeguards for Open-Weight LLMs
by: Tamirisa, Rishub, et al.
Published: (2024) -
Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs
by: Mazeika, Mantas, et al.
Published: (2025) -
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?
by: Ren, Richard, et al.
Published: (2024) -
Foundation models may exhibit staged progression in novel CBRN threat disclosure
by: Esvelt, Kevin M
Published: (2025) -
The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems
by: Ren, Richard, et al.
Published: (2025)