Saved in:
| Main Authors: | Gardner-Challis, Nelson, Bostock, Jonathan, Kozhevnikov, Georgiy, Sinclaire, Morgan, Velja, Joan, Abate, Alessandro, Griffin, Charlie |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.20628 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Studying Cross-cluster Modularity in Neural Networks
by: Golechha, Satvik, et al.
Published: (2025)
by: Golechha, Satvik, et al.
Published: (2025)
Games for AI Control: Models of Safety Evaluations of AI Deployment Protocols
by: Griffin, Charlie, et al.
Published: (2024)
by: Griffin, Charlie, et al.
Published: (2024)
Subversion Strategy Eval: Can language models statelessly strategize to subvert control protocols?
by: Mallen, Alex, et al.
Published: (2024)
by: Mallen, Alex, et al.
Published: (2024)
Who can we trust? LLM-as-a-jury for Comparative Assessment
by: Qian, Mengjie, et al.
Published: (2026)
by: Qian, Mengjie, et al.
Published: (2026)
When AI reviews science: Can we trust the referee?
by: Wang, Jialiang, et al.
Published: (2026)
by: Wang, Jialiang, et al.
Published: (2026)
An alignment safety case sketch based on debate
by: Buhl, Marie Davidsen, et al.
Published: (2025)
by: Buhl, Marie Davidsen, et al.
Published: (2025)
A sketch of an AI control safety case
by: Korbak, Tomek, et al.
Published: (2025)
by: Korbak, Tomek, et al.
Published: (2025)
Entanglement witnessing with untrusted detectors
by: Viola, Giuseppe, et al.
Published: (2023)
by: Viola, Giuseppe, et al.
Published: (2023)
How can we trust opaque systems? Criteria for robust explanations in XAI
by: Boge, Florian J., et al.
Published: (2025)
by: Boge, Florian J., et al.
Published: (2025)
Peripheral central venous pressure in the external jugular vein: can we trust it?
by: Fulvio Nisi, et al.
Published: (2025)
by: Fulvio Nisi, et al.
Published: (2025)
PREHISTORIA DEL PERÍODO FORMATIVO EN LA CUENCA ALTA DEL RÍO SALADO (REGIÓN DEL LOA SUPERIOR)
by: Carole Sinclaire
Published: (2004)
by: Carole Sinclaire
Published: (2004)
'Explaining RL Decisions with Trajectories': A Reproducibility Study
by: Sadek, Karim Abdel, et al.
Published: (2024)
by: Sadek, Karim Abdel, et al.
Published: (2024)
The protection of "state overseas interests" in China's foreign policy strategy: conceptual, legal and expert dimensions of the discussion
by: Sizov, Georgiy A.
Published: (2024)
by: Sizov, Georgiy A.
Published: (2024)
Federated learning with differential privacy and an untrusted aggregator
by: Liu, Kunlong, et al.
Published: (2023)
by: Liu, Kunlong, et al.
Published: (2023)
Joint-measurability and quantum communication with untrusted devices
by: Masini, Michele, et al.
Published: (2024)
by: Masini, Michele, et al.
Published: (2024)
Self-testing with untrusted random number generators
by: Morán, Moisés Bermejo, et al.
Published: (2026)
by: Morán, Moisés Bermejo, et al.
Published: (2026)
Convergence to collusion in algorithmic pricing
by: Frick, Kevin Michael
Published: (2026)
by: Frick, Kevin Michael
Published: (2026)
Detecting collusion in procurement auctions
by: Efimov, Konstantin D.
Published: (2024)
by: Efimov, Konstantin D.
Published: (2024)
Caracterización de lavas vítreas de fuentes y sitios arqueológicos del Formativo Temprano en la Subárea Circumpuneña: Resultados preliminares y proyecciones para la prehistoria atacameña
by: Carole Sinclaire A
Published: (2004)
by: Carole Sinclaire A
Published: (2004)
SUBMICROSECOND ATMOSPHERIC ELECTRIC DISCHARGE FROM THE NON-UNIFORM ELECTRODE (TIP) TOWARDS THE PLANE ELECTRODE
by: Vasily Y. Kozhevnikov
Published: (2019)
by: Vasily Y. Kozhevnikov
Published: (2019)
Physical nature of 'anomalous' electrons in high-current vacuum diodes
by: Vasily Y. Kozhevnikov
Published: (2021)
by: Vasily Y. Kozhevnikov
Published: (2021)
Measuring the precise photometric period of the probable intermediate polar 1RXS J014549.6+514314 based on extensive photometry
by: Kozhevnikov, V. P.
Published: (2025)
by: Kozhevnikov, V. P.
Published: (2025)
Kinetic simulation of vacuum plasma expansion beyond the "plasma approximation"
by: Vasily Y. Kozhevnikov
Published: (2022)
by: Vasily Y. Kozhevnikov
Published: (2022)
Discovery of eclipses in the cataclysmic variable LAMOST J035913.61+405035.0
by: Kozhevnikov, V. P.
Published: (2024)
by: Kozhevnikov, V. P.
Published: (2024)
Detection of Eclipses in the Cataclysmic Variable LAMOST J035913.61 + 405035.0
by: V. P. Kozhevnikov
Published: (2025)
by: V. P. Kozhevnikov
Published: (2025)
Measuring the Precise Photometric Period of the Probable Intermediate Polar 1RXS J014549.6+514314 Based on Extensive Photometry
by: V. P. Kozhevnikov
Published: (2025)
by: V. P. Kozhevnikov
Published: (2025)
Neural Proofs for Sound Verification and Control of Complex Systems
by: Abate, Alessandro
Published: (2025)
by: Abate, Alessandro
Published: (2025)
Dynamic Vocabulary Pruning in Early-Exit LLMs
by: Vincenti, Jort, et al.
Published: (2024)
by: Vincenti, Jort, et al.
Published: (2024)
Algorithmic collusion under competitive design
by: Conjeaud, Ivan
Published: (2023)
by: Conjeaud, Ivan
Published: (2023)
A variable dimension sketching strategy for nonlinear least-squares
by: Bellavia, Stefania, et al.
Published: (2025)
by: Bellavia, Stefania, et al.
Published: (2025)
Can we trust the evaluation on ChatGPT?
by: Aiyappa, Rachith, et al.
Published: (2023)
by: Aiyappa, Rachith, et al.
Published: (2023)
When should we trust the annotation? Selective prediction for molecular structure retrieval from mass spectra
by: Jürgens, Mira, et al.
Published: (2026)
by: Jürgens, Mira, et al.
Published: (2026)
Fully passive quantum random number generation with untrusted light
by: Qiu, KaiWei, et al.
Published: (2025)
by: Qiu, KaiWei, et al.
Published: (2025)
On endogenous cartel size under tacit collusion
by: Marc Escrihuela-Villar
Published: (2008)
by: Marc Escrihuela-Villar
Published: (2008)
Vertical tacit collusion in AI-mediated markets
by: Affonso, Felipe M.
Published: (2026)
by: Affonso, Felipe M.
Published: (2026)
Electronic properties of MoSe$_2$ nanowrinkles
by: Velja, Stefan, et al.
Published: (2024)
by: Velja, Stefan, et al.
Published: (2024)
Practical challenges of control monitoring in frontier AI deployments
by: Lindner, David, et al.
Published: (2025)
by: Lindner, David, et al.
Published: (2025)
Aligned, Orthogonal or In-conflict: When can we safely optimize Chain-of-Thought?
by: Kaufmann, Max, et al.
Published: (2026)
by: Kaufmann, Max, et al.
Published: (2026)
Should we trust cross‐sectional multiplier estimates?
by: Fabio Canova
Published: (2024)
by: Fabio Canova
Published: (2024)
Certifying bipartite pure quantum states efficiently using untrusted devices
by: Lin, Lijinzhi, et al.
Published: (2023)
by: Lin, Lijinzhi, et al.
Published: (2023)
Similar Items
-
Studying Cross-cluster Modularity in Neural Networks
by: Golechha, Satvik, et al.
Published: (2025) -
Games for AI Control: Models of Safety Evaluations of AI Deployment Protocols
by: Griffin, Charlie, et al.
Published: (2024) -
Subversion Strategy Eval: Can language models statelessly strategize to subvert control protocols?
by: Mallen, Alex, et al.
Published: (2024) -
Who can we trust? LLM-as-a-jury for Comparative Assessment
by: Qian, Mengjie, et al.
Published: (2026) -
When AI reviews science: Can we trust the referee?
by: Wang, Jialiang, et al.
Published: (2026)