:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Bartoszcze, Lukasz, Munshi, Sarthak, Sukidi, Bryan, Yen, Jennifer, Yang, Zejia, Williams-King, David, Le, Linh, Asuzu, Kosi, Maple, Carsten
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2502.17601
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Representation Noising: A Defence Mechanism Against Harmful Finetuning
by: Rosati, Domenic, et al.
Published: (2024)

Immunization against harmful fine-tuning attacks
by: Rosati, Domenic, et al.
Published: (2024)

Can Safety Fine-Tuning Be More Principled? Lessons Learned from Cybersecurity
by: Williams-King, David, et al.
Published: (2025)

Individualised Counterfactual Examples Using Conformal Prediction Intervals
by: Adams, James M., et al.
Published: (2025)

Taxonomy, Opportunities, and Challenges of Representation Engineering for Large Language Models
by: Wehner, Jan, et al.
Published: (2025)

Single-Configuration Attack Success Rate Is Not Enough: Jailbreak Evaluations Should Report Distributional Attack Success
by: Maple, Carsten, et al.
Published: (2026)

Faithful or Fabricated? A Causal Framework for Rationalization Bias in LLM Judges
by: Tapwal, Riya, et al.
Published: (2026)

PRISM: Generation-Time Detection and Mitigation of Secret Leakage in Multi-Agent LLM Pipelines
by: Tapwal, Riya, et al.
Published: (2026)

Justified Evidence Collection for Argument-based AI Fairness Assurance
by: Sabuncuoglu, Alpay, et al.
Published: (2025)

Towards Robust Federated Analytics via Differentially Private Measurements of Statistical Heterogeneity
by: Scott, Mary, et al.
Published: (2024)

Private Federated Multiclass Post-hoc Calibration
by: Maddock, Samuel, et al.
Published: (2025)

FLAIM: AIM-based Synthetic Data Generation in the Federated Setting
by: Maddock, Samuel, et al.
Published: (2023)

DriveSafe: A Hierarchical Risk Taxonomy for Safety-Critical LLM-Based Driving Assistants
by: Kumar, Abhishek, et al.
Published: (2026)

Latent Personality Alignment: Improving Harmlessness Without Mentioning Harms
by: Le, Linh, et al.
Published: (2026)

Audio Computer-Assisted Self Interview Compared to Traditional Interview in an HIV-Related Behavioral Survey in Vietnam
by: Linh Cu Le
Published: (2012)

Threat, Risk and Mitigation Taxonomy for Digital Identity Systems
by: SHEIK, AL TARIQ, et al.
Published: (2024)

Towards Smart Healthcare: Challenges and Opportunities in IoT and ML
by: Saifuzzaman, Munshi, et al.
Published: (2023)

LearnedCache: An eBPF-Integrated Perceptron-Based Eviction Policy for the Linux Page Cache
by: Qi, Zejia
Published: (2026)

Differentially Private Health Tokens for Estimating COVID-19 Risk
by: Butler, David, et al.
Published: (2020)

Data-Agnostic Face Image Synthesis Detection Using Bayesian CNNs
by: Leyva, Roberto, et al.
Published: (2024)

Operationalising Artificial Intelligence Bills of Materials (AIBOMs) for Verifiable AI Provenance and Lifecycle Assurance
by: Radanliev, Petar, et al.
Published: (2026)

Distributed, communication-efficient, and differentially private estimation of KL divergence
by: Scott, Mary, et al.
Published: (2024)

SBOMs into Agentic AIBOMs: Schema Extensions, Agentic Orchestration, and Reproducibility Evaluation
by: Radanliev, Petar, et al.
Published: (2026)

Field-Localized Forgery Detection for Digital Identity Documents
by: Kumar, Abhishek, et al.
Published: (2026)

Detecting Face Synthesis Using a Concealed Fusion Model
by: Leyva, Roberto, et al.
Published: (2024)

Manifold of Failure: Behavioral Attraction Basins in Language Models
by: Munshi, Sarthak, et al.
Published: (2026)

ACSE-Eval: Can LLMs threat model real-world cloud infrastructure?
by: Munshi, Sarthak, et al.
Published: (2025)

acad_recuperation_joueurs_exclus_fr-ca
by: Maple, Kevon
Published: (2026)

acad_self_excluded_player_recovery_en-ca
by: Maple, Kevon
Published: (2026)

acad_dispute_resolution_handbook_bilingual_fr-ca
by: Maple, Kevon
Published: (2026)

acad_rg_resource_compendium_bilingual_fr-ca
by: Maple, Kevon
Published: (2026)

acad_withdrawal_caps_high_rollers_en-ca
by: Maple, Kevon
Published: (2026)

acad_trustpilot_casinos_quebecois_fr-ca
by: Maple, Kevon
Published: (2026)

Refugee Reception in Southern Africa
by: Maple, Nicholas
Published: (2024)

SRA: Span Representation Alignment for Large Language Model Distillation
by: Dao, Quoc Phong, et al.
Published: (2026)

A Game-Theoretic Approach for PMU Deployment Against False Data Injection Attacks
by: Maleki, Sajjad, et al.
Published: (2024)

A privacy preserving querying mechanism with high utility for electric vehicles
by: Atmaca, Ugur Ilker, et al.
Published: (2022)

Large Language Models and the Rationalist Empiricist Debate
by: King, David
Published: (2024)

Spherical Steering: Geometry-Aware Activation Rotation for Language Models
by: You, Zejia, et al.
Published: (2026)

Securing Cryptographic Software via Typed Assembly Language (Extended Version)
by: Song, Shixin, et al.
Published: (2025)