Saved in:
| Main Authors: | Nguyen, Tuan, Tran-Thanh, Long |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.09330 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Black-Box Behavioral Distillation Breaks Safety Alignment in Medical LLMs
by: Jahan, Sohely, et al.
Published: (2025)
by: Jahan, Sohely, et al.
Published: (2025)
Curvature-Aware Safety Restoration In LLMs Fine-Tuning
by: Bach, Thong, et al.
Published: (2025)
by: Bach, Thong, et al.
Published: (2025)
CASUAL: Conditional Support Alignment for Domain Adaptation with Label Shift
by: Nguyen, Anh T, et al.
Published: (2023)
by: Nguyen, Anh T, et al.
Published: (2023)
Kernel Learning for Sample Constrained Black-Box Optimization
by: Rajagopalan, Rajalaxmi, et al.
Published: (2025)
by: Rajagopalan, Rajalaxmi, et al.
Published: (2025)
Continual Safety Alignment via Gradient-Based Sample Selection
by: Bach, Thong, et al.
Published: (2026)
by: Bach, Thong, et al.
Published: (2026)
Feature Optimization for Time Series Forecasting via Novel Randomized Uphill Climbing
by: Van Thanh, Nguyen
Published: (2025)
by: Van Thanh, Nguyen
Published: (2025)
BSO: Safety Alignment Is Density Ratio Matching
by: Nguyen, Tien-Phat, et al.
Published: (2026)
by: Nguyen, Tien-Phat, et al.
Published: (2026)
Sample-Constrained Black Box Optimization for Audio Personalization
by: Rajagopalan, Rajalaxmi, et al.
Published: (2025)
by: Rajagopalan, Rajalaxmi, et al.
Published: (2025)
Statistical Inference for Autoencoder-based Anomaly Detection after Representation Learning-based Domain Adaptation
by: Kiet, Tran Tuan, et al.
Published: (2025)
by: Kiet, Tran Tuan, et al.
Published: (2025)
Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment
by: Ghosal, Soumya Suvra, et al.
Published: (2024)
by: Ghosal, Soumya Suvra, et al.
Published: (2024)
POT: Inducing Overthinking in LLMs via Black-Box Iterative Optimization
by: Li, Xinyu, et al.
Published: (2025)
by: Li, Xinyu, et al.
Published: (2025)
Black Box Causal Inference: Effect Estimation via Meta Prediction
by: Bynum, Lucius E. J., et al.
Published: (2025)
by: Bynum, Lucius E. J., et al.
Published: (2025)
Black-Box Optimization From Small Offline Datasets via Meta Learning with Synthetic Tasks
by: Fadhel, Azza, et al.
Published: (2026)
by: Fadhel, Azza, et al.
Published: (2026)
PRESTO: Preimage-Informed Instruction Optimization for Prompting Black-Box LLMs
by: Chu, Jaewon, et al.
Published: (2025)
by: Chu, Jaewon, et al.
Published: (2025)
Mitigating the Safety Alignment Tax with Null-Space Constrained Policy Optimization
by: Niu, Yifan, et al.
Published: (2025)
by: Niu, Yifan, et al.
Published: (2025)
HUANet: Hard-Constrained Unrolled ADMM for Constrained Convex Optimization
by: Tran, Trinh, et al.
Published: (2026)
by: Tran, Trinh, et al.
Published: (2026)
No-Regret Learning of Nash Equilibrium for Black-Box Games via Gaussian Processes
by: Han, Minbiao, et al.
Published: (2024)
by: Han, Minbiao, et al.
Published: (2024)
MANATEE: Inference-Time Lightweight Diffusion Based Safety Defense for LLMs
by: Kan, Chun Yan Ryan, et al.
Published: (2026)
by: Kan, Chun Yan Ryan, et al.
Published: (2026)
Boundary Point Jailbreaking of Black-Box LLMs
by: Davies, Xander, et al.
Published: (2026)
by: Davies, Xander, et al.
Published: (2026)
Transferable Black-Box One-Shot Forging of Watermarks via Image Preference Models
by: Souček, Tomáš, et al.
Published: (2025)
by: Souček, Tomáš, et al.
Published: (2025)
Unveiling the Black Box: A Multi-Layer Framework for Explaining Reinforcement Learning-Based Cyber Agents
by: Goel, Diksha, et al.
Published: (2025)
by: Goel, Diksha, et al.
Published: (2025)
Surrogate-based Optimization via Clustering for Box-Constrained Problems
by: Ahmad, Maaz, et al.
Published: (2026)
by: Ahmad, Maaz, et al.
Published: (2026)
Posterior Inference in Latent Space for Scalable Constrained Black-box Optimization
by: Om, Kiyoung, et al.
Published: (2025)
by: Om, Kiyoung, et al.
Published: (2025)
Enhancing Time Series Forecasting via a Parallel Hybridization of ARIMA and Polynomial Classifiers
by: Nguyen, Thanh Son, et al.
Published: (2025)
by: Nguyen, Thanh Son, et al.
Published: (2025)
Stepwise Alignment for Constrained Language Model Policy Optimization
by: Wachi, Akifumi, et al.
Published: (2024)
by: Wachi, Akifumi, et al.
Published: (2024)
Evaluating Black-Box Vulnerabilities with Wasserstein-Constrained Data Perturbations
by: Monteiro, Adriana Laurindo, et al.
Published: (2026)
by: Monteiro, Adriana Laurindo, et al.
Published: (2026)
Efficient Mixture Learning in Black-Box Variational Inference
by: Hotti, Alexandra, et al.
Published: (2024)
by: Hotti, Alexandra, et al.
Published: (2024)
Robust SDE Parameter Estimation Under Missing Time Information Setting
by: Van Tran, Long, et al.
Published: (2026)
by: Van Tran, Long, et al.
Published: (2026)
Reward Shaping for Inference-Time Alignment: A Stackelberg Game Perspective
by: Wang, Haichuan, et al.
Published: (2026)
by: Wang, Haichuan, et al.
Published: (2026)
Revisiting LARS for Large Batch Training Generalization of Neural Networks
by: Do, Khoi, et al.
Published: (2023)
by: Do, Khoi, et al.
Published: (2023)
Learning the Expected Core of Strictly Convex Stochastic Cooperative Games
by: Tran, Nam Phuong, et al.
Published: (2024)
by: Tran, Nam Phuong, et al.
Published: (2024)
Hair-Trigger Alignment: Black-Box Evaluation Cannot Guarantee Post-Update Alignment
by: Bakman, Yavuz, et al.
Published: (2026)
by: Bakman, Yavuz, et al.
Published: (2026)
Agnostic Sharpness-Aware Minimization
by: Nguyen, Van-Anh, et al.
Published: (2024)
by: Nguyen, Van-Anh, et al.
Published: (2024)
On the Convergence of Black-Box Variational Inference
by: Kim, Kyurae, et al.
Published: (2023)
by: Kim, Kyurae, et al.
Published: (2023)
Auditing Training Data in Generative Music Models via Black-Box Membership Inference
by: Liu, Yi Chen, et al.
Published: (2026)
by: Liu, Yi Chen, et al.
Published: (2026)
Guardrails in Logit Space: Safety Token Regularization for LLM Alignment
by: Bach, Thong, et al.
Published: (2026)
by: Bach, Thong, et al.
Published: (2026)
Black Box Variational Inference with a Deterministic Objective: Faster, More Accurate, and Even More Black Box
by: Giordano, Ryan, et al.
Published: (2023)
by: Giordano, Ryan, et al.
Published: (2023)
Adaptive Test-Time Compute Allocation for Reasoning LLMs via Constrained Policy Optimization
by: Zhai, Zhiyuan, et al.
Published: (2026)
by: Zhai, Zhiyuan, et al.
Published: (2026)
Empirical Comparison of Lightweight Forecasting Models for Seasonal and Non-Seasonal Time Series
by: Nguyen, Thanh Son, et al.
Published: (2025)
by: Nguyen, Thanh Son, et al.
Published: (2025)
GLiRA: Black-Box Membership Inference Attack via Knowledge Distillation
by: Galichin, Andrey V., et al.
Published: (2024)
by: Galichin, Andrey V., et al.
Published: (2024)
Similar Items
-
Black-Box Behavioral Distillation Breaks Safety Alignment in Medical LLMs
by: Jahan, Sohely, et al.
Published: (2025) -
Curvature-Aware Safety Restoration In LLMs Fine-Tuning
by: Bach, Thong, et al.
Published: (2025) -
CASUAL: Conditional Support Alignment for Domain Adaptation with Label Shift
by: Nguyen, Anh T, et al.
Published: (2023) -
Kernel Learning for Sample Constrained Black-Box Optimization
by: Rajagopalan, Rajalaxmi, et al.
Published: (2025) -
Continual Safety Alignment via Gradient-Based Sample Selection
by: Bach, Thong, et al.
Published: (2026)