:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Farzam, Amirhossein, Behabahani, Majid, Malek, Mani, Nevmyvaka, Yuriy, Sapiro, Guillermo
Natura:	Preprint
Pubblicazione:	2026
Soggetti:	Artificial Intelligence
Accesso online:	https://arxiv.org/abs/2602.19396
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

Conceal, Reconstruct, Jailbreak: Exploiting the Reconstruction-Concealment Tradeoff in MLLMs
di: Reza, Md Farhamdur, et al.
Pubblicazione: (2026)

Hiding in Plain Sight: A Steganographic Approach to Stealthy LLM Jailbreaks
di: Geng, Jianing, et al.
Pubblicazione: (2025)

SPOT: Sparsification with Attention Dynamics via Token Relevance in Vision Transformers
di: Schlesinger, Oded, et al.
Pubblicazione: (2025)

Antelope: Potent and Concealed Jailbreak Attack Strategy
di: Zhao, Xin, et al.
Pubblicazione: (2024)

Exploring Jailbreak Attacks on LLMs through Intent Concealment and Diversion
di: Cui, Tiehan, et al.
Pubblicazione: (2025)

Hiding in Plain Sight: Detectability-Aware Antidistillation of Reasoning Models
di: Hartman, Max, et al.
Pubblicazione: (2026)

Data-Aware Random Feature Kernel for Transformers
di: Farzam, Amirhossein, et al.
Pubblicazione: (2026)

Spectra 1.1: Scaling Laws and Efficient Inference for Ternary Language Models
di: Vaidhya, Tejas, et al.
Pubblicazione: (2025)

Small Vocabularies, Big Gains: Pretraining and Tokenization in Time Series Models
di: Roger, Alexis, et al.
Pubblicazione: (2025)

Honeyfile Camouflage: Hiding Fake Files in Plain Sight
di: Timmer, Roelien C., et al.
Pubblicazione: (2024)

Hide in Plain Sight: Clean-Label Backdoor for Auditing Membership Inference
di: Chen, Depeng, et al.
Pubblicazione: (2024)

Hide Your Malicious Goal Into Benign Narratives: Jailbreak Large Language Models through Carrier Articles
di: Wang, Zhilong, et al.
Pubblicazione: (2024)

AlphaLab: Autonomous Multi-Agent Research Across Optimization Domains with Frontier LLMs
di: Hogan, Brendan R., et al.
Pubblicazione: (2026)

Wrist Photoplethysmography Predicts Dietary Information
di: Verrier, Kyle, et al.
Pubblicazione: (2025)

Deep Generative Sampling in the Dual Divergence Space: A Data-efficient & Interpretative Approach for Generative AI
di: Garg, Sahil, et al.
Pubblicazione: (2024)

Random Initialization Can't Catch Up: The Advantage of Language Model Transfer for Time Series Forecasting
di: Riachi, Roland, et al.
Pubblicazione: (2025)

Federated Fairness without Access to Sensitive Groups
di: Papadaki, Afroditi, et al.
Pubblicazione: (2024)

ASGuard: Activation-Scaling Guard to Mitigate Targeted Jailbreaking Attack
di: Park, Yein, et al.
Pubblicazione: (2025)

Hide to Guide: Learning via Semantic Masking
di: Liu, Ruitao, et al.
Pubblicazione: (2026)

Cross-Lingual Jailbreak Detection via Semantic Codebooks
di: Alanova, Shirin, et al.
Pubblicazione: (2026)

Jailbreaking on Text-to-Video Models via Scene Splitting Strategy
di: Lee, Wonjun, et al.
Pubblicazione: (2025)

TS-RAG: Retrieval-Augmented Generation based Time Series Foundation Models are Stronger Zero-Shot Forecaster
di: Ning, Kanghui, et al.
Pubblicazione: (2025)

Activation-Guided Local Editing for Jailbreaking Attacks
di: Wang, Jiecong, et al.
Pubblicazione: (2025)

AHA: Aligning Large Audio-Language Models for Reasoning Hallucinations via Counterfactual Hard Negatives
di: Chen, Yanxi, et al.
Pubblicazione: (2025)

Efficient LLM-Jailbreaking via Multimodal-LLM Jailbreak
di: Ji, Haoxuan, et al.
Pubblicazione: (2024)

Graph Partitioning With Limited Moves
di: Behbahani, Majid, et al.
Pubblicazione: (2024)

Fine-Tuning Causal LLMs for Text Classification: Embedding-Based vs. Instruction-Based Approaches
di: Yousefiramandi, Amirhossein, et al.
Pubblicazione: (2025)

Jailbreak-R1: Exploring the Jailbreak Capabilities of LLMs via Reinforcement Learning
di: Guo, Weiyang, et al.
Pubblicazione: (2025)

Subgoal Discovery Using a Free Energy Paradigm and State Aggregations
di: Mesbah, Amirhossein, et al.
Pubblicazione: (2024)

Jailbreaking to Jailbreak
di: Kritz, Jeremy, et al.
Pubblicazione: (2025)

Automatic Jailbreaking of the Text-to-Image Generative AI Systems
di: Kim, Minseon, et al.
Pubblicazione: (2024)

Plan-X: Instruct Video Generation via Semantic Planning
di: Huang, Lun, et al.
Pubblicazione: (2025)

Concealed Adversarial attacks on neural networks for sequential data
di: Sokerin, Petr, et al.
Pubblicazione: (2025)

Interpreting and Controlling Model Behavior via Constitutions for Atomic Concept Edits
di: Kalibhat, Neha, et al.
Pubblicazione: (2026)

An Empirical Evaluation of Neural and Neuro-symbolic Approaches to Real-time Multimodal Complex Event Detection
di: Han, Liying, et al.
Pubblicazione: (2024)

Hiding-in-Plain-Sight (HiPS) Attack on CLIP for Targetted Object Removal from Images
di: Daw, Arka, et al.
Pubblicazione: (2024)

Segment Concealed Objects with Incomplete Supervision
di: He, Chunming, et al.
Pubblicazione: (2025)

Interpretable Discriminative Text Representations via Agreement and Label Disentanglement
di: Wang, Tong, et al.
Pubblicazione: (2026)

Seamless Deception: Larger Language Models Are Better Knowledge Concealers
di: Ashok, Dhananjay, et al.
Pubblicazione: (2026)

Sycophancy Hides Linearly in the Attention Heads
di: Genadi, Rifo, et al.
Pubblicazione: (2026)