Saved in:
| Main Authors: | Zhao, Hao, Andriushchenko, Maksym, Croce, Francesco, Flammarion, Nicolas |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.04833 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Is In-Context Learning Sufficient for Instruction Following in LLMs?
by: Zhao, Hao, et al.
Published: (2024)
by: Zhao, Hao, et al.
Published: (2024)
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks
by: Andriushchenko, Maksym, et al.
Published: (2024)
by: Andriushchenko, Maksym, et al.
Published: (2024)
Does Refusal Training in LLMs Generalize to the Past Tense?
by: Andriushchenko, Maksym, et al.
Published: (2024)
by: Andriushchenko, Maksym, et al.
Published: (2024)
HalluHard: A Hard Multi-Turn Hallucination Benchmark
by: Fan, Dongyang, et al.
Published: (2026)
by: Fan, Dongyang, et al.
Published: (2026)
Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs
by: Rando, Javier, et al.
Published: (2024)
by: Rando, Javier, et al.
Published: (2024)
SPA: A Simple but Tough-to-Beat Baseline for Knowledge Injection
by: Tang, Kexian, et al.
Published: (2026)
by: Tang, Kexian, et al.
Published: (2026)
OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents
by: Kuntz, Thomas, et al.
Published: (2025)
by: Kuntz, Thomas, et al.
Published: (2025)
On the Out-of-Distribution Generalization of Reasoning in Multimodal LLMs for Simple Visual Planning Tasks
by: Neuhaus, Yannic, et al.
Published: (2026)
by: Neuhaus, Yannic, et al.
Published: (2026)
Why Do We Need Weight Decay in Modern Deep Learning?
by: D'Angelo, Francesco, et al.
Published: (2023)
by: D'Angelo, Francesco, et al.
Published: (2023)
Data Repetition Beats Data Scaling in Long-CoT Supervised Fine-Tuning
by: Kopiczko, Dawid J., et al.
Published: (2026)
by: Kopiczko, Dawid J., et al.
Published: (2026)
FuseLIP: Multimodal Embeddings via Early Fusion of Discrete Tokens
by: Schlarmann, Christian, et al.
Published: (2025)
by: Schlarmann, Christian, et al.
Published: (2025)
On the Adversarial Robustness of Discrete Image Tokenizers
by: Bhagwatkar, Rishika, et al.
Published: (2026)
by: Bhagwatkar, Rishika, et al.
Published: (2026)
Capability-Based Scaling Trends for LLM-Based Red-Teaming
by: Panfilov, Alexander, et al.
Published: (2025)
by: Panfilov, Alexander, et al.
Published: (2025)
Improved Baselines with Visual Instruction Tuning
by: Liu, Haotian, et al.
Published: (2023)
by: Liu, Haotian, et al.
Published: (2023)
Selective Induction Heads: How Transformers Select Causal Structures In Context
by: D'Angelo, Francesco, et al.
Published: (2025)
by: D'Angelo, Francesco, et al.
Published: (2025)
(How) Learning Rates Regulate Catastrophic Overtraining
by: Rofin, Mark, et al.
Published: (2026)
by: Rofin, Mark, et al.
Published: (2026)
A NotSo Simple Way to Beat Simple Bench
by: Sane, Soham, et al.
Published: (2024)
by: Sane, Soham, et al.
Published: (2024)
Toward Secure Tuning: Mitigating Security Risks from Instruction Fine-Tuning
by: Du, Yanrui, et al.
Published: (2024)
by: Du, Yanrui, et al.
Published: (2024)
Helpful to a Fault: Measuring Illicit Assistance in Multi-Turn, Multilingual LLM Agents
by: Talokar, Nivya, et al.
Published: (2026)
by: Talokar, Nivya, et al.
Published: (2026)
Improving Alignment and Robustness with Circuit Breakers
by: Zou, Andy, et al.
Published: (2024)
by: Zou, Andy, et al.
Published: (2024)
Does Instruction Tuning Make LLMs More Consistent?
by: Fierro, Constanza, et al.
Published: (2024)
by: Fierro, Constanza, et al.
Published: (2024)
SIBO: A Simple Booster for Parameter-Efficient Fine-Tuning
by: Wen, Zhihao, et al.
Published: (2024)
by: Wen, Zhihao, et al.
Published: (2024)
Filtering Beats Fine Tuning: A Bayesian Kalman View of In Context Learning in LLMs
by: Kiruluta, Andrew
Published: (2026)
by: Kiruluta, Andrew
Published: (2026)
A Simple Yet Strong Baseline for Long-Term Conversational Memory of LLM Agents
by: Zhou, Sizhe, et al.
Published: (2025)
by: Zhou, Sizhe, et al.
Published: (2025)
Less is More: High-value Data Selection for Visual Instruction Tuning
by: Liu, Zikang, et al.
Published: (2024)
by: Liu, Zikang, et al.
Published: (2024)
The Atomic Instruction Gap: Instruction-Tuned LLMs Struggle with Simple, Self-Contained Directives
by: Lim, Henry, et al.
Published: (2025)
by: Lim, Henry, et al.
Published: (2025)
Fine-Mem: Fine-Grained Feedback Alignment for Long-Horizon Memory Management
by: Ma, Weitao, et al.
Published: (2026)
by: Ma, Weitao, et al.
Published: (2026)
Instruction Matters: A Simple yet Effective Task Selection for Optimized Instruction Tuning of Specific Tasks
by: Lee, Changho, et al.
Published: (2024)
by: Lee, Changho, et al.
Published: (2024)
Blind Baselines Beat Membership Inference Attacks for Foundation Models
by: Das, Debeshee, et al.
Published: (2024)
by: Das, Debeshee, et al.
Published: (2024)
Fine-Tuning on Noisy Instructions: Effects on Generalization and Performance
by: Alajrami, Ahmed, et al.
Published: (2025)
by: Alajrami, Ahmed, et al.
Published: (2025)
GIFT: Guided Fine-Tuning and Transfer for Enhancing Instruction-Tuned Language Models
by: Ruan, Zhiwen, et al.
Published: (2026)
by: Ruan, Zhiwen, et al.
Published: (2026)
Chain-of-Frames: Advancing Video Understanding in Multimodal LLMs via Frame-Aware Reasoning
by: Ghazanfari, Sara, et al.
Published: (2025)
by: Ghazanfari, Sara, et al.
Published: (2025)
Low-Resource Fine-Tuning for Multi-Task Structured Information Extraction with a Billion-Parameter Instruction-Tuned Model
by: Chih, Yu Cheng, et al.
Published: (2025)
by: Chih, Yu Cheng, et al.
Published: (2025)
Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment
by: Song, Feifan, et al.
Published: (2024)
by: Song, Feifan, et al.
Published: (2024)
Decomposing and Measuring Evaluation Awareness
by: Li, Changling, et al.
Published: (2026)
by: Li, Changling, et al.
Published: (2026)
Balancing Continuous Pre-Training and Instruction Fine-Tuning: Optimizing Instruction-Following in LLMs
by: Jindal, Ishan, et al.
Published: (2024)
by: Jindal, Ishan, et al.
Published: (2024)
Stealth Fine-Tuning: Efficiently Breaking Alignment in RVLMs Using Self-Generated CoT
by: Yu, Le, et al.
Published: (2025)
by: Yu, Le, et al.
Published: (2025)
Adapting AlignScore Mertic for Factual Consistency Evaluation of Text in Russian: A Student Abstract
by: Zimin, Mikhail, et al.
Published: (2025)
by: Zimin, Mikhail, et al.
Published: (2025)
Towards Unified Benchmark and Models for Multi-Modal Perceptual Metrics
by: Ghazanfari, Sara, et al.
Published: (2024)
by: Ghazanfari, Sara, et al.
Published: (2024)
DELIFT: Data Efficient Language model Instruction Fine Tuning
by: Agarwal, Ishika, et al.
Published: (2024)
by: Agarwal, Ishika, et al.
Published: (2024)
Similar Items
-
Is In-Context Learning Sufficient for Instruction Following in LLMs?
by: Zhao, Hao, et al.
Published: (2024) -
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks
by: Andriushchenko, Maksym, et al.
Published: (2024) -
Does Refusal Training in LLMs Generalize to the Past Tense?
by: Andriushchenko, Maksym, et al.
Published: (2024) -
HalluHard: A Hard Multi-Turn Hallucination Benchmark
by: Fan, Dongyang, et al.
Published: (2026) -
Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs
by: Rando, Javier, et al.
Published: (2024)