:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Jeung, Wonje, Yoon, Sangyeon, Hong, Hyesoo, Kim, Soeun, Han, Seungju, Yu, Youngjae, No, Albert
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2505.15209
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

R-TOFU: Unlearning in Large Reasoning Models
by: Yoon, Sangyeon, et al.
Published: (2025)

SEPS: A Separability Measure for Robust Unlearning in LLMs
by: Jeung, Wonje, et al.
Published: (2025)

Rethinking Benign Relearning: Syntax as the Hidden Driver of Unlearning Failures
by: Yoon, Sangyeon, et al.
Published: (2026)

A2D: Any-Order, Any-Step Safety Alignment for Diffusion Language Models
by: Jeung, Wonje, et al.
Published: (2025)

BenchPreS: A Benchmark for Context-Aware Personalized Preference Selectivity of Persistent-Memory LLMs
by: Yoon, Sangyeon, et al.
Published: (2026)

SAFEPATH: Preventing Harmful Reasoning in Chain-of-Thought via Early Alignment
by: Jeung, Wonje, et al.
Published: (2025)

VLMs Trace Without Tracking: Diagnosing Failures in Visual Path Following
by: Hong, Hyesoo, et al.
Published: (2026)

Adversarial Sample-Based Approach for Tighter Privacy Auditing in Final Model-Only Scenarios
by: Yoon, Sangyeon, et al.
Published: (2024)

Representation Bending for Large Language Model Safety
by: Yousefpour, Ashkan, et al.
Published: (2025)

Few-Shot Truly Benign DPO Attack for Jailbreaking LLMs
by: Yoon, Sangyeon, et al.
Published: (2026)

Knowledge Beyond Language: Bridging the Gap in Multilingual Machine Unlearning Evaluation
by: Hwang, Kyomin, et al.
Published: (2026)

Large Language Models Still Exhibit Bias in Long Text
by: Jeung, Wonje, et al.
Published: (2024)

Where Rollouts Begin: Low-Load, High-Leverage First-Token Diversification for RLVR
by: Kim, Soeun, et al.
Published: (2026)

An Information Theoretic Evaluation Metric For Strong Unlearning
by: Jeon, Dongjae, et al.
Published: (2024)

SMILE: Multimodal Dataset for Understanding Laughter in Video with Language Models
by: Hyun, Lee, et al.
Published: (2023)

Selective Vision is the Challenge for Visual Reasoning: A Benchmark for Visual Argument Understanding
by: Chung, Jiwan, et al.
Published: (2024)

Do LLMs Have Distinct and Consistent Personality? TRAIT: Personality Testset designed for LLMs with Psychometrics
by: Lee, Seungbeen, et al.
Published: (2024)

Is GPT-4 Alone Sufficient for Automated Essay Scoring?: A Comparative Judgment Approach Based on Rater Cognition
by: Kim, Seungju, et al.
Published: (2024)

Do MLLMs Capture How Interfaces Guide User Behavior? A Benchmark for Multimodal UI/UX Design Understanding
by: Jeon, Jaehyun, et al.
Published: (2025)

Do Language Models Associate Sound with Meaning? A Multimodal Study of Sound Symbolism
by: Jeong, Jinhong, et al.
Published: (2025)

Multi-Level Knowledge Distillation and Dynamic Self-Supervised Learning for Continual Learning
by: Kim, Taeheon, et al.
Published: (2025)

Rainbow Padding: Mitigating Early Termination in Instruction-Tuned Diffusion LLMs
by: Kim, Bumjun, et al.
Published: (2025)

DSG-KD: Knowledge Distillation from Domain-Specific to General Language Models
by: Cho, Sangyeon, et al.
Published: (2024)

Verifying the Verifiers: Unveiling Pitfalls and Potentials in Fact Verifiers
by: Seo, Wooseok, et al.
Published: (2025)

Assigning Distinct Roles to Quantized and Low-Rank Matrices Toward Optimal Weight Decomposition
by: Cho, Yoonjun, et al.
Published: (2025)

Intrinsic Test of Unlearning Using Parametric Knowledge Traces
by: Hong, Yihuai, et al.
Published: (2024)

Do LLMs Really Forget? Evaluating Unlearning with Knowledge Correlation and Confidence Awareness
by: Wei, Rongzhe, et al.
Published: (2025)

ConCSE: Unified Contrastive Learning and Augmentation for Code-Switched Embeddings
by: Jeon, Jangyeong, et al.
Published: (2024)

IFCap: Image-like Retrieval and Frequency-based Entity Filtering for Zero-shot Captioning
by: Lee, Soeun, et al.
Published: (2024)

ViPCap: Retrieval Text-Based Visual Prompts for Lightweight Image Captioning
by: Kim, Taewhan, et al.
Published: (2024)

Pearl: A Review-driven Persona-Knowledge Grounded Conversational Recommendation Dataset
by: Kim, Minjin, et al.
Published: (2024)

Safety-Aligned Weights Are Not Enough: Refusal-Teacher-Guided Finetuning Enhances Safety and Downstream Performance under Harmful Finetuning Attacks
by: Ham, Seokil, et al.
Published: (2025)

RESTOR: Knowledge Recovery in Machine Unlearning
by: Rezaei, Keivan, et al.
Published: (2024)

Right at My Level: A Unified Multilingual Framework for Proficiency-Aware Text Simplification
by: Jeong, Jinhong, et al.
Published: (2026)

The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities
by: Wu, Zhaofeng, et al.
Published: (2024)

Mind the Motions: Benchmarking Theory-of-Mind in Everyday Body Language
by: Lee, Seungbeen, et al.
Published: (2025)

Are Any-to-Any Models More Consistent Across Modality Transfers Than Specialists?
by: Chung, Jiwan, et al.
Published: (2025)

Does Localization Inform Unlearning? A Rigorous Examination of Local Parameter Attribution for Knowledge Unlearning in Language Models
by: Lee, Hwiyeong, et al.
Published: (2025)

BioBridge: Unified Bio-Embedding with Bridging Modality in Code-Switched EMR
by: Jeon, Jangyeong, et al.
Published: (2024)

Mechanistic Unlearning: Robust Knowledge Unlearning and Editing via Mechanistic Localization
by: Guo, Phillip, et al.
Published: (2024)