:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Suri, Manan, Anand, Nishit, Bhaskar, Amisha
Natura:	Preprint
Pubblicazione:	2025
Soggetti:	Computation and Language
Accesso online:	https://arxiv.org/abs/2503.06040
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

Learning Illumination Control in Diffusion Models
di: Anand, Nishit, et al.
Pubblicazione: (2026)

Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs
di: Hans, Abhimanyu, et al.
Pubblicazione: (2024)

The Landscape of Memorization in LLMs: Mechanisms, Measurement, and Mitigation
di: Xiong, Alexander, et al.
Pubblicazione: (2025)

Generalization or Memorization: Dynamic Decoding for Mode Steering
di: Zhang, Xuanming
Pubblicazione: (2025)

FairSteer: Inference Time Debiasing for LLMs with Dynamic Activation Steering
di: Li, Yichen, et al.
Pubblicazione: (2025)

Continual Memorization of Factoids in Language Models
di: Chen, Howard, et al.
Pubblicazione: (2024)

Mitigating Unintended Memorization with LoRA in Federated Learning for LLMs
di: Bossy, Thierry, et al.
Pubblicazione: (2025)

Steering Towards Fairness: Mitigating Political Bias in LLMs
di: Nadeem, Afrozah, et al.
Pubblicazione: (2025)

Randomized Masked Finetuning: An Efficient Way to Mitigate Memorization of PIIs in LLMs
di: Joshi, Kunj, et al.
Pubblicazione: (2025)

Understanding How CodeLLMs (Mis)Predict Types with Activation Steering
di: Lucchetti, Francesca, et al.
Pubblicazione: (2024)

Latent Phase-Shift Rollback: Inference-Time Error Correction via Residual Stream Monitoring and KV-Cache Steering
di: Gupta, Manan, et al.
Pubblicazione: (2026)

TSPE: Task-Specific Prompt Ensemble for Improved Zero-Shot Audio Classification
di: Anand, Nishit, et al.
Pubblicazione: (2024)

Semantics-Adaptive Activation Intervention for LLMs via Dynamic Steering Vectors
di: Wang, Weixuan, et al.
Pubblicazione: (2024)

CodeScout: Contextual Problem Statement Enhancement for Software Agents
di: Suri, Manan, et al.
Pubblicazione: (2026)

Mitigating Memorization In Language Models
di: Sakarvadia, Mansi, et al.
Pubblicazione: (2024)

Guiding Giants: Lightweight Controllers for Weighted Activation Steering in LLMs
di: Hegazy, Amr, et al.
Pubblicazione: (2025)

Steering MoE LLMs via Expert (De)Activation
di: Fayyaz, Mohsen, et al.
Pubblicazione: (2025)

Skeleton-based Coherence Modeling in Narratives
di: Asnani, Nishit, et al.
Pubblicazione: (2026)

Spurious Rewards Paradox: Mechanistically Understanding How RLVR Activates Memorization Shortcuts in LLMs
di: Yan, Lecheng, et al.
Pubblicazione: (2026)

Extracting Unlearned Information from LLMs with Activation Steering
di: Seyitoğlu, Atakan, et al.
Pubblicazione: (2024)

ChartLens: Fine-grained Visual Attribution in Charts
di: Suri, Manan, et al.
Pubblicazione: (2025)

ContextFocus: Activation Steering for Contextual Faithfulness in Large Language Models
di: Anand, Nikhil, et al.
Pubblicazione: (2026)

VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation
di: Suri, Manan, et al.
Pubblicazione: (2024)

Why Stop at Words? Unveiling the Bigger Picture through Line-Level OCR
di: Vempati, Shashank, et al.
Pubblicazione: (2025)

The Unreasonable Ineffectiveness of Nucleus Sampling on Mitigating Text Memorization
di: Borec, Luka, et al.
Pubblicazione: (2024)

Shifting Perspectives: Steering Vectors for Robust Bias Mitigation in LLMs
di: Siddique, Zara, et al.
Pubblicazione: (2025)

From Amateur to Master: Infusing Knowledge into LLMs via Automated Curriculum Learning
di: Neema, Nishit, et al.
Pubblicazione: (2025)

Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs
di: Kassem, Aly M., et al.
Pubblicazione: (2024)

SafeConstellations: Mitigating Over-Refusals in LLMs Through Task-Aware Representation Steering
di: Maskey, Utsav, et al.
Pubblicazione: (2025)

Memorization or Reasoning? Exploring the Idiom Understanding of LLMs
di: Kim, Jisu, et al.
Pubblicazione: (2025)

Fine-Grained Activation Steering: Steering Less, Achieving More
di: Feng, Zijian, et al.
Pubblicazione: (2026)

Mitigating Content Effects on Reasoning in Language Models through Fine-Grained Activation Steering
di: Valentino, Marco, et al.
Pubblicazione: (2025)

Activation-Space Personality Steering: Hybrid Layer Selection for Stable Trait Control in LLMs
di: Bhandari, Pranav, et al.
Pubblicazione: (2025)

FHIRPath-QA: Executable Question Answering over FHIR Electronic Health Records
di: Frew, Michael, et al.
Pubblicazione: (2026)

Memorization and Knowledge Injection in Gated LLMs
di: Pan, Xu, et al.
Pubblicazione: (2025)

Structured Uncertainty guided Clarification for LLM Agents
di: Suri, Manan, et al.
Pubblicazione: (2025)

Steering Awareness: Detecting Activation Steering from Within
di: Rivera, Joshua Fonseca, et al.
Pubblicazione: (2025)

Steer2Edit: From Activation Steering to Component-Level Editing
di: Sun, Chung-En, et al.
Pubblicazione: (2026)

Follow the Flow: Fine-grained Flowchart Attribution with Neurosymbolic Agents
di: Suri, Manan, et al.
Pubblicazione: (2025)

Unveiling Over-Memorization in Finetuning LLMs for Reasoning Tasks
di: Ruan, Zhiwen, et al.
Pubblicazione: (2025)