:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Bail, Mathis Le, Dentan, Jérémie, Buscaldi, Davide, Vanier, Sonia
Natura:	Preprint
Pubblicazione:	2025
Soggetti:	Computation and Language
Accesso online:	https://arxiv.org/abs/2506.23951
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

Guess or Recall? Training CNNs to Classify and Localize Memorization in LLMs
di: Dentan, Jérémie, et al.
Pubblicazione: (2025)

MUCH: A Multilingual Claim Hallucination Benchmark
di: Dentan, Jérémie, et al.
Pubblicazione: (2025)

Predicting memorization within Large Language Models fine-tuned for classification
di: Dentan, Jérémie, et al.
Pubblicazione: (2024)

Activation Surgery: Jailbreaking White-box LLMs without Touching the Prompt
di: Jenny, Maël, et al.
Pubblicazione: (2026)

PACT: Pruning and Clustering-Based Token Reduction for Faster Visual Language Models
di: Dhouib, Mohamed, et al.
Pubblicazione: (2025)

Leveraging Contrastive Learning for a Similarity-Guided Tampered Document Data Generation Pipeline
di: Dhouib, Mohamed, et al.
Pubblicazione: (2026)

Triplètoile: Extraction of Knowledge from Microblogging Text
di: Zavarella, Vanni, et al.
Pubblicazione: (2024)

A Robust Autoencoder Ensemble-Based Approach for Anomaly Detection in Text
di: Pantin, Jeremie, et al.
Pubblicazione: (2024)

Word Sense Induction with Hierarchical Clustering and Mutual Information Maximization
di: Abdine, Hadi, et al.
Pubblicazione: (2022)

Disentangling concept semantics via multilingual averaging in Sparse Autoencoders
di: O'Reilly, Cliff, et al.
Pubblicazione: (2025)

Tug-of-war between idioms' figurative and literal interpretations in LLMs
di: Oh, Soyoung, et al.
Pubblicazione: (2025)

Sparse Autoencoders for Interpretable Emotion Control in Text-to-Speech
di: Du, Hongfei, et al.
Pubblicazione: (2026)

Sparse Autoencoder Features for Classifications and Transferability
di: Gallifant, Jack, et al.
Pubblicazione: (2025)

Unveiling Language-Specific Features in Large Language Models via Sparse Autoencoders
di: Deng, Boyi, et al.
Pubblicazione: (2025)

Self-Regularization with Sparse Autoencoders for Controllable LLM-based Classification
di: Wu, Xuansheng, et al.
Pubblicazione: (2025)

Breaking Bad Tokens: Detoxification of LLMs Using Sparse Autoencoders
di: Goyal, Agam, et al.
Pubblicazione: (2025)

SCAR: Sparse Conditioned Autoencoders for Concept Detection and Steering in LLMs
di: Härle, Ruben, et al.
Pubblicazione: (2024)

Combining Autoregressive and Autoencoder Language Models for Text Classification
di: Gonçalves, João
Pubblicazione: (2024)

Interpreting and Steering LLMs with Mutual Information-based Explanations on Sparse Autoencoders
di: Wu, Xuansheng, et al.
Pubblicazione: (2025)

Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders
di: Kuznetsov, Kristian, et al.
Pubblicazione: (2025)

Uncovering Cross-Linguistic Disparities in LLMs using Sparse Autoencoders
di: Xuan, Richmond Sin Jing, et al.
Pubblicazione: (2025)

Does higher interpretability imply better utility? A Pairwise Analysis on Sparse Autoencoders
di: Wang, Xu, et al.
Pubblicazione: (2025)

Cognitive Bias in Decision-Making with LLMs
di: Echterhoff, Jessica, et al.
Pubblicazione: (2024)

TextBandit: Evaluating Probabilistic Reasoning in LLMs Through Language-Only Decision Tasks
di: Lim, Jimin, et al.
Pubblicazione: (2025)

Layer-Wise Perturbations via Sparse Autoencoders for Adversarial Text Generation
di: Shu, Huizhen, et al.
Pubblicazione: (2025)

Sequential Decision-Making for Inline Text Autocomplete
di: Chitnis, Rohan, et al.
Pubblicazione: (2024)

SASFT: Sparse Autoencoder-guided Supervised Finetuning to Mitigate Unexpected Code-Switching in LLMs
di: Deng, Boyi, et al.
Pubblicazione: (2025)

Less is More: Sparse Watermarking in LLMs with Enhanced Text Quality
di: Hoang, Duy C., et al.
Pubblicazione: (2024)

Evaluating the Bias in LLMs for Surveying Opinion and Decision Making in Healthcare
di: Khaokaew, Yonchanok, et al.
Pubblicazione: (2025)

From Text to Emotion: Unveiling the Emotion Annotation Capabilities of LLMs
di: Niu, Minxue, et al.
Pubblicazione: (2024)

Text Clustering as Classification with LLMs
di: Huang, Chen, et al.
Pubblicazione: (2024)

Reliable Decision Support with LLMs: A Framework for Evaluating Consistency in Binary Text Classification Applications
di: Megahed, Fadel M., et al.
Pubblicazione: (2025)

SAFE: A Sparse Autoencoder-Based Framework for Robust Query Enrichment and Hallucination Mitigation in LLMs
di: Abdaljalil, Samir, et al.
Pubblicazione: (2025)

Sparse Autoencoders for Hypothesis Generation
di: Movva, Rajiv, et al.
Pubblicazione: (2025)

Sparse Autoencoder Insights on Voice Embeddings
di: Pluth, Daniel, et al.
Pubblicazione: (2025)

Steering LLMs? Actually, Sparse Autoencoders can outperform simple baselines
di: Jørgensen, Mikkel Godsk, et al.
Pubblicazione: (2026)

A Design-based Solution for Causal Inference with Text: Can a Language Model Be Too Large?
di: Tierney, Graham, et al.
Pubblicazione: (2025)

Group-SAE: Efficient Training of Sparse Autoencoders for Large Language Models via Layer Groups
di: Ghilardi, Davide, et al.
Pubblicazione: (2024)

Constrain Alignment with Sparse Autoencoders
di: Yin, Qingyu, et al.
Pubblicazione: (2024)

Toponym Disambiguation in Information Retrieval
di: Davide Buscaldi
Pubblicazione: (2011)