Saved in:
| Main Authors: | Achara, Akshit, Gaintseva, Tatiana, Mahaut, Mateo, Chakraborty, Pritish, Johansson, Viktor Stenby, Barsbey, Melih, Rodolà, Emanuele, Crisostomi, Donato |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.06205 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Model Merging Improves Zero-Shot Generalization in Bioacoustic Foundation Models
by: Marincione, Davide, et al.
Published: (2025)
by: Marincione, Davide, et al.
Published: (2025)
Language Models are Injective and Hence Invertible
by: Nikolaou, Giorgos, et al.
Published: (2025)
by: Nikolaou, Giorgos, et al.
Published: (2025)
Metric Based Few-Shot Graph Classification
by: Crisostomi, Donato, et al.
Published: (2022)
by: Crisostomi, Donato, et al.
Published: (2022)
Mergenetic: a Simple Evolutionary Model Merging Library
by: Minut, Adrian Robert, et al.
Published: (2025)
by: Minut, Adrian Robert, et al.
Published: (2025)
MERGE$^3$: Efficient Evolutionary Merging on Consumer-grade GPUs
by: Mencattini, Tommaso, et al.
Published: (2025)
by: Mencattini, Tommaso, et al.
Published: (2025)
Model Merging: Foundations and Algorithms
by: Crisostomi, Donato
Published: (2026)
by: Crisostomi, Donato
Published: (2026)
Two-Scale Latent Dynamics for Recurrent-Depth Transformers
by: Pappone, Francesco, et al.
Published: (2025)
by: Pappone, Francesco, et al.
Published: (2025)
ATM: Improving Model Merging by Alternating Tuning and Merging
by: Zhou, Luca, et al.
Published: (2024)
by: Zhou, Luca, et al.
Published: (2024)
Implicit Inversion turns CLIP into a Decoder
by: D'Orazio, Antonio, et al.
Published: (2025)
by: D'Orazio, Antonio, et al.
Published: (2025)
On Task Vectors and Gradients
by: Zhou, Luca, et al.
Published: (2025)
by: Zhou, Luca, et al.
Published: (2025)
$C^2M^3$: Cycle-Consistent Multi-Model Merging
by: Crisostomi, Donato, et al.
Published: (2024)
by: Crisostomi, Donato, et al.
Published: (2024)
MASS: MoErging through Adaptive Subspace Selection
by: Crisostomi, Donato, et al.
Published: (2025)
by: Crisostomi, Donato, et al.
Published: (2025)
Large Learning Rates Simultaneously Achieve Robustness to Spurious Correlations and Compressibility
by: Barsbey, Melih, et al.
Published: (2025)
by: Barsbey, Melih, et al.
Published: (2025)
Adversarial Attacks Leverage Interference Between Features in Superposition
by: Stevinson, Edward, et al.
Published: (2025)
by: Stevinson, Edward, et al.
Published: (2025)
LoopGen: Training-Free Loopable Music Generation
by: Marincione, Davide, et al.
Published: (2025)
by: Marincione, Davide, et al.
Published: (2025)
On the Interaction of Compressibility and Adversarial Robustness
by: Barsbey, Melih, et al.
Published: (2025)
by: Barsbey, Melih, et al.
Published: (2025)
Communicating Sound Through Natural Language
by: Rossi, Emanuele, et al.
Published: (2026)
by: Rossi, Emanuele, et al.
Published: (2026)
Grokking at the Edge of Numerical Stability
by: Prieto, Lucas, et al.
Published: (2025)
by: Prieto, Lucas, et al.
Published: (2025)
Watching the AI Watchdogs: A Fairness and Robustness Analysis of AI Safety Moderation Classifiers
by: Achara, Akshit, et al.
Published: (2025)
by: Achara, Akshit, et al.
Published: (2025)
Domain Elastic Transform: Bayesian Function Registration for High-Dimensional Scientific Data
by: Hirose, Osamu, et al.
Published: (2026)
by: Hirose, Osamu, et al.
Published: (2026)
Referential communication in heterogeneous communities of pre-trained visual deep networks
by: Mahaut, Matéo, et al.
Published: (2023)
by: Mahaut, Matéo, et al.
Published: (2023)
From Data Statistics to Feature Geometry: How Correlations Shape Superposition
by: Prieto, Lucas, et al.
Published: (2026)
by: Prieto, Lucas, et al.
Published: (2026)
Multi-objective Evolutionary Merging Enables Efficient Reasoning Models
by: Iacobelli, Mario, et al.
Published: (2026)
by: Iacobelli, Mario, et al.
Published: (2026)
MidSteer: Optimal Affine Framework for Steering Generative Models
by: Gaintseva, Tatiana, et al.
Published: (2026)
by: Gaintseva, Tatiana, et al.
Published: (2026)
Revealing the Underlying Patterns: Investigating Dataset Similarity, Performance, and Generalization
by: Achara, Akshit, et al.
Published: (2023)
by: Achara, Akshit, et al.
Published: (2023)
CoreDeep: Improving Crack Detection Algorithms Using Width Stochasticity
by: Pandey, Ram Krishna, et al.
Published: (2022)
by: Pandey, Ram Krishna, et al.
Published: (2022)
Mapping representations in Reinforcement Learning via Semantic Alignment for Zero-Shot Stitching
by: Ricciardi, Antonio Pio, et al.
Published: (2025)
by: Ricciardi, Antonio Pio, et al.
Published: (2025)
TOAST: Transformer Optimization using Adaptive and Simple Transformations
by: Cannistraci, Irene, et al.
Published: (2024)
by: Cannistraci, Irene, et al.
Published: (2024)
R3L: Relative Representations for Reinforcement Learning
by: Ricciardi, Antonio Pio, et al.
Published: (2024)
by: Ricciardi, Antonio Pio, et al.
Published: (2024)
FairPO: Robust Preference Optimization for Fair Multi-Label Learning
by: Mondal, Soumen Kumar, et al.
Published: (2025)
by: Mondal, Soumen Kumar, et al.
Published: (2025)
Membership and Dataset Inference Attacks on Large Audio Generative Models
by: Proboszcz, Jakub, et al.
Published: (2025)
by: Proboszcz, Jakub, et al.
Published: (2025)
Task Singular Vectors: Reducing Task Interference in Model Merging
by: Gargiulo, Antonio Andrea, et al.
Published: (2024)
by: Gargiulo, Antonio Andrea, et al.
Published: (2024)
Accelerating Transformer Inference for Translation via Parallel Decoding
by: Santilli, Andrea, et al.
Published: (2023)
by: Santilli, Andrea, et al.
Published: (2023)
Repetitions are not all alike: distinct mechanisms sustain repetition in language models
by: Mahaut, Matéo, et al.
Published: (2025)
by: Mahaut, Matéo, et al.
Published: (2025)
Update Your Transformer to the Latest Release: Re-Basin of Task Vectors
by: Rinaldi, Filippo, et al.
Published: (2025)
by: Rinaldi, Filippo, et al.
Published: (2025)
Implicit Compressibility of Overparametrized Neural Networks Trained with Heavy-Tailed SGD
by: Wan, Yijun, et al.
Published: (2023)
by: Wan, Yijun, et al.
Published: (2023)
Zero-Shot Quantization via Weight-Space Arithmetic
by: Solombrino, Daniele, et al.
Published: (2026)
by: Solombrino, Daniele, et al.
Published: (2026)
Preserving Privacy in Large Language Models: A Survey on Current Threats and Solutions
by: Miranda, Michele, et al.
Published: (2024)
by: Miranda, Michele, et al.
Published: (2024)
Understanding Sources of Demographic Predictability in Brain MRI via Disentangling Anatomy and Contrast
by: Avci, Mehmet Yigit, et al.
Published: (2026)
by: Avci, Mehmet Yigit, et al.
Published: (2026)
Fast and Featureless Node Representation Learning with Partial Pairwise Supervision
by: Chakraborty, Sujan, et al.
Published: (2026)
by: Chakraborty, Sujan, et al.
Published: (2026)
Similar Items
-
Model Merging Improves Zero-Shot Generalization in Bioacoustic Foundation Models
by: Marincione, Davide, et al.
Published: (2025) -
Language Models are Injective and Hence Invertible
by: Nikolaou, Giorgos, et al.
Published: (2025) -
Metric Based Few-Shot Graph Classification
by: Crisostomi, Donato, et al.
Published: (2022) -
Mergenetic: a Simple Evolutionary Model Merging Library
by: Minut, Adrian Robert, et al.
Published: (2025) -
MERGE$^3$: Efficient Evolutionary Merging on Consumer-grade GPUs
by: Mencattini, Tommaso, et al.
Published: (2025)