Saved in:
Bibliographic Details
Main Authors: Shor, Tamir, Fetaya, Ethan, Baskin, Chaim, Bronstein, Alex
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.20314
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866918366886756352
author Shor, Tamir
Fetaya, Ethan
Baskin, Chaim
Bronstein, Alex
author_facet Shor, Tamir
Fetaya, Ethan
Baskin, Chaim
Bronstein, Alex
contents Implicit Neural Representations (INRs) have been recently garnering increasing interest in various research fields, mainly due to their ability to represent large, complex data in a compact, continuous manner. Past work further showed that numerous popular downstream tasks can be performed directly in the INR parameter-space. Doing so can substantially reduce the computational resources required to process the represented data in their native domain. A major difficulty in using modern machine-learning approaches, is their high susceptibility to adversarial attacks, which have been shown to greatly limit the reliability and applicability of such methods in a wide range of settings. In this work, we perform an in-depth security analysis of the behavior of weight-space classifiers under adversarial attacks. Our study reveals that parameter-space models trained for classification exhibit increased robustness to standard white-box adversarial attacks compared to standard classifiers that operate in the original signal space. This is achieved without the need of any robust training. We source this robust behavior to the phenomenon of gradient-obfuscation promoted during the INR optimization process, and pinpoint the limitations of this robustness under alternative adversarial approaches. To support our claims, we develop a novel suite of adversarial attacks targeting parameter-space classifiers, and furthermore analyze practical considerations of such attacks.
format Preprint
id arxiv_https___arxiv_org_abs_2502_20314
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Adversarial Attacks in Weight-Space Classifiers
Shor, Tamir
Fetaya, Ethan
Baskin, Chaim
Bronstein, Alex
Machine Learning
Implicit Neural Representations (INRs) have been recently garnering increasing interest in various research fields, mainly due to their ability to represent large, complex data in a compact, continuous manner. Past work further showed that numerous popular downstream tasks can be performed directly in the INR parameter-space. Doing so can substantially reduce the computational resources required to process the represented data in their native domain. A major difficulty in using modern machine-learning approaches, is their high susceptibility to adversarial attacks, which have been shown to greatly limit the reliability and applicability of such methods in a wide range of settings. In this work, we perform an in-depth security analysis of the behavior of weight-space classifiers under adversarial attacks. Our study reveals that parameter-space models trained for classification exhibit increased robustness to standard white-box adversarial attacks compared to standard classifiers that operate in the original signal space. This is achieved without the need of any robust training. We source this robust behavior to the phenomenon of gradient-obfuscation promoted during the INR optimization process, and pinpoint the limitations of this robustness under alternative adversarial approaches. To support our claims, we develop a novel suite of adversarial attacks targeting parameter-space classifiers, and furthermore analyze practical considerations of such attacks.
title Adversarial Attacks in Weight-Space Classifiers
topic Machine Learning
url https://arxiv.org/abs/2502.20314