Saved in:
Bibliographic Details
Main Authors: Abolfazli, Mojtaba, Amirani, Mohammad Zaeri, Høst-Madsen, Anders, Zhang, June, Bratincsak, Andras
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2404.17023
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Given a default distribution $P$ and a set of test data $x^M=\{x_1,x_2,\ldots,x_M\}$ this paper seeks to answer the question if it was likely that $x^M$ was generated by $P$. For discrete distributions, the definitive answer is in principle given by Kolmogorov-Martin-Löf randomness. In this paper we seek to generalize this to continuous distributions. We consider a set of statistics $T_1(x^M),T_2(x^M),\ldots$. To each statistic we associate its maximum entropy distribution and with this a universal source coder. The maximum entropy distributions are subsequently combined to give a total codelength, which is compared with $-\log P(x^M)$. We show that this approach satisfied a number of theoretical properties. For real world data $P$ usually is unknown. We transform data into a standard distribution in the latent space using a bidirectional generate network and use maximum entropy coding there. We compare the resulting method to other methods that also used generative neural networks to detect anomalies. In most cases, our results show better performance.