Guardado en:
Detalles Bibliográficos
Autores principales: Tran, Nguyen Khoa, Mu, Lin, Papenberg, Martin, Klau, Gunnar W.
Formato: Preprint
Publicado: 2026
Materias:
Acceso en línea:https://arxiv.org/abs/2604.24285
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
_version_ 1866913065026453504
author Tran, Nguyen Khoa
Mu, Lin
Papenberg, Martin
Klau, Gunnar W.
author_facet Tran, Nguyen Khoa
Mu, Lin
Papenberg, Martin
Klau, Gunnar W.
contents The $k$-Maximum Dispersion Problem with Cardinality Constraints ($k$-MDCC) asks for a partition of a given item set with pairwise dissimilarities into $k$ cardinality-constrained groups such that the minimum pairwise intra-group dissimilarity, which is also known as the dispersion, is maximized. The problem arises in the context of anticlustering, where the goal is to create maximally heterogeneous groups of items with applications in psychological research, bioinformatics, and data science. It is known that $k$-MDCC is NP-hard for $k \geq 3$ but it has been an open question whether it can be solved in polynomial time for $k = 2$. We give a positive answer to this question by showing that $2$-MDCC can be solved by a quadratic number of cardinality-constrained 2-coloring problem instances ($2$-COLCC). We solve these instances by transforming them into a restricted class of subset sum instances. Although subset sum is NP-complete in general, for this restricted class the input values are bounded, ensuring that the pseudopolynomial dynamic programming algorithm runs in polynomial time. As a consequence, we obtain a polynomial-time algorithm for $2$-MDCC. We demonstrate that a publicly available open-source implementation of our new algorithm outperforms the previous integer linear programming solution by several orders of magnitude so that even large datasets ($n = 10{,}000$) can be processed in less than a second.
format Preprint
id arxiv_https___arxiv_org_abs_2604_24285
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Coloring for dispersion: A polynomial-time algorithm for cardinality-constrained 2-anticlustering
Tran, Nguyen Khoa
Mu, Lin
Papenberg, Martin
Klau, Gunnar W.
Data Structures and Algorithms
The $k$-Maximum Dispersion Problem with Cardinality Constraints ($k$-MDCC) asks for a partition of a given item set with pairwise dissimilarities into $k$ cardinality-constrained groups such that the minimum pairwise intra-group dissimilarity, which is also known as the dispersion, is maximized. The problem arises in the context of anticlustering, where the goal is to create maximally heterogeneous groups of items with applications in psychological research, bioinformatics, and data science. It is known that $k$-MDCC is NP-hard for $k \geq 3$ but it has been an open question whether it can be solved in polynomial time for $k = 2$. We give a positive answer to this question by showing that $2$-MDCC can be solved by a quadratic number of cardinality-constrained 2-coloring problem instances ($2$-COLCC). We solve these instances by transforming them into a restricted class of subset sum instances. Although subset sum is NP-complete in general, for this restricted class the input values are bounded, ensuring that the pseudopolynomial dynamic programming algorithm runs in polynomial time. As a consequence, we obtain a polynomial-time algorithm for $2$-MDCC. We demonstrate that a publicly available open-source implementation of our new algorithm outperforms the previous integer linear programming solution by several orders of magnitude so that even large datasets ($n = 10{,}000$) can be processed in less than a second.
title Coloring for dispersion: A polynomial-time algorithm for cardinality-constrained 2-anticlustering
topic Data Structures and Algorithms
url https://arxiv.org/abs/2604.24285