About heuristic algorithm for Correlation Clustering problem solving

Aleksandr Soldatenko, Daria Semenova, Ellada Ibragimova
20m
The Correlation Clustering (CC) problem is traditionally defined as a problem of partitioning a signed graph without specifying the number of clusters in advance. In this paper, CC problem is considered for undirected and unweighted signed graphs without multiple edges and loops, where error functional is linear combination of intercluster and intracluster errors. In this formulation, the CC problem is NP-complete. Exact algorithms for this problem are time-consuming. Approximate algorithms for solving CC problem often lead to unsatisfactory results, and heuristic algorithms are often non-deterministic in the number of steps leading to a solution. We propose a new heuristic algorithm SGClustAlpha for CC problem solving. The main idea of this algorithm is in minimizing of intracluster error and optimization of error functional according to the greedy strategy. It was proved that this algorithm takes polynomial time. Numerical experiments were carried out on randomly generated signed graphs.