K-means seeding via MUS algorithm

Leonardo Egidi; Roberta Pappada'; Francesco Pauli; Nicola Torelli

Open Conference Systems, 50th Scientific meeting of the Italian Statistical Society

Leonardo Egidi, Roberta Pappada', Francesco Pauli, Nicola Torelli

Last modified: 2018-05-21

Abstract

K-means algorithm is one of the most popular procedures in data clustering. Despite its large use, one major criticism is the impact of the initial seeding on the final solution. A modified version of K-means is proposed, based on a suitable choice of the initial centers. Similarly to clustering ensemble methods, our approach takes advantage of the information contained in a co-association matrix. Such matrix is given as input for the MUS algorithm that allows to define a pivot-based initialization step. Preliminary results concerning the comparison with the classical approach are discussed.

Full Text: PDF