Open Conference Systems, STATISTICS AND DATA SCIENCE: NEW CHALLENGES, NEW GENERATIONS

Font Size: 
Model-based Clustering with Sparse Covariance Matrices
Michael Fop, Brendan Murphy, Luca Scrucca

Last modified: 2017-04-28

Abstract


We introduce mixtures of Gaussian covariance graph models for model-based clustering with sparse covariance matrices. The framework allows a parsimonious model-based clustering of the data, where clusters are characterized by sparse covariance matrices and the associated dependence structures are represented by graphs. The graphical models pose a set of pairwise independence restrictions on the covariance matrices, resulting in sparsity and a flexible model for the joint distribution of the variables. The model is estimated employing a penalised likelihood approach, whose maximisation is carried out using a genetic algorithm embedded in a structural-EM. The method is naturally extended to allow for Bayesian regularization in the case of high-dimensional data.