Font Size:
Dirichlet processes, posterior similarity and graph clustering
Last modified: 2018-05-18
Abstract
This paper proposes a  clustering method based on the sequential estimation of the random partition induced by the Dirichlet process.  Our approach relies on the Sequential Importance Resampling (SIR) algorithm and on the estimation of the posterior probabilities that each pair of individuals are generated by the same mixture component. Such estimates do not require the identification of mixture components, and therefore are not affected by label switching. Then, a similarity matrix can be easily built, allowing for the construction of a weighted undirected graph. A random walk can be defined on such a graph, whose dynamics is closely linked to the posterior similarity. A community detection algorithm, the map equation, can then be implemented in order to achieve a clustering minimising an information theoretic criterion.
Full Text:
PDF