Open Conference Systems, STATISTICS AND DATA SCIENCE: NEW CHALLENGES, NEW GENERATIONS

Font Size: 
Detecting group differences in multivariate categorical data
Massimiliano Russo

Last modified: 2017-04-28

Abstract


In several studies, a group indicator is collected together with a multivariate vector of categorical variables with main goal in assessing evidence of  differences of the collected vector across these groups.Similar goals arise routinely, but very few general methods which can test for group differences in multivariate categorical data are discussed in literature. We address this goal proposing a Bayesian model which factorizes the joint probability mass function for the group variable and the multivariate categorical data as the product of the marginal probabilities for the groups and the conditional probability mass function of the multivariate categorical data given the group membership. To provide a flexible and computationally tractable model for the probability mass function of multivariate categorical vector we rely on a mixture of tensor factorizations, facilitating dimensionality reduction, while providing simple and accurate test procedures to assess global and local group differences.