Open Conference Systems, STATISTICS AND DATA SCIENCE: NEW CHALLENGES, NEW GENERATIONS

Font Size: 
Non-parametric micro Statistical Matching techniques: some developments
Riccardo D'Alberto, Meri Raggi

Last modified: 2017-05-22

Abstract


Sometimes, the integration of different data sources is the only suitable solution to microdata shortage. Among the several data integration methodologies, Statistical Matching (SM) imputation allows to integrate different datasets when the same records are not uniquely identifiable through a detriment observed variable and/or beyond a modelled rescaling procedure from an observed sample. Particularly, non-parametric micro SM imputation (“hot deck”) techniques allow researchers both to work always with observed (real) data and to avoid model misspecification bias. Their “incompleteness” w.r.t. both the theoretical formalization and the strategy for the imputation goodness validation, are the object of this work investigation. We propose new combinations of not default distance functions and “hot deck” techniques, analysing how these combinations perform in different donor-recipient datasets scenarios and elaborating a robust, recursive strategy for the imputation goodness validation.