Open Conference Systems, ITACOSM 2019 - Survey and Data Science

Font Size: 
On bootstrap for complex sampling designs via sampling scheme modification
Tomasz Żądło

Building: Learning Center Morgagni
Room: Aula 209
Date: 2019-06-05 04:20 PM – 06:00 PM
Last modified: 2019-05-23


According to Ranalli and Mecatti (2012) the majority of bootstrap methods for complex sampling designs are developed within one out of two approaches - the ad-hoc approach and the plug-in approach. In the first approach the original sample is re-sampled, usually using iid re-sampling and data re-scaling. But they also classify Antal and Tillé (2011), where non-iid re-sampling is used and rescaling is not made, as belonging to this approach. On the other hand, the plug-in approach is based on the idea of the bootstrap pseudopopulation and mimicking the original sample design. In this approach, firstly, the bootstrap pseudopopulation is constructed by replication of sampled elements the number of times equals inverses of first order inclusion probabilities which are rounded to integers using different methods (e.g. Holmberg (1998) and Barbiero and Mecatti (2010)). The generalization of the idea is presented in Barbiero, Manzi and Mecatti (2015) where replication is based on rounded calibration weights. Secondly, the bootstrap samples are drawn from the bootstrap pseudopopulation using the sampling design mimicking the original sampling design.

Ranalli and Mecatti (2012) solved the problem of usually time-consuming construction of the bootstrap pseudopopulation (where sampled elements are replicated the number of times equals rounded inverses of first order inclusion probabilities) by showing re-sampling designs from the original sample mimicking the process of sampling from the bootstrap pseudopopulation. In our approach we also mimic the process of sampling from the bootstrap pseudopopulation but (i) it is not physically constructed and - what is more - (ii) non-integer replications are allowed and (iii) numbers of replications do not have to be equal inverses of first order inclusion probabilities. Our approach is based on the modification of the original sampling scheme. Considerations are supported by simulation studies based on real data.