The Marginal Impact of Auxiliary Totals in Calibration

Alessio Guandalini; Claudio Ceccarelli

Open Conference Systems, ITACOSM 2019 - Survey and Data Science

Alessio Guandalini, Claudio Ceccarelli

Building: Learning Center Morgagni
Room: Aula 210
Date: 2019-06-05 04:20 PM – 06:00 PM
Last modified: 2019-05-23

Abstract

The calibration estimator (Deville and Sardnal, 1992; Sardnal, 2007) is widely used for deriving survey estimates. The main reasons for its spread can be summarised in the need of increasing accuracy of estimates, the need of achieve consistent estimates and the possibility to provide users with just a set of sampling weights.

For ONSs, consistency is a very important point especially as a means for promoting credibility in published statistics. In fact, it is not unreasonable to say that, for them, increasing consistency is almost a more imperative motivation than increasing accuracy.

That motivation, combined with a better provision of administrative data and registers, have resulted in a growing number of auxiliary totals considered in calibration. This enabled, of course, an higher level of consistency but, on the other hand, has increased problems of convergence for the constrained optimization problem. Not to mention the fact that adding further auxiliary totals when many are already considered can have just a cosmetic effect. Furthermore, in extreme cases, the abuse of auxiliary totals can even have a backfired effect on the accuracy of the estimates due to much variability in calibrated weights, especially for multipurpose surveys.

It is a well-known question that a wise choice of auxiliary totals can bring better estimates. A simple but useful solution is switch through "GREG thinking"and find the super-population model, implicitly assumed between the interest variable and the auxiliary variables, with the best fitting. This solution representsa good tool, but provides just an indirect evaluation on the impact of the auxiliary totals in terms of accuracy, without any hint on its magnitude.

In the present work, the Shapley decomposition method (Shapley, 1953) has been adapted to this context. The Shapley decomposition is based on the well known concept of the Shapley value in cooperative game theory. It was introduced by the namesake author for redistribute the profit generated by a coalition of players in proportion to the contribution that each player has made to the coalition itself. The amount of the coalition's profit due to a given player, net of other players, is the Shapley value.

The Shapley decomposition method can be applied in the calibration context, where changes in estimates or sampling variance reduction are the profits and auxiliary totals are the players. Then with a few simple adjustments, we are able to determine the marginal impact on estimates of each auxiliary variable considered, in terms both of efficiency and accuracy.

This method has been applied on quarterly estimates of the Italian Labour Force Survey from first quarter of 2013 to the fourth quarter of 2016. In particular the Shapley value for the auxiliary totals has been computed when estimating the main parameters of the survey, that is employment rate, unemployment rate, inactivity rate and proportion of NEET.

The work demonstrate that the Shapley decomposition method could be a very interesting tool for helping researchers in evaluating and choosing the auxiliary totals to be considered in calibration.

Full Text: SLIDES