Open Conference Systems, ITACOSM 2019 - Survey and Data Science

Font Size: 
The future of Statistics: challenges for understanding new phenomena in a rapidly changing world
Giorgio Alleva

Building: Learning Center Morgagni
Room: Aula Magna 327
Date: 2019-06-05 06:00 PM – 07:00 PM
Last modified: 2019-05-06

Abstract


This paper discusses the challenges for statistics and the role data plays in understanding a society’s trends and transformations, providing a country with a clear and more complete picture of society.

Future challenges and the developments of official statistics and of statistical science can be presented along four main axes: data, capabilities to manage data, methods, data governance.

A first challenge is to have a central role in the new data ecosystem, assuming a leadership role in guiding their correct use and going beyond the traditional inferential paradigm based on data collected through a probabilistic sample.We cannot do this alone.Full cooperation with substantive disciplines and users are decisive.In this paper we illustrate the approach based on integrated systems of statistical registers and the policy of smart statistics that the EuropeanStatisticalSystem adopted in this transition phase,as well as the most significant experiences underway at Istat.

Secondly,closely connected with the challenge of understanding new data is knowing how to build new skills.The university system is experimenting with new paths in this direction.Is it possible for a single researcher to maintain sufï¬cient expertise in both statistics and computer science to model complex problems alone?In the paper it is argued that alongside the figure of data scientists the decisive element is knowing how to dialogue and interact between different communities of experts with different skills.

A third challenge is the development of methods.What are the research challenges facing the core of statistics?Many of the traditional research lines at the beginning of the new century are still valid: data analysis and reduction, sampling,approaches to inference, etc. But other issues are emerging.For instance, how to deal with the replicability and stability of findings,the high degree of heterogeneity and non-independency of data,missing and biased observations high dimensional inference.The paper emphasizes the strategic role of data integration in the NSIs for producing estimates based on multiple sources.

The fourth challenge is the governance of data, their production, processing and communication.The rapid increase in the availability and utility of data poses not only great opportunities and advantages, but also threats.The most important cross-cutting theme related to big data is privacy, which covers all aspects of the data life-cycle.The NSIs have studied for many decades the disclosure limitation and innovation in combining statistics with encryption to ensure privacy is expected, as well as in terms of anonymization schemes and methods to analyse anonymized data. More recently privacy issues have expanded to include concerns about two crucial statistical tools/products: the informative value of an integrated system of statistical registers and algorithmic decision fairness,particularly with deep learning algorithms,which can be quite opaque even to their developers.

Society can largely benefit from new informative infrastructures and automated decisions,but the need for greater scrutiny and transparency is crucial.For the future of our disciplines we must significantly contribute to an informed public discourse, otherwise abuse and misuse of data can generate strong public mistrust.

 

 

 

 


Full Text: SLIDES