Open Conference Systems, 50th Scientific meeting of the Italian Statistical Society

Font Size: 
Using web scraping techniques to derive co-authorship data: insights from a case study
Domenico De Stefano, Vittorio Fuccella, Maria Prosperina Vitale, Susanna Zaccarin

Last modified: 2018-06-04


The aim of the present contribution is to discuss the first results of the application of web scraping procedures to derive co-authorship data among scholars. A semi-automatic tool is adopted to retrieve publications metadata from a unique platform introduced for managing and supporting research products in Italian academic and research institutions. The co-authorship relationships among Italian academic statisticians will be used as basis for retrieving updated collaborations patterns in this scientific community.


[1] Blondel, V. D., Guillaume, J. L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, 2008, P10008 (2008)
[2] De Stefano, D., Fuccella, V., Vitale, M. P., Zaccarin, S.: The use of different data sources in the analysis of co-authorship networks and scientific performance. Social Networks. 35, 370-381 (2013)
[3] Fuccella, V., De Stefano, D., Vitale, M. P., Zaccarin, S.: Improving coauthorship network structures by combining multiple data sources: evidence from Italian academic statisticians. Scientometrics. 107, 167-184 (2016)
[4] Girvan, M., Newman, M.E. Community structure in social and biological networks. Proceedings of the national academy of sciences. 99, 7821-7826 (2002)
[5] Lancichinetti, A., Fortunato, S.: Community detection algorithms: a comparative analysis. Physical review E. 80, 056117 (2009)
[6] Menardi, G., De Stefano, D.: Modal clustering of social network. In: Cabras, S. and Di Battista, T. and Racugno, W. (eds) Proceedings of the 47th SIS Scientific Meeting of the Italian Statistical Society, CUEC Editrice, Cagliari (2014)
[7] Mitchell, R.: Web scraping with Python: collecting data from the modern web. Packt Publishing, Birmingham (2015)
[8] Murthy, D., Gross, A., Takata, A., Bond, S.: Evaluation and development of data mining tools for social network analysis. In: Ozyer, T., Erdem, Z., Rokne, J., Khoury, S. (eds.) Mining Social Networks and Security Informatics, pp. 183-202. Springer, Dordrecht (2013)
[9] Vargiu, E., Urru, M.: Exploiting web scraping in a collaborative filteringbased approach to web advertising. Artificial Intelligence Research. 2, 44-54 (2013)

Full Text: PDF