Estimating the number of unseen species under heavy tails
Marco Battiston, Federico Camerlenghi, Emanuele Dolera, Stefano Favaro

Species sampling is a popular subject in several scientific disciplines. Assuming to be provided with an initial sample of size n, a crucial issue is the estimation of the number of new species that will be observed in an additional sample of size ln, being l>0. The case l<1 has been successfully tackled by Good (1953) and by Good & Toulmin (1956), but the most interesting situation l>1 has been addressed only recently by Orlitsky et al. (2017). We will show that their solution is unsatisfying when the species’ proportions have regularly varying heavy tails. Under this assumption, we provide another estimator for the number of new species and we empirically show its performance.

