Open Conference Systems, 50th Scientific meeting of the Italian Statistical Society

Font Size: 
On the estimation of high-dimensional regression models with binary covariates
Valentina Mameli, Debora Slanzi, Irene Poli

Last modified: 2018-05-18

Abstract


In this paper we address the problem of estimating the parameters of high dimensional regression models characterized by binary covariates. We suggest a new procedure which combines particular clustering for the binary covariates and group penalized regression for estimating the model parameters. The good performance of the methodology is shown in a simulation study.


References


  1. Breheny, P., Huang, J. (2009). Penalized methods for bi-level variable selection. Statistics and Its Interface, 2, 369–380.
  2. Breheny, P. (2015). The Group Exponential Lasso for Bi-Level Variable Selection. Biometrics, 71, 731–740.
  3. Everitt, B., Landau, S., Leese, M., Stahl, D. (2011). Cluster analysis. 5th edn, Wiley, Chichester.
  4. Galimberti, G., Montanari, A., Viroli, C. (2009). Penalized factor mixture analysis for variable selection in clustered data, Computational statistics & data analysis, 53, 4301–4310.
  5. Huang, Z. (1998). Extensions to the v-means Algorithm for Clustering Large Data Sets with Categorical Values. Data Mining and Knowledge Discovery, 2, 28-304.
  6. Huang, J., Breheny, P., Ma, S. (2012). A Selective Review of Group Selection in High-Dimensional Models. Statistical Sciences, 27, 481–499.
  7. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58, 267–288.
  8. Santra, T. (2016). A Bayesian non-parametric method for clustering high-dimensional binary data. https://arxiv.org/pdf/1603.02494.
  9. Schwarz, G. (1978). Estimating the Dimension of a Model. Annals of Statistics, 6,461–464.
  10. Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society, Series B, 68, 49–67.
  11. Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. Annals of Statistics, 38, 894–942.

Full Text: PDF