Open Conference Systems, CLADAG2023

Font Size: 
A NEW ALTERNATIVE METHOD FOR SIMULTANEOUS FEATURE SELECTION AND DETERMINATION OF INFLUENTIAL DATA POINTS IN COX MODEL WITH HIGH DIMENSIONAL DATASET
Nuriye Sancar, Deniz Inan

Last modified: 2023-07-06

Abstract


In the analysis of high-dimensional data, it is very important to consider the potential problems arising from having more features than data points. As in any modeling process with high-dimensional data, it is very important to accurately identify a subset of the features and reduce the dimensionality in the Cox modeling process in the case of high-dimensionality. The explanatory variables are mostly severely inter-correlated in high-dimensional data, which a creates multicollinearity problem. Numerous penalized techniques for the Cox model with high-dimensional data have been developed to handle the multicollinearity problem and decrease variability. Adaptive elastic net is one of the penalized methods used for feature selection that both handles the grouping effect and has the oracle property. On the other hand, another important issue in the Cox modeling process is the existence of influential data points in the data set. In high-dimensional data, the influence of data points is more intense since the number of features is higher than the number of data points. If there are influential data points in the data set, choosing the features without considering this important situation will produce erroneous findings. This means that the selection of features and determination of influential data points cannot be considered apart problems. In line with this, this study aims to introduce a metaheuristic optimization-based Adaptive elastic-net approach for the Cox model with a proper objective function for simultaneously determining influential data points and selecting features. The introduced metaheuristic-based method has been evaluated by extensive simulation studies by comparing it with different methods using various evaluation criteria under different scenarios.