关键词:
High dimension
Small sample size
Meta-classification
Ensemble classifier
Microarray data
摘要:
Classification using small sample size (limited number of samples) with high dimension is a challenging problem in both machine learning and medicine as there are a wide variety of possible modeling approaches. Furthermore, it is not always clear which method is optimal for a prediction task. Different modeling choices include feature selection (dimensionality reduction), classification algorithms, and ensemble selection. There are several possible combinations of these methods, and it is not always clear which is the best. In the previous works, researchers show that evolutionary computation is useful to build an ensemble from the pairs of feature selection and classification algorithms. However, there are several parameters to be determined for the evolutionary computation and it requires computational time for the optimization. In this paper, we attempt to improve the approach by adopting meta-classification with the farthest-first clustering algorithm. The effectiveness and accuracy of our method are validated by experiments on four real micro-array datasets (colon, breast, prostate and lymphoma cancers) publicly available. The results confirm that the proposed method outperforms single individual classifiers and other alternatives (standard genetic algorithm, and methods from literature).