Evaluation of ensemble methods in imbalanced regression tasks

Moniz, Nuno Miguel, Branco, Paula Oliveira, Torgo, Luís

Abstract

Ensemble methods are well known for providing an advantage over single models in a large range of data mining and machine learning tasks. Their benefits are commonly associated to the ability of reducing the bias and/or variance in learning tasks. Ensembles have been studied both for classification and regression tasks with uniform domain preferences. However, only for imbalanced classification these methods were thoroughly studied. In this paper we present an empirical study concerning the predictive ability of ensemble methods bagging and boosting in regression tasks, using 20 data sets with imbalanced distributions, and assuming non-uniform domain preferences. Results show that ensemble methods are capable of providing improvements in predictive ability towards under-represented values, and that this improvement influences the predictive ability of models concerning the average behaviour of the data. Results also show that the smaller data sets are prone to larger improvements in predictive accuracy and that no conclusion could be drawn when considering the percentage of rare cases alone.

Publication
Proceedings of Machine Learning Research, First International Workshop on Learning with Imbalanced Domains: Theory and Applications. (2017)
Paula Branco
Paula Branco
Assistant Professor

I’m an Assistant Professor at EECS, University of Ottawa. My research interests include Artificial Intelligence, Machine Learning, Imbalanced Domains, Outlier Detection, Anomaly Detection, Fraud Detection and Cybersecurity.