City Research Online

Confidence Ensembles: Tabular Data Classifiers on Steroids

Zoppi, T. & Popov, P. ORCID: 0000-0002-3434-5272 (2025). Confidence Ensembles: Tabular Data Classifiers on Steroids. Information Fusion, article number 103126. doi: 10.1016/j.inffus.2025.103126

Abstract

The astounding amount of research conducted in the last decades provided plenty of Machine Learning (ML) algorithms and models for solving a wide variety of tasks for tabular data. However, classifiers are not always fast, accurate, and robust to unknown inputs, calling for further research in the domain. This paper proposes two classifiers based on confidence ensembles: Confidence Bagging (ConfBag) and Confidence Boosting (ConfBoost). Confidence ensembles build upon a base estimator and create base learners relying on the concept of “confidence” in predictions. They apply to any classification problem: binary and multi-class, supervised or unsupervised, without requiring additional data with respect to those already required by the base estimator. Our experimental evaluation using a range of tabular datasets shows that confidence ensembles, and especially ConfBoost, i) build more accurate classifiers than base estimators alone, even using a limited amount of base learners, ii) are relatively easy to tune as they rely on a limited number of hyper-parameters, and iii) are significantly more robust when dealing with unknown, unexpected input data compared to other tabular data classifiers. Amongst others, confidence ensembles showed potential in going beyond the performance of de-facto standard classifiers for tabular data such as Random Forest and eXtreme Gradient Boosting. ConfBag and ConfBoost are publicly available as PyPI package, compliant with widely used Python frameworks such as scikit-learn and pyod, and require little to no tuning to be exercised on tabular datasets for classification tasks.

Publication Type: Article
Additional Information: This article is available under the Creative Commons CC-BY-NC-ND license and permits non-commercial use of the work as published, without adaptation or alteration provided the work is fully attributed.
Publisher Keywords: Confidence Ensembles, Classification Confidence, Ensemble Learning, Robust Classification, Tabular Data, Machine Learning
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Departments: School of Science & Technology
School of Science & Technology > Computer Science
School of Science & Technology > Computer Science > Software Reliability
SWORD Depositor:
[thumbnail of IF_ConfEns_ClassifiersSteroids_V24_FINAL_nohighlights.pdf]
Preview
Text - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (1MB) | Preview

Export

Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Downloads

Downloads per month over past year

View more statistics

Actions (login required)

Admin Login Admin Login