Confidence Ensembles: Tabular Data Classifiers on Steroids
Zoppi, T. & Popov, P. ORCID: 0000-0002-3434-5272 (2025).
Confidence Ensembles: Tabular Data Classifiers on Steroids.
Information Fusion,
article number 103126.
doi: 10.1016/j.inffus.2025.103126
Abstract
The astounding amount of research conducted in the last decades provided plenty of Machine Learning (ML) algorithms and models for solving a wide variety of tasks for tabular data. However, classifiers are not always fast, accurate, and robust to unknown inputs, calling for further research in the domain. This paper proposes two classifiers based on confidence ensembles: Confidence Bagging (ConfBag) and Confidence Boosting (ConfBoost). Confidence ensembles build upon a base estimator and create base learners relying on the concept of “confidence” in predictions. They apply to any classification problem: binary and multi-class, supervised or unsupervised, without requiring additional data with respect to those already required by the base estimator. Our experimental evaluation using a range of tabular datasets shows that confidence ensembles, and especially ConfBoost, i) build more accurate classifiers than base estimators alone, even using a limited amount of base learners, ii) are relatively easy to tune as they rely on a limited number of hyper-parameters, and iii) are significantly more robust when dealing with unknown, unexpected input data compared to other tabular data classifiers. Amongst others, confidence ensembles showed potential in going beyond the performance of de-facto standard classifiers for tabular data such as Random Forest and eXtreme Gradient Boosting. ConfBag and ConfBoost are publicly available as PyPI package, compliant with widely used Python frameworks such as scikit-learn and pyod, and require little to no tuning to be exercised on tabular datasets for classification tasks.
Publication Type: | Article |
---|---|
Additional Information: | This article is available under the Creative Commons CC-BY-NC-ND license and permits non-commercial use of the work as published, without adaptation or alteration provided the work is fully attributed. |
Publisher Keywords: | Confidence Ensembles, Classification Confidence, Ensemble Learning, Robust Classification, Tabular Data, Machine Learning |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Departments: | School of Science & Technology School of Science & Technology > Computer Science School of Science & Technology > Computer Science > Software Reliability |
SWORD Depositor: |
Available under License Creative Commons Attribution Non-commercial No Derivatives.
Download (1MB) | Preview
Export
Downloads
Downloads per month over past year