City Research Online

Interpreting Deep Learning Models with Marginal Attribution by Conditioning on Quantiles

Merz, M., Richman, R., Tsanakas, A. ORCID: 0000-0003-4552-5532 & Wüthrich, M. (2022). Interpreting Deep Learning Models with Marginal Attribution by Conditioning on Quantiles. Data Mining and Knowledge Discovery, 36(4), pp. 1335-1370. doi: 10.1007/s10618-022-00841-4

Abstract

A vast and growing literature on explaining deep learning models has emerged. This paper contributes to that literature by introducing a global gradient-based model-agnostic method, which we call Marginal Attribution by Conditioning on Quantiles (MACQ). Our approach is based on analyzing the marginal attribution of predictions (outputs) to individual features (inputs). Specifically, we consider variable importance by fixing (global) output levels, and explaining how features marginally contribute to these fixed global output levels. MACQ can be seen as a marginal attribution counterpart to approaches such as accumulated local effects (ALE), which study the sensitivities of outputs by perturbing inputs. Furthermore, MACQ allows us to separate marginal attribution of individual features from interaction effects and to visualize the 3-way relationship between marginal attribution, output level, and feature value.

Publication Type: Article
Additional Information: This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Publisher Keywords: explainable AI (XAI), model-agnostic tools, deep learning, attribution, accumu- lated local effects (ALE), partial dependence plot (PDP), locally interpretable model-agnostic explanation (LIME), variable importance, post-hoc analysis, interaction
Subjects: H Social Sciences > HB Economic Theory
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
T Technology > T Technology (General)
Departments: Bayes Business School > Actuarial Science & Insurance
SWORD Depositor:
[thumbnail of Merz2022_Article_InterpretingDeepLearningModels.pdf]
Preview
Text - Published Version
Available under License Creative Commons: Attribution International Public License 4.0.

Download (3MB) | Preview
[thumbnail of MACQ_Final.pdf] Text - Accepted Version
This document is not freely accessible due to copyright restrictions.

To request a copy, please use the button below.

Request a copy

Export

Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Downloads

Downloads per month over past year

View more statistics

Actions (login required)

Admin Login Admin Login