Towards interpretable-by-design deep learning algorithms

Computing and Communications

Associated organisational units

Electronic data

2311.11396v1
12.7 MB, PDF document

View graph of relations

Research output: Working paper › Preprint

Published

More...

Publication date	19/11/2023
Publisher	Arxiv
<mark>Original language</mark>	English

Abstract

The proposed framework named IDEAL (Interpretable-by-design DEep learning ALgorithms) recasts the standard supervised classification problem into a function of similarity to a set of prototypes derived from the training data, while taking advantage of existing latent spaces of large neural networks forming so-called Foundation Models (FM). This addresses the issue of explainability (stage B) while retaining the benefits from the tremendous achievements offered by DL models (e.g., visual transformers, ViT) pre-trained on huge data sets such as IG-3.6B + ImageNet-1K or LVD-142M (stage A). We show that one can turn such DL models into conceptually simpler, explainable-through-prototypes ones. The key findings can be summarized as follows: (1) the proposed models are interpretable through prototypes, mitigating the issue of confounded interpretations, (2) the proposed IDEAL framework circumvents the issue of catastrophic forgetting allowing efficient class-incremental learning, and (3) the proposed IDEAL approach demonstrates that ViT architectures narrow the gap between finetuned and non-finetuned models allowing for transfer learning in a fraction of time \textbf{without} finetuning of the feature space on a target dataset with iterative supervised methods.

Research

Associated organisational units

Electronic data

Towards interpretable-by-design deep learning algorithms

Abstract

Quick Links

Connect With Us

Faculties & Depts

Contact Us