Back to All Blogs
BlackBox7 min read

Unraveling the black box models with explainability - Explainable AI

Nirasha Nelki's profile picture

Nirasha Nelki

Jan 30, 2025
0
Unraveling the black box models with explainability - Explainable AI

Machine learning is a branch of artificial intelligence that enables algorithms to find hidden patterns in the datasets. Then these algorithms can predict the new and unseen data since they have learned the patterns in the data. A simple load prediction model predicts whether a loan request will be approved or rejected based on the input data. But it never says “Why did this loan request get approved or rejected?” It become a forever mystery since ML models act like black boxes.

The concept of Explainable AI (XAI) unrevealed this black box, allowing the users to understand what is happening inside the black boxes. Explainability tells us why our loan request was rejected. Since we know the reasons, next time we can submit the loan request in such a way that it would be accepted definitely. Explainable AI addresses the transparency issue in the black box model. Also, it addresses the issue of trustworthiness. When an ML model has made a prediction, there is a question of whether it is correct or not, because we don’t know how the ML model arrived at that prediction. If we could integrate explainability we can know the reasons for that prediction.

There are two ways that we can integrate explainability.

1. Make the model transparent.

(a) Design the model architecture to be easy for users to understand. Allow users to follow the model’s decision-making path manually. If the model is simple and easy to understand, users can effortlessly follow its path.

(b) Examples include linear regression, logistic regression, rule-based explanations, and decision trees.

2. Integrate the explainability separately (Post-hoc explanations).

(a) If the model architecture is complicated, it becomes hard to make it transparent. We can integrate a separate explainability model into complex deep neural network models.

(b) Post-hoc Techniques

  • (i) Feature importance: Identifying how each feature affects the model output.
  • (ii) Rule-based learning: After getting some insights from the model outputs, translate them into rules.
  • (iii) LIME: Creating a surrogate model using the perturbed instances to generate explanations on how a single data instance was predicted.
  • (iv) SHAP: SHAP is an explanation technique based on game theory. It assigns a value called “SHAP value” to each feature and it shows how each feature affects the model’s output.

Types of Explanations

There are two ways to generate explanations:

1. Local explanations: Explains how the given instance was predicted.

2. Global explanations: Global explanations give the explanations about the global aspect of the model. For example, it shows what features are mostly affecting the model’s predictions and biases of the model.

With the combination of global and local explanations, we can get a better understanding of how the ML model arrives at a particular decision.

Different XAI techniques can be applied to textual, image, and audio data. Also, there are a variety of XAI techniques utilized for different architectures as well (transformer architecture, Continuous Neural Networks, Deep Neural Networks).

Share your thoughts