Explaining neural networks’ (NNs) predictions is an ongoing research area. Due to their black-box nature, we often know very little about how they make decisions. However, understanding NNs is of great importance, especially in real-world applications, such as self-driving cars, where assuring the reliability and robustness of the model is crucial.
In our previous post, we described a method called Class Activation Mapping (CAM), which is used to highlight pixels in the input image that influence the assignment of this image to a certain class. In today’s post, we will present another method to find out which pixels are relevant to the prediction, namely Layer-wise relevance propagation (LRP) .
Layer-wise relevance propagation
The main idea behind the LRP algorithm lies in tracing back the contributions of input nodes to the final prediction.
First, the relevance score of the specified node in the last layer is set as its output. Next, the relevance value is propagated back towards the input layer using a redistribution rule. The basic redistribution rule (so-called LPR-Z) is presented below:
where j and k denote neurons in consecutive layers, and zjk=ajwjk is the activation of the neuron j multiplied by the weight between neuron j and neuron k.
The process of propagating back the relevance values is presented schematically below:
As a result of the LRP algorithm, the prediction is decomposed into pixel-wise relevances indicating how much the node contributes to the final decision.
In addition to the above, many other, more robust, redistribution rules have been proposed, for example LRP-ε, which adds small positive term ε to the denominator or LRP-γ which favors the effect of positive contributions over negative ones (see ref.  for more information). While multiple versions of the redistribution rule exist, they all share the conservation principle, which says that the activation strength of the output is conserved per layer or, in other words, the sum of neurons relevance scores is the same in all layers.
Below you can see the relevance maps produced for an image recognized as Siberian husky using 5 different redistribution rules available in , as well as the heatmap generated with the help of the CAM technique that we talked about in more detail in our previous blog post.
As you can see in the pictures above, the heatmaps differ from each other depending on what principle was used to calculate relevance scores and what method was applied (LRP or CAM). However, in all pictures, the dog’s head is clearly highlighted, which indicates that this part of the image is relevant for the classification result.
The LRP method was successfully applied to, among others, explaining neural network decisions in facial expressions recognition  and finding words relevant for document categorization .
 Bach, Sebastian, et al. “On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation.” PloS one 10.7 (2015)
 Montavon, Grégoire, et al. “Layer-wise relevance propagation: an overview.” Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Springer, Cham, 2019. 193-209
 Alber, Maximilian, et al. “iNNvestigate neural networks.” Journal of Machine Learning Research 20.93 (2019): 1-8.
 Arbabzadah, Farhad, et al. “Identifying individual facial expressions by deconstructing a neural network.” German Conference on Pattern Recognition. Springer, Cham, 2016
 Arras, Leila, et al. ” “What is relevant in a text document?”: An interpretable machine learning approach.” PloS one 12.8 (2017).
Project co-financed from European Union funds under the European Regional Development Funds as part of the Smart Growth Operational Programme.
Project implemented as part of the National Centre for Research and Development: Fast Track.