Machine Learning (ML) models are making their way into real-world applications.
We all hear news about ML systems for credit scoring, health-care, crime prediction. We can easily foresee a social scoring system powered by ML. Thanks to the fast pace of ML research and the great results obtained in controlled experiments, more and more people seem now open to the possibility of having statistical models ruling important parts of our lives.
Yet, most such systems are seen as black-boxes. Obscure number-crunching machines producing a simple Yes/No answer; at most, the answer is followed by an unsatisfactory "percentage of confidence". This is often an obstacle to the full adoption of such systems for critical decisions.
In health-care or lending, a specialist does not examine an enormous database to find out complex relationships. A specialist applies prior education and domain knowledge to make the best decision for a given problem. Very likely, an assessment grounded on data analysis, even involving the help of an automated tool. But, ultimately, a decision backed by a plausible and explainable reasoning. That's why there are motivations for rejected loans and medical treatments. Moreover, such explanations often help us judging a good specialist from a bad one.
Most of us tend to resist to accept decisions that seem arbitrary; it is crucial to understand how important decisions have been made. This excellent article describes some real-world objectives of interpretable ML systems:
Trust: confidence in the system predictions
Causality: help to infer properties of the natural world
Generalization: deal with a non-stationary environment
Informativeness: include useful information about the decision process
Fair and Ethical Decision-Making: prevent discriminatory outcomes
It's clear that society cares about interpretable machine learning.
I'd like to go further: those who build ML systems should care about interpretability too. ML practitioners and engineers should pursue interpretability as a mean to build better models.
The intent of this article isn't to present the details of interpretability tools and techniques and how to apply them. Rather, I'd like to offer a vision of why such tools are important for the practice of Machine Learning and go over a few of my favorite tools.
ML systems (e.g. for classification) are designed and optimized to identify patterns in huge volumes of data. It is unbelievably easy to build a system capable to find very complex correlations between input variables and the target category. Structured data? Throw an XGBoost to it. Unstructured data? Some deep network to the rescue!
A typical ML workflow consists of exploring data, preprocessing features, training a model, then validating the model and decide if it's ready to be used in production. If not, go back, often engineering better features for our classifier. Most of the times, model validation is based on a measure of predictive power: for instance, the area under the ROC curve is often quite reliable.
Fig. 1 - How model interpretation fits in the common ML workflow
However, during model building, many design decisions can slightly change the model. Not only the choice of the classifier but countless decisions in each preprocessing step. It turns out that, given a non-trivial problem, there are countless models with high predictive power, each one telling a whole different story about the data. Some stories may simply be wrong, even though they seem to work for a specific dataset.
This has been splendidly called the Rashomon Effect. Which of those models should we deploy in production to make critical decisions? Should we always take the model with the highest absolute AUC? How should we differentiate between good and bad design decisions?
Interpretable machine learning approach and tools help us to make this decision and, more broadly, do better model validation. Which is more than simply look at the AUC, but answering questions such as: how the model output varies with respect to the value of each feature? Do the relations match human intuition and/or domain knowledge? What features weight the most for a specific observation?
We can roughly divide interpretability into the global and local analysis.
Global analysis methods will give you a general sense of the relation between a feature and the model output. For example: how the house size influences the chance of being sold in the next three months?
Local analysis methods will help you understand a particular decision. Suppose to have a high probability of default (not paid back) for a given loan application. Usually, you want to know which features led the model to classify the application as high risk.
The partial dependence plot displays the probability for a certain class given different values of the feature. It is a global method: it takes into account all instances and makes a statement about the global relationship of a feature with the predicted outcome. [Credits: Interpretable Machine Learning]
A partial dependence plot gives you an idea of how the model responds to a particular feature. It can show whether the relationship between the target and a feature is linear, monotonic or more complex. For example, the plot can show a monotonically growing influence of Square Meters on the house price (that's good). Or you can spot a weird situation when spending more money is better for your credit scoring - trust me, it happens.
Partial dependence plot is a global method because it does not focus on specific instances but on an overall average. The equivalent of PDP for single observations is called Individual Conditional Expectation (ICE) plot. ICE plots draw one line per instance, representing how the instance's prediction changes when the feature changes.
Fig. 2 - Partial Dependence Plot (the bold line) with Individual Conditional Expectations. Credits: https://github.com/SauceCat/PDPbox
Being a global average, PDP can fail to capture heterogeneous relationships that come from the interaction between features. It's often good to equip your Partial Dependence Plot with ICE lines to gain a lot more insight.
One of the latest and most promising approaches to local analysis is Shapley Additive exPlanations. It aims to answer the question Why did the model make that specific decision for an instance? SHAP assigns each feature an importance value for a particular prediction.
Before production, you can deploy your model in a test environment and submit data from, say, a holdout test set. Computing SHAP values for the observation in that test set can represent an interesting approximation of how the features will influence the model outputs in production. In this case, I strongly recommend extracting the test set "out of time", that is, the more recent observations being the holdout data.
Interpretable decisions from ML models is already an important demand to apply them in the real world.
In many critical ML applications, the solution has been to consider only inherently interpretable algorithms - such as linear models. Those algorithms, incapable of capturing fine-grained patterns specific to the training dataset, are going to capture only general trends. Trends that are easily interpretable and matched against domain knowledge and intuition.
Interpretable tools offer us an alternative: use a powerful algorithm, let it capture any pattern and then use your human expertise to remove undesirable ones. Among all the many possible models, choose the one that tells the right story about the data.
When you have interpretable results out of your trained model, you can exploit this interpretability. The outputs of the tools described above can constitute a brief report that a business person understands. After all, you need to explain to your boss why your model works so well. Interpretable models are probably going to lead your boss and all stakeholders to better business decisions.
Sometimes people say that only ML practitioners in highly regulated applications should bother about interpretability. I think the opposite: every ML practitioner should use interpretability as an additional tool to build better models.