Predicting behavior in a business context has been the essence of our consulting firm since its launch in 2006. Almost 15 years later, we have worked on over 500 projects and conducted a massive amount of experiments, which resulted in a simple approach that almost always delivers – and combines good predictive power with excellent interpretation.
We’re extremely excited to announce that we release this approach to the community, with the intention of adding a simple yet valuable business tool to their increasingly sophisticated toolbox. Hence, we turn our methodology into a package that is as simple to deploy as a black box model, but offers many advantages we’ll explain here.
Let’s back up – what is a model?
A model is a simplified description of a system or process, to assist calculations and predictions. As such, no model ever captures the complete complexity of reality. Just like our mental models, algorithms represent simplified versions of the truth. This is best captured in this well-known catchphrase:
All models are wrong but some are useful. – George Box
So, what makes a model useful in a business context? In our experience, a useful model is a model that performs well AND provides valuable insights into the problem. For example, marketeers not only want to know who will purchase, they also want to understand the profile of their ideal target group. A bank needs to be able to understand why a specific client was refused credit. And whether it was ethical to do so. And it is equally useful to predict who is at risk of burnout than to understand the drivers and importance of stressors. In short, in many business applications, an additional percentage point in predictive power is not always fruitful if this turns the model into a black box. A lot of business value is left on the table when data scientists fail to understand the problem they are working on, and all value is lost if they fail to gain the trust of the business decision-maker.
The multiplicity of good models
In many real-life cases, fundamentally different models may produce very similar results. This phenomenon is often called ‘the multiplicity of good models’. This is actually a blessing in business problems, since this allows the data scientist in a business context to choose the algorithm that performs well technically AND offers useful insights. What was confirmed throughout many internal experiments (and is a public secret to many data scientists) is that when white box models are constructed carefully, they often perform quite similar to black box models.
Simplicity is the ultimate sophistication – Leonardo Da Vinci
The whitest of boxes
In our experiments, we have always been careful to only add complexity when it creates value. To provide a concrete example, one can bin every single predictor using a decision tree approach, looking for the optimal bins of every predictor. But when this does not add value over equi-frequency binning, we decided to stick with the latter. By making pragmatic and systematic choices, we have constructed a methodology that is equally elegant, pragmatic as powerful. An additional result of our methodology is that, not only the final model is interpretable, also the individual predictors are interpretable. And you can simply decompose the algorithm to understand why exactly a single client is expected to be interested in a certain product. In our case, the interpretability is enforced in the algorithm itself, and is not a result of a post-processing step. And as an additional benefit, the models produced are so compact and simple that they can be implemented as an extremely short piece of code in really any IT system or (God forbid) in MS Excel.
In short, our approach produces hyper-interpretable, fully transparent and scalable solutions that almost always work.
We are careful that we don’t oversimplify. There are cases where our approach hits its limits. So we often experiment and still perform regular comparisons with state of the art algorithms to make sure we don’t miss the ball in a specific case. We also follow closely the growing efforts put into making the newest algorithms more interpretable, with tools such as SHAP for example. But we only choose more complex solutions when they clearly outperform a more simple and elegant solution.
Why we decided to share our approach
When we were recruiting candidates in our early days, it was very unusual that they had been building models in a previous job. Today, every candidate in our recruitment process enters with the capacity to construct complex models in only a limited number of lines of code. But when asked the question “what did your model learn you?” many turn silent – some even fail to understand the question. By releasing our code, we turn our methodology into a package that is as simple to deploy as a black box model, but offers all the advantages explained before. With this, our aim is to convince aspiring Data Scientists to add white box models to their toolkit, so they are able to decide exactly when to choose for black box solutions.
Yes! Where do I find more information?
- Whoop here it is! https://pypi.org/project/pythonpredictions-cobra/
- Here is the link to the documentation: https://pythonpredictions.github.io/cobra.io/index.html
- And here is the link to our GitHub: https://github.com/PythonPredictions/Cobra
Looking forward to your feedback and contributions!
A big thank you to all the Python Predictions employees who contributed to this methodology and to the code itself. Especially to our greatest contributors Nele Verbiest, Guillaume Marion, Jan Beníšek and Matthias Roels. We’ve received inspiration in countless discussions with our clients, inside our team and through many experiments together with Prof. Kristof Coussement of IESEG Lille.
Thanks to the whole team for their enthusiasm in using the code in their projects and for their continuous feedback. And thanks to our Tobania CEO Lode Peeters for applauding our idea to share this asset with the world!
ps: want to grasp the basics of our methodology in a different way? Check out the Introduction to Predictive Analytics in Python on DataCamp of our former colleague Nele Verbiest, and follow in the footsteps of over 10.000 Data Scientists so far.