Jupyter Notebook for Interactive Business Discussions - Part 1: Use of Sliders and Filters

Anshuman Lall
Predmatic
Published in
4 min readJul 2, 2018

--

As data scientists, we often help businesses by finding meaningful insights in the data. This could include predicting a valuable business indicator so that the decision-makers can take a certain decision. Well.. that’s the theory, but sometimes it is true. What is not true is that the decision-makers take our data-driven recommendation in the form we try to present. They ask a lot of other questions, probably to develop a complete understanding of our recommendations or perhaps to build more trust into our analysis. However, when we present our results in a static PowerPoint presentation (presumably prepared for a set of conditions), the slides may not have the answers that the business is looking for.

The same problem can be looked at from a different angle. Let’s say we build a machine learning model which shows a good accuracy in the test data (historical values of predictors). The business would like to understand what will happen with the new future or different values of predictors. There could be unlimited number of combinations of these predictor values. Then a static set of results may not cover all possible scenarios. Alternatively, we run or modify a Jupyter notebook code to get to the answer, we may risk losing the audience attention.

One possible way is to pre-compute a lot of scenarios and put a reporting front-end (Tableau, for example) which works very effectively. But the there would be some amount of precomputing effort and not everything can be precomputed.

Using the Jupyter notebook could be a more direct and visually interesting way to present the data science/ statistical modeling results in a business discussion. An example is shown below.

Before I begin, I would like to acknowledge the sources of my findings:

Building What-If Analysis on Top of a Regression Model

For the purpose of demonstrating the concept, lets consider a simple equation which came out of a regression model.

A business scenario could be the prediction of future repair (or maintenance) costs (Y) which depend on a number of factors. Let’s say there are 3 factors, one of which could if the planned length of operation (X1). The parameter, a0 can be considered a fixed cost (perhaps, routine maintenance). More operation hours (higher X1) would lead to more repair needs or costs (Y) which would lead to a positive a1. This relationship could change (presumably, improve) when more incremental data comes in.

At any given point of time and with the most available historical data, the business would like to vary the future values X1, X2, and X3 based on their plans and domain knowledge and see their effect of the repair costs “forecast” (Y).

To build this what-if scenario we can make use of the widgets extension. The first step is to enable the extension using the following command in terminal:

jupyter nbextension enable --py --sys-prefix widgetsnbextension

Then in the Jupyter notebook, the code below can be used for what-if analysis. For brevity, I am skipping the topic of regression modeling and manually assigning values to the coefficients.

from ipywidgets import widgets, interactive
from IPython.display import display
a0=1; a1=1; a2=2; a3=-1def f(x1,x2,x3):
value = a0 + a1*x1 + a2*x2 + a3*x3
print 'The projected repairs costs are $%sM' % value
w = interactive(f, x1=widgets.IntSlider(min=1,max=30,step=1,value=x1),
x2=widgets.IntSlider(min=1,max=20,step=1,value=x2),
x3=widgets.IntSlider(min=0,max=3,step=1,value=x3))
display(w)

The output is fully interactive using the slider functionality:

Building an Interface with a Filter

Let’s consider that there are two parts which needs repairs. Separate models are built for the two parts. We would like to select one part at a time and conduct the what-if analysis.

from ipywidgets import widgets, interactive
from IPython.display import display
parts = {'PartA', 'PartB'}def cost(part,x1,x2,x3):
if part == 'PartA':
a0=1; a1=1; a2=2; a3=-1
if part == 'PartB':
a0=2; a1=2; a2=4; a3=-1
value = a0 + a1*x1 + a2*x2 + a3*x3
print 'The projected repair cost for %s is $%sM' % (part,value)

x1_widget = widgets.IntSlider(min=1,max=30,step=1,value=1)
x2_widget = widgets.IntSlider(min=1,max=20,step=1,value=1)
x3_widget = widgets.IntSlider(min=0,max=3,step=1,value=1)
part_widget = widgets.Select(options=parts)
w1 = interactive(cost, part=part_widget,
x1=x1_widget,
x2=x2_widget,
x3=x3_widget)
display(w1)

The output is fully interactive using the slider and filter functionality:

Conclusions

As enterprises are adopting advanced data science use cases, the role of data scientists is becoming clear in a business setting, and no longer confined to the realms of the technology division. However, there is a gap between a data science output and business decision-making process. The current widely used tools, such as Jupyter notebook, have the capabilities which can reduce the gap if used appropriately. In this post, we covered some of these aspects.

In the next part of this series of posts, we will explore more advanced aspects towards building a business user friendly system for running and analyzing a data science solution in Jupyter notebook.

--

--