Customer Segmentation & Forecasting Part II

Eva Andres
5 min readFeb 14, 2021

Hi guys!

Let’s go with the 2nd part of this use case using prophet to predict the future of UK sales.

Remember that in the previous post we loaded the dataset this way:

# Loading the dataset with pandas
dataset = pd.read_excel(io='/content/sample_data/Online Retail.xlsx')
dataset = pd.DataFrame(dataset)
dataset.info()
This is the output

Prepare the data

We’ll transform the feature ‘InvoiceDate’ from string with date and time information into just a date.

dataset['InvoiceDate'] = pd.to_datetime(dataset['InvoiceDate']).dt.date

And now, we’re going to add the “revenue” in our dataset:

#calculate revenue for each customer
dataset['Revenue'] = dataset['UnitPrice'] * dataset['Quantity']
dataset.head(10)

Now, let’s check the data of “United Kingdon”, the country that has most of the orders.

df = dataset[dataset["Country"] == 'United Kingdom']
df = df[df["Revenue"] > 0]
df.head()

Let’s prepare the dataset for sale forescasting with only the features InvoiceDate and the Revenue:

tx_data = df.groupby('InvoiceDate').Revenue.sum().reset_index()tx_data.columns = ['InvoiceDate','Revenue']
tx_data.head(20)

To use prophet library the columns must be named “ds” and “y”. In the code bellow you can see this and the first rows:

tx_data.columns = ['ds','y']
tx_data.head()

And now the last:

tx_data.tail()

So, the first date is 01/12/2010 and the last date is 09/12/2011

Build the model

we’ll set monthly seasonality, you could also add the country holidays.

model = Prophet()
model.add_seasonality(
name='monthly', period=31, fourier_order=3, prior_scale=0.1)
model.fit(tx_data)

Prepare a dataframe for the results

Let’s go to prepare the resulting dataframe with the date for tx_data and the next 12 months:

future = model.make_future_dataframe(periods=12, freq='M')
future.head()
future.tail(7)

Create a new column “forecast” for the pronostics

The predict method will assign each row in future a predicted value which it names yhat. If you pass in historical dates, it will provide an in-sample fit.

forecast = model.predict(future)
forecast.tail()

The forecast object here is a new dataframe that includes a column yhat with the forecast, as well as columns for components and uncertainty intervals.

To check the columns of the future dataset use this:

forecast.columns

To see the last 10 rows:

forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail(10)

And for plotting the dataset:

model.plot(forecast);

In this graph you can see the past values and also the value of the next 12 months. The light blue stripe is the confident margin. The futher we move away in time the greater this value will be because the uncertainty is greater.

If we want to do a zoom in a specific interval for example from 01/10/2011 to 01/02/2012, we could do the following:

import matplotlib.pyplot as plt
model.plot(forecast)
plt.xlim('2011-10-01','2012-02-01')

Let’s plot the Yhat value (predictions only):

# plot the predictions. yhat
forecast.plot(x='ds',y='yhat')

Now, we’re going to see the forecast components. By default you’ll see the trend, yearly seasonality, and weekly seasonality of the time series. If you include holidays, you’ll see those here, too.

model.plot_components(forecast);

The trend is very good and most of the sales will be done on Saturday

plot_components_plotly(model, forecast)

Now, let’s check the UK forecast taking into account the holidays and the anual and weekly seasonability.

For holidays you can add the following code after prophet model creation:

prophet.add_country_holidays(country_name='UK')

or during the prophet model creation:

from fbprophet.make_holidays import make_holidays_dfyear_list = [2010, 2011, 2012]
holidays = make_holidays_df(year_list=year_list, country='UK')
holidays
prophet = Prophet(growth='linear',
yearly_seasonality=True,
weekly_seasonality=True,
daily_seasonality=False,
holidays=holidays,
seasonality_mode='additive',
seasonality_prior_scale=10,
holidays_prior_scale=10,
changepoint_prior_scale=.05,
mcmc_samples=0
).add_seasonality(name='yearly',
period=365.25,
fourier_order=3,
prior_scale=10,
mode='additive'
)

Now, we’ll fit the model with tx_data:

prophet.fit(tx_data)

And we’ll prepare the future dataset for the “pronostics” keeping a daily frequency:

future = prophet.make_future_dataframe(periods=365, freq='D')forecast = prophet.predict(future)

Let’s plot the future and its components:

fig = prophet.plot(forecast)
a = add_changepoints_to_plot(fig.gca(), prophet, forecast)
plt.show()

The trend is up

fig2 = prophet.plot_components(forecast)
plt.show()
plot_components_plotly(prophet, forecast)

As we can see, on December the sales are higher than in other months and in January they decrease due to the January slope.

Summarizing, the model pronostics for the next 12 months that UK sales will be very good.

And that’s all

--

--

Eva Andres

Senior Manager, FullStack Architect, AI Specialized