Predicting new subscriptions with Machine Learning & Magicsheets

Motivation

Regardless of whether you

  • manage a Slack channel,
  • manage your start-ups newsletter,
  • etc….

The Mailchimp subscriber forecasting problem

We will try to answer the question:

The dataset

First, let’s get the past subscriber dataset for the past 3 months
from our Mailchimp. (You can read about exporting your contacts here.)

Cleaning the data

Mailchimp provides us with tonnes of useful information, such as he IP address of the subscriber, the location of the IP address, the subscriber's time zone, etc.

CONFIRM_TIME  count
0 2021-07-26 46
1 2021-07-27 59
2 2021-07-28 50
3 2021-07-29 57
4 2021-07-30 62
.. ... ...
87 2021-10-21 55
88 2021-10-22 56
89 2021-10-23 44
90 2021-10-24 56
91 2021-10-25 65

Visualizing the dataset

Let’s plot the new subscriber counts as a function of time. We will use the famous matplotlib library for it.

Splitting the dataset

Regardless of what model we train in the end, we need to be able to assess how good it is.

  • you then ask the model: what do you predict for the remaining 30% of the dataset?
  • this prediction you can now compare with the real data that you kept hidden away from the model at training.

The predictive model: first attempt

We now build the time series model. This can also be loaded directly from sktime library.

Training the model

Testing the model

We will now generate predictions for the period that the model was not trained on.

Assessing how good the model is

We now need some way of telling “how far” the values the model predicts for the prediction period are from the real number of new subscribers on those dates.

mean of the actual data:  54.47826086956522
prediction value: 69 54.304351
70 54.304351
71 54.304351
72 54.304351
73 54.304351
74 54.304351
75 54.304351
76 54.304351
77 54.304351
78 54.304351
79 54.304351
80 54.304351
81 54.304351
82 54.304351
83 54.304351
84 54.304351
85 54.304351
86 54.304351
87 54.304351
88 54.304351
89 54.304351
90 54.304351
91 54.304351
dtype: float64
51.323250355315075

Improving the predictive model

Before you throw your model (Exponential Smoothing model, in our case) out of the window, one way to make it work better and product more reliable predictions is to adjust the model’s hyperparameters.

Teaching your model periodicity

Certain things happen periodically. We might for example know that new subscribers are more likely to sign up for our newsletter on Mondays, Tuesdays and Wednesdays, and much less likely to join e.g. on weekends.

the new model's error is =  47.13314112736905
the new model's error is a 8.164154060659747 % improvement over the old model

Predicting new Mailchimp subscriber numbers

We can now finally move on to predicting the new Mailchimp subscriber numbers that we originally set out to do.

day   prediction
69 51.888901
70 53.800013
71 57.400023
72 53.500012
73 55.100009
74 56.100020
75 52.100006
print(np.floor(predictions))
day prediction
69 51.0
70 53.0
71 57.0
72 53.0
73 55.0
74 56.0
75 52.0

Automating the subscriber predictions with Magicsheets

We have gone through a lot here, so let’s sum it all up. In order to get Mailchimp new subscriber volume predictions, we had to go through the following steps:

  1. Load the dataset into Python with Pandas.
  2. Identify and select the relevant data columns.
  3. Build the time series model (in our case, this is Exponential Smoothing model).
  4. Train the model on the training set.
  5. Test the model on the testing set.
  6. Calculate the MSE, plot the predicitons, and adjust hyperparameters.
  7. Repeat 4→7 until you are happy with your model’s MSE.
  8. Make the predictions
  9. The next day (or week, or month): rinse, wash, repeat!
  1. Re-train the model and generate new predictions any time you want in a dedicated Slack channel with typing a simple command ‘/run-magicpipe’

I want it! How do I get it?

We are building a low-code version of the product that you can deploy in ~5min if you know basic Python right now. If you would like it, join the waitlist.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Magicsheets

Magicsheets

Adding magic to your spreadsheets, one click at a time.