Today, most services are digitalized, and data is more and more available. Companies have been able to store and process vast amounts of data while realizing that being customer-centric was becoming the main requirement to stand out from the competition. Customer Churn Prediction is important for subscription-based businesses. They must focus on customer retention and churn management to be, or remain, leaders. They also need to understand which customers are canceling their subscriptions and why.
The cost of acquiring a new customer could be higher than that of retaining a customer by as much as 700%, and that increasing customer retention rates by a mere 5% could increase profits by 25% to 95%.
In this article, we will perform churn analysis and prediction in Graphite without writing a single line of code.
Customer churn happens in the Software-as-a-Service business similarly as it is in subscription-based industries like the telecommunications industry. But very often, companies lack knowledge about the factors leading to customer churn. They must implement customer churn prediction models to respond to customer churn in time.
The main characteristic of machine learning is the ability to build systems capable of finding patterns in data and learning from it - without explicit programming the rules. In customer churn prediction models, the Model will observe behavior characteristics and other features that decrease customer satisfaction from using company services/products.
First, in the training phase, machine learning algorithms will reveal some shared behavior patterns of those customers who have already left the company.
Then, once trained, algorithms can check the behavior of future customers against such patterns - and point out potential churners.
Armed with that knowledge, companies can be proactive with these customers to engage with them, understand their pain points, and prevent churn before it happens.
So, how do we start working with churn rate prediction? Which data is needed?
For this tutorial, we use a Telecom Customer Churn dataset from Kaggle, which is quite popular for churn modeling.
Each row represents a customer, and each column contains the customer's attributes.
The dataset contains information about:
Let's import and parse a CSV file that we previously downloaded from Kaggle.
We can browse through our dataset rows, filter, or search on the View Data tab.
We have 21 columns and 7032 rows.
Every uploaded dataset in Graphite has a practical Summary tab. It enables, at a glance, to check distributions of numeric columns, the number of null values, and different statistical measures.
We can quickly check that our target column, "Churn", explaining if a customer left or not, is not very imbalanced. That means we have enough "yes" and "no" signals to train the model.
It is interesting to see the distribution of some of the columns, like "monthly charges". Most of our customers have monthly charges up to $28. Another customer group is centered around $80 / month.
Predicting Customer Churn is a great use case of binary machine learning classification.
The reason is because our target variable, "Churn" can have only two states -
Now we have our dataset uploaded. All is set to create a no-code machine learning model in Graphite. We chose the Binary Classification model.
In Graphite, to build a binary classification model, you need
In just a few mouse clicks, we will define a model Scenario in Graphite.
We select our Target column from our dataset:
We selected all other columns as features.
We will leave all other options on default and run this scenario.
Graphite will take care of several preprocessing steps to achieve the best results, so you don't have to think about them. If you are curious about technical stuff, all these preprocessing steps will occur automatically:
Graphite will take a sample of 80% (5625 rows) of our data and train several machine learning models.
Then, it will test those models on the remaining 20% (1407 rows) and calculate relevant model scores. Based on scores, it will select the best performing model for the dataset.
The best model fit, results, and predictions are available on the Results tab, after about 20 seconds training.
In our case the best Model based on the F1 value score is Logistic Regression. Other models' training metrics are listed below.
Confusion Matrix makes it easy to see whether the Model is confusing two classes (YES and NO in our case). For each class, it summarizes the number of correct and incorrect predictions. The Model predicted column 'Churn' for a test dataset of 1407 rows and compared the predicted outcomes to the historical outcomes.
1129 in total out of 1407 test rows. This is defining Model Accuracy = 80.24%
True Positives (TP) = 204: a row was Yes and the model predicted a Yes class for it.
True Negatives (TN) = 925: a row was No and the model predicted a No class for it.
278 in total out of 1407 test rows, 19.76%
False Positives (FP) = 103: a row was No and the model predicted a Yes class for it.
False Negatives (FN) = 175: a row was Yes and the model predicted a No class for it.
Other Model Scores
Please note that we describe predicted values as Positive and Negative and actual values as True and False.
Accuracy, (TP + TN) / TOTAL.
From all the classes (positive and negative), 80.24% of them we have predicted correctly.
Accuracy should be as high as possible.
Precision, TP / (TP + FP).
From all the classes we have predicted as positive, 66.45% are actually positive.
Precision should be as high as possible.
Recall, TP / (TP + FN).
From all the positive classes, 53.83% we predicted correctly.
Recall should be as high as possible.
F1 score, 2 * (Precision * Recall)/(Precision + Recall).
F1-score is 59.48%. It helps to measure Recall and Precision simultaneously.
Feature importance refers to how much this Model relies upon each column (feature) to make accurate predictions. The more a model relies on a column (feature) to make predictions, the more important it is for the Model overall. Graphite uses a permutation feature importance for this calculation.
The most important feature is column
For example, "gender" and the fact that customer is "Senior citizen" or not don't have any influence on churn.
In Graphite, it is very easy to check any feature in respect to our target column ("Churn").
Notice that most churn can be seen in the tenure 0-5 months, and then again for tenure 50-55 months. Already some valuable info for customer success team.
Next insight is that most churn can be seen in the contracts that are “Month-to-Month”:
Regarding Internet Service - likelihood of customers to churn is bigger if they use "Fiber Optic".
It is important to say that Graphite automatically deploys trained Model.
What that means, it is easy to predict new, unseen data on customer churn. We can get answers to questions like "Who will churn next"? "What is the probability of that outcome"?
After you trained the Churn Model with Graphite, your team give you information about new customers.
You can easily check whether that customers will churn - and the probability of churn.
A powerful tool to increase your retention.
Let's check the churn prediction for one of the new customers:
The trained model, based on historical data, claims this one will not churn, with 72% probability. He is not a target for customer success team. He is better candidate for upsell or to participate in a case study than a customer who is currently a churn risk.
For another new customer, the model is predicting that she WILL churn:
The main drivers, if you recall, are tenure, Contract, Internet Service - this customer has a Month-to-Month contract and Fiber Optics, which signals she is likely to churn.
Churn is a natural health indicator for subscription-based companies. Identifying customers who aren't happy with provided solutions allows businesses to learn about operation problems, product or pricing plan weak points, and customer preferences to reduce reasons for churn proactively.
Also, it's essential to define data sources to have a complete picture of customer interaction history. The more qualitative the dataset, the more precise forecasts will be.
I hope this helped you understand how easy it is to train models in a no-code machine learning software like Graphite. With just a few mouse clicks, we train the ML model and predict.
I hope you enjoyed it!