A Lead represents a potential customer interested in buying your products or services. In this era of Advanced Analytics and Machine Learning, every organization wants to transform the process of identifying Leads from a long list of people or companies that have some interest in the product or service you offer. Transform it to make it better and more efficient. To address this issue, the Leads Scoring Model comes into the picture.
Image by the author: Leads Scoring model training and predictions process
Companies are in business to profit and can only profit if they provide the products or services their customers demand. Meeting these demands means that they can generate revenue to keep the business going and expand it. That is where generating sales leads comes in because if companies are not bringing in new customers, then they will not be able to grow and will begin to stagnate.
Leads Scoring Model is merely a methodology where we train machine learning models to learn from historical data. In our example, the model will learn to classify leads in two states - "will convert" and "will not convert." It will also understand what influences the leads to convert.
If your team has many leads but not enough resources to pursue them all, you must prioritize your sales teams' time and give them the best possible leads. That will mean the leads with the highest probability to convert.
Maybe it will find out that leads who had more than two phone calls are very likely to convert, but only if they have spent more than 10 minutes on your website.
I am sure you will agree that this knowledge is extremely powerful in the hands of any marketing and sales team.
Dataset for leads scoring model
When you acquire the lead, it commonly includes information like:
name
demographic
tags / comments
contact details of the lead
Source of origin
Time spent on the website
the number of clicks
number of emails sent
number of phone calls/demos
...
For this post, we are using a popular Lead Conversion dataset from Kaggle. It contains over 9240 past leads with 37 columns.
The dataset consists of various attributes such as Lead Source, Total Time Spent on Website, Total Visits, or Last Activity. These may or may not help decide whether a lead will be converted.
The most important variable is the column 'Converted.' It tells whether a past lead was converted to a customer or not.
Our goal: the company desires to identify the most potential leads, also known as 'Hot Leads.'
Suppose they successfully identify this set of leads. In that case, the lead conversion rate should go up as the sales team will now be focusing more on communicating with the potential leads rather than making calls to everyone.
Import dataset
In a few mouse clicks, we imported and parsed a CSV file that we previously downloaded from Kaggle.
We can browse through our dataset rows, filter, or search on the View Data tab. We have 37 columns and 9240 rows.
Every uploaded dataset in Graphite has a practical Summary tab. It enables, at a glance, to check distributions of numeric columns, the number of null values, and different statistical measures.
Image by the author: leads scoring training dataset summary in Graphite
As part of quick exploratory data analysis (EDA), it is always good to check correlations (ready for you on the Correlation tab) in the dataset to understand and "feel" the data better.
Image by the author: leads scoring training dataset correlations in Graphite
Binary Classification in Machine Learning
Predicting Lead Conversion is a great use case of binary machine learning classification. Binary, because our target variable we will be training the model for can have only two states - '0 - not converted' and '1 - converted'.
Run the no-code machine learning model in Graphite
Now we have our dataset uploaded, and we are ready to create a no-code machine learning model in Graphite. We chose the Binary Classification model.
In Graphite, to build a binary classification model, you need
a binary target column (what are we predicting, with only two distinct states?)
a set of features (other columns that have an impact on the target column)
In just a few mouse clicks, we will define a model Scenario.
Our Target column from our dataset:
Image by the author: target column selection in Graphite
We will select all other columns as features.
Image by the author: feature columns selection in Graphite
Note how Graphite immediately excluded columns that are not appropriate for modeling. Examples:
Prospect ID: it contains 9240 unique values. The column will not be used because it is a categorical value that contains more than 90% unique values.
Magazine: The column will not be used because it is a constant. It can not influence the target variable.
Binary classification Model Results
We will leave all other options on default and run this scenario.
Graphite will take care of several preprocessing steps to achieve the best results, so you don't have to think about them. All these preprocessing steps will occur automatically:
null values handling
missing values
One Hot Encoding
fix imbalance
normalization
constants
cardinality
Graphite will take a sample of 80% of our data and train several machine learning models. Then, it will test those models on the remaining 20% and calculate relevant model scores. The final best model fit, results, and predictions will be available on the Results tab.
After about 30 seconds, we have our results.
Graphite runs several Machine learning algorithms that work best with binary classification problems by using
80% of the data (7392 rows) for training and
20% (1848 rows) for a test dataset.
The total training time was 36.46 seconds.
The best model based on the F1 value is Light Gradient Boosting Machine. Other models' training metrics are listed below.
Image by the author: leads scoring training results in Graphite
The model Fit tab shows how well Graphite performs. For 1848 rows in the test dataset, we compare the Model's predictions of the column 'Converted' to historical, known outcomes for the column 'Converted.' Model fit is better when the historical and predicted bars are closer.
Image by the author: leads scoring model fit in Graphite
Confusion Matrix - How Did the Model Perform?
Confusion Matrix reveals classification errors. It makes it easy to see whether the Model is confusing two classes. For each class, it summarizes the number of correct and incorrect predictions. The Model predicted column 'Converted' for a test dataset of 1848 rows and compared the predicted outcomes to the historical outcomes.
Image by the author: leads scoring model accuracy in Graphite
Correct Predictions
1762 in total out of 1848 test rows. This is defining Model Accuracy = 95.35%
True Positives (TP) = 634: a row was 1 and the model predicted a 1 class for it.
True Negatives (TN) = 1128: a row was 0 and the model predicted a 0 class for it.
Errors
86 in total out of 1848 test rows, 4.65%
False Positives (FP) = 35: a row was 0 and the model predicted a 1 class for it.
False Negatives (FN) = 51: a row was 1 and the model predicted a 0 class for it.
Other Model Scores
Please note that we describe predicted values as Positive and Negative and actual values as True and False.
Accuracy, (TP + TN) / TOTAL.
From all the classes (positive and negative), 95.35% of them we have predicted correctly. Accuracy should be as high as possible.
From all the classes we have predicted as positive, 94.77% are actually positive. Precision should be as high as possible.
Recall, TP / (TP + FN).
From all the positive classes, 92.55% we predicted correctly. Recall should be as high as possible.
F1 score, 2 * (Precision * Recall)/(Precision + Recall).
F1-score is 93.65%.It helps to measure Recall and Precision at the same time. You cannot have a high F1 score without strong model underneath.
Feature importance
Feature importance refers to how much this Model relies upon each column (feature) to make accurate predictions. The more a model relies on a column (feature) to make predictions, the more important it is for the Model. Graphite uses a permutation feature importance for this calculation.
Image by the author: leads scoring feature importance in Graphite
The most important feature is column "Tags", then "Last Notable Activity", "Total Time Spent on Website", and so on.
In Graphite, it is very easy to check column like "Tags" in respect to our target column ("Converted"). The most leads that converted have a tag value of "Will revert after reading the email":
How to Create a Leads Scoring Model: Definitive Guide 1
Predictions for the New Leads
Graphite automatically deployed trained Model. That means it is straightforward to predict new, unseen data on leads, whether they will convert or not, and the probability of such an outcome.
Imagine that your marketing team informs you about their new lead after you trained the Leads Scoring Model with Graphite.
You can check whether that lead will convert and the probability. A powerful tool to keep you focused only on high-quality leads!
I hope this helped you understand how easy it is to train models in a no-code machine learning software like Graphite Note. With just a few mouse clicks, we could predict the lead conversion.
You can explore all other Graphite Models here. Feel free to train your machine learning model on any dataset with the same ease or schedule a demo if you need any help or have any questions.
This blog post provides insights based on the current research and understanding of AI, machine learning and predictive analytics applications for companies. Businesses should use this information as a guide and seek professional advice when developing and implementing new strategies.
Note
At Graphite Note, we are committed to providing our readers with accurate and up-to-date information. Our content is regularly reviewed and updated to reflect the latest advancements in the field of predictive analytics and AI.
Author Bio
Hrvoje Smolic, born in 1976 in Zagreb, Croatia, is the accomplished Founder and CEO of Graphite Note. He holds a Master's degree in Physics from the University of Zagreb. In 2010 Hrvoje founded Qualia, a company that created BusinessQ, an innovative SaaS data visualization software utilized by over 15,000 companies worldwide. Continuing his entrepreneurial journey, Hrvoje founded Graphite Note in 2020, a visionary company that seeks to redefine the business intelligence landscape by seamlessly integrating data analytics, predictive analytics algorithms, and effective human communication.
Graphite Note simplifies the use of Machine Learning in analytics by helping business users to generate no-code machine learning models - without writing a single line of code.
If you liked this blog post, you'll love Graphite Note!
This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.
Strictly Necessary Cookies
Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.
If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.
3rd Party Cookies
This website uses Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.
Keeping this cookie enabled helps us to improve our website.
Please enable Strictly Necessary Cookies first so that we can save your preferences!