Using Logistic Regression to Predict Customer Retention
 Introduction
The aim of this project will be to generate insights on customer preference, create a logistic regression machine learning model, to equip both primary and secondary stakeholders with insights to make informed decisions on optimizing marketing strategies, sales targeting efforts, user experience on the firm's web platforms, and to innovate services and products, leading to the achievement of business goals and increase in revenue.Â
To identify the top-performing marketing channels.
To understand the top-performing(top selling and/or revenue generating) product category.
To measure customer conversion rates.
To create a machine learning model using logistic regression to make predictions on what segment of current customers are willing to make subsequent purchases from the web platform.
The logistic regression machine learning algorithm will be created directly from the database using BigqueryML.
Project Structure
 Exploratory Data Analysis
Perform data wrangling using SQL to access the number of site visits, measure the conversion rate and select a machine learning model to make predictions.
With the results of these queries, we can formulate a hypothesis on how our customers are attracted to our web platform and what exactly our customers expect from our service.Â
To obtain further knowledge as to how relevant a brand is to prospective customers, I need to understand how these web users find us. Answering the business question of what marketing channels and strategies are our most efficient.
Exploring on, I write a query to understand which of our products are in top demand, this helps us to further understand our brand identity among our customers. This query also helps me to understand the revenue rate generated from specific categories and products.
Model Selection
However, to increase the accuracy of these strategies, I selected a logistic regression machine learning algorithm. This algorithm will inform us of the probable likeliness of current customers making subsequent purchases from the firm's web platform.
Observing the training specifications of the regression model.
This is the schema of the ML model, these are the features I have selected for the model, and the expected output is a single column that is named 'predicted_will_buy_on_return_visits' which predicts the customer segment which will return to engage with our web platform and make a purchase from the site.
I have explored the data, created a regression model to make predictions, however, to understand how well our model performed, I have evaluated it and these are the results.
 Feature Engineering
The features selected for the algorithm is a combination of the time spent by each customer on the firm's web platform, the bounce rate for each web session and the unique visitor identification number. The goal of this algorithm will be to predict which customers will make subsequent purchases from the web platform.
This basically means re-engineering the model to be more accurate and to clearly understand the relationship which exists between the customer's first visit, total time spent on the site and how far the customer progressed into the conversion funnel.
Summary
The goal of this data analytics project is to identify the possibility that a customer segment will make subsequent purchases from the firm's web platform.Â
To achieve this, I performed exploratory data analysisÂ
To achieve this, I performed exploratory data analysisÂ
- To understand the top-selling products.
- The average amount of time spent on the site by each customer before making a purchase.
- Revenue generated from each product category.