Spring 2020 | Marketing Model | Retail Data Analysis
In this project, our team helped the Retail Relay- the grocery delivery service company to improve customer retention rates. We used the customer-level purchase information database to identify the retained customers’ distinct characteristics and instruct Retail Relay about how to increase the retention of its non-so-regular customers.
Our research question is to determine, among all eight independent variables, which variables significantly influence the customer retention rate. Because the dependent variable, whether the customers got retained, is a binary variable, we first used the Logit, Probit Model, and the Wald Test to determine which IV significantly impact the DV.
The results showed that both of the Logit and Probit models are significant. And the number of emails sent, email sent open rates, the customers who subscribed for paperless communication, and customers who subscribed for the automatic refill will all increase the odds of customer retention. On the other hand, the average order size has a negative relationship with the odds of customer retention.
Based on the Logic and Probit models’ results, we decided to explore further the variables that have a significant impact on the expected length of the duration for the customer who stayed with the store before they are churned. For this research question, the dependent variable is the duration (time), and independent variables are the same with previous models.
We conducted the survival analysis using the Kaplan-Meier model, Weibull AFT model, Weibull PH model, and Cox PH model. The result from the KM plots indicated that paperless subscription slightly improves the overall survival probability of customer retention; however, the difference of the survival probabilities are not highly distinctive. For the automatic order refill and doorstep delivery, customers who have subscribed to those services have a much higher survival probability of retaining. However, the variance for those two groups’ survival is probably more substantial, and further analysis might be needed.
For the customers who have been retained, there is right censorship for them when running models. Therefore, we used the Weibull AFT, Weibull PH, and Cox PH to determine what variables would be able to increase the duration of customer retention. The tests repeatedly show that the email sent, the subscription for the automatic refill and the order frequency will increase the time that customers stay with us. On the other hand, the paperless will decrease the time customers stay with us, even though it might increase the customer retention rates, based on the results from Logit, Probit, and the KM Model.
From the above survival analysis, we can conclude that customers who buy more frequently and spend more money each time stay longer with us. Customers who subscribed from our emails and automatic refills would also stay longer with us. Even though the paperless option would help with customer retention, it won’t help make customers remain longer. Based on those findings, we recommended that the Retail Relay could offer a discount on the weekly or monthly automatic refill option. Meanwhile, they could also work on an integrated email marketing strategy, which provided valuable information for customers and a reliable source for them. It helps build a long-term relationship with customers, which would drive the customer retention rate up while making customers stay with them longer.