Soren DeOrlow

Leveraging multiple machine learning models to gain insights into evolving customer behavior during the transition to a post-pandemic new normal

December 8, 2022 · IDSN 590 · Directed Research · with Filiz Osman-Calvo, Katrina Manrique, Junyi Li, Gbemi Giwa, Terrance Hunter, Sydnie Klett, Jiwoo Jeon, Sarah Maines

Pandemic-era signal in consumer data: what Peloton's rise and reversal reveal about subscription retention when behavior snaps back.

CREDITS

The research and analytical models in this paper reflect contributions by analytics students at the USC Marshall School of Business. The research and analytics team consisted of Filiz Osman-Calvo, Katrina Manrique, Junyi Li, Gbemi Giwa, Terrance Hunter, Sydnie Klett, Jiwoo Jeon, and Sarah Maines. The data analysis was done in the following courses: DSO 566 Marketing Analytics and DSO 528 Data Warehousing, Business Intelligence and Data Mining.

ABSTRACT

A pandemic can be seen as a “black swan” event, an occurrence that causes a ripple effect globally, ecologically, socially and economically. This even presents a unique opportunity for researchers to observe shifts in behavior that have great consequences. Businesses such as Peloton were able to offer services to meet customer needs during a unique period of time. Would thriving pandemic businesses be sustainable, or would they experience a reduction in customer retention once the pandemic was over?

RESEARCH PROBLEM

Students in a marketing analytics course at USC Marshall were tasked with identifying a unique business problem to research. The focus of the research was on the at-home fitness equipment company, Peloton. Peloton achieved great growth during the pandemic when customers were forced to workout from home. During the spring of 2022, employers began mandating that workers return to the office, along with new mask mandates from the CDC on Feb 25th, a broader shift in attitudes towards the pandemic were emerging. On March 11 of 2022 it was the two-year anniversary of WHO declaring COVID-19 a global pandemic. Across America, the end of the pandemic was seen as being near, inspiring an embrace of the “new normal.” Many businesses who had experienced great growth during the pandemic, such as delivery services, online retailers, subscription-based entertainment, and at-home fitness equipment, were now facing competition and threats to customer retention.

The primary research problem that needed to be addressed was whether we could discern a loss in customer retention. A null hypothesis, H0, in this regard would indicate that nothing extraordinary was going to indicate a declining customer retention rate. The data would need to reveal a statistically significant shift in customer behavior.

If the data were to show a declining customer retention rate, the next research problem would need to determine reasons why engagement with Peloton was declining. It would be important to determine what potential factors would drive customers away from Peloton. Would the in-home context be a determining factor that would cause customers to seek alternative ways to workout? Does Peloton’s portfolio of offerings sufficiently satisfy customers or would they be drawn to gyms and bootcamps where they might find social connection and camaraderie?

It would also be important to evaluate and define customer engagement on the Peloton platform. Peloton users engage with the platform in a number of ways through products such as the bike, treadmill and app. When customers purchase a product, they also subscribe to a monthly membership that affords them access to workout programming. Product usage often confines customer workout behavior to the home, while the app affords greater flexibility and extends the workout occasion to anywhere a user would like to workout, including running outdoors. Could shifting behaviors across the platform indicate or predict the departure of a Peloton customer?

A final research problem or challenge would be to collect data from Peloton users within a short timeframe and with limited resources. Would it be possible to gather enough data and the right data to understand the nature of shifting behaviors.

Before any research planning began, an initial effort was made to evaluate what Peloton data sources might be in existence and publicly available. Two datasets were published on Kaggle, but the subject matter was not relevant to the research challenge on customer retention. Another potential source of data is individual workout data from each Peloton product. In early 2021, Peloton’s API had a vulnerability that provided access to 1 million connected user accounts and visibility into a wide range of user behavior across the platform. This leaky API was reported on Tech Crunch and later fixed, but the data that was available at the time revealed the scope of information that Peloton has on its user population.

Data made available through Peloton’s previous API.

Datapoint

User IDs Instructor IDs Group Membership Location Workout stats Gender and age If they are in the studio or not

Datapoint
User IDs
Group Membership
Location
Workout stats
Gender and age
If they are in the studio or not

With access to data from approximately one million riders across the platform, Peloton would theoretically be able to detect longitudinal shifts in user behavior, however rider data alone would not reveal the full scope of customer retention. Additional data collection and insights would need to be gathered to understand the drivers behind changes in user behavior.

RESEARCH DESIGN

In order to gather insight into the factors impacting customer retention, we chose to create a survey so that we could directly engage Peloton customers. A survey would provide original data and yield results quickly. To conduct the survey careful consideration would need to happen to identify a sample population to involve in the study. The survey itself would need to be developed ensuring that the data being collected would be helpful in determining if a statistically significant change in customer retention was occurring and also what factors would impact customer behavior.

The survey was intended to gather four areas of data: user behaviors, user behavior over time, user attitudes and demographic information. Amongst the four areas of data collection, the survey had a total of 19 questions.

Dimension

User behaviors User behavior over time User attitudes Demographics

Dimension
User behaviors
User behavior over time
User attitudes
Demographics

The first area of data collection was focused on user behaviors related to fitness activities. The survey questions focused on user behaviors probed into the various fitness activities users engaged in, the frequency of engagement with various fitness activities, and how Peloton products were used to support various fitness activities. The broader objective of collecting these fitness behaviors was to determine the level of fitness engagement, the diversity of fitness activities and how well Peloton products were supporting users in their various fitness pursuits.

The second area of data collection was focused on user behaviors over time. The survey questions focused on four time components; the first survey question being an increase in Peloton use in the past six months, the second survey question being a decrease in Peloton use in the past six months, the third survey question being no change in Peloton use over six months and the final survey question inquiring whether the participant had owned or used Peloton for less than six months. The broader objective of collecting this longitudinal data was to determine an increase or decrease in Peloton usage over six months.

The third area of data collection was focused on user attitudes towards Peloton as well as general attitudes and preferences pertaining to fitness activities. The survey questions probed into a participant’s preferred location for fitness activities, the preferences or priorities for fitness activities. The broader objective of collecting this data was to determine overall perceptions towards Peloton and the attitudinal relationship to fitness activities and preferences. Was Peloton perceived as a solution to participants’ preferred fitness priorities, activities and location? Were participants’ preferred fitness priorities, activities and location falling within or outside of Peloton’s strengths?

The fourth area of data collection was focused on demographic characteristics of each participant. The survey questions probed into a persons’ gender, age, annual income and location data was automatically collected by Qualtrics.

DATA COLLECTION

The research was focused on determining whether customer churn was happening within Peloton users. In early 2022, there were three million active Peloton subscribers. To gain insight into this population, we chose to sample users across the full Peloton user journey. This meant including current Peloton users as well as sellers who were actively attempting to sell their Peloton bikes after having given up on the brand.

At the time of our research, we did not have access to any relevant existing user data and therefore were forced to collect data ourselves by engaging the public. There was no other option but to seek voluntary participation in our survey, knowing that this might introduce potential bias in the data. Knowing the survey would be shared in this way, best efforts were made to ensure that the survey itself was designed for ease of use. The survey was designed to be responded to on a mobile device and survey questions were assembled as multiple choice. The survey was tested internally and response completion took less than 3 minutes. This survey approach meant that it would be more convenient for participants to complete the survey and participate in the study. After all the survey data had been collected, the average survey completion took a little over 5 minutes to complete.

Another challenge of the research was collecting data and recruiting participants. Our initial statistical power study suggested that 50 participants would be needed. Without access to a pool of active users, research would not be possible. The quest for Peloton users led to the exploration of active online communities and later the discovery of Facebook groups. There were numerous vibrant communities containing thousands of users.

The recruitment strategy centered on deploying the survey to Peloton Facebook Groups. In order to deploy the survey to a Facebook Group, a Facebook user must apply to become accepted into a group. Four Peloton facebook groups were applied to and only one group, Peloton Group 2, provided immediate acceptance. The number of users within this Facebook group was 9.5 thousand members and our outreach to them yielded 317 responses. Recruitment of Peloton sellers involved sending the survey to 100 Craigslist listings in metropolitan cities across the United States and our outreach to these sellers yielded 5 responses. The seller dataset was ultimately abandoned because it contained incomplete responses and a limited overall response.

FB of Group# of MembersAcceptance into FB Group
Group 135.4K membersDelayed
Group 29.5K membersYes
Group 33.2K membersDelayed
Group 41.3K membersDelayed
Name FB of Group# of MembersSurvey Responses
Peloton Runners Facebook Group9.5K Members317 Responses
Craigslist Sellers100 Listings5 Responses

On the initial day that the survey was launched, there were exactly 100 responses and four days later that total exceeded 300. Ultimately, this response rate exceeded the initial goal of 50 responses.

It is important to acknowledge potential data biases and limitations that may be present engaging a Facebook Groups environment which is a voluntary response sample. There was an enthusiastic response to the survey on Facebook, with participants expressing their interest in the research. This enthusiasm might suggest that our sample might be biased towards participants who are engaged or satisfied with their Peloton experience and may not represent an accurate population of Peloton users. Based on the limited responses from Peloton sellers, those who are less engaged with Peloton or have abandoned the brand, may not be interested in participating in a survey about their experience. Ultimately, analysis of the data collected in this study may not fully reflect the extent of customer churn happening at Peloton.

DATA CLEANING

The Qualtrics survey contained four data collection areas with a total of 19 questions that were optimized for response on a mobile device which resulted in a dataset containing both categorical and numerical variables. Fortunately, the survey had a high completion rate which produced a relatively clean dataset. Outside of routine data cleaning of extraneous data created by Qualtrics, a key data cleaning challenge that emerged involved managing categorical data. Various data types can be seen in the table below.

Data TypeData TypeSub Type
GenderCategoricalNominal
LocationCategoricalNominal
Peloton ProductCategoricalNominal
Fitness BehaviorCategoricalNominal
Fitness Preference or ContextCategoricalNominal
Fitness FrequencyCategoricalOrdinal
AgeNumericalContinuous
SalaryNumericalContinuous

In some survey questions, participants were given the option to provide multiple answers, so that they could list all answers that applied to them. This scenario created cells with multiple comma delimited categorical variables. This presented a unique data cleaning challenge to expand each question into multiple binary dimensions while maintaining the integrity of each tuple. In the example below, a participant can up to three answers for this question and each cell may contain any combination of answers or categorical variables. This created a dimension of data that could not be utilized effectively in analysis.

Response IDIf you use Peloton, which format(s) do you use?
1Bike, App Workout
2Bike, Tread, App Workout
3Bike, App Workout

Therefore, this data would need to be converted into new dimensions with binary values. This can be seen in the example below.

Response IDApp WorkoutBikeTread
1110
2111
3110

Executing this data conversion in Python, requires leveraging the package Pandas and the sub-module “series expand.” Series expand takes a given column of data and evaluates every categorical variable within each cell, converting each variable into a new column of binary data.

In Python, the code to execute this data conversion begins by importing csv, pandas and numpy.

import csv
import numpy as np
import pandas as pd

Each column needing conversion must be listed.

columns = ['Response ID', 'Q1', 'Q2', 'Q3', 'Q4', 'Q5', 'Q6', 'Q7', 'Q8', 'Q9', 'Q10', 'Q11', 'Q12', 'Q13', 'Q14', 'Q15', 'Q16', 'Q18']

Pandas then reads the csv file and converts it into a dataframe.

sample = pd.read_csv('PelotonSurveyDataset_Cleaned_Headers_.csv')
sample

The sub-module get_dummies prepares Pandas to convert a column containing a combined series of categorical variables into columns containing binary columns or what’s known as a series.

df = pd.get_dummies(sample['Q1'].apply(pd.Series), prefix='Q1')
df

In order to activate this within each cell, Pandas must know how to divide each string into a new column. In this case the string is divided by a comma. The boolean expression “expand” then converts each string into the appropriate column and binary value.

df = pd.get_dummies(sample['Q1'].str.split(',',expand=True).stack()).max(level=0)
df

Once this process is complete a new dataframe is created and saved as a new csv file.

display(df)
df.to_csv('PelotonSurveyDataset_Columnized_BinaryValuesQ1.csv')

When dealing with a larger, more complicated series of data, this process becomes especially valuable in time efficiency. A great example of this can be seen in the very first question in the survey which asks about preferred fitness activity. This question presents a more complicated and illusive landscape of data that becomes immediately clearer with Pandas series expand.

BikingFitness classFlexibility trainingHikingOtherRunningSwimmingWeight training
001100100
101100100
201100100
301100100
401100100
501100100
...........................
31201100100
31301100100
31401100100
31501100100
31601100100

317 rows × 8 columns

A limitation of Pandas series expand is that it converts a single series or column at a time. This process must be repeated for each column rather than an entire dataset.

Another data cleaning task was dealing with demographic data. One challenge was ensuring that the dataset was free of collinearity. Collinearity is when two dimensions have an inherent linear relationship, such as “wins” vs “losses.” This obviously became a consideration when dealing with demographic data and specifically the issue of gender. To avoid this, the dimension “female” was included and “male” was removed.

Additional data cleaning tasks involved converting “age” and “salary” to numerical data types. These data were previously entered as multiple choice questions, where participants were given an age range or salary range to choose from. The median of these ranges was substituted for each participant answer. An alternate approach to this would have been to introduce a random age within the chosen range. This could have been easily done by leveraging Python’s “randrange()” method to return a random integer for age or salary.

The availability of Python packages such as pandas and numpy afforded seamless data conversion and ensured a clean dataset for analysis.

DATA ANALYSIS

The initial research problem called for descriptive, predictive and prescriptive approaches to analysis. Descriptive analytics are needed to determine if Peloton had a customer retention issue. Predictive analytics could provide insights into the drivers of customer churn. Ultimately, prescriptive analytics could provide strategies for addressing possible customer churn.

There are a number of machine learning models that are relevant in addressing the research problem and approaches to analysis. The models were linear regression, k-means clustering, decision trees and neural networks.

The first challenge of the research involved descriptive analytics to determine if there was a presence of customer churn. Within the survey, 31 participants responded that their Peloton use had been reduced over a 6 month period which equated to 10.225% of the research sample. This datapoint confirmed our alternate hypothesis that customers were leaving Peloton. This data dimension became the dependent variable for use in further analysis. The table below is an overview of the variables in the dataset.

DimensionDependent VariableIndependent (Predictor) Variables
User behaviors41
User behavior over time1 (6 month reduction in Peloton use)3
User attitudes46
Demographics3

The second challenge of the research required a predictive approach to customer retention. This analytical approach required data mining to predict when a churn event would occur. The tools that were utilized for data mining were partitioning with decision trees and deep learning with neural networks. Through the use of decision trees which partition data to reduce entropy, it was possible to see the impact of specific variables on prediction rates. Neural networks provided higher performance, but due to the opaqueness of this form of machine learning, the hidden layers of the black box provide less direct clarity on the actual drivers of prediction.

The decision tree table below represents the best performing model. This model included 3 variables: Every day, 3-5 times per week, and Exclusive. This model was able to predict 5 churn in the training set and 1 churn in the testing set. The training set produced sensitivity of 33% and specificity of 1.0%, while the testing set produced sensitivity of 50% and specificity of 90.44%. In some regards the decision tree performed well in predicting not churn, but unfortunately, the decision tree struggled to accurately predict churn. This was also revealed within a confusion matrix that was imbalanced. Therefore, attention was turned toward running the data through a neural network which was run using JMP software. There were 13 variables used in the neural network (see appendix for the table of variables). This model utilized 2 hidden layers with 3 and 60 neurons respectively and tanh activation function. The profiler did not show any irregular movement and the ROC curve was .8599 for the testing data. The neural network was effective in identifying churn, however there were higher false positives causing the accuracy to drop to 70.3% in the test set.

The final research challenge and analytical approach to customer retention involved an understanding of the characteristics of customer churn through a process of customer segmentation. Segmentation is done through a machine learning method called k-means clustering. This methodology leverages euclidean distance to determine a centroid for each cluster or customer segment. The centroid is built on a limited set of variables. Once the set of variables are defined, a number of clusters must be determined, such as k=3 or k=4. The segments are then generated through a methodology called unsupervised learning. Unsupervised learning leverages the data itself to find patterns and create segmentation based on mathematically calculated distances. Below are the clusters that were generated using k=4. Within this set of clusters, ten variables were used to generate the four clusters.

Cluster 1Cluster 2Cluster 3Cluster 4
Average Age40433853
% Female98%95%90%92%
Average Income$ 60,417$ 81,959$ 100,000$ 100,000
Average Income$ 60,417$ 81,959$ 100,000$ 100,000
6mo Reduc0.190.080.080.08
Q5 App0.670.540.690.62
Q7 Monthly0.020.020.030.01
Q16 Everyday0.330.520.490.59
Q16 Everyday0.330.520.490.59
Q2 Outdoors0.210.240.200.22
Q3 Privacy0.310.300.240.16
Q15 Expensive0.330.310.230.17

DATA INSIGHTS AND MODELING

The benefit of k-means clustering is having the ability to uncover and identify shared attributes within a customer segment. This approach revealed four unique Peloton customer segments. Two of the segments presented a threat to customer retention. What was revelatory and fascinating is the dimensionality of the four segments. It was possible to see how each of the four segments varied in behavior, attitude and demographics. In the table below are descriptions of each segment which are informed by each of the dimensions in the cluster.

ClusterCluster 1Cluster 2Cluster 3Cluster 4
Segment NamePeloton Equipment ReducersPeloton OutsidersPeloton App EngagersPeloton Seasoned Loyalists
DescriptionThis segment is younger and has less disposable income. They perceive Peloton as expensive and have used Peloton less frequently in the last 6 months.This segment prefers working out in the outdoors more than other segments. They are also inclined to perceive Peloton as more expensive. Their workout behaviors tend to be of higher frequency. They are less inclined to use the app.This is the youngest segment with more men. This is an affluent group who engages highly with the app.This is the oldest segment with most disposable income and engagement with the brand. This group is engaged across all platforms and exercises with Peloton frequently.
Risk to retentionHighModerateLowLow

What this ultimately reveals is that Cluster 1 poses the greatest risk to retention. Cluster 1 has the lowest engagement with Peloton, they also have the lowest disposable income and perceive peloton as expensive. As a result of this analysis, Cluster 1 will require additional customer retention efforts if Peloton is to retain them. These targeted marketing efforts would focus on the broad segment or population of customers.

What data mining would do is focus-in on specific customers themselves to identify where customer churn is happening. The decision tree and neural network models were able to predict churn, but with low accuracy. The best model was 70% accurate and any targeted marketing campaigns using this method would be inefficient, as the campaign would be sent to both the intended and unintended customers. The variables used in the best performing neural network model do not provide inherent insight or explainability. This is the challenge of neural networks, whose inner workings are a black box. It is also to remember that any relationship between variables may show a correlation, but not causation.

VariableType
I don’t workout with Peloton as much as I did 6 months ago.Dependent Variable
3-5 times per weekIndependent Variable
About 30 minutesIndependent Variable
ExclusiveIndependent Variable
BikingIndependent Variable
GymIndependent Variable
TonalIndependent Variable
App WorkoutIndependent Variable
Log(Income)Independent Variable
Every dayIndependent Variable
Log(Age)Independent Variable

Data analytics has provided insights into the original research problem, confirming that a customer retention issue is happening, identifying a population where retention is of the most concern and enabled several solutions to address the issue through product development and marketing.

Where research and data analytics could be better is with better data. The old saying is true, garbage in, garbage out. The analytical models would perform better with additional data and more dimensions. Supplemental qualitative research would also help in deepening understanding around the causes of customer churn. These additional research opportunities could provide Peloton with greater confidence in addressing churn.

SOURCES

Arora, Rohit. n.d. “Which Companies Did Well During The Coronavirus Pandemic?” Forbes. Accessed September 29, 2022

CDC. August 16, 2022. “CDC Museum COVID-19 Timeline.” Centers for Disease Control and Prevention

Chen, Al. 2022. “Get Peloton Workout Stats in Real-Time Using Peloton API · Peloton Analytics Tool for Workout Stats [+Template].” Coda | Everything Evolves, the Evolution of Documents. May 11, 2022

“Coronavirus: Timeline.” U.S. Department of Defense. Accessed September 29, 2022

Henry. 2022. “Every Fortune 100’s Return To Office Policy: Google, Apple + (AUG UPDATE).” Build a Better Company. Remotely. August 18, 2022

“Tour de Peloton: Exposed User Data | Pen Test Partners.” n.d. Accessed November 20, 2022

“Peloton Revenue and Usage Statistics (2022).” 2021. Business of Apps. November 11, 2021

Whittaker, Zack. 2021. “Peloton’s Leaky API Let Anyone Grab Riders’ Private Account Data.” TechCrunch (blog). May 5, 2021