2021/02/18

Visualizing and improving retention rates for each customer experience

Author:: UmezawaKeisuke

, 2021/02/18

Visualizing and improving retention rates for each customer experience

日本語で読む

TL;DR

Hi, I am @umechan from the CRE (Customer Reliability Engineering) team in Mercari.

Today, I would like to share our trial with @shido as follows:

When we visualized the retention rate of customers who had a bad experience, we found that the retention rate of customers who did not complain is worse than customers who complained, which is similar to Goodman’s law [1].
Based on the visualization, we proposed a method for prioritizing measures.

Background

As you know, Mercari is a flea market app that allows individuals to buy and sell things easily, and it is a C2C service where both buyers and sellers are individuals. Thanks to this, everyone can easily enjoy buying and selling things, and we believe that we help individuals realize personal goals.

However, since sellers are also individuals, there are some problems that do not occur in general B2C services. For example, the product description may be inadequate and misleading, or the transaction may be unfriendly and give a bad impression to the counterpart. In this way, there are new problems associated with transactions between individuals, and Mercari needs to support its customers even more.

For this reason, Mercari has a team called Customer Reliability Engineering (CRE), which is constantly improving customer experiences under the mission of "building a reliable platform that customers will want to continue using." One of the specific guidelines is to create a system to improve the customer’s experience and continue to use Mercari.

Every day, Mercari receives inquiries and reports about our customers’ experiences. In response to such reports, our customer support helps them resolve problems, and analyze and improve products to make fundamental service improvements.

Problem

To capture bad experiences of customers, we mainly used inquiries from customers. However, in fact, they are only a small part of our customers’ overall bad experiences. According to a survey by Tomoyasu Sato [1], 4% of dissatisfied customers complain, and the remaining 96% do not complain and most of them leave silently. We expected that the same phenomenon is occurring in Mercari, and it is necessary to identify customers who had bad experiences but never send inquiries to us.

In addition, by grasping customers who had bad experiences that cannot be grasped only by inquiries, we believe that we can actively provide customer support and product improvement based on that, and evaluate the measures. Based on the above issues, we first thought that it was necessary to visualize the customers who corresponded.

Requirement

To visualize "something," it is necessary to define it as a metric so that it can be constantly monitored on a dashboard. In addition, to improve "something," it is necessary to define the metric and verify the effect based on it.

In this case as well, if you want to visualize and improve "a bad experience that cannot be identified by inquiries alone," you need to define them as indicators, and monitor and verify their effectiveness regularly. to achieve this, the following process is necessary in this case.

Define the segments of customers who had “a bad experience that cannot be identified by inquiries alone.”
Monitor metrics of the customer experience for each segment defined above.

By doing the above, it is possible to create metrics for each bad experience.

Solution

Define a segment of customers who had “a bad experience that cannot be identified by inquiry alone”

There are several ways to define a segment of customers who had a bad experience that cannot be identified by inquiries alone.

Define bad experiences in the log data and use the customers that match the definition for the segment.
Define the customers who had a bad experience and segment the customers who have a similar behavioral log to the customers who had a bad experience.

In this project, we focused our visualization on the former, defining bad experiences in log data and using the customers that match the definition as segments. For example, we visualized the following bad experiences.

Customers who received a bad evaluation after a transaction
Customers who did not give us a rating after the transaction
Customers who cancelled or were cancelled during the transaction
Customers whose delivery was delayed after purchase

In the above experiences, the log data can be used to define whether or not the experience took place. This makes it possible to identify customers who have had a bad experience even if they have not contacted us.

Define metrics for the customer experience

It is often said that you can’t improve what you can’t measure. There are several common metrics[2] for measuring customer experience. For example, the following metrics exist:

Net Promoter Score (NPS)
Customer Satisfaction Score (CSAT)
Churn Rate
Retention Rate, etc.

We chose "Retention Rate" as our metric this time. First of all, to define the retention rate, we need to define the meaning of retention users. In an e-commerce business, we can define it as a customer who takes a certain "action" within n days. For example, if we consider it is important that a customer logs in to our app every month, we can consider a customer who logs in within 28 days to be a retention user. Then, based on that definition, define the retention rate. In general, the retention rate definition is "the percentage of retention users at the end of a given period out of the total number of retention users at the beginning of that period.” So, if we consider customers who have logged in within 28 days to be retention users, the retention rate can be defined as "the percentage of the number of people who have logged in within 28 days at the end of a given period out of the number of people who have logged in within 28 days at the beginning of the period”. If we use a month as the period, then it is calculated as:

Retention Rate (28 days) = Logged in both 29~56 days and 1~28 days ago / Logged in 29~56 days ago

As Mercari is a flea market app, we can pick up the following behaviors as examples of “actions”:

Login
Buying
Listing
Selling

The following viewpoints can be cited as criteria for selecting metrics:

The metric should be calculated for all customers because we need to know the whole number.
The measurement frequency should be shorter because the numbers can be compared more frequently.
The reflection time after changes in the app should be shorter, because the AB test can give the verification result faster.

The first criteria is important because the purpose is to grasp the total number of customers who had bad experiences. NPS and CSAT do not meet this requirement because they use sampling. The second criteria is also difficult to collect enough samples daily for survey-based indicators. And for the third perspective, the n-day retention rate means that it takes n days before the effect can be measured. However, the delay of measurement is unavoidable because other indicators also have such a delay. Also, the reason why we did not use the churn rate is that Mercari is not a subscription service, so there are more customers who stop using the application than those who explicitly uninstall it. Those are the reasons why we use "Retention Rate."

Result

As mentioned above, we defined the segments of "customers who had bad experiences that cannot be grasped only by inquiries," defined the metric for customer experiences for each segment by the retention rate, and then visualized the transition in a graph. The following is a visualization of the customer segment that received a bad evaluation after the transaction. The x-axis is the date when they had the bad experience, and the y-axis is the retention rate. However, I hid the original value by changing the actual value, erasing the axis value, changing the scale, etc.

Looking at the above, we can see that the retention rate of customers who had a bad experience is lower than that of customers who did not have a bad experience. You can also see that the retention rate of customers who make inquiries is equal to or higher than that of customers who did not have a bad experience. The result itself may include biases such as the people making inquiries use Mercari more, but it may indicate the effectiveness of responding to inquiries, and the result is similar to Goodman’s Law[1].

The following graph is that of showing the transition of each number of people. I can’t give you a concrete value for this, but the number of customers who made inquiries is about 0.1 to 1% of the customers who did not have a bad experience, and about 5% of the customers who had a bad experience.I was particularly surprised that the latter value was close to the survey result [1] by Tomoyasu Sato.

By watching the transition of these values, you can watch the impact of each bad experience.

Future work

We believe this method can also be applied to:

Evaluate the effect of measures using CLTV by AB test
Use machine learning to define the segment "customers who had a bad experience that cannot be grasped by inquiries alone"

At the next opportunity, I would like to report on the results of these measures and the effectiveness of the prioritization described here.