This is a technical overview of the techniques you'll use to collect Metrics on your SaaS Trial Users so that you can leverage that information to stop your trial users from churning, which in turn will make your business a lot of money.
But be warned: There Is Math Here. If you'd prefer to step back a notch and learn what this whole Onboarding Analytics thing is all about, here's a much more entertaining overview I wrote about
How I Quadrupled My SaaS Trial Conversions (with math).
OK. Just us nerds left? Cool. Let me break out the Entity Relationship Diagrams…
Step One: Collecting Data
First, a bit of background. What we're doing here boils down to collecting data about everything your users do during their Trial. Every interaction with your site, every lifecycle mail they open, even every time they didn't show up for an entire week. If it happens, collect it.
We'll then use that data to find patterns about what sort of things your successful, gonna-be-paying, users do versus what your unsuccessful, churn-bound users are doing. And once we have those patterns, we'll be able to flag up failing users so that the we can guide them back onto the happy path.
So again, step one: we need to store everything they do. How about this for a schema:
Here, a Participant is one of your trial users. You could skip this table if you want to hook actions directly to Users, but this gives us a bit of flexibility and keeps our example self-contained. The Label table will hold the names of the Actions that our Participants can do, and allow us to add extra information, and to link off to any supplemental tables we'll build later. And, of course, Actions holds one record for each thing that happens, ever. ParticipantStatus is just a lookup table for "Trial", "Paid", "Expired" & "Cancelled".
Now we can build a little library in the backend of our app to quickly stash Actions for us. Ideally, we'll want to expose a single function to our developers that is so simple that they can't come up with an excuse not to use it:
It can really be that simple, as we'll know who our user is (since he's logged in), and we can probably ask somebody for the current time.
Now all that's left is to sprinkle a few hundred of those calls throughout the codebase, every time anything even mildly interesting happens, and to expose a way to for your marketing folk to manually add more Actions when they've interacted with your trialers in person. (Which, incidentally, is something that marketing folk actually do!)
Step Two: Finding Patterns
Here we go, straight into the Machine Learning stuff. Ready?
No, we're not ready. Sadly, it'll be months of boring data collection before we have enough for our Machine to Learn anything about. So unless you want to embarrass yourself with a Decision Forest that happily maps "User Logged In" to "Always Churns, 90% Confidence", we'll hold off on that for the moment. First, let's step back and just look at that data with our eyes.
Step 2A: Visualizing the data
Here are two of our Participants, cruising through their trials. Which would you say is more likely to convert?
It's amazing how much you'll learn about your users just by watching them. You'll find out that some people reset their password every single time they log in to your site. You'll find users who log in four times every day and check the same screen to see if anything has changed. And you'll find plenty of users who got stuck on something during the first 10 minutes and never came back.
Those are useful things to know, so it's worth building some simple reports right away to get that information to the folks who can use it. This is our Minimum Viable Product to justify collecting this data in the first place. Even though we know the Big Payoffs will come later on, it's surprising how big a win you can get just from this first step.
Here's a few report ideas to get you started:
Actions By User
Actions By Label
Trial Users (with action counts)
Paid Users (with action counts)
Expired Trials (with action counts)
Who Logged In This Week
Who Didn't Log In This Week
Step 2B: Statistics
Even though true ML is still a while off, we'll have trialers start converting (or expiring) right away, so we can start building statistics.
For a given user at the end of his trial, we have a list of things that he has done, and we know his outcome. That gives us enough information to determine which Labels tend to affect outcomes, and (to a lesser extent) by how much. We can at least state something along the lines of
"38% of participants with the Label 'FilledOutProfile' converted to paid, whereas only 8% of participants without that Label converted."
We can build little stories like that around all of our Labels, to see which are the most interesting. We can even combine them to generate something like a Score for a given Participant based on which Labels he has seen and which he hasn't.
So, assuming we can compile a list of Participants that have finished their trials and either paid (which IsGood) or expired or explicitly cancelled (which !IsGood), we can calculate a few properties for our Labels like this:
foreach (var label in Labels)
label.InGoodCount = Participants.Count(p => p.IsGood && p.LabelIDs.Contains(label.LabelID));
label.NotInGoodCount = Participants.Count(p => p.IsGood && !p.LabelIDs.Contains(label.LabelID));
label.InBadCount = Participants.Count(p => !p.IsGood && p.LabelIDs.Contains(label.LabelID));
label.NotInBadCount = Participants.Count(p => !p.IsGood && !p.LabelIDs.Contains(label.LabelID));
And then all we need are a couple helper methods on our Labels:
public double PercentIfIn()
var denominator = InGoodCount + InBadCount;
if (denominator == 0)
return (double)InGoodCount / denominator;
public double PercentIfNotIn()
var denominator = NotInGoodCount + NotInBadCount;
if (denominator == 0)
return (double)NotInGoodCount / denominator;
... and we can generate the little snippet of text describing the performance of a given Label. While we're at it, we can keep track of the expected conversion rate:
var goodCount = Participants.Count(p => p.IsGood);
var badCount = Participants.Count(p => !p.IsGood);
var expected = (double)goodCount / goodCount + badCount;
... and we can use the variance from the mean to assign an "Interestingness" value to each label that we can sort by when presenting them or using them to "score" Partipants:
// for each Label...
var up = Math.Max(PercentIfIn/expected, expected/PercentIfIn);
var down = Math.Max(PercentIfNotIn / expected, expected / PercentIfNotIn);
Interestingness = Math.Abs(up + down);
Part Three: Intervention
Now that we have all the collection and analysis ticking away, we'll have a bit of Calendar Time on our hands, waiting for enough Participants to stumble through their trials for our conversion statistics to be meaningful. That gives us some time to automate things.
We're going to want a nightly job of some description to come through and calculate all those statistics for our Labels and apply them to our Participants. We'll want to generate a report showing which of them are likely to convert and which of them are pretty much hopeless (and why.)
It'd also be cool if that nightly job fired off some webhooks for certain conditions so that we could set up worker tasks to do things like send off rescue mails to problem users, or at the least notify that guy in Marketing so that he can do something drastic like actually call them on the phone or something. You can even build him something that plugs directly into his CRM and Calendar so that he'll have up to date information and action items ready to go each morning.
The important thing is that by this point you'll be the hero, having demonstrated that this whole Customer Lifecycle Metrics thing is real, and that it will in fact make a measurable-in-dollars difference to your business by steering problem trialers back onto the happy path.
It's also worth noting, and I'm almost embarrassed to mention it since we're both software folk and could easily build all this from scratch, but just putting it out there: This is all built already.
You can sign up for a Free Trial today over at Unwaffle.com, and have it all up and running 20 minutes from now. We have drop-in client libraries for every programming language known to man, so all you need to do is sprinkle those Track() calls around in your code then sit back and watch the data flow in.
But either way, build or buy, that's how it works. And you really should be doing it. It'll make you a lot of money.
Discuss on hacker news