Speaking during the keynote session at Benefits Canada‘s 2023 Defined Contribution Plan Summit, Seth Stephens-Davidowitz, New York Times’ bestselling author and former Google data scientist, told the story of a horse called American Pharoah, who became the first in 37 years to win the Triple Crown.
Prior to the big wins, the horse wasn’t anything special, he said, but 50 years before data analytics became mainstream, American Pharoah’s owner Jeff Sader used data to predict horse-racing success. The data he looked at included the size of a horse’s nostrils, the volume of its fast twitch muscles and the size of its feces, but none of these theories panned out — until he started measuring the size of the ventricles in a horse’s heart.
“He found out [the size] of the left ventricle, put that in a spreadsheet, correlated that with how fast horses eventually ran, how much money they earned, how much they won and found the size of the left ventricle worked. It predicts horse-racing success.”
According to Stephens-Davidowitz, there are four lessons to learn here: the value of a dataset isn’t its size, it’s its newness — using a dataset that no one else has; the winners in the world of big data are entrepreneurial; these winners fail a lot on their way to success; and there are left ventricles out there.
“This is really, really exciting if you think about data science. When I talk to business audiences I leave them with the question: ‘What are the left ventricles waiting to be found?’ What’s the thing you can learn about your business, your industry, your customers that, if you know that, you’re not going to be five or 10 per cent better, you’re going to have a 10-times better understanding of everybody in the market [compared] to everybody else. Because if you find that left ventricle, your model is just going to be better than everybody else’s.”
Stephens-Davidowitz also talked about surveys, noting people have conducted these for the past 80 years to better understand others. But there’s a dirty secret in survey research, he added, which is people lie. “Even though it’s anonymous, people just don’t feel comfortable telling the truth. And we know this when we compare what people say in surveys to what they actually do.”
For example, when plan sponsors conduct research on pensions, respondents may feel pressure to say they’re saving for retirement, he said. “Everybody wants to sound good, even if they’re not really doing what they think they’re supposed to be doing.”
He shared a real-life example of predicting voter turnout. Instead of asking people if they’re going to vote, he suggested looking at Google searches: How to vote? Where to vote? “It’s highly predictive of how high turnout is going to be in that area. If people aren’t searching that, they aren’t actually going to turn out to vote.”
Returning to pensions, Stephens-Davidowitz noted the most Googled question about pensions is: ‘What’s a pension?’ He called this data profound because it goes to the area he discussed around other topics — that people are embarrassed.
For example, if the topic was raised in a focus group, the participants probably wouldn’t ask that question, he said. “That’s embarrassing. They’re going to pretend they know more than they do. So just keep this in mind — how basic people’s questions can be.
“Everybody in this room has worked in the area of pensions or retirement for your entire lives,” he added. “So if you’re dealing with [plan members] and you’re trying to explain your [pension plan], there’s something called the ‘curse of knowledge.’ You know something so well it’s hard to remember how little the person you’re talking to knows about that topic.”
Read more coverage of the 2023 DC Plan Summit.