ABOUT THE SPEAKER
Sebastian Wernicke - Data scientist
After making a splash in the field of bioinformatics, Sebastian Wernicke moved on to the corporate sphere, where he motivates and manages multidimensional projects.

Why you should listen

Dr. Sebastian Wernicke is the Chief Data Scientist of ONE LOGIC, a data science boutique that supports organizations across industries to make sense of their vast data collections to improve operations and gain strategic advantages. Wernicke originally studied bioinformatics and previously led the strategy and growth of Seven Bridges Genomics, a Cambridge-based startup that builds platforms for genetic analysis.

Before his career in statistics began, Wernicke worked stints as both a paramedic and successful short animated filmmaker. He's also the author of the TEDPad app, an irreverent tool for creating an infinite number of "amazing and really bad" and mostly completely meaningless talks. He's the author of the statistically authoritative and yet completely ridiculous "How to Give the Perfect TEDTalk."

More profile about the speaker
Sebastian Wernicke | Speaker | TED.com
TEDxCambridge

Sebastian Wernicke: How to use data to make a hit TV show

Filmed:
1,628,704 views

Does collecting more data lead to better decision-making? Competitive, data-savvy companies like Amazon, Google and Netflix have learned that data analysis alone doesn't always produce optimum results. In this talk, data scientist Sebastian Wernicke breaks down what goes wrong when we make decisions based purely on data -- and suggests a brainier way to use it.
- Data scientist
After making a splash in the field of bioinformatics, Sebastian Wernicke moved on to the corporate sphere, where he motivates and manages multidimensional projects. Full bio

Double-click the English transcript below to play the video.

00:12
Roy Price is a man that most of you
have probably never heard about,
0
820
4276
00:17
even though he may have been responsible
1
5120
2496
00:19
for 22 somewhat mediocre
minutes of your life on April 19, 2013.
2
7640
6896
00:26
He may have also been responsible
for 22 very entertaining minutes,
3
14560
3176
00:29
but not very many of you.
4
17760
2256
00:32
And all of that goes back to a decision
5
20040
1896
00:33
that Roy had to make
about three years ago.
6
21960
2000
00:35
So you see, Roy Price
is a senior executive with Amazon Studios.
7
23984
4832
00:40
That's the TV production
company of Amazon.
8
28840
3016
00:43
He's 47 years old, slim, spiky hair,
9
31880
3256
00:47
describes himself on Twitter
as "movies, TV, technology, tacos."
10
35160
4816
00:52
And Roy Price has a very responsible job,
because it's his responsibility
11
40000
5176
00:57
to pick the shows, the original content
that Amazon is going to make.
12
45200
4056
01:01
And of course that's
a highly competitive space.
13
49280
2336
01:03
I mean, there are so many
TV shows already out there,
14
51640
2736
01:06
that Roy can't just choose any show.
15
54400
2176
01:08
He has to find shows
that are really, really great.
16
56600
4096
01:12
So in other words, he has to find shows
17
60720
2816
01:15
that are on the very right end
of this curve here.
18
63560
2376
01:17
So this curve here
is the rating distribution
19
65960
2656
01:20
of about 2,500 TV shows
on the website IMDB,
20
68640
4376
01:25
and the rating goes from one to 10,
21
73040
2896
01:27
and the height here shows you
how many shows get that rating.
22
75960
2976
01:30
So if your show gets a rating
of nine points or higher, that's a winner.
23
78960
4696
01:35
Then you have a top two percent show.
24
83680
1816
01:37
That's shows like "Breaking Bad,"
"Game of Thrones," "The Wire,"
25
85520
3896
01:41
so all of these shows that are addictive,
26
89440
2296
01:43
whereafter you've watched a season,
your brain is basically like,
27
91760
3056
01:46
"Where can I get more of these episodes?"
28
94840
2176
01:49
That kind of show.
29
97040
1200
01:50
On the left side, just for clarity,
here on that end,
30
98920
2496
01:53
you have a show called
"Toddlers and Tiaras" --
31
101440
3176
01:56
(Laughter)
32
104640
2656
01:59
-- which should tell you enough
33
107320
1536
02:00
about what's going on
on that end of the curve.
34
108880
2191
02:03
Now, Roy Price is not worried about
getting on the left end of the curve,
35
111095
4161
02:07
because I think you would have to have
some serious brainpower
36
115280
2936
02:10
to undercut "Toddlers and Tiaras."
37
118240
1696
02:11
So what he's worried about
is this middle bulge here,
38
119960
3936
02:15
the bulge of average TV,
39
123920
1816
02:17
you know, those shows
that aren't really good or really bad,
40
125760
2856
02:20
they don't really get you excited.
41
128639
1656
02:22
So he needs to make sure
that he's really on the right end of this.
42
130320
4856
02:27
So the pressure is on,
43
135200
1576
02:28
and of course it's also the first time
44
136800
2176
02:31
that Amazon is even
doing something like this,
45
139000
2176
02:33
so Roy Price does not want
to take any chances.
46
141200
3336
02:36
He wants to engineer success.
47
144560
2456
02:39
He needs a guaranteed success,
48
147040
1776
02:40
and so what he does is,
he holds a competition.
49
148840
2576
02:43
So he takes a bunch of ideas for TV shows,
50
151440
3136
02:46
and from those ideas,
through an evaluation,
51
154600
2296
02:48
they select eight candidates for TV shows,
52
156920
4096
02:53
and then he just makes the first episode
of each one of these shows
53
161040
3216
02:56
and puts them online for free
for everyone to watch.
54
164280
3136
02:59
And so when Amazon
is giving out free stuff,
55
167440
2256
03:01
you're going to take it, right?
56
169720
1536
03:03
So millions of viewers
are watching those episodes.
57
171280
5136
03:08
What they don't realize is that,
while they're watching their shows,
58
176440
3216
03:11
actually, they are being watched.
59
179680
2296
03:14
They are being watched
by Roy Price and his team,
60
182000
2336
03:16
who record everything.
61
184360
1376
03:17
They record when somebody presses play,
when somebody presses pause,
62
185760
3376
03:21
what parts they skip,
what parts they watch again.
63
189160
2536
03:23
So they collect millions of data points,
64
191720
2256
03:26
because they want
to have those data points
65
194000
2096
03:28
to then decide
which show they should make.
66
196120
2696
03:30
And sure enough,
so they collect all the data,
67
198840
2176
03:33
they do all the data crunching,
and an answer emerges,
68
201040
2576
03:35
and the answer is,
69
203640
1216
03:36
"Amazon should do a sitcom
about four Republican US Senators."
70
204880
5536
03:42
They did that show.
71
210440
1216
03:43
So does anyone know the name of the show?
72
211680
2160
03:46
(Audience: "Alpha House.")
73
214720
1296
03:48
Yes, "Alpha House,"
74
216040
1456
03:49
but it seems like not too many of you here
remember that show, actually,
75
217520
4096
03:53
because it didn't turn out that great.
76
221640
1856
03:55
It's actually just an average show,
77
223520
1856
03:57
actually -- literally, in fact, because
the average of this curve here is at 7.4,
78
225400
4576
04:02
and "Alpha House" lands at 7.5,
79
230000
2416
04:04
so a slightly above average show,
80
232440
2016
04:06
but certainly not what Roy Price
and his team were aiming for.
81
234480
2920
04:10
Meanwhile, however,
at about the same time,
82
238320
2856
04:13
at another company,
83
241200
1576
04:14
another executive did manage
to land a top show using data analysis,
84
242800
4216
04:19
and his name is Ted,
85
247040
1576
04:20
Ted Sarandos, who is
the Chief Content Officer of Netflix,
86
248640
3416
04:24
and just like Roy,
he's on a constant mission
87
252080
2136
04:26
to find that great TV show,
88
254240
1496
04:27
and he uses data as well to do that,
89
255760
2016
04:29
except he does it
a little bit differently.
90
257800
2015
04:31
So instead of holding a competition,
what he did -- and his team of course --
91
259839
3737
04:35
was they looked at all the data
they already had about Netflix viewers,
92
263600
3536
04:39
you know, the ratings
they give their shows,
93
267160
2096
04:41
the viewing histories,
what shows people like, and so on.
94
269280
2696
04:44
And then they use that data to discover
95
272000
1896
04:45
all of these little bits and pieces
about the audience:
96
273920
2616
04:48
what kinds of shows they like,
97
276560
1456
04:50
what kind of producers,
what kind of actors.
98
278040
2096
04:52
And once they had
all of these pieces together,
99
280160
2576
04:54
they took a leap of faith,
100
282760
1656
04:56
and they decided to license
101
284440
2096
04:58
not a sitcom about four Senators
102
286560
2456
05:01
but a drama series about a single Senator.
103
289040
2880
05:04
You guys know the show?
104
292760
1656
05:06
(Laughter)
105
294440
1296
05:07
Yes, "House of Cards," and Netflix
of course, nailed it with that show,
106
295760
3736
05:11
at least for the first two seasons.
107
299520
2136
05:13
(Laughter) (Applause)
108
301680
3976
05:17
"House of Cards" gets
a 9.1 rating on this curve,
109
305680
3176
05:20
so it's exactly
where they wanted it to be.
110
308880
3176
05:24
Now, the question of course is,
what happened here?
111
312080
2416
05:26
So you have two very competitive,
data-savvy companies.
112
314520
2656
05:29
They connect all of these
millions of data points,
113
317200
2856
05:32
and then it works
beautifully for one of them,
114
320080
2376
05:34
and it doesn't work for the other one.
115
322480
1856
05:36
So why?
116
324360
1216
05:37
Because logic kind of tells you
that this should be working all the time.
117
325600
3456
05:41
I mean, if you're collecting
millions of data points
118
329080
2456
05:43
on a decision you're going to make,
119
331560
1736
05:45
then you should be able
to make a pretty good decision.
120
333320
2616
05:47
You have 200 years
of statistics to rely on.
121
335960
2216
05:50
You're amplifying it
with very powerful computers.
122
338200
3016
05:53
The least you could expect
is good TV, right?
123
341240
3280
05:57
And if data analysis
does not work that way,
124
345880
2720
06:01
then it actually gets a little scary,
125
349520
2056
06:03
because we live in a time
where we're turning to data more and more
126
351600
3816
06:07
to make very serious decisions
that go far beyond TV.
127
355440
4480
06:12
Does anyone here know the company
Multi-Health Systems?
128
360760
3240
06:17
No one. OK, that's good actually.
129
365080
1656
06:18
OK, so Multi-Health Systems
is a software company,
130
366760
3216
06:22
and I hope that nobody here in this room
131
370000
2816
06:24
ever comes into contact
with that software,
132
372840
3176
06:28
because if you do,
it means you're in prison.
133
376040
2096
06:30
(Laughter)
134
378160
1176
06:31
If someone here in the US is in prison,
and they apply for parole,
135
379360
3536
06:34
then it's very likely that
data analysis software from that company
136
382920
4296
06:39
will be used in determining
whether to grant that parole.
137
387240
3616
06:42
So it's the same principle
as Amazon and Netflix,
138
390880
2576
06:45
but now instead of deciding whether
a TV show is going to be good or bad,
139
393480
4616
06:50
you're deciding whether a person
is going to be good or bad.
140
398120
2896
06:53
And mediocre TV, 22 minutes,
that can be pretty bad,
141
401040
5496
06:58
but more years in prison,
I guess, even worse.
142
406560
2640
07:02
And unfortunately, there is actually
some evidence that this data analysis,
143
410360
4136
07:06
despite having lots of data,
does not always produce optimum results.
144
414520
4216
07:10
And that's not because a company
like Multi-Health Systems
145
418760
2722
07:13
doesn't know what to do with data.
146
421506
1627
07:15
Even the most data-savvy
companies get it wrong.
147
423158
2298
07:17
Yes, even Google gets it wrong sometimes.
148
425480
2400
07:20
In 2009, Google announced
that they were able, with data analysis,
149
428680
4496
07:25
to predict outbreaks of influenza,
the nasty kind of flu,
150
433200
4136
07:29
by doing data analysis
on their Google searches.
151
437360
3776
07:33
And it worked beautifully,
and it made a big splash in the news,
152
441160
3856
07:37
including the pinnacle
of scientific success:
153
445040
2136
07:39
a publication in the journal "Nature."
154
447200
2456
07:41
It worked beautifully
for year after year after year,
155
449680
3616
07:45
until one year it failed.
156
453320
1656
07:47
And nobody could even tell exactly why.
157
455000
2256
07:49
It just didn't work that year,
158
457280
1696
07:51
and of course that again made big news,
159
459000
1936
07:52
including now a retraction
160
460960
1616
07:54
of a publication
from the journal "Nature."
161
462600
2840
07:58
So even the most data-savvy companies,
Amazon and Google,
162
466480
3336
08:01
they sometimes get it wrong.
163
469840
2136
08:04
And despite all those failures,
164
472000
2936
08:06
data is moving rapidly
into real-life decision-making --
165
474960
3856
08:10
into the workplace,
166
478840
1816
08:12
law enforcement,
167
480680
1816
08:14
medicine.
168
482520
1200
08:16
So we should better make sure
that data is helping.
169
484400
3336
08:19
Now, personally I've seen
a lot of this struggle with data myself,
170
487760
3136
08:22
because I work in computational genetics,
171
490920
1976
08:24
which is also a field
where lots of very smart people
172
492920
2496
08:27
are using unimaginable amounts of data
to make pretty serious decisions
173
495440
3656
08:31
like deciding on a cancer therapy
or developing a drug.
174
499120
3560
08:35
And over the years,
I've noticed a sort of pattern
175
503520
2376
08:37
or kind of rule, if you will,
about the difference
176
505920
2456
08:40
between successful
decision-making with data
177
508400
2696
08:43
and unsuccessful decision-making,
178
511120
1616
08:44
and I find this a pattern worth sharing,
and it goes something like this.
179
512760
3880
08:50
So whenever you're
solving a complex problem,
180
518520
2135
08:52
you're doing essentially two things.
181
520679
1737
08:54
The first one is, you take that problem
apart into its bits and pieces
182
522440
3296
08:57
so that you can deeply analyze
those bits and pieces,
183
525760
2496
09:00
and then of course
you do the second part.
184
528280
2016
09:02
You put all of these bits and pieces
back together again
185
530320
2656
09:05
to come to your conclusion.
186
533000
1336
09:06
And sometimes you
have to do it over again,
187
534360
2336
09:08
but it's always those two things:
188
536720
1656
09:10
taking apart and putting
back together again.
189
538400
2320
09:14
And now the crucial thing is
190
542280
1616
09:15
that data and data analysis
191
543920
2896
09:18
is only good for the first part.
192
546840
2496
09:21
Data and data analysis,
no matter how powerful,
193
549360
2216
09:23
can only help you taking a problem apart
and understanding its pieces.
194
551600
4456
09:28
It's not suited to put those pieces
back together again
195
556080
3496
09:31
and then to come to a conclusion.
196
559600
1896
09:33
There's another tool that can do that,
and we all have it,
197
561520
2736
09:36
and that tool is the brain.
198
564280
1296
09:37
If there's one thing a brain is good at,
199
565600
1936
09:39
it's taking bits and pieces
back together again,
200
567560
2256
09:41
even when you have incomplete information,
201
569840
2016
09:43
and coming to a good conclusion,
202
571880
1576
09:45
especially if it's the brain of an expert.
203
573480
2936
09:48
And that's why I believe
that Netflix was so successful,
204
576440
2656
09:51
because they used data and brains
where they belong in the process.
205
579120
3576
09:54
They use data to first understand
lots of pieces about their audience
206
582720
3536
09:58
that they otherwise wouldn't have
been able to understand at that depth,
207
586280
3416
10:01
but then the decision
to take all these bits and pieces
208
589720
2616
10:04
and put them back together again
and make a show like "House of Cards,"
209
592360
3336
10:07
that was nowhere in the data.
210
595720
1416
10:09
Ted Sarandos and his team
made that decision to license that show,
211
597160
3976
10:13
which also meant, by the way,
that they were taking
212
601160
2381
10:15
a pretty big personal risk
with that decision.
213
603565
2851
10:18
And Amazon, on the other hand,
they did it the wrong way around.
214
606440
3016
10:21
They used data all the way
to drive their decision-making,
215
609480
2736
10:24
first when they held
their competition of TV ideas,
216
612240
2416
10:26
then when they selected "Alpha House"
to make as a show.
217
614680
3696
10:30
Which of course was
a very safe decision for them,
218
618400
2496
10:32
because they could always
point at the data, saying,
219
620920
2456
10:35
"This is what the data tells us."
220
623400
1696
10:37
But it didn't lead to the exceptional
results that they were hoping for.
221
625120
4240
10:42
So data is of course a massively
useful tool to make better decisions,
222
630120
4976
10:47
but I believe that things go wrong
223
635120
2376
10:49
when data is starting
to drive those decisions.
224
637520
2576
10:52
No matter how powerful,
data is just a tool,
225
640120
3776
10:55
and to keep that in mind,
I find this device here quite useful.
226
643920
3336
10:59
Many of you will ...
227
647280
1216
11:00
(Laughter)
228
648520
1216
11:01
Before there was data,
229
649760
1216
11:03
this was the decision-making
device to use.
230
651000
2856
11:05
(Laughter)
231
653880
1256
11:07
Many of you will know this.
232
655160
1336
11:08
This toy here is called the Magic 8 Ball,
233
656520
1953
11:10
and it's really amazing,
234
658497
1199
11:11
because if you have a decision to make,
a yes or no question,
235
659720
2896
11:14
all you have to do is you shake the ball,
and then you get an answer --
236
662640
3736
11:18
"Most Likely" -- right here
in this window in real time.
237
666400
2816
11:21
I'll have it out later for tech demos.
238
669240
2096
11:23
(Laughter)
239
671360
1216
11:24
Now, the thing is, of course --
so I've made some decisions in my life
240
672600
3576
11:28
where, in hindsight,
I should have just listened to the ball.
241
676200
2896
11:31
But, you know, of course,
if you have the data available,
242
679120
3336
11:34
you want to replace this with something
much more sophisticated,
243
682480
3056
11:37
like data analysis
to come to a better decision.
244
685560
3616
11:41
But that does not change the basic setup.
245
689200
2616
11:43
So the ball may get smarter
and smarter and smarter,
246
691840
3176
11:47
but I believe it's still on us
to make the decisions
247
695040
2816
11:49
if we want to achieve
something extraordinary,
248
697880
3016
11:52
on the right end of the curve.
249
700920
1936
11:54
And I find that a very encouraging
message, in fact,
250
702880
4496
11:59
that even in the face
of huge amounts of data,
251
707400
3976
12:03
it still pays off to make decisions,
252
711400
4096
12:07
to be an expert in what you're doing
253
715520
2656
12:10
and take risks.
254
718200
2096
12:12
Because in the end, it's not data,
255
720320
2776
12:15
it's risks that will land you
on the right end of the curve.
256
723120
3960
12:19
Thank you.
257
727840
1216
12:21
(Applause)
258
729080
3680

▲Back to top

ABOUT THE SPEAKER
Sebastian Wernicke - Data scientist
After making a splash in the field of bioinformatics, Sebastian Wernicke moved on to the corporate sphere, where he motivates and manages multidimensional projects.

Why you should listen

Dr. Sebastian Wernicke is the Chief Data Scientist of ONE LOGIC, a data science boutique that supports organizations across industries to make sense of their vast data collections to improve operations and gain strategic advantages. Wernicke originally studied bioinformatics and previously led the strategy and growth of Seven Bridges Genomics, a Cambridge-based startup that builds platforms for genetic analysis.

Before his career in statistics began, Wernicke worked stints as both a paramedic and successful short animated filmmaker. He's also the author of the TEDPad app, an irreverent tool for creating an infinite number of "amazing and really bad" and mostly completely meaningless talks. He's the author of the statistically authoritative and yet completely ridiculous "How to Give the Perfect TEDTalk."

More profile about the speaker
Sebastian Wernicke | Speaker | TED.com