ABOUT THE SPEAKER

Supasorn Suwajanakorn - Computer scientist
Supasorn Suwajanakorn works on ways to reconstruct, preserve and reanimate anyone -- just from their existing photos and videos.

Why you should listen

Can we create a digital avatar that looks, acts and talks just like our sweet grandma? This question has inspired Supasorn Suwajanakorn, a recent PhD graduate from the University of Washington, to spend years developing new tools to make it a reality. He has developed a set of algorithms that can build a moving 3D face model of anyone from just photos, which was awarded the Innovation of the Year in 2016. He then introduced the first system that can replicate a person's speech and produce a realistic CG-animation by only analyzing their existing video footage -- all without ever bringing in the person to a Hollywood capture studio.

Suwajanakorn is working in the field of machine learning and computer vision. His goal is to bring vision algorithms out of the lab and make them work in the wild.

More profile about the speaker
Supasorn Suwajanakorn | Speaker | TED.com

TED2018

Supasorn Suwajanakorn: Fake videos of real people -- and how to spot them

Filmed: 2018-04-10

Readability: 4.6

1,453,308 views

Do you think you're good at spotting fake videos, where famous people say things they've never said in real life? See how they're made in this astonishing talk and tech demo. Computer scientist Supasorn Suwajanakorn shows how, as a grad student, he used AI and 3D modeling to create photorealistic fake videos of people synced to audio. Learn more about both the ethical implications and the creative possibilities of this tech -- and the steps being taken to fight against its misuse.

Supasorn Suwajanakorn - Computer scientist
Supasorn Suwajanakorn works on ways to reconstruct, preserve and reanimate anyone -- just from their existing photos and videos. Full bio

Double-click the English transcript below to play the video.

00:12

Look at these images.

0

876

1151

00:14

Now, tell me which Obama here is real.

1

2051

2635

00:16

(Video) Barack Obama: To help families
refinance their homes,

2

4710

2861

00:19

to invest in things
like high-tech manufacturing,

3

7595

2647

00:22

clean energy

4

10266

1159

00:23

and the infrastructure
that creates good new jobs.

5

11449

2779

00:26

Supasorn Suwajanakorn: Anyone?

6

14647

1484

00:28

The answer is none of them.

7

16155

1874

00:30

(Laughter)

8

18053

1114

00:31

None of these is actually real.

9

19191

1786

00:33

So let me tell you how we got here.

10

21001

1840

00:35

My inspiration for this work

11

23940

1578

00:37

was a project meant to preserve our last
chance for learning about the Holocaust

12

25542

5411

00:42

from the survivors.

13

30977

1768

00:44

It's called New Dimensions in Testimony,

14

32769

2627

00:47

and it allows you to have
interactive conversations

15

35420

3126

00:50

with a hologram
of a real Holocaust survivor.

16

38570

2556

00:53

(Video) Man: How did you
survive the Holocaust?

17

41793

1966

00:55

(Video) Hologram: How did I survive?

18

43783

1668

00:57

I survived,

19

45912

1807

01:00

I believe,

20

48419

1527

01:01

because providence watched over me.

21

49970

3023

01:05

SS: Turns out these answers
were prerecorded in a studio.

22

53573

3454

01:09

Yet the effect is astounding.

23

57051

2452

01:11

You feel so connected to his story
and to him as a person.

24

59527

3619

01:16

I think there's something special
about human interaction

25

64011

3301

01:19

that makes it much more profound

26

67336

2757

01:22

and personal

27

70117

2198

01:24

than what books or lectures
or movies could ever teach us.

28

72339

3485

01:28

So I saw this and began to wonder,

29

76267

2425

01:30

can we create a model
like this for anyone?

30

78716

2810

01:33

A model that looks, talks
and acts just like them?

31

81550

2975

01:37

So I set out to see if this could be done

32

85573

2007

01:39

and eventually came up with a new solution

33

87604

2310

01:41

that can build a model of a person
using nothing but these:

34

89938

3220

01:45

existing photos and videos of a person.

35

93747

2214

01:48

If you can leverage
this kind of passive information,

36

96701

2617

01:51

just photos and video that are out there,

37

99342

2007

01:53

that's the key to scaling to anyone.

38

101373

2056

01:56

By the way, here's Richard Feynman,

39

104119

1777

01:57

who in addition to being
a Nobel Prize winner in physics

40

105920

3413

02:01

was also known as a legendary teacher.

41

109357

2453

02:05

Wouldn't it be great
if we could bring him back

42

113080

2198

02:07

to give his lectures
and inspire millions of kids,

43

115302

3265

02:10

perhaps not just in English
but in any language?

44

118591

2992

02:14

Or if you could ask our grandparents
for advice and hear those comforting words

45

122441

4602

02:19

even if they're no longer with us?

46

127067

1770

02:21

Or maybe using this tool,
book authors, alive or not,

47

129683

3396

02:25

could read aloud all of their books
for anyone interested.

48

133103

2937

02:29

The creative possibilities
here are endless,

49

137199

2437

02:31

and to me, that's very exciting.

50

139660

1713

02:34

And here's how it's working so far.

51

142595

2002

02:36

First, we introduce a new technique

52

144621

1667

02:38

that can reconstruct a high-detailed
3D face model from any image

53

146312

4572

02:42

without ever 3D-scanning the person.

54

150908

2119

02:45

And here's the same output model
from different views.

55

153890

2642

02:49

This also works on videos,

56

157969

1502

02:51

by running the same algorithm
on each video frame

57

159495

2852

02:54

and generating a moving 3D model.

58

162371

2222

02:57

And here's the same
output model from different angles.

59

165538

2772

03:01

It turns out this problem
is very challenging,

60

169933

2534

03:04

but the key trick
is that we are going to analyze

61

172491

2525

03:07

a large photo collection
of the person beforehand.

62

175040

2966

03:10

For George W. Bush,
we can just search on Google,

63

178650

2539

03:14

and from that, we are able
to build an average model,

64

182309

2499

03:16

an iterative, refined model
to recover the expression

65

184832

3111

03:19

in fine details,
like creases and wrinkles.

66

187967

2336

03:23

What's fascinating about this

67

191326

1403

03:24

is that the photo collection
can come from your typical photos.

68

192753

3423

03:28

It doesn't really matter
what expression you're making

69

196200

2603

03:30

or where you took those photos.

70

198827

1885

03:32

What matters is
that there are a lot of them.

71

200736

2400

03:35

And we are still missing color here,

72

203160

1736

03:36

so next, we develop
a new blending technique

73

204920

2348

03:39

that improves upon
a single averaging method

74

207292

2836

03:42

and produces sharp
facial textures and colors.

75

210152

2818

03:45

And this can be done for any expression.

76

213779

2771

03:49

Now we have a control
of a model of a person,

77

217485

2499

03:52

and the way it's controlled now
is by a sequence of static photos.

78

220008

3795

03:55

Notice how the wrinkles come and go,
depending on the expression.

79

223827

3126

04:00

We can also use a video
to drive the model.

80

228109

2746

04:02

(Video) Daniel Craig: Right, but somehow,

81

230879

2593

04:05

we've managed to attract
some more amazing people.

82

233496

3771

04:10

SS: And here's another fun demo.

83

238021

1642

04:11

So what you see here
are controllable models

84

239687

2246

04:13

of people I built
from their internet photos.

85

241957

2444

04:16

Now, if you transfer
the motion from the input video,

86

244425

2904

04:19

we can actually drive the entire party.

87

247353

2152

04:21

George W. Bush:
It's a difficult bill to pass,

88

249529

2172

04:23

because there's a lot of moving parts,

89

251725

2303

04:26

and the legislative processes can be ugly.

90

254052

5231

04:31

(Applause)

91

259307

1630

04:32

SS: So coming back a little bit,

92

260961

1837

04:34

our ultimate goal, rather,
is to capture their mannerisms

93

262822

3191

04:38

or the unique way each
of these people talks and smiles.

94

266037

3045

04:41

So to do that, can we
actually teach the computer

95

269106

2313

04:43

to imitate the way someone talks

96

271443

2222

04:45

by only showing it
video footage of the person?

97

273689

2420

04:48

And what I did exactly was,
I let a computer watch

98

276898

2577

04:51

14 hours of pure Barack Obama
giving addresses.

99

279499

3277

04:55

And here's what we can produce
given only his audio.

100

283443

3516

04:58

(Video) BO: The results are clear.

101

286983

1777

05:00

America's businesses have created
14.5 million new jobs

102

288784

4349

05:05

over 75 straight months.

103

293157

2774

05:07

SS: So what's being synthesized here
is only the mouth region,

104

295955

2905

05:10

and here's how we do it.

105

298884

1540

05:12

Our pipeline uses a neural network

106

300764

1826

05:14

to convert and input audio
into these mouth points.

107

302614

2936

05:18

(Video) BO: We get it through our job
or through Medicare or Medicaid.

108

306547

4225

05:22

SS: Then we synthesize the texture,
enhance details and teeth,

109

310796

3420

05:26

and blend it into the head
and background from a source video.

110

314240

3074

05:29

(Video) BO: Women can get free checkups,

111

317338

1905

05:31

and you can't get charged more
just for being a woman.

112

319267

2968

05:34

Young people can stay
on a parent's plan until they turn 26.

113

322973

3306

05:39

SS: I think these results
seem very realistic and intriguing,

114

327267

2952

05:42

but at the same time
frightening, even to me.

115

330243

3173

05:45

Our goal was to build an accurate model
of a person, not to misrepresent them.

116

333440

4015

05:49

But one thing that concerns me
is its potential for misuse.

117

337956

3111

05:53

People have been thinking
about this problem for a long time,

118

341958

2971

05:56

since the days when Photoshop
first hit the market.

119

344953

2381

05:59

As a researcher, I'm also working
on countermeasure technology,

120

347862

3801

06:03

and I'm part of an ongoing
effort at AI Foundation,

121

351687

2942

06:06

which uses a combination
of machine learning and human moderators

122

354653

3397

06:10

to detect fake images and videos,

123

358074

2144

06:12

fighting against my own work.

124

360242

1514

06:14

And one of the tools we plan to release
is called Reality Defender,

125

362675

3190

06:17

which is a web-browser plug-in
that can flag potentially fake content

126

365889

4039

06:21

automatically, right in the browser.

127

369952

2533

06:24

(Applause)

128

372509

4228

06:28

Despite all this, though,

129

376761

1453

06:30

fake videos could do a lot of damage,

130

378238

1840

06:32

even before anyone has a chance to verify,

131

380102

3294

06:35

so it's very important
that we make everyone aware

132

383420

2722

06:38

of what's currently possible

133

386166

2007

06:40

so we can have the right assumption
and be critical about what we see.

134

388197

3369

06:44

There's still a long way to go before
we can fully model individual people

135

392423

5007

06:49

and before we can ensure
the safety of this technology.

136

397454

2786

06:53

But I'm excited and hopeful,

137

401097

1587

06:54

because if we use it right and carefully,

138

402708

3539

06:58

this tool can allow any individual's
positive impact on the world

139

406271

4309

07:02

to be massively scaled

140

410604

2190

07:04

and really help shape our future
the way we want it to be.

141

412818

2742

07:07

Thank you.

142

415584

1151

07:08

(Applause)

143

416759

5090

ABOUT THE SPEAKER

Supasorn Suwajanakorn - Computer scientist
Supasorn Suwajanakorn works on ways to reconstruct, preserve and reanimate anyone -- just from their existing photos and videos.

Why you should listen

Can we create a digital avatar that looks, acts and talks just like our sweet grandma? This question has inspired Supasorn Suwajanakorn, a recent PhD graduate from the University of Washington, to spend years developing new tools to make it a reality. He has developed a set of algorithms that can build a moving 3D face model of anyone from just photos, which was awarded the Innovation of the Year in 2016. He then introduced the first system that can replicate a person's speech and produce a realistic CG-animation by only analyzing their existing video footage -- all without ever bringing in the person to a Hollywood capture studio.

Suwajanakorn is working in the field of machine learning and computer vision. His goal is to bring vision algorithms out of the lab and make them work in the wild.

More profile about the speaker
Supasorn Suwajanakorn | Speaker | TED.com

THE ORIGINAL VIDEO ON TED.COM

Supasorn Suwajanakorn: Fake videos of real people -- and how to spot them | TED Talk | TED.com