ABOUT THE SPEAKER
Supasorn Suwajanakorn - Computer scientist
Supasorn Suwajanakorn works on ways to reconstruct, preserve and reanimate anyone -- just from their existing photos and videos.

Why you should listen

Can we create a digital avatar that looks, acts and talks just like our sweet grandma? This question has inspired Supasorn Suwajanakorn, a recent PhD graduate from the University of Washington, to spend years developing new tools to make it a reality. He has developed a set of algorithms that can build a moving 3D face model of anyone from just photos, which was awarded the Innovation of the Year in 2016. He then introduced the first system that can replicate a person's speech and produce a realistic CG-animation by only analyzing their existing video footage -- all without ever bringing in the person to a Hollywood capture studio.

Suwajanakorn is working in the field of machine learning and computer vision. His goal is to bring vision algorithms out of the lab and make them work in the wild.

More profile about the speaker
Supasorn Suwajanakorn | Speaker | TED.com
TED2018

Supasorn Suwajanakorn: Fake videos of real people -- and how to spot them

Supasorn Suwajanakorn: 真人的假视频以及如何识别他们

Filmed:
1,453,308 views

你觉得你擅长发现假视频吗,当名人们并没有真正说过这些话的时候?这个演讲和技术演示让我们了解到这是如何做到的。计算机科学家 Supasorn Suwajanakorn 为我们展示了,如何使用人工智能和3D建模并与音频同步来创造假视频。了解更多关于这个技术的伦理问题和这个技术的可能性,以及我们应该采取的措施来对抗该技术的错误使用。
- Computer scientist
Supasorn Suwajanakorn works on ways to reconstruct, preserve and reanimate anyone -- just from their existing photos and videos. Full bio

Double-click the English transcript below to play the video.

00:12
Look at these images图片.
0
876
1151
看看这些图像。
00:14
Now, tell me which哪一个 Obama奥巴马 here is real真实.
1
2051
2635
现在,告诉我哪个是真的奥马巴。
00:16
(Video视频) Barack巴拉克 Obama奥巴马: To help families家庭
refinance再融资 their homes家园,
2
4710
2861
巴拉克·奥巴马:帮助家庭对他们的房屋重做贷款,
00:19
to invest投资 in things
like high-tech高科技 manufacturing制造业,
3
7595
2647
投资高科技制造业,
00:22
clean清洁 energy能源
4
10266
1159
清洁能源
00:23
and the infrastructure基础设施
that creates创建 good new jobs工作.
5
11449
2779
和带来良好就业机会的基础设施。
00:26
SupasornSupasorn SuwajanakornSuwajanakorn: Anyone任何人?
6
14647
1484
有人知道吗?
00:28
The answer回答 is none没有 of them.
7
16155
1874
答案是:都不是。
00:30
(Laughter笑声)
8
18053
1114
(笑声)
00:31
None没有 of these is actually其实 real真实.
9
19191
1786
这些都不是真的。
00:33
So let me tell you how we got here.
10
21001
1840
那让我来告诉你们是怎么回事。
00:35
My inspiration灵感 for this work
11
23940
1578
我这个工作的灵感来自于
00:37
was a project项目 meant意味着 to preserve保留 our last
chance机会 for learning学习 about the Holocaust大屠杀
12
25542
5411
一个试图保存我们从幸存者那里
了解到的关于大屠杀
00:42
from the survivors幸存者.
13
30977
1768
的项目。
00:44
It's called New Dimensions尺寸 in Testimony证词,
14
32769
2627
这个项目叫做证词新维度
(New Dimensions in Testimony),
00:47
and it allows允许 you to have
interactive互动 conversations对话
15
35420
3126
它可以让你与真实大屠杀幸存者的全息图
00:50
with a hologram全息照相
of a real真实 Holocaust大屠杀 survivor幸存者.
16
38570
2556
进行互动对话。
00:53
(Video视频) Man: How did you
survive生存 the Holocaust大屠杀?
17
41793
1966
你是怎么在大屠杀中幸存下来的?
00:55
(Video视频) Hologram全息照相: How did I survive生存?
18
43783
1668
我怎么幸存下来?
00:57
I survived幸存,
19
45912
1807
我幸存下来,
01:00
I believe,
20
48419
1527
我相信,
01:01
because providence普罗维登斯 watched看着 over me.
21
49970
3023
是因为上帝眷顾我。
01:05
SSSS: Turns out these answers答案
were prerecorded预录 in a studio工作室.
22
53573
3454
原来这些答案是预先在工作室录制的。
01:09
Yet然而 the effect影响 is astounding惊人.
23
57051
2452
但效果令人吃惊。
01:11
You feel so connected连接的 to his story故事
and to him as a person.
24
59527
3619
你会对他的故事,
他这个人感同身受。
01:16
I think there's something special特别
about human人的 interaction相互作用
25
64011
3301
我想人类互动的特别之处
01:19
that makes品牌 it much more profound深刻
26
67336
2757
让它比图书,演讲或电影
01:22
and personal个人
27
70117
2198
告诉我们的
01:24
than what books图书 or lectures讲座
or movies电影 could ever teach us.
28
72339
3485
要更加深刻和真实。
01:28
So I saw this and began开始 to wonder奇迹,
29
76267
2425
所以我就开始想,
01:30
can we create创建 a model模型
like this for anyone任何人?
30
78716
2810
我们能不能为每个人做个模型?
01:33
A model模型 that looks容貌, talks会谈
and acts行为 just like them?
31
81550
2975
这个模型的样子,
谈话和举止就跟真人无异。
01:37
So I set out to see if this could be doneDONE
32
85573
2007
于是我开始探索这个能不能搞定,
01:39
and eventually终于 came来了 up with a new solution
33
87604
2310
并最终找到了一个新的解决方案,
01:41
that can build建立 a model模型 of a person
using运用 nothing but these:
34
89938
3220
只需使用下面这些东西就能构建人的模型:
01:45
existing现有 photos相片 and videos视频 of a person.
35
93747
2214
个人现存的照片和视频。
01:48
If you can leverage杠杆作用
this kind of passive被动 information信息,
36
96701
2617
如果你能利用这种被动信息,
01:51
just photos相片 and video视频 that are out there,
37
99342
2007
只需公开的照片和视频,
01:53
that's the key to scaling缩放 to anyone任何人.
38
101373
2056
这是扩展到其他人的关键。
01:56
By the way, here's这里的 Richard理查德 Feynman费曼,
39
104119
1777
顺便说一句,这是理查德·费曼,
01:57
who in addition加成 to being存在
a Nobel诺贝尔 Prize winner优胜者 in physics物理
40
105920
3413
他除了是诺贝尔物理学奖得主
02:01
was also known已知 as a legendary传奇的 teacher老师.
41
109357
2453
也是位传奇教师。
02:05
Wouldn't岂不 it be great
if we could bring带来 him back
42
113080
2198
这岂不是很棒?
如果能够把他带回来
02:07
to give his lectures讲座
and inspire启发 millions百万 of kids孩子,
43
115302
3265
讲课并激励成千上万的小孩,
02:10
perhaps也许 not just in English英语
but in any language语言?
44
118591
2992
用英语或者其他任何语言?
02:14
Or if you could ask our grandparents祖父母
for advice忠告 and hear those comforting欣慰的 words
45
122441
4602
或者你也可以征求祖父母的意见,
听听那些让人宽慰的言语,
02:19
even if they're no longer with us?
46
127067
1770
即便他们已经离开我们了。
02:21
Or maybe using运用 this tool工具,
book authors作者, alive or not,
47
129683
3396
或者使用这个工具,图书的作者,
不管是活着的还是去世的,
02:25
could read aloud高声 all of their books图书
for anyone任何人 interested有兴趣.
48
133103
2937
可以为任何有兴趣的人朗读他们的书本。
02:29
The creative创作的 possibilities可能性
here are endless无穷,
49
137199
2437
这里的创意可能是无限的,
02:31
and to me, that's very exciting扣人心弦.
50
139660
1713
对我而言,这非常让人兴奋。
02:34
And here's这里的 how it's working加工 so far.
51
142595
2002
这是目前它的工作原理。
02:36
First, we introduce介绍 a new technique技术
52
144621
1667
首先我们引入一种新的技术
02:38
that can reconstruct重建 a high-detailed高详细
3D face面对 model模型 from any image图片
53
146312
4572
可以从任何图像中
重建一个高细节的3D人脸模型,
02:42
without ever 3D-scanningD 扫描 the person.
54
150908
2119
而且无需经对真人进行3D扫描。
02:45
And here's这里的 the same相同 output产量 model模型
from different不同 views意见.
55
153890
2642
这是不同视角下的同一输出模型。
02:49
This also works作品 on videos视频,
56
157969
1502
这也可以应用于视频,
02:51
by running赛跑 the same相同 algorithm算法
on each video视频 frame
57
159495
2852
通过对每一幅视频
使用同样的算法
02:54
and generating发电 a moving移动 3D model模型.
58
162371
2222
产生移动的3D模型。
02:57
And here's这里的 the same相同
output产量 model模型 from different不同 angles.
59
165538
2772
这是不同视角下的同一输出模型。
03:01
It turns out this problem问题
is very challenging具有挑战性的,
60
169933
2534
这些问题富有挑战性,
但关键技巧在于我们需要提前
03:04
but the key trick
is that we are going to analyze分析
61
172491
2525
03:07
a large photo照片 collection采集
of the person beforehand预先.
62
175040
2966
分析一个人的大量照片集。
03:10
For George乔治 W. Bush衬套,
we can just search搜索 on Google谷歌,
63
178650
2539
对乔治·沃克·布什,
我们只需要搜索谷歌,
03:14
and from that, we are able能够
to build建立 an average平均 model模型,
64
182309
2499
这样,我们就能建立一个平均模型,
03:16
an iterative迭代, refined精制 model模型
to recover恢复 the expression表达
65
184832
3111
一个迭代,精炼的模型来恢复表达的细节,
03:19
in fine details细节,
like creases折痕 and wrinkles皱纹.
66
187967
2336
比如折痕和皱纹。
03:23
What's fascinating迷人 about this
67
191326
1403
迷人的是
03:24
is that the photo照片 collection采集
can come from your typical典型 photos相片.
68
192753
3423
照片集可以来自你的特定照片。
03:28
It doesn't really matter
what expression表达 you're making制造
69
196200
2603
你做何表情或者你在哪里拍照
03:30
or where you took those photos相片.
70
198827
1885
并不那么关键。
03:32
What matters事项 is
that there are a lot of them.
71
200736
2400
关键的是数量要足够多。
03:35
And we are still missing失踪 color颜色 here,
72
203160
1736
这里我们仍然缺少肤色,
03:36
so next下一个, we develop发展
a new blending混纺 technique技术
73
204920
2348
所以下一步,
我们开发了一种新的混合技术
03:39
that improves提高 upon
a single averaging平均 method方法
74
207292
2836
改善了平均模型,

03:42
and produces产生 sharp尖锐
facial面部 textures纹理 and colors颜色.
75
210152
2818
并产生尖锐的面部纹理和肤色。
03:45
And this can be doneDONE for any expression表达.
76
213779
2771
这可以用于做任何表情。
03:49
Now we have a control控制
of a model模型 of a person,
77
217485
2499
现在我们可以
对一个人的模型进行控制,
03:52
and the way it's controlled受控 now
is by a sequence序列 of static静态的 photos相片.
78
220008
3795
它现在被控制的方式是
一系列静态的照片。
03:55
Notice注意 how the wrinkles皱纹 come and go,
depending根据 on the expression表达.
79
223827
3126
注意皱纹是如何产生和消失的,
这取决于你的表情。
04:00
We can also use a video视频
to drive驾驶 the model模型.
80
228109
2746
我们也可以使用视频来驱动模型。
04:02
(Video视频) Daniel丹尼尔 Craig克雷格: Right, but somehow不知何故,
81
230879
2593
丹尼尔·克雷格:没错,但不管怎样,
04:05
we've我们已经 managed管理 to attract吸引
some more amazing惊人 people.
82
233496
3771
我们能够吸引到更多优秀的人才。
04:10
SSSS: And here's这里的 another另一个 fun开玩笑 demo演示.
83
238021
1642
这是另一个有趣的演示。
04:11
So what you see here
are controllable可控制 models楷模
84
239687
2246
所以你们看到的是
我使用人们的互联网图像
04:13
of people I built内置
from their internet互联网 photos相片.
85
241957
2444
建立的个人控制模型。
04:16
Now, if you transfer转让
the motion运动 from the input输入 video视频,
86
244425
2904
现在,如果你从视频中传递表情动作,
04:19
we can actually其实 drive驾驶 the entire整个 party派对.
87
247353
2152
我们可以让整个派对动起来。
04:21
George乔治 W. Bush衬套:
It's a difficult bill法案 to pass通过,
88
249529
2172
布什:这是个难以通过的法案,
04:23
because there's a lot of moving移动 parts部分,
89
251725
2303
因为有太多可供商榷的部分,
04:26
and the legislative立法 processes流程 can be ugly丑陋.
90
254052
5231
立法过程可能让人奔溃。
04:31
(Applause掌声)
91
259307
1630
(鼓掌)
04:32
SSSS: So coming未来 back a little bit,
92
260961
1837
那么回到正题,
04:34
our ultimate最终 goal目标, rather,
is to capture捕获 their mannerisms装相
93
262822
3191
我们的最终目标,
不如说,是捕捉他们的言谈举止,
04:38
or the unique独特 way each
of these people talks会谈 and smiles笑容.
94
266037
3045
或者每一个人交谈或微笑的独特之处。
04:41
So to do that, can we
actually其实 teach the computer电脑
95
269106
2313
所以这样,
我们能不能只向电脑展示这个人的录像
04:43
to imitate模拟 the way someone有人 talks会谈
96
271443
2222
就能教会电脑
04:45
by only showing展示 it
video视频 footage镜头 of the person?
97
273689
2420
去模仿人们谈话的方式?
04:48
And what I did exactly究竟 was,
I let a computer电脑 watch
98
276898
2577
而我做的事情是,我让电脑
04:51
14 hours小时 of pure Barack巴拉克 Obama奥巴马
giving addresses地址.
99
279499
3277
看了14个小时的奥巴马演讲。
04:55
And here's这里的 what we can produce生产
given特定 only his audio音频.
100
283443
3516
这是我们只通过他的音频生产出来的内容。
04:58
(Video视频) BOBO: The results结果 are clear明确.
101
286983
1777
结果非常明显。
05:00
America's美国 businesses企业 have created创建
14.5 million百万 new jobs工作
102
288784
4349
在过去75个月中,美国企业已经创造了
05:05
over 75 straight直行 months个月.
103
293157
2774
1450万新的工作机会。
05:07
SSSS: So what's being存在 synthesized综合 here
is only the mouth region地区,
104
295955
2905
所以这里合成的只是嘴巴部分,
05:10
and here's这里的 how we do it.
105
298884
1540
这是我们做的方法。
05:12
Our pipeline管道 uses使用 a neural神经 network网络
106
300764
1826
我们的处理系统使用神经网络
05:14
to convert兑换 and input输入 audio音频
into these mouth points.
107
302614
2936
来转换和输入音频到这些嘴巴的位置。
05:18
(Video视频) BOBO: We get it through通过 our job工作
or through通过 Medicare医保 or Medicaid医疗补助.
108
306547
4225
我们通过我们的工作或者医疗保险
或补助来实现这一目标。
05:22
SSSS: Then we synthesize合成 the texture质地,
enhance提高 details细节 and teeth,
109
310796
3420
然后我们合成纹理,
增强细节和牙齿,
05:26
and blend混合 it into the head
and background背景 from a source资源 video视频.
110
314240
3074
并将其与源视频中的
头部和背景混合在一起。
05:29
(Video视频) BOBO: Women妇女 can get free自由 checkups体检,
111
317338
1905
女性可以获得免费的检查,
05:31
and you can't get charged带电 more
just for being存在 a woman女人.
112
319267
2968
你不会因为是女性而需要支付更高的费用。
05:34
Young年轻 people can stay
on a parent's父母 plan计划 until直到 they turn 26.
113
322973
3306
年轻人可以在父母计划中呆到26岁。
05:39
SSSS: I think these results结果
seem似乎 very realistic实际 and intriguing奇妙,
114
327267
2952
我觉得这些结果看起来非常真实和有趣,
05:42
but at the same相同 time
frightening可怕的, even to me.
115
330243
3173
但同时,也让我担忧,即便是我。
05:45
Our goal目标 was to build建立 an accurate准确 model模型
of a person, not to misrepresent歪曲 them.
116
333440
4015
我们的目标是构建人的精准模型,
而非歪曲他们。
05:49
But one thing that concerns关注 me
is its potential潜在 for misuse滥用.
117
337956
3111
但让我担忧的是它被错误使用的可能。
05:53
People have been thinking思维
about this problem问题 for a long time,
118
341958
2971
人们思考这个问题很长时间了,
05:56
since以来 the days when PhotoshopPhotoshop中
first hit击中 the market市场.
119
344953
2381
从Photoshop进入市场那天就开始了。
05:59
As a researcher研究员, I'm also working加工
on countermeasure对策 technology技术,
120
347862
3801
作为一名研究人员,
我也在研究对抗技术,
06:03
and I'm part部分 of an ongoing不断的
effort功夫 at AIAI Foundation基础,
121
351687
2942
我是人工智能基金会持续努力的一份子,
06:06
which哪一个 uses使用 a combination组合
of machine learning学习 and human人的 moderators版主
122
354653
3397
它结合了机器学习和人工模型
06:10
to detect检测 fake images图片 and videos视频,
123
358074
2144
来识别假图像和视频,
06:12
fighting战斗 against反对 my own拥有 work.
124
360242
1514
与我们自己的工作做斗争。
06:14
And one of the tools工具 we plan计划 to release发布
is called Reality现实 Defender后卫,
125
362675
3190
我们打算发布的一个工具叫做真相卫士,
06:17
which哪一个 is a web-browser网络浏览器 plug-in插入
that can flag potentially可能 fake content内容
126
365889
4039
是个浏览器插件
可以用来自动标记潜在假内容,
06:21
automatically自动, right in the browser浏览器.
127
369952
2533
在浏览器中就可以使用。
06:24
(Applause掌声)
128
372509
4228
(掌声)
06:28
Despite尽管 all this, though虽然,
129
376761
1453
此外,
06:30
fake videos视频 could do a lot of damage损伤,
130
378238
1840
假视频可以带来很大危害,
06:32
even before anyone任何人 has a chance机会 to verify校验,
131
380102
3294
甚至在人们有机会验证它之前,
06:35
so it's very important重要
that we make everyone大家 aware知道的
132
383420
2722
所以让大家意识到这可能是什么
06:38
of what's currently目前 possible可能
133
386166
2007
非常重要,
06:40
so we can have the right assumption假设
and be critical危急 about what we see.
134
388197
3369
这样我们才能得到正确的推断,
并对看到的保持谨慎。
06:44
There's still a long way to go before
we can fully充分 model模型 individual个人 people
135
392423
5007
在个人完全建模
以及确保技术的安全性方面,
06:49
and before we can ensure确保
the safety安全 of this technology技术.
136
397454
2786
仍有很长的路要走。
06:53
But I'm excited兴奋 and hopeful有希望,
137
401097
1587
但我兴奋且充满希望,
06:54
because if we use it right and carefully小心,
138
402708
3539
因为如果我们正确地使用它,
06:58
this tool工具 can allow允许 any individual's个人
positive impact碰撞 on the world世界
139
406271
4309
这个工具可以让
每个人对世界积极的影响
07:02
to be massively大规模 scaled缩放
140
410604
2190
得到大规模的普及
07:04
and really help shape形状 our future未来
the way we want it to be.
141
412818
2742
并真正帮助塑造我们想要的未来。
07:07
Thank you.
142
415584
1151
谢谢。
07:08
(Applause掌声)
143
416759
5090
(掌声)
Translated by jacks peng
Reviewed by Kai Lu

▲Back to top

ABOUT THE SPEAKER
Supasorn Suwajanakorn - Computer scientist
Supasorn Suwajanakorn works on ways to reconstruct, preserve and reanimate anyone -- just from their existing photos and videos.

Why you should listen

Can we create a digital avatar that looks, acts and talks just like our sweet grandma? This question has inspired Supasorn Suwajanakorn, a recent PhD graduate from the University of Washington, to spend years developing new tools to make it a reality. He has developed a set of algorithms that can build a moving 3D face model of anyone from just photos, which was awarded the Innovation of the Year in 2016. He then introduced the first system that can replicate a person's speech and produce a realistic CG-animation by only analyzing their existing video footage -- all without ever bringing in the person to a Hollywood capture studio.

Suwajanakorn is working in the field of machine learning and computer vision. His goal is to bring vision algorithms out of the lab and make them work in the wild.

More profile about the speaker
Supasorn Suwajanakorn | Speaker | TED.com