ABOUT THE SPEAKER
Deb Roy - Cognitive scientist
Deb Roy studies how children learn language, and designs machines that learn to communicate in human-like ways. On sabbatical from MIT Media Lab, he's working with the AI company Bluefin Labs.

Why you should listen

Deb Roy directs the Cognitive Machines group at the MIT Media Lab, where he studies how children learn language, and designs machines that learn to communicate in human-like ways. To enable this work, he has pioneered new data-driven methods for analyzing and modeling human linguistic and social behavior. He has authored numerous scientific papers on artificial intelligence, cognitive modeling, human-machine interaction, data mining, and information visualization.

Deb Roy was the co-founder and serves as CEO of Bluefin Labs, a venture-backed technology company. Built upon deep machine learning principles developed in his research over the past 15 years, Bluefin has created a technology platform that analyzes social media commentary to measure real-time audience response to TV ads and shows.

Follow Deb Roy on Twitter>

Roy adds some relevant papers:

Deb Roy. (2009). New Horizons in the Study of Child Language Acquisition. Proceedings of Interspeech 2009. Brighton, England. bit.ly/fSP4Qh

Brandon C. Roy, Michael C. Frank and Deb Roy. (2009). Exploring word learning in a high-density longitudinal corpus. Proceedings of the 31st Annual Meeting of the Cognitive Science Society. Amsterdam, Netherlands. bit.ly/e1qxej

Plenty more papers on our research including technology and methodology can be found here, together with other research from my lab at MIT: bit.ly/h3paSQ

The work that I mentioned on relationships between television content and the social graph is being done at Bluefin Labs (www.bluefinlabs.com). Details of this work have not been published. The social structures we are finding (and that I highlighted in my TED talk) are indeed new. The social media communication channels that are leading to their formation did not even exist a few years ago, and Bluefin's technology platform for discovering these kinds of structures is the first of its kind. We'll certainly have more to say about all this as we continue to dig into this fascinating new kind of data, and as new social structures continue to evolve!

More profile about the speaker
Deb Roy | Speaker | TED.com
TED2011

Deb Roy: The birth of a word

戴·罗伊: 单词的诞生

Filmed:
2,809,941 views

麻省理工大学的研究员戴·罗伊希望了解他的初生儿子是怎样学习语言的,因此他在家里四处装置了摄像机捕捉儿子成长的几乎每一刻,然后将9万小时的家庭录像进行分析,观察 “gaaaa” 渐渐转变成 “水”的过程。 这一惊人的资料丰富的研究对帮助我们了解人类是如果学习起到了深远的作用
- Cognitive scientist
Deb Roy studies how children learn language, and designs machines that learn to communicate in human-like ways. On sabbatical from MIT Media Lab, he's working with the AI company Bluefin Labs. Full bio

Double-click the English transcript below to play the video.

00:15
Imagine想像 if you could record记录 your life --
0
0
4000
想象一下如果你能记录你的生活--
00:19
everything you said, everything you did,
1
4000
3000
你说的一切,做的一切
00:22
available可得到 in a perfect完善 memory记忆 store商店 at your fingertips指尖,
2
7000
3000
就存储在一个完美的你触手可及的记忆库
00:25
so you could go back
3
10000
2000
你可以回到过去
00:27
and find memorable难忘 moments瞬间 and relive复活 them,
4
12000
3000
找寻那难忘一刻回顾这一切
00:30
or sift through通过 traces痕迹 of time
5
15000
3000
或者追寻时间的轨迹
00:33
and discover发现 patterns模式 in your own拥有 life
6
18000
2000
发现在属于你自己的生活模式
00:35
that previously先前 had gone走了 undiscovered未被发现.
7
20000
3000
那种以前没有发现的规律
00:38
Well that's exactly究竟 the journey旅程
8
23000
2000
而那就是我们全家
00:40
that my family家庭 began开始
9
25000
2000
5年半前开始的
00:42
five and a half years年份 ago.
10
27000
2000
家庭旅程
00:44
This is my wife妻子 and collaborator合作者, RupalRupal线路.
11
29000
3000
这是我妻子和合作者, 鲁泊尔
00:47
And on this day, at this moment时刻,
12
32000
2000
在这一天,从这一刻
00:49
we walked into the house with our first child儿童,
13
34000
2000
我们带着我们第一个孩子走进了这个家
00:51
our beautiful美丽 baby宝宝 boy男孩.
14
36000
2000
我们美丽的儿子
00:53
And we walked into a house
15
38000
3000
我们走进了一个
00:56
with a very special特别 home video视频 recording记录 system系统.
16
41000
4000
安装了特殊的家庭摄像系统的家
01:07
(Video视频) Man: Okay.
17
52000
2000
(录像) 男人:好
01:10
Deb德布 Roy罗伊: This moment时刻
18
55000
1000
戴·罗伊: 这一刻
01:11
and thousands数千 of other moments瞬间 special特别 for us
19
56000
3000
和其他千万的我们的特殊时刻
01:14
were captured捕获 in our home
20
59000
2000
在我们家中被捕捉下来
01:16
because in every一切 room房间 in the house,
21
61000
2000
因为这个房子的每个屋子
01:18
if you looked看着 up, you'd see a camera相机 and a microphone麦克风,
22
63000
3000
如果你仰头看,你都可以看见一个摄像机和话筒
01:21
and if you looked看着 down,
23
66000
2000
而你望下看
01:23
you'd get this bird's-eye鸟瞰 view视图 of the room房间.
24
68000
2000
你可以俯视整个房间
01:25
Here's这里的 our living活的 room房间,
25
70000
3000
这是我们的客厅
01:28
the baby宝宝 bedroom卧室,
26
73000
3000
这是婴儿的房间
01:31
kitchen厨房, dining用餐 room房间
27
76000
2000
厨房,餐厅
01:33
and the rest休息 of the house.
28
78000
2000
这是其余的地方
01:35
And all of these fed美联储 into a disc圆盘 array排列
29
80000
3000
这些都被装进了一排
01:38
that was designed设计 for a continuous连续 capture捕获.
30
83000
3000
为持续拍摄设计的光盘中
01:41
So here we are flying飞行 through通过 a day in our home
31
86000
3000
这里我们飞快地经历一遍我们家庭的一天
01:44
as we move移动 from sunlit阳光 morning早上
32
89000
3000
我们从太阳初升的早晨
01:47
through通过 incandescent白炽灯 evening晚间
33
92000
2000
到亮起电灯的夜晚
01:49
and, finally最后, lights灯火 out for the day.
34
94000
3000
最后, 熄灯就寝
01:53
Over the course课程 of three years年份,
35
98000
3000
历经3年的时间
01:56
we recorded记录 eight to 10 hours小时 a day,
36
101000
2000
我们每天记录8到10个小时
01:58
amassing再再 roughly大致 a quarter-million25万 hours小时
37
103000
3000
积累了大约25万小时
02:01
of multi-track多轨 audio音频 and video视频.
38
106000
3000
的多轨音频和视频资料
02:04
So you're looking at a piece of what is by far
39
109000
2000
所以你现在看到的是有史以来
02:06
the largest最大 home video视频 collection采集 ever made制作.
40
111000
2000
最大的家庭录相集
02:08
(Laughter笑声)
41
113000
3000
(笑声)
02:11
And what this data数据 represents代表
42
116000
2000
从个人的角度而言,
02:13
for our family家庭 at a personal个人 level水平,
43
118000
4000
这些代表了我们家庭的资料
02:17
the impact碰撞 has already已经 been immense巨大,
44
122000
2000
已经产生了巨大的影响
02:19
and we're still learning学习 its value.
45
124000
3000
我们还在继续学习其中的价值
02:22
Countless无数 moments瞬间
46
127000
2000
无数的时刻
02:24
of unsolicited不请自来 natural自然 moments瞬间, not posed构成 moments瞬间,
47
129000
3000
无预兆的,不造作的自然时刻
02:27
are captured捕获 there,
48
132000
2000
都记录在这里
02:29
and we're starting开始 to learn学习 how to discover发现 them and find them.
49
134000
3000
我们正开始学习怎样发现和寻找它们
02:32
But there's also a scientific科学 reason原因 that drove开车 this project项目,
50
137000
3000
而促使这个项目还有一个科学的原因
02:35
which哪一个 was to use this natural自然 longitudinal data数据
51
140000
4000
便是用这些纵向记录的数据
02:39
to understand理解 the process处理
52
144000
2000
去了解一个
02:41
of how a child儿童 learns获悉 language语言 --
53
146000
2000
孩子是怎样学习语言的--
02:43
that child儿童 being存在 my son儿子.
54
148000
2000
这个孩子是我的儿子
02:45
And so with many许多 privacy隐私 provisions规定 put in place地点
55
150000
4000
所以在设置了隐私保护的条件下
02:49
to protect保护 everyone大家 who was recorded记录 in the data数据,
56
154000
3000
每个被记录到的人物都得到保护
02:52
we made制作 elements分子 of the data数据 available可得到
57
157000
3000
我们对我们信任的麻省理工研究团队
02:55
to my trusted信任 research研究 team球队 at MITMIT
58
160000
3000
公开了部分数据
02:58
so we could start开始 teasing戏弄 apart距离 patterns模式
59
163000
3000
因此我们可以从这个巨大的
03:01
in this massive大规模的 data数据 set,
60
166000
3000
数据资料中排除出一些多余的模式
03:04
trying to understand理解 the influence影响 of social社会 environments环境
61
169000
3000
以此来试图理解社会环境
03:07
on language语言 acquisition获得.
62
172000
2000
对语言形成的影响
03:09
So we're looking here
63
174000
2000
所以我们在这里看到
03:11
at one of the first things we started开始 to do.
64
176000
2000
我们所做的第一件事情
03:13
This is my wife妻子 and I cooking烹饪 breakfast早餐 in the kitchen厨房,
65
178000
4000
这是我的妻子和我在厨房做早餐
03:17
and as we move移动 through通过 space空间 and through通过 time,
66
182000
3000
随着时间的流逝地点的变化
03:20
a very everyday每天 pattern模式 of life in the kitchen厨房.
67
185000
3000
这是厨房里日常生活的轨迹
03:23
In order订购 to convert兑换
68
188000
2000
为了转换
03:25
this opaque不透明, 90,000 hours小时 of video视频
69
190000
3000
这个9万小时的录相
03:28
into something that we could start开始 to see,
70
193000
2000
将它变成我们能识辨的东西
03:30
we use motion运动 analysis分析 to pull out,
71
195000
2000
我们用行动分析来抽取
03:32
as we move移动 through通过 space空间 and through通过 time,
72
197000
2000
我们在时空的移动
03:34
what we call space-time时空 worms蠕虫.
73
199000
3000
我们称之为 时空虫
03:37
And this has become成为 part部分 of our toolkit工具包
74
202000
3000
这个成为了我们工具的一部分
03:40
for being存在 able能够 to look and see
75
205000
3000
用来观察和辨识
03:43
where the activities活动 are in the data数据,
76
208000
2000
数据中的各种活动
03:45
and with it, trace跟踪 the pattern模式 of, in particular特定,
77
210000
3000
再利用这个办法,去追踪模型,特别是
03:48
where my son儿子 moved移动 throughout始终 the home,
78
213000
2000
我儿子在家去过哪些地方
03:50
so that we could focus焦点 our transcription转录 efforts努力,
79
215000
3000
使得我们能够聚焦解读
03:53
all of the speech言语 environment环境 around my son儿子 --
80
218000
3000
我儿子学习语言的语境
03:56
all of the words that he heard听说 from myself, my wife妻子, our nanny保姆,
81
221000
3000
他从我,我妻子和保姆那里听到的所有词汇
03:59
and over time, the words he began开始 to produce生产.
82
224000
3000
渐渐的,他开始使用的词汇
04:02
So with that technology技术 and that data数据
83
227000
3000
因此通过技术和数据
04:05
and the ability能力 to, with machine assistance帮助,
84
230000
2000
在机器的协助下
04:07
transcribe录制 speech言语,
85
232000
2000
录制下对话
04:09
we've我们已经 now transcribed转录
86
234000
2000
我们现在已经完成了
04:11
well over seven million百万 words of our home transcripts成绩单.
87
236000
3000
超过7万字的家庭言谈的记录
04:14
And with that, let me take you now
88
239000
2000
现在,让我带你们
04:16
for a first tour游览 into the data数据.
89
241000
3000
进入这些数据的第一个旅行
04:19
So you've all, I'm sure,
90
244000
2000
我相信,你们大家都
04:21
seen看到 time-lapse时间推移 videos视频
91
246000
2000
看过时间推移的影片
04:23
where a flower will blossom开花 as you accelerate加速 time.
92
248000
3000
加快时间的推移你可以看见花朵盛开
04:26
I'd like you to now experience经验
93
251000
2000
现在我让你们看看
04:28
the blossoming朵朵 of a speech言语 form形成.
94
253000
2000
语言的花朵是怎样绽放的
04:30
My son儿子, soon不久 after his first birthday生日,
95
255000
2000
我的儿子,在他的第一个生日后
04:32
would say "gaga加加" to mean water.
96
257000
3000
会说“gaga“来指水
04:35
And over the course课程 of the next下一个 half-year半年,
97
260000
3000
在这之后的半年里
04:38
he slowly慢慢地 learned学到了 to approximate近似
98
263000
2000
他渐渐地学会了
04:40
the proper正确 adult成人 form形成, "water."
99
265000
3000
成年人说的正确的“水”
04:43
So we're going to cruise巡航 through通过 half a year
100
268000
2000
我们现在来用40秒时间
04:45
in about 40 seconds.
101
270000
2000
快速浏览这半年
04:47
No video视频 here,
102
272000
2000
没有影象
04:49
so you can focus焦点 on the sound声音, the acoustics声学,
103
274000
3000
所以你们可以专注听声音,声学上的
04:52
of a new kind of trajectory弹道:
104
277000
2000
这种新的轨迹变化
04:54
gaga加加 to water.
105
279000
2000
从“Gaga"到"Water"
04:56
(Audio音频) Baby宝宝: GagagagagagaGagagagagaga
106
281000
12000
(声音)婴儿:Gagagagagaga
05:08
Gaga加加 gaga加加 gaga加加
107
293000
4000
Gaga gaga gaga
05:12
gugaGUGA gugaGUGA gugaGUGA
108
297000
5000
guga guga guga
05:17
wada世界反兴奋剂机构 gaga加加 gaga加加 gugaGUGA gaga加加
109
302000
5000
wada gaga gaga guga gaga
05:22
wader涉水 gugaGUGA gugaGUGA
110
307000
4000
wader guga guga
05:26
water water water
111
311000
3000
water water water
05:29
water water water
112
314000
6000
water water water
05:35
water water
113
320000
4000
water water
05:39
water.
114
324000
2000
water
05:41
DRDR: He sure nailed it, didn't he.
115
326000
2000
戴·罗伊: 他学会了啊,不是吗?
05:43
(Applause掌声)
116
328000
7000
(掌声)
05:50
So he didn't just learn学习 water.
117
335000
2000
而他并不只是学会了水
05:52
Over the course课程 of the 24 months个月,
118
337000
2000
在24个月里
05:54
the first two years年份 that we really focused重点 on,
119
339000
3000
在最初的2年里,这才是我真正关注的
05:57
this is a map地图 of every一切 word he learned学到了 in chronological实足 order订购.
120
342000
4000
这里有一张图按照时序列出了他所学到的词汇
06:01
And because we have full充分 transcripts成绩单,
121
346000
3000
因为我们有全部的记录
06:04
we've我们已经 identified确定 each of the 503 words
122
349000
2000
我们为他到两岁前学会的503个单词
06:06
that he learned学到了 to produce生产 by his second第二 birthday生日.
123
351000
2000
都做了辨认和分析
06:08
He was an early talker健谈.
124
353000
2000
他算是说话早的
06:10
And so we started开始 to analyze分析 why.
125
355000
3000
所以我们开始分析其原因
06:13
Why were certain某些 words born天生 before others其他?
126
358000
3000
为什么有些词他学得早
06:16
This is one of the first results结果
127
361000
2000
这是其中的一个研究结果
06:18
that came来了 out of our study研究 a little over a year ago
128
363000
2000
是一年多前出来的
06:20
that really surprised诧异 us.
129
365000
2000
让我们很吃惊
06:22
The way to interpret this apparently显然地 simple简单 graph图形
130
367000
3000
解读这张看似简单的图表的方式
06:25
is, on the vertical垂直 is an indication迹象
131
370000
2000
是横坐标表示
06:27
of how complex复杂 caregiver护理人员 utterances话语 are
132
372000
3000
照顾者的话语复杂程度
06:30
based基于 on the length长度 of utterances话语.
133
375000
2000
基于话语的长度
06:32
And the [horizontal] axis is time.
134
377000
3000
纵坐标代表了时间(演讲者口误)
06:35
And all of the data数据,
135
380000
2000
所有的数据
06:37
we aligned对齐 based基于 on the following以下 idea理念:
136
382000
3000
我们都用下述的方法排列:
06:40
Every一切 time my son儿子 would learn学习 a word,
137
385000
3000
每次我们发现儿子学了一个新的词
06:43
we would trace跟踪 back and look at all of the language语言 he heard听说
138
388000
3000
我们就会回溯他听过的这个词的
06:46
that contained that word.
139
391000
2000
所有的语言记录
06:48
And we would plot情节 the relative相对的 length长度 of the utterances话语.
140
393000
4000
然后我们绘制这些语言的长度
06:52
And what we found发现 was this curious好奇 phenomena现象,
141
397000
3000
我们发现了一个奇特的现象
06:55
that caregiver护理人员 speech言语 would systematically系统 dip to a minimum最低限度,
142
400000
3000
照顾者的讲话会系统地将语言简化
06:58
making制造 language语言 as simple简单 as possible可能,
143
403000
3000
简化到最简单的程度
07:01
and then slowly慢慢地 ascend back up in complexity复杂.
144
406000
3000
然后渐渐地回升到更复杂的句子
07:04
And the amazing惊人 thing was
145
409000
2000
而惊奇的事是
07:06
that bounce弹跳, that dip,
146
411000
2000
这种回升和下降
07:08
lined up almost几乎 precisely恰恰
147
413000
2000
正好精确的
07:10
with when each word was born天生 --
148
415000
2000
吻合了每个词的诞生过程--
07:12
word after word, systematically系统.
149
417000
2000
一个词接一个词,很有系统规律
07:14
So it appears出现 that all three primary caregivers护理人员 --
150
419000
2000
似乎三个主要的照顾他的人
07:16
myself, my wife妻子 and our nanny保姆 --
151
421000
3000
我,我妻子,和我们的保姆--
07:19
were systematically系统 and, I would think, subconsciously下意识
152
424000
3000
都是有系统的,我想,也是下意识的
07:22
restructuring重组 our language语言
153
427000
2000
重新构建我们的用语
07:24
to meet遇到 him at the birth分娩 of a word
154
429000
3000
去迎合他的新的词汇的诞生
07:27
and bring带来 him gently平缓 into more complex复杂 language语言.
155
432000
4000
带他渐渐学习更为复杂的语言
07:31
And the implications启示 of this -- there are many许多,
156
436000
2000
这其中蕴含的--有很多意义
07:33
but one I just want to point out,
157
438000
2000
但是我想指出的其中的一个
07:35
is that there must必须 be amazing惊人 feedback反馈 loops循环.
158
440000
3000
就是这个过程中必定包涵了一个惊人的反馈循环
07:38
Of course课程, my son儿子 is learning学习
159
443000
2000
当然,我的儿子是
07:40
from his linguistic语言 environment环境,
160
445000
2000
在他的语言环境中学习
07:42
but the environment环境 is learning学习 from him.
161
447000
3000
但是那个环境也在向他学习
07:45
That environment环境, people, are in these tight feedback反馈 loops循环
162
450000
3000
环境,人,都在这个紧密的反馈循环中
07:48
and creating创建 a kind of scaffolding脚手架
163
453000
2000
并建立了一种类似脚手架的互相支撑关系
07:50
that has not been noticed注意到 until直到 now.
164
455000
3000
这是之前没有被注意到的
07:54
But that's looking at the speech言语 context上下文.
165
459000
2000
这是关注讲话的语境来看
07:56
What about the visual视觉 context上下文?
166
461000
2000
若是从视觉环境来看呢?
07:58
We're not looking at --
167
463000
2000
我们现在看到的是
08:00
think of this as a dollhouse玩具屋 cutaway of our house.
168
465000
2000
想象这是用我们家做样板做的洋娃娃屋
08:02
We've我们已经 taken采取 those circular fish-eye鱼眼 lens镜片 cameras相机,
169
467000
3000
我们使用环状鱼眼睛摄像机
08:05
and we've我们已经 doneDONE some optical光纤 correction更正,
170
470000
2000
我们还做了些光学修正
08:07
and then we can bring带来 it into three-dimensional三维 life.
171
472000
4000
然后我们就可以把它做成三维录像
08:11
So welcome欢迎 to my home.
172
476000
2000
欢迎到我家来
08:13
This is a moment时刻,
173
478000
2000
这是其中的一刻
08:15
one moment时刻 captured捕获 across横过 multiple cameras相机.
174
480000
3000
通过几个录相机拍下的同一时刻
08:18
The reason原因 we did this is to create创建 the ultimate最终 memory记忆 machine,
175
483000
3000
我们这样做是为了创造出终极的记忆机器
08:21
where you can go back and interactively交互式 fly around
176
486000
3000
你可以用互动的方式前后快速搜寻
08:24
and then breathe呼吸 video-life视频生活 into this system系统.
177
489000
3000
然后用这系统体验录像生活
08:27
What I'm going to do
178
492000
2000
我要做的是
08:29
is give you an accelerated加速 view视图 of 30 minutes分钟,
179
494000
3000
是给你们看一段压缩了30分钟的速放录像
08:32
again, of just life in the living活的 room房间.
180
497000
2000
这次也是在客厅
08:34
That's me and my son儿子 on the floor地板.
181
499000
3000
这是我和我儿子在地上
08:37
And there's video视频 analytics分析
182
502000
2000
这是影片分析
08:39
that are tracking追踪 our movements运动.
183
504000
2000
跟踪我们的移动
08:41
My son儿子 is leaving离开 red ink墨水. I am leaving离开 green绿色 ink墨水.
184
506000
3000
我儿子的留下了红色的轨迹,我的是绿色的
08:44
We're now on the couch长椅,
185
509000
2000
我们在沙发上
08:46
looking out through通过 the window窗口 at cars汽车 passing通过 by.
186
511000
3000
看着窗外汽车开过
08:49
And finally最后, my son儿子 playing播放 in a walking步行 toy玩具 by himself他自己.
187
514000
3000
最后,我儿子自己玩他的学步玩具
08:52
Now we freeze冻结 the action行动, 30 minutes分钟,
188
517000
3000
现在定格,30分钟
08:55
we turn time into the vertical垂直 axis,
189
520000
2000
我们将时间放到垂直轴上
08:57
and we open打开 up for a view视图
190
522000
2000
然后我们打开
08:59
of these interaction相互作用 traces痕迹 we've我们已经 just left behind背后.
191
524000
3000
刚才留下的互动的轨迹
09:02
And we see these amazing惊人 structures结构 --
192
527000
3000
我们看见令人惊讶的结构
09:05
these little knots of two colors颜色 of thread线
193
530000
3000
这是两种颜色的小结点
09:08
we call "social社会 hot spots斑点."
194
533000
2000
我们把它称为社交热点
09:10
The spiral螺旋 thread线
195
535000
2000
那些螺旋线
09:12
we call a "solo独奏 hot spot."
196
537000
2000
我们称为单一热点
09:14
And we think that these affect影响 the way language语言 is learned学到了.
197
539000
3000
我们觉得这个影响语言学习
09:17
What we'd星期三 like to do
198
542000
2000
我们要做的是
09:19
is start开始 understanding理解
199
544000
2000
是开始去了解
09:21
the interaction相互作用 between之间 these patterns模式
200
546000
2000
这些模式与我儿子接触的
09:23
and the language语言 that my son儿子 is exposed裸露 to
201
548000
2000
语言间的关系
09:25
to see if we can predict预测
202
550000
2000
看我们是否能预测
09:27
how the structure结构体 of when words are heard听说
203
552000
2000
什么时候听到怎样的单词结构
09:29
affects影响 when they're learned学到了 --
204
554000
2000
会影响到什么时候学会字词
09:31
so in other words, the relationship关系
205
556000
2000
换句话说,就是
09:33
between之间 words and what they're about in the world世界.
206
558000
4000
词汇和他们所表示的世界的关系
09:37
So here's这里的 how we're approaching接近 this.
207
562000
2000
这是我们的解读方法
09:39
In this video视频,
208
564000
2000
在这个录像中
09:41
again, my son儿子 is being存在 traced追踪 out.
209
566000
2000
同样是跟踪我的儿子
09:43
He's leaving离开 red ink墨水 behind背后.
210
568000
2000
他留下了红色的轨迹
09:45
And there's our nanny保姆 by the door.
211
570000
2000
我们的保姆在门边
09:47
(Video视频) Nanny保姆: You want water? (Baby宝宝: AaaaAAAA.)
212
572000
3000
(录像)保姆:你要喝水妈? (宝宝:Aaaa)
09:50
Nanny保姆: All right. (Baby宝宝: AaaaAAAA.)
213
575000
3000
保姆:好。(宝宝:Aaaa)
09:53
DRDR: She offers报价 water,
214
578000
2000
戴·罗伊:她给他水
09:55
and off go the two worms蠕虫
215
580000
2000
然后两条时空虫
09:57
over to the kitchen厨房 to get water.
216
582000
2000
开始移动到厨房拿水
09:59
And what we've我们已经 doneDONE is use the word "water"
217
584000
2000
同时我们所做的就和“水”这个词
10:01
to tag标签 that moment时刻, that bit of activity活动.
218
586000
2000
联系上了,随着一些动作
10:03
And now we take the power功率 of data数据
219
588000
2000
然后我们用数据的力量
10:05
and take every一切 time my son儿子
220
590000
3000
每次我儿子
10:08
ever heard听说 the word water
221
593000
2000
听到水这个字
10:10
and the context上下文 he saw it in,
222
595000
2000
以及他看见的情景
10:12
and we use it to penetrate穿透 through通过 the video视频
223
597000
3000
我们利用这些来分析整个影片
10:15
and find every一切 activity活动 trace跟踪
224
600000
3000
找到每个跟
10:18
that co-occurred共发生 with an instance of water.
225
603000
3000
“水”字出现时发生的活动
10:21
And what this data数据 leaves树叶 in its wake唤醒
226
606000
2000
这个数据勾勒出了
10:23
is a landscape景观.
227
608000
2000
这么一幅风景
10:25
We call these wordscapeswordscapes.
228
610000
2000
我们把这个叫做 词景
10:27
This is the wordscapewordscape for the word water,
229
612000
2000
这是水字的词景
10:29
and you can see most of the action行动 is in the kitchen厨房.
230
614000
2000
你可以看见大多数行动是在厨房
10:31
That's where those big peaks are over to the left.
231
616000
3000
就是左边的这些高峰
10:34
And just for contrast对比, we can do this with any word.
232
619000
3000
相对,你也可以为其他词汇勾勒词景
10:37
We can take the word "bye再见"
233
622000
2000
比如“goog bye”(再见)里的
10:39
as in "good bye再见."
234
624000
2000
”bye"字
10:41
And we're now zoomed放大 in over the entrance入口 to the house.
235
626000
2000
我们放大到房子大门口附近
10:43
And we look, and we find, as you would expect期望,
236
628000
3000
我们看到,我们发现,你也会想到
10:46
a contrast对比 in the landscape景观
237
631000
2000
一幅相对的景象
10:48
where the word "bye再见" occurs发生 much more in a structured结构化的 way.
238
633000
3000
在那儿你看到“bye“高频率出现的结构
10:51
So we're using运用 these structures结构
239
636000
2000
我们用这些结构
10:53
to start开始 predicting预测
240
638000
2000
开始预言
10:55
the order订购 of language语言 acquisition获得,
241
640000
3000
学会语言的顺序
10:58
and that's ongoing不断的 work now.
242
643000
2000
这是在持续进行的工作
11:00
In my lab实验室, which哪一个 we're peering窥视 into now, at MITMIT --
243
645000
3000
在我麻省理工学院的研究室-就是现在看到
11:03
this is at the media媒体 lab实验室.
244
648000
2000
那是在媒体实验室里
11:05
This has become成为 my favorite喜爱 way
245
650000
2000
这成了我最喜欢的空间
11:07
of videographingvideographing just about any space空间.
246
652000
2000
视频制图方法
11:09
Three of the key people in this project项目,
247
654000
2000
这个项目的关键人物都在
11:11
Philip菲利普 DeCamp逃走, Rony罗尼 Kubat库巴特 and Brandon布兰登 Roy罗伊 are pictured合照 here.
248
656000
3000
就是图片里的菲利普·迪坎普, 罗尼·库巴特和布兰登·罗伊
11:14
Philip菲利普 has been a close collaborator合作者
249
659000
2000
菲利普是一个密切的合作者
11:16
on all the visualizations可视化 you're seeing眼看.
250
661000
2000
你们看到的视觉化功能就是他负责的
11:18
And Michael迈克尔 Fleischman弗莱施曼
251
663000
3000
还有麦克尔·菲莱舍曼
11:21
was another另一个 Ph博士.D. student学生 in my lab实验室
252
666000
2000
是我实验室的另一个博士生
11:23
who worked工作 with me on this home video视频 analysis分析,
253
668000
3000
和我一起做了家庭视频的分析
11:26
and he made制作 the following以下 observation意见:
254
671000
3000
是他发表了以下的观点:
11:29
that "just the way that we're analyzing分析
255
674000
2000
“我们分析
11:31
how language语言 connects所连接 to events事件
256
676000
3000
语言如何于事件相关
11:34
which哪一个 provide提供 common共同 ground地面 for language语言,
257
679000
2000
这是语言的共同的基础
11:36
that same相同 idea理念 we can take out of your home, Deb德布,
258
681000
4000
我们可以把同样的思路带出你的家,戴
11:40
and we can apply应用 it to the world世界 of public上市 media媒体."
259
685000
3000
我们可以把它用到公共媒体上”
11:43
And so our effort功夫 took an unexpected意外 turn.
260
688000
3000
所以我们的研究有了个意想不到的转折
11:46
Think of mass media媒体
261
691000
2000
想到大众媒体
11:48
as providing提供 common共同 ground地面
262
693000
2000
提供共同的基础
11:50
and you have the recipe食谱
263
695000
2000
你就可以把我们的方法
11:52
for taking服用 this idea理念 to a whole整个 new place地点.
264
697000
3000
运用到一个崭新的地方
11:55
We've我们已经 started开始 analyzing分析 television电视 content内容
265
700000
3000
我们开始分析电视内容
11:58
using运用 the same相同 principles原则 --
266
703000
2000
用同样的原则--
12:00
analyzing分析 event事件 structure结构体 of a TV电视 signal信号 --
267
705000
3000
分析一个电视信号的事件结构--
12:03
episodes发作 of shows节目,
268
708000
2000
电视剧集
12:05
commercials广告,
269
710000
2000
广告
12:07
all of the components组件 that make up the event事件 structure结构体.
270
712000
3000
所有的组成事件结构的成分
12:10
And we're now, with satellite卫星 dishes碗碟, pulling and analyzing分析
271
715000
3000
我们现在, 通过卫星电视,抽出分析了
12:13
a good part部分 of all the TV电视 being存在 watched看着 in the United联合的 States状态.
272
718000
3000
在美国高收视率的电视节目
12:16
And you don't have to now go and instrument仪器 living活的 rooms客房 with microphones麦克风
273
721000
3000
你不再需要把麦克风装在起居室里来
12:19
to get people's人们 conversations对话,
274
724000
2000
记录人们的对话
12:21
you just tune into publicly公然 available可得到 social社会 media媒体 feeds供稿.
275
726000
3000
你只要去听公开的社交媒体讯息就可以了
12:24
So we're pulling in
276
729000
2000
我们每个月抽出
12:26
about three billion十亿 comments注释 a month,
277
731000
2000
大概30亿个评论
12:28
and then the magic魔法 happens发生.
278
733000
2000
奇迹发生了
12:30
You have the event事件 structure结构体,
279
735000
2000
这中间可以找到事件结构
12:32
the common共同 ground地面 that the words are about,
280
737000
2000
这些词汇的共同基础
12:34
coming未来 out of the television电视 feeds供稿;
281
739000
3000
那些从这次电视讯息里透露出的反馈
12:37
you've got the conversations对话
282
742000
2000
你得到有关这些
12:39
that are about those topics主题;
283
744000
2000
话题的对话
12:41
and through通过 semantic语义 analysis分析 --
284
746000
3000
通过语意分析
12:44
and this is actually其实 real真实 data数据 you're looking at
285
749000
2000
你们看到的这个是根据我们的数据处理过后
12:46
from our data数据 processing处理 --
286
751000
2000
的真实的数据结果--
12:48
each yellow黄色 line线 is showing展示 a link链接 being存在 made制作
287
753000
3000
每条黄线显示一个链接
12:51
between之间 a comment评论 in the wild野生
288
756000
3000
连接着外界的评论
12:54
and a piece of event事件 structure结构体 coming未来 out of the television电视 signal信号.
289
759000
3000
和电视信号发出的事件结构间的关系
12:57
And the same相同 idea理念 now
290
762000
2000
这都是用同样的思路
12:59
can be built内置 up.
291
764000
2000
构建起来的
13:01
And we get this wordscapewordscape,
292
766000
2000
我们得到了这个词汇背景
13:03
except now words are not assembled组装 in my living活的 room房间.
293
768000
3000
不过现在词汇不是从我的客厅里来的
13:06
Instead代替, the context上下文, the common共同 ground地面 activities活动,
294
771000
4000
取而代之的情境,共同基础活动
13:10
are the content内容 on television电视 that's driving主动 the conversations对话.
295
775000
3000
是电视内容带动的对话
13:13
And what we're seeing眼看 here, these skyscrapers摩天大楼 now,
296
778000
3000
我们现在看到的这些高耸的结构
13:16
are commentary评论
297
781000
2000
都是电视评论
13:18
that are linked关联 to content内容 on television电视.
298
783000
2000
它们跟电视上播放的内容联系着
13:20
Same相同 concept概念,
299
785000
2000
同样的概念
13:22
but looking at communication通讯 dynamics动力学
300
787000
2000
但是你们看见的是它在不同的领域
13:24
in a very different不同 sphere领域.
301
789000
2000
展现的交流动态
13:26
And so fundamentally从根本上, rather than, for example,
302
791000
2000
从根本上,而不是,比如
13:28
measuring测量 content内容 based基于 on how many许多 people are watching观看,
303
793000
3000
根据收视率衡量内容
13:31
this gives us the basic基本 data数据
304
796000
2000
这个给了我们观察这些
13:33
for looking at engagement订婚 properties性能 of content内容.
305
798000
3000
内容参与性的最基本的资料
13:36
And just like we can look at feedback反馈 cycles周期
306
801000
3000
就跟我们可以看见家里的
13:39
and dynamics动力学 in a family家庭,
307
804000
3000
反馈循环和互动一样
13:42
we can now open打开 up the same相同 concepts概念
308
807000
3000
我们现在可以利用同样的构想
13:45
and look at much larger groups of people.
309
810000
3000
来观察更大的群体
13:48
This is a subset子集 of data数据 from our database数据库 --
310
813000
3000
这是我们资料库里的一个子集
13:51
just 50,000 out of several一些 million百万 --
311
816000
3000
只是几百万信息中的5万条
13:54
and the social社会 graph图形 that connects所连接 them
312
819000
2000
社交图是和公开资缘
13:56
through通过 publicly公然 available可得到 sources来源.
313
821000
3000
来自于对大众公开的来源
13:59
And if you put them on one plain,
314
824000
2000
如果你把它们放到平面上
14:01
a second第二 plain is where the content内容 lives生活.
315
826000
3000
第二个平面是内容活跃的地方
14:04
So we have the programs程式
316
829000
3000
于是我们有了节目
14:07
and the sporting运动的 events事件
317
832000
2000
体育活动
14:09
and the commercials广告,
318
834000
2000
广告
14:11
and all of the link链接 structures结构 that tie领带 them together一起
319
836000
2000
所有的链接结构将它们连在一起
14:13
make a content内容 graph图形.
320
838000
2000
形成了内容图表
14:15
And then the important重要 third第三 dimension尺寸.
321
840000
4000
然后是重要的第三个面向
14:19
Each of the links链接 that you're seeing眼看 rendered呈现 here
322
844000
2000
大家在这里看到的每个连接
14:21
is an actual实际 connection连接 made制作
323
846000
2000
是一段内容和有些人评论
14:23
between之间 something someone有人 said
324
848000
3000
和有些人评论
14:26
and a piece of content内容.
325
851000
2000
间构成的真实联系
14:28
And there are, again, now tens of millions百万 of these links链接
326
853000
3000
这里的几千万条链
14:31
that give us the connective结缔组织 tissue组织 of social社会 graphs
327
856000
3000
让我们看见了社交图表中的关联组织
14:34
and how they relate涉及 to content内容.
328
859000
3000
和它们跟内容的关系
14:37
And we can now start开始 to probe探测 the structure结构体
329
862000
2000
于是我们可以用有趣的办法来
14:39
in interesting有趣 ways方法.
330
864000
2000
探索这个结构
14:41
So if we, for example, trace跟踪 the path路径
331
866000
3000
所以,比如,我们跟踪
14:44
of one piece of content内容
332
869000
2000
某个内容的发展途经
14:46
that drives驱动器 someone有人 to comment评论 on it,
333
871000
2000
这促使有人对此发表评论
14:48
and then we follow跟随 where that comment评论 goes,
334
873000
3000
然后我们跟踪这些评论的去向
14:51
and then look at the entire整个 social社会 graph图形 that becomes activated活性
335
876000
3000
然后观察整个活跃的社交图
14:54
and then trace跟踪 back to see the relationship关系
336
879000
3000
然后又回头追踪查看那个社交图
14:57
between之间 that social社会 graph图形 and content内容,
337
882000
2000
和内容之间的关系
14:59
a very interesting有趣 structure结构体 becomes visible可见.
338
884000
2000
于是显现出一个非常有趣的结构
15:01
We call this a co-viewing共同观看 clique集团,
339
886000
2000
我们称之为 共视团体
15:03
a virtual虚拟 living活的 room房间 if you will.
340
888000
3000
你可以把它当成一个虚拟的客厅
15:06
And there are fascinating迷人 dynamics动力学 at play.
341
891000
2000
这里头上演着引人注目的戏剧
15:08
It's not one way.
342
893000
2000
它不是单向的
15:10
A piece of content内容, an event事件, causes原因 someone有人 to talk.
343
895000
3000
一个内容,一个事件促使某人发表了意见
15:13
They talk to other people.
344
898000
2000
他们和其他人对话
15:15
That drives驱动器 tune-in调入 behavior行为 back into mass media媒体,
345
900000
3000
就驱动了大众传媒的收视行为
15:18
and you have these cycles周期
346
903000
2000
于是出现了这样的循环
15:20
that drive驾驶 the overall总体 behavior行为.
347
905000
2000
驱动了整体的收视行为
15:22
Another另一个 example -- very different不同 --
348
907000
2000
另一个例子--情况很不同--
15:24
another另一个 actual实际 person in our database数据库 --
349
909000
3000
我们的资料库里有一位人士--
15:27
and we're finding发现 at least最小 hundreds数以百计, if not thousands数千, of these.
350
912000
3000
其实我们可以找到成千上百个例子
15:30
We've我们已经 given特定 this person a name名称.
351
915000
2000
我们给这个人一个名字
15:32
This is a pro-amateur亲业余, or pro-am亲我 media媒体 critic评论家
352
917000
3000
这是一个专业的媒体评论员
15:35
who has this high fan-out扇出 rate.
353
920000
3000
有很多粉丝
15:38
So a lot of people are following以下 this person -- very influential有影响 --
354
923000
3000
很多人都追随他 -- 很有影响力--
15:41
and they have a propensity倾向 to talk about what's on TV电视.
355
926000
2000
他们很喜欢讨论电视上在播的东西
15:43
So this person is a key link链接
356
928000
3000
于是这个人就是一个关键的链接
15:46
in connecting mass media媒体 and social社会 media媒体 together一起.
357
931000
3000
将大众媒体和社交媒体联系在了一起
15:49
One last example from this data数据:
358
934000
3000
这份资料的最后一个例子是:
15:52
Sometimes有时 it's actually其实 a piece of content内容 that is special特别.
359
937000
3000
有时确实是一件特别的内容
15:55
So if we go and look at this piece of content内容,
360
940000
4000
如果我们回顾这个内容
15:59
President主席 Obama's奥巴马 State of the Union联盟 address地址
361
944000
3000
几个星期前的欧巴马总统
16:02
from just a few少数 weeks ago,
362
947000
2000
国情咨文演讲
16:04
and look at what we find in this same相同 data数据 set,
363
949000
3000
再来看看我们在这组资料中发现些什么
16:07
at the same相同 scale规模,
364
952000
3000
用同样的尺度来衡量
16:10
the engagement订婚 properties性能 of this piece of content内容
365
955000
2000
这个内容的可参与属性
16:12
are truly remarkable卓越.
366
957000
2000
真的是很神奇的
16:14
A nation国家 exploding爆炸 in conversation会话
367
959000
2000
整个国家顿时同步
16:16
in real真实 time
368
961000
2000
爆发了谈话
16:18
in response响应 to what's on the broadcast广播.
369
963000
3000
是针对广播的东西
16:21
And of course课程, through通过 all of these lines线
370
966000
2000
当然,通过这些线路
16:23
are flowing流动 unstructured非结构化 language语言.
371
968000
2000
涌现出了结构的语言
16:25
We can X-rayX-射线
372
970000
2000
我们可以在 社交点 上
16:27
and get a real-time即时的 pulse脉冲 of a nation国家,
373
972000
2000
感受一下这个国家即时的动脉
16:29
real-time即时的 sense
374
974000
2000
即时的感受
16:31
of the social社会 reactions反应 in the different不同 circuits电路 in the social社会 graph图形
375
976000
3000
不同的社会圈的社会反应被内容所激活
16:34
being存在 activated活性 by content内容.
376
979000
3000
都展示在社会图表上
16:37
So, to summarize总结, the idea理念 is this:
377
982000
3000
所以, 总结来说,观点是:
16:40
As our world世界 becomes increasingly日益 instrumented仪表
378
985000
3000
当我们的世界变得越来越工具化
16:43
and we have the capabilities功能
379
988000
2000
我们有能力
16:45
to collect搜集 and connect the dots
380
990000
2000
搜集和链接一个一个小点
16:47
between之间 what people are saying
381
992000
2000
将人们的话语
16:49
and the context上下文 they're saying it in,
382
994000
2000
和他们说这些话时所处得环境联系起来
16:51
what's emerging新兴 is an ability能力
383
996000
2000
那么呈现的将是洞悉
16:53
to see new social社会 structures结构 and dynamics动力学
384
998000
3000
社会结构和社交动态的新视野
16:56
that have previously先前 not been seen看到.
385
1001000
2000
那是以前我们没有看见过的
16:58
It's like building建造 a microscope显微镜 or telescope望远镜
386
1003000
2000
这好像是造一个显微镜或者望远镜
17:00
and revealing揭示 new structures结构
387
1005000
2000
展示了我们交流和行为间
17:02
about our own拥有 behavior行为 around communication通讯.
388
1007000
3000
的新结构
17:05
And I think the implications启示 here are profound深刻,
389
1010000
3000
我觉得其意义是深远的
17:08
whether是否 it's for science科学,
390
1013000
2000
无论是对科学而言
17:10
for commerce商业, for government政府,
391
1015000
2000
还是对商业,政府而言
17:12
or perhaps也许 most of all,
392
1017000
2000
或许更重要的是
17:14
for us as individuals个人.
393
1019000
3000
对我们每个人而言
17:17
And so just to return返回 to my son儿子,
394
1022000
3000
所以我们把话题回到我的儿子
17:20
when I was preparing准备 this talk, he was looking over my shoulder,
395
1025000
3000
当我在准备这个演讲时,他在我身后看着
17:23
and I showed显示 him the clips剪辑 I was going to show显示 to you today今天,
396
1028000
2000
我给他看了这段我今天将要给你们看的录相
17:25
and I asked him for permission允许 -- granted理所当然.
397
1030000
3000
我征求他的同意,他同意了
17:28
And then I went on to reflect反映,
398
1033000
2000
然后我想
17:30
"Isn't it amazing惊人,
399
1035000
3000
“这真是神奇的事情
17:33
this entire整个 database数据库, all these recordings录音,
400
1038000
3000
整个数据库, 所有这些录相
17:36
I'm going to hand off to you and to your sister妹妹" --
401
1041000
2000
我会给交给你和你的妹妹”
17:38
who arrived到达 two years年份 later后来 --
402
1043000
3000
妹妹是两年后出生的
17:41
"and you guys are going to be able能够 to go back and re-experience再体验 moments瞬间
403
1046000
3000
“你们两个将能够回顾重温
17:44
that you could never, with your biological生物 memory记忆,
404
1049000
3000
你们生物记忆无法
17:47
possibly或者 remember记得 the way you can now?"
405
1052000
2000
记得的这些时刻。”
17:49
And he was quiet安静 for a moment时刻.
406
1054000
2000
那一刻他很安静
17:51
And I thought, "What am I thinking思维?
407
1056000
2000
我想:”我在想什么啊?
17:53
He's five years年份 old. He's not going to understand理解 this."
408
1058000
2000
他才5岁, 他不会理解这些。 “
17:55
And just as I was having that thought, he looked看着 up at me and said,
409
1060000
3000
而正当我怎么想着,他抬头对我说:
17:58
"So that when I grow增长 up,
410
1063000
2000
“那等我长大了,
18:00
I can show显示 this to my kids孩子?"
411
1065000
2000
我可以给我的孩子们看,是吗?”
18:02
And I thought, "Wow, this is powerful强大 stuff东东."
412
1067000
3000
我想:“哇, 这玩意儿真是太强大了。”
18:05
So I want to leave离开 you
413
1070000
2000
所以,我要给各位
18:07
with one last memorable难忘 moment时刻
414
1072000
2000
留下最后一个值得回忆的
18:09
from our family家庭.
415
1074000
3000
家庭记忆
18:12
This is the first time our son儿子
416
1077000
2000
这是我儿子第一次
18:14
took more than two steps脚步 at once一旦 --
417
1079000
2000
走了迈出两步的情形
18:16
captured捕获 on film电影.
418
1081000
2000
拍摄在录像中
18:18
And I really want you to focus焦点 on something
419
1083000
3000
我希望你们看的时候
18:21
as I take you through通过.
420
1086000
2000
注意到其中的一点
18:23
It's a cluttered凌乱 environment环境; it's natural自然 life.
421
1088000
2000
周围有点闹,这是自然的环境
18:25
My mother's母亲 in the kitchen厨房, cooking烹饪,
422
1090000
2000
我妈在厨房做饭
18:27
and, of all places地方, in the hallway门厅,
423
1092000
2000
就在过道里
18:29
I realize实现 he's about to do it, about to take more than two steps脚步.
424
1094000
3000
我意识到他就要迈步了,大概一两步的样子
18:32
And so you hear me encouraging鼓舞人心的 him,
425
1097000
2000
因此各位可以听到我在鼓励他
18:34
realizing实现 what's happening事件,
426
1099000
2000
我感到有事要发生
18:36
and then the magic魔法 happens发生.
427
1101000
2000
然后妙事发生了
18:38
Listen very carefully小心.
428
1103000
2000
请仔细听
18:40
About three steps脚步 in,
429
1105000
2000
大概在走了三步后
18:42
he realizes实现 something magic魔法 is happening事件,
430
1107000
2000
他感到了美妙的事情发生了
18:44
and the most amazing惊人 feedback反馈 loop循环 of all kicks in,
431
1109000
3000
令人惊讶的反应循环作用全部启动
18:47
and he takes a breath呼吸 in,
432
1112000
2000
他松了一口气
18:49
and he whispers耳语 "wow"
433
1114000
2000
轻轻地说了声:“哇”
18:51
and instinctively本能 I echo回声 back the same相同.
434
1116000
4000
我也凭着直觉说了同样的话
18:56
And so let's fly back in time
435
1121000
3000
我们现在回到那一刻
18:59
to that memorable难忘 moment时刻.
436
1124000
2000
回到那个令人难忘的一刻
19:05
(Video视频) DRDR: Hey.
437
1130000
2000
(录像) 戴·罗伊:嗨
19:07
Come here.
438
1132000
2000
过来
19:09
Can you do it?
439
1134000
3000
你行吗?
19:13
Oh, boy男孩.
440
1138000
2000
哇,宝贝
19:15
Can you do it?
441
1140000
3000
你行吗?
19:18
Baby宝宝: Yeah.
442
1143000
2000
宝宝:好
19:20
DRDR: Ma, he's walking步行.
443
1145000
3000
戴1罗伊:妈,他走路了
19:24
(Laughter笑声)
444
1149000
2000
(笑声)
19:26
(Applause掌声)
445
1151000
2000
(掌声)
19:28
DRDR: Thank you.
446
1153000
2000
戴·罗伊:谢谢大家
19:30
(Applause掌声)
447
1155000
15000
(掌声)
Translated by Jenny Yang
Reviewed by Ralph Jin

▲Back to top

ABOUT THE SPEAKER
Deb Roy - Cognitive scientist
Deb Roy studies how children learn language, and designs machines that learn to communicate in human-like ways. On sabbatical from MIT Media Lab, he's working with the AI company Bluefin Labs.

Why you should listen

Deb Roy directs the Cognitive Machines group at the MIT Media Lab, where he studies how children learn language, and designs machines that learn to communicate in human-like ways. To enable this work, he has pioneered new data-driven methods for analyzing and modeling human linguistic and social behavior. He has authored numerous scientific papers on artificial intelligence, cognitive modeling, human-machine interaction, data mining, and information visualization.

Deb Roy was the co-founder and serves as CEO of Bluefin Labs, a venture-backed technology company. Built upon deep machine learning principles developed in his research over the past 15 years, Bluefin has created a technology platform that analyzes social media commentary to measure real-time audience response to TV ads and shows.

Follow Deb Roy on Twitter>

Roy adds some relevant papers:

Deb Roy. (2009). New Horizons in the Study of Child Language Acquisition. Proceedings of Interspeech 2009. Brighton, England. bit.ly/fSP4Qh

Brandon C. Roy, Michael C. Frank and Deb Roy. (2009). Exploring word learning in a high-density longitudinal corpus. Proceedings of the 31st Annual Meeting of the Cognitive Science Society. Amsterdam, Netherlands. bit.ly/e1qxej

Plenty more papers on our research including technology and methodology can be found here, together with other research from my lab at MIT: bit.ly/h3paSQ

The work that I mentioned on relationships between television content and the social graph is being done at Bluefin Labs (www.bluefinlabs.com). Details of this work have not been published. The social structures we are finding (and that I highlighted in my TED talk) are indeed new. The social media communication channels that are leading to their formation did not even exist a few years ago, and Bluefin's technology platform for discovering these kinds of structures is the first of its kind. We'll certainly have more to say about all this as we continue to dig into this fascinating new kind of data, and as new social structures continue to evolve!

More profile about the speaker
Deb Roy | Speaker | TED.com