All language subtitles for 09-Lecture 1 Segment 9 What can AI do Perception.en

af Afrikaans
sq Albanian
am Amharic
ar Arabic Download
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bn Bengali
bs Bosnian
bg Bulgarian
ca Catalan
ceb Cebuano
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
tl Filipino
fi Finnish
fr French
fy Frisian
gl Galician
ka Georgian
de German
el Greek
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
km Khmer
ko Korean
ku Kurdish (Kurmanji)
ky Kyrgyz
lo Lao
la Latin
lv Latvian
lt Lithuanian
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mn Mongolian
my Myanmar (Burmese)
ne Nepali
no Norwegian
ps Pashto
fa Persian
pl Polish
pt Portuguese
pa Punjabi
ro Romanian
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
st Sesotho
sn Shona
sd Sindhi
si Sinhala
sk Slovak
sl Slovenian
so Somali
es Spanish
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
te Telugu
th Thai
tr Turkish
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
or Odia (Oriya)
rw Kinyarwanda
tk Turkmen
tt Tatar
ug Uyghur
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:01,790 --> 00:00:05,899 Okay, other areas. Another big sub-area of artificial intelligence is perceiving the world, 2 00:00:05,899 --> 00:00:09,730 and in large part this is vision but there's other kinds of perception. 3 00:00:09,730 --> 00:00:11,949 So, things like object recognition, face recognition. 4 00:00:11,949 --> 00:00:15,079 You probably have a lot of this technology closer than you think, you 5 00:00:15,079 --> 00:00:18,880 probably have face recognition built into your cameras, that's actually face detection, 6 00:00:18,880 --> 00:00:22,669 not always face recognition though some cameras do that too. Segmenting scenes into pieces, 7 00:00:22,669 --> 00:00:26,320 figuring out for a given image what it means, what's going on. 8 00:00:26,320 --> 00:00:30,529 Here's an example on the left of an image, and overlaid on the image is the 9 00:00:30,529 --> 00:00:33,780 machine's reconstruction of kind of the underlying 3D 10 00:00:33,780 --> 00:00:35,180 outline and mesh, 11 00:00:35,180 --> 00:00:39,020 and you can see that reconstructed on the right. This is from an image, but actually 12 00:00:39,020 --> 00:00:42,660 it turns out that in addition to being able to do a bunch of cool things in 13 00:00:42,660 --> 00:00:44,380 vision with the image, 14 00:00:44,380 --> 00:00:47,990 one realization we've had in cases like autonomous driving and vision is 15 00:00:47,990 --> 00:00:51,210 we don't have to use the tools that humans use. We've spent a long time with vision 16 00:00:51,210 --> 00:00:54,830 just trying to use like a camera, or maybe two cameras slightly apart because 17 00:00:54,830 --> 00:00:58,140 that's what we have, we've got two cameras slightly apart. But then we realized, 18 00:00:58,140 --> 00:01:01,300 we can do other stuff. So what's this, anybody recognize this? 19 00:01:01,300 --> 00:01:05,010 It's a Kinect. The Kinect's got sensors that you don't. 20 00:01:05,010 --> 00:01:08,790 Sorry you didn't get a rangefinder, a depth detector, you just didn't. 21 00:01:08,790 --> 00:01:11,980 But, you know, we can build them so why not. And so, now we can do cool things like 22 00:01:11,980 --> 00:01:15,660 take an image and produce a depth map that isn't just about parallax, looking at 23 00:01:15,660 --> 00:01:18,740 the difference between the two eyes, or about kind of inferring from occlusion. 24 00:01:18,740 --> 00:01:22,740 You'll notice, people like to think that vision is all about having two images, 25 00:01:22,740 --> 00:01:26,110 but if you close one eye, you can still see depth. 26 00:01:26,110 --> 00:01:29,990 It's not like the world suddenly goes flat and you shriek. I mean, you close one eye, you still have a sense of depth, 27 00:01:29,990 --> 00:01:33,050 we want to be to build machines that do that, but at the moment we do pretty well by 28 00:01:33,050 --> 00:01:36,730 using things like depth detectors, cause why not. 29 00:01:36,730 --> 00:01:38,600 Let's take a look at science fiction again. 30 00:01:38,600 --> 00:01:43,120 Does anybody recognize this movie? Does anybody know what this is gonna be? 31 00:01:43,120 --> 00:01:48,720 Yeah, this is Terminator here, and let's take a look at what it's like to be a 32 00:01:48,720 --> 00:01:52,520 Terminator--it's relevant to vision. So here's what it's like to be a terminator. 33 00:01:52,520 --> 00:01:59,520 It's actually a lot like being Governor of California, apparently. 34 00:02:00,009 --> 00:02:02,070 Okay, so he looks around, 35 00:02:02,070 --> 00:02:07,090 okay, motorcycle, motorcycle, motorcycle, car, 36 00:02:07,090 --> 00:02:09,229 motorcycle, 37 00:02:09,229 --> 00:02:13,189 some place, target acquired. Okay, so 38 00:02:13,189 --> 00:02:16,969 looking around, outlines, detection. Identifying what the objects are, figuring 39 00:02:16,969 --> 00:02:20,379 out what the target is, that's in the movies. 40 00:02:20,379 --> 00:02:21,729 Straight out of science fiction. 41 00:02:21,729 --> 00:02:25,879 Let's look at some vision recognition system--this is a cute demo from Al Rahimi's lab. 42 00:02:25,879 --> 00:02:27,579 43 00:02:27,579 --> 00:02:31,849 So here we have a camera panning around, and it's kind of--we can do exactly the same 44 00:02:31,849 --> 00:02:34,050 thing but for real. So here we have 45 00:02:34,050 --> 00:02:35,859 the cat. 46 00:02:35,859 --> 00:02:38,319 Cat... 47 00:02:38,319 --> 00:02:41,489 Frog... 48 00:02:41,489 --> 00:02:44,839 Fox... 49 00:02:44,839 --> 00:02:49,379 Dalmatian... 50 00:02:49,379 --> 00:02:51,669 Bulldog... Terminate bulldog, right. Okay. 51 00:02:51,669 --> 00:02:55,229 So, 52 00:02:55,229 --> 00:02:59,269 this is a case where I think it's amazing how close we can to what people thought 53 00:02:59,269 --> 00:03:04,019 it might be like if this technology were possible. This is not robots from the 54 00:03:04,019 --> 00:03:05,909 future detecting the bulldog, this is today. 4965

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.