All language subtitles for 028 Object Detection - Step 2-en

af Afrikaans
ak Akan
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bn Bengali
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fr French
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
ko Korean
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranî)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
fa Persian
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal)
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es Spanish
es-419 Spanish (Latin American)
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese Download
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,510 --> 00:00:05,250 Hello and welcome to this new editorial in the British this editorials we described the challenge that 2 00:00:05,250 --> 00:00:05,890 we have. 3 00:00:05,970 --> 00:00:12,750 We have to detect a funny dog on a two seconds video and we will do it through a computer vision based 4 00:00:12,840 --> 00:00:14,730 on deep learning neural networks. 5 00:00:14,790 --> 00:00:18,050 So we already found the right folder now. 6 00:00:18,070 --> 00:00:22,850 This quick to toile I'm going to explain the libraries that we're going to use. 7 00:00:22,890 --> 00:00:27,480 They're already all installed I already prepared the code that will import them. 8 00:00:27,480 --> 00:00:33,000 So there is nothing to do but I think it's important that you understand what we will be using them 9 00:00:33,000 --> 00:00:33,840 for. 10 00:00:33,840 --> 00:00:39,060 So let's start with the first one as you can see the first library when port is torche that's of course 11 00:00:39,150 --> 00:00:47,220 the torch library that contains PI torch which is definitely by far our best weapon to build a new one 12 00:00:47,220 --> 00:00:52,650 that work and do some computer vision and that's for a specific reason it's because by torch content 13 00:00:52,650 --> 00:00:58,890 the dynamic graphs things to which we are able to compute very efficiently the gradients of composition 14 00:00:58,890 --> 00:01:01,590 functions in backward propagation. 15 00:01:01,590 --> 00:01:06,870 You know when we have to update day to wait through stochastic gradient descent Well we have to compute 16 00:01:06,960 --> 00:01:11,380 the gradient of some composition functions because we have several layers. 17 00:01:11,400 --> 00:01:14,240 You know it's a deep neural network so we have several layers. 18 00:01:14,370 --> 00:01:19,740 And therefore it's like one has some functions of the PRI's layer which has some functions of the previous 19 00:01:19,740 --> 00:01:23,160 previous layer so that generates some composition functions. 20 00:01:23,160 --> 00:01:28,140 We have to compute the gradient of these composition functions to have data weights according to how 21 00:01:28,140 --> 00:01:32,980 much they are responsible for the error between the target and the predictions. 22 00:01:33,120 --> 00:01:34,860 So that's where it plays. 23 00:01:35,010 --> 00:01:41,400 And the dynamic graphs is a highly advanced graph structure that allows to have some very fast and efficient 24 00:01:41,610 --> 00:01:43,550 computation of the gradients. 25 00:01:43,560 --> 00:01:49,980 So that's why torture is our first choice then from torche undergrad which is the module responsible 26 00:01:49,980 --> 00:01:51,240 for graden descent. 27 00:01:51,360 --> 00:01:58,710 We are importing the variable class which will be used to convert the tensors into some torche variables 28 00:01:58,710 --> 00:02:02,030 that will contain both the tensor and a gradient. 29 00:02:02,160 --> 00:02:07,710 And then the storage variable containing the tensor in the gradients will be one element of the graph. 30 00:02:07,710 --> 00:02:14,550 Then of course we import CB2 and that even if we're not going to implement a model based on open Hargus 31 00:02:14,580 --> 00:02:20,310 gate we're just importing CB2 because we will be drawing some rectangles around the deck. 32 00:02:20,390 --> 00:02:24,860 But the detection of the dog will not be based on open city Harker's Cate's. 33 00:02:25,020 --> 00:02:33,160 They will be based on as is the neural network that is single shot multa box detection so opens we just 34 00:02:33,180 --> 00:02:41,490 to draw the rectangles then hear from Data Import base transform the classes as label map data is just 35 00:02:41,490 --> 00:02:50,010 a folder that contains the classes based transform and classes then base transform is a class that will 36 00:02:50,010 --> 00:02:56,850 do the required transformations so that the image the input images will be compatible with the neural 37 00:02:56,850 --> 00:02:57,450 network. 38 00:02:57,510 --> 00:03:02,370 You know when we feed the neural network with the input images they have to have a certain format and 39 00:03:02,490 --> 00:03:08,880 base transform will be used to transform the images in this format so that they can be accepted into 40 00:03:08,880 --> 00:03:10,100 the network. 41 00:03:10,380 --> 00:03:12,360 And then what Les's. 42 00:03:12,630 --> 00:03:17,160 Well look Les's is just a dictionary that will do the encoding of the classes. 43 00:03:17,280 --> 00:03:24,390 So for example planes will be encoded as one Dug's will be included as to is just an example it's not 44 00:03:24,540 --> 00:03:26,710 exactly just mapping but that's the idea. 45 00:03:26,730 --> 00:03:31,170 We'll do a mapping because of course we want to work with numbers and not text. 46 00:03:31,230 --> 00:03:36,480 So that's just a very simple dictionary doing the mapping between the text fields of the classes and 47 00:03:36,540 --> 00:03:37,860 some integers. 48 00:03:37,860 --> 00:03:42,400 All right then from the import build SSD. 49 00:03:42,720 --> 00:03:51,600 So first SSD is the library of the single shot multi-book action model and then build as that we import 50 00:03:51,600 --> 00:03:57,120 from the SSD library will be the constructor of the SSD neural network. 51 00:03:57,120 --> 00:04:03,090 And so if you want to have a look you can have a look in this as is digitized and fell to see how it 52 00:04:03,090 --> 00:04:03,680 works. 53 00:04:03,840 --> 00:04:09,300 But it is just a constructor that will build the architecture of this single shot not box detection 54 00:04:09,300 --> 00:04:10,100 model. 55 00:04:10,230 --> 00:04:19,860 And finally image I know is just the library that we'll use to process the images of the video and applying 56 00:04:19,980 --> 00:04:23,250 the detect function that will implement on the images. 57 00:04:23,250 --> 00:04:30,960 So at first we wanted to import pill P L which is another library but image I O actually turns out to 58 00:04:30,960 --> 00:04:34,090 be a much better choice in terms of lines of code. 59 00:04:34,170 --> 00:04:40,350 You'll see we will only have to type two or three lines of code to be able to process the images of 60 00:04:40,350 --> 00:04:41,100 the video. 61 00:04:41,220 --> 00:04:47,430 That is funny Doug and before and apply to detect function that will implement to detect the dog and 62 00:04:47,430 --> 00:04:49,430 the humans on the video. 63 00:04:50,010 --> 00:04:50,370 All right. 64 00:04:50,370 --> 00:04:56,490 So I hope you have now a clear understanding of the libraries that will be used. 65 00:04:56,490 --> 00:04:58,360 It's important to know how they work. 66 00:04:58,560 --> 00:05:05,820 And now with you're going to do is define the detect function that will do the detections. 67 00:05:05,840 --> 00:05:08,650 So let's take a fresh start in the next tutorial. 68 00:05:08,750 --> 00:05:10,790 And so until then enjoy computer vision. 7633

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.