All language subtitles for 030 Object Detection - Step 4-en

af Afrikaans
ak Akan
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bn Bengali
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fr French
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
ko Korean
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranî)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
fa Persian
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal)
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es Spanish
es-419 Spanish (Latin American)
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese Download
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,420 --> 00:00:02,700 Hello and welcome to listen to Tauriel. 2 00:00:02,700 --> 00:00:08,220 All right so we have a series of transformations to make so that the input frame can be fed into the 3 00:00:08,220 --> 00:00:09,130 neural network. 4 00:00:09,270 --> 00:00:14,300 Let's do these four transformations in this tutorial and let's start immediately with the first one. 5 00:00:14,580 --> 00:00:20,910 So as we said in the previous Statoil the first one is to apply the transform transformation so that 6 00:00:21,120 --> 00:00:26,990 our input frame has a right format meaning the right time dominations and the right colors. 7 00:00:27,180 --> 00:00:34,710 So to apply this transformation we're going to introduce a new Voivode which will be frame T and that 8 00:00:34,710 --> 00:00:38,680 will respond to this future transform frame. 9 00:00:38,760 --> 00:00:44,190 But I'm calling it Frente in that frame because I want to keep the original frame because we will use 10 00:00:44,460 --> 00:00:47,700 the original frame and to put the rectangles inside. 11 00:00:47,960 --> 00:00:53,700 So Frente will be the transformed frame after applying the transform transformation and to apply this 12 00:00:53,700 --> 00:00:55,300 transform transformation. 13 00:00:55,500 --> 00:01:02,130 Well we simply need to take that transformation transform and as input as you might guess when put the 14 00:01:02,130 --> 00:01:11,340 frame frame then we must understand that this transform function returns two elements and we are only 15 00:01:11,340 --> 00:01:16,260 interested in the first element which is actually the transform frame the frame with the right format 16 00:01:16,710 --> 00:01:21,410 and therefore to get the first element returned by this function. 17 00:01:21,480 --> 00:01:27,960 We simply need to add some brackets here and get the index of the first element which is zero. 18 00:01:28,410 --> 00:01:32,140 All right so then we get this new transform frame. 19 00:01:32,160 --> 00:01:39,720 Now Frente has the right format that has the right dimensions and the right color values good first 20 00:01:39,720 --> 00:01:41,500 transformation done now. 21 00:01:41,520 --> 00:01:43,080 Second transformation. 22 00:01:43,170 --> 00:01:45,810 Remember Frente is an umpire. 23 00:01:45,930 --> 00:01:50,210 We applied the first transformation but it still returns a number by Array. 24 00:01:50,340 --> 00:01:56,690 And so the second transformation to make now is to convert this number by Array into the torch tensor. 25 00:01:56,740 --> 00:02:04,140 I remind the torch tensor is a much more advanced matrix of a single type than an array. 26 00:02:04,260 --> 00:02:08,310 It is very useful for neural networks but it will not be useful enough. 27 00:02:08,310 --> 00:02:12,040 You will see we will have still two transformations to make. 28 00:02:12,090 --> 00:02:17,240 So let's do this second transformation converting the non-bio right into a torch sensor. 29 00:02:17,280 --> 00:02:24,780 So since we're getting closer and closer to the final input that will be accepted by our as is the neural 30 00:02:24,780 --> 00:02:25,480 network. 31 00:02:25,650 --> 00:02:32,340 I'm going to start to call this input X because then I will just override the x variable with the same 32 00:02:32,340 --> 00:02:32,870 name. 33 00:02:33,060 --> 00:02:39,520 So X will be this tortured answer that will just be converted from the non-firing. 34 00:02:39,810 --> 00:02:45,150 So now it's very simple to convert a non binary into a tortured answer. 35 00:02:45,150 --> 00:02:53,070 We just need to take our torched library and then a simple function very intuitive to remember which 36 00:02:53,070 --> 00:02:55,840 is from none. 37 00:02:56,130 --> 00:03:02,970 All right from run by the function that converts numbers into torsion answers and therefore very obviously 38 00:03:03,090 --> 00:03:07,240 what we have to input here in this function is of course our number. 39 00:03:07,250 --> 00:03:07,650 Right. 40 00:03:07,650 --> 00:03:10,870 That is Frente Okay perfect. 41 00:03:10,950 --> 00:03:16,250 But now there is another slight thing to do we could call it a sub transformation. 42 00:03:16,250 --> 00:03:23,400 This wasn't a transformation that I mentioned because it's a small thing but anyway this small transformation 43 00:03:23,430 --> 00:03:31,340 is to convert the sequence red blue green into green red blue because right now the order of the indexes 44 00:03:31,340 --> 00:03:34,560 or the color for our image is red blue green. 45 00:03:34,680 --> 00:03:40,890 But the neural network says he was trained under the convention green red blue. 46 00:03:40,920 --> 00:03:46,260 So we just need to inverse the order but that's a specific thing to the neural network and therefore 47 00:03:46,530 --> 00:03:52,320 this is not the general transformations that we have to make each time remember that the transformations 48 00:03:52,320 --> 00:03:58,920 that we're making except the one we're about to make right now is the general process before finding 49 00:03:58,970 --> 00:04:04,710 and then put into a new network implemented with torche that's very important to remember but by doing 50 00:04:04,710 --> 00:04:07,330 it two or three times it will be very easy for you. 51 00:04:07,380 --> 00:04:12,360 So let's do this small and quick transformation that is a permanent nation of colors. 52 00:04:12,540 --> 00:04:18,510 So we're going to add a dot here and then we're going to use the permute function and we want to go 53 00:04:18,510 --> 00:04:21,870 from red blue green to green red blue. 54 00:04:21,930 --> 00:04:29,550 So since Green is the last index number two we always to hear then since Red is the first one we put 55 00:04:29,620 --> 00:04:36,000 a zero and since Blue is the second index that is one we put one right. 56 00:04:36,000 --> 00:04:39,220 We want to go from red blue green to green red blue. 57 00:04:39,220 --> 00:04:43,030 So we want to go from 0 1 to 2 2 0 1. 58 00:04:43,170 --> 00:04:44,040 Perfect. 59 00:04:44,040 --> 00:04:50,850 So now we have the right torch sensor format with the right order of colors. 60 00:04:50,850 --> 00:04:51,570 Perfect. 61 00:04:51,570 --> 00:04:56,220 Now next transformation this is the third transformation out of the four. 62 00:04:56,310 --> 00:05:01,450 But actually we're going to make the last two transformations in the same one line of code. 63 00:05:01,470 --> 00:05:07,810 So the third transformation is to add this fake dimension corresponding to the batch. 64 00:05:07,830 --> 00:05:15,300 And the reason for doing this is that the neural network cannot actually accept single inputs like a 65 00:05:15,300 --> 00:05:18,810 single input vector or a single input image. 66 00:05:18,900 --> 00:05:22,540 It only accepts them in to some batches. 67 00:05:22,560 --> 00:05:27,040 So basically the neural network only except batches of inputs. 68 00:05:27,180 --> 00:05:33,390 And that's why now we have to create a structure with the first time engine correspond to the batch 69 00:05:33,750 --> 00:05:37,480 and then the other dimensions corresponding to the input. 70 00:05:37,680 --> 00:05:40,440 That's very very important in PI torch. 71 00:05:40,470 --> 00:05:44,500 You will always see that we do it with the squeeze function. 72 00:05:44,580 --> 00:05:50,490 So each time you see the squeeze function it's definitely just before feeding the neural network with 73 00:05:50,490 --> 00:05:54,820 the input we use the N squeeze function to create that thing domination of the batch. 74 00:05:54,840 --> 00:06:00,940 So good thing to keep in mind if you're going to use torture in the future which I highly recommend. 75 00:06:01,110 --> 00:06:03,700 But let's do this third transformation. 76 00:06:03,830 --> 00:06:07,440 The Squeeze and the good news is that it's very simple. 77 00:06:07,440 --> 00:06:12,350 We take our input image which is now a torch sensor. 78 00:06:12,450 --> 00:06:20,340 Thanks to the previous transformation we are added and then we use the squeeze function. 79 00:06:20,910 --> 00:06:28,950 And now this squeeze function takes one argument which is exactly the index of the dimension of the 80 00:06:28,950 --> 00:06:33,290 batch and the batch should always be the first time engine. 81 00:06:33,480 --> 00:06:39,700 So it you always have the first index and therefore since indexes and bytes and start at zero. 82 00:06:39,870 --> 00:06:42,700 Well we put a zero that zero. 83 00:06:42,700 --> 00:06:48,780 Here is the Index of this first diamond and corresponding to the bet that we're adding to our structure 84 00:06:48,900 --> 00:06:50,300 of input images. 85 00:06:51,460 --> 00:06:53,810 All right so that's the third transformation that's done. 86 00:06:53,860 --> 00:06:54,760 Perfect. 87 00:06:54,850 --> 00:06:58,350 And now we're going to do the last transformation in that same line. 88 00:06:58,600 --> 00:07:00,980 And that last transformation. 89 00:07:01,030 --> 00:07:06,970 The final one and then we'll be ready to feed the new one that work with the input image is to convert 90 00:07:07,240 --> 00:07:16,180 this batch of torture and sort of input into a torch viable I remind a torch viable is a highly advanced 91 00:07:16,180 --> 00:07:20,080 variable that contains both a tensor and a gradient. 92 00:07:20,080 --> 00:07:25,810 This torch viable will become an element of the dynamic graph which will compute very efficiently the 93 00:07:25,810 --> 00:07:30,480 gradients of any composition functions during backward propagation. 94 00:07:30,490 --> 00:07:31,340 So there we go. 95 00:07:31,360 --> 00:07:33,150 That's the final transformation. 96 00:07:33,190 --> 00:07:34,880 And that is very easy to do. 97 00:07:34,930 --> 00:07:41,880 We simply need to take the variable class like that. 98 00:07:41,920 --> 00:07:48,630 So this is the variable class which will create an object which will be this towards variable. 99 00:07:48,760 --> 00:07:54,730 And therefore since we're creating a new object we need to override the previous variable X because 100 00:07:54,730 --> 00:08:01,720 I'm going to call it X again and I'm going to add here X equals variable x and zero. 101 00:08:02,080 --> 00:08:06,870 And that's exactly an object which is the torch viable. 102 00:08:06,890 --> 00:08:10,090 So now purrfect are four transformations are done. 103 00:08:10,150 --> 00:08:17,590 So congratulations you are ready to feed the is the neural network with the input images that are now 104 00:08:17,770 --> 00:08:18,940 towards variables. 105 00:08:19,180 --> 00:08:20,440 So that's wonderful. 106 00:08:20,440 --> 00:08:21,900 We're done with this material. 107 00:08:21,910 --> 00:08:24,730 We did very well our four transformations. 108 00:08:24,730 --> 00:08:31,600 So in the next time we will feed this torch very well to the neural network SSD which is already pre-trained 109 00:08:31,630 --> 00:08:37,870 because you know it's a neural network that was pre-trained to detect between 30 to 40 objects including 110 00:08:37,870 --> 00:08:42,500 planes horses dogs cars boats a lot more. 111 00:08:42,580 --> 00:08:44,820 So we're not going to do the whole training again. 112 00:08:44,830 --> 00:08:45,930 That's the beauty of it. 113 00:08:45,940 --> 00:08:47,860 We have a pre-trained model. 114 00:08:47,880 --> 00:08:51,730 We're going to feed the input image to this pre-trained model and we will get the output. 115 00:08:51,790 --> 00:08:54,470 That is the prediction detection. 116 00:08:54,970 --> 00:08:58,720 So let's do this in the next little while and until then enjoy computer vision. 12714

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.