subtitlecat.com

All language subtitles for 028 Object Detection - Step 2-en

Afrikaans

Akan

Albanian

Amharic

Arabic

Armenian

Azerbaijani

Basque

Belarusian

Bemba

Bengali

Bihari

Bosnian

Breton

Bulgarian

Cambodian

Catalan

Cebuano

Cherokee

Chichewa

Chinese (Simplified)

Chinese (Traditional)

Corsican

Croatian

Czech

Danish

Dutch

English

Esperanto

Estonian

Ewe

Faroese

Filipino

Finnish

French

Frisian

Galician

Georgian

German

Greek

Guarani

Gujarati

Haitian Creole

Hausa

Hawaiian

Hebrew

Hindi

Hmong

Hungarian

Icelandic

Igbo

Indonesian

Interlingua

Irish

Italian

Japanese

Javanese

Kannada

Kazakh

Kinyarwanda

Kirundi

Kongo

Korean

Krio (Sierra Leone)

Kurdish

Kurdish (Soranî)

Kyrgyz

Laothian

Latin

Latvian

Lingala

Lithuanian

Lozi

Luganda

Luo

Luxembourgish

Macedonian

Malagasy

Malay

Malayalam

Maltese

Maori

Marathi

Mauritian Creole

Moldavian

Mongolian

Myanmar (Burmese)

Montenegrin

Nepali

Nigerian Pidgin

Northern Sotho

Norwegian

Norwegian (Nynorsk)

Occitan

Oriya

Oromo

Pashto

Persian

Polish

Portuguese (Brazil)

Portuguese (Portugal)

Punjabi

Quechua

Romanian

Romansh

Runyakitara

Russian

Samoan

Scots Gaelic

Serbian

Serbo-Croatian

Sesotho

Setswana

Seychellois Creole

Shona

Sindhi

Sinhalese

Slovak

Slovenian

Somali

Spanish

Spanish (Latin American)

Sundanese

Swahili

Swedish

Tajik

Tamil

Tatar

Telugu

Thai

Tigrinya

Tonga

Tshiluba

Tumbuka

Turkish

Turkmen

Twi

Uighur

Ukrainian

Urdu

Uzbek

Vietnamese Download

Welsh

Wolof

Xhosa

Yiddish

Yoruba

Zulu

Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,510 --> 00:00:05,250 Hello and welcome to this new editorial in the British this editorials we described the challenge that 2 00:00:05,250 --> 00:00:05,890 we have. 3 00:00:05,970 --> 00:00:12,750 We have to detect a funny dog on a two seconds video and we will do it through a computer vision based 4 00:00:12,840 --> 00:00:14,730 on deep learning neural networks. 5 00:00:14,790 --> 00:00:18,050 So we already found the right folder now. 6 00:00:18,070 --> 00:00:22,850 This quick to toile I'm going to explain the libraries that we're going to use. 7 00:00:22,890 --> 00:00:27,480 They're already all installed I already prepared the code that will import them. 8 00:00:27,480 --> 00:00:33,000 So there is nothing to do but I think it's important that you understand what we will be using them 9 00:00:33,000 --> 00:00:33,840 for. 10 00:00:33,840 --> 00:00:39,060 So let's start with the first one as you can see the first library when port is torche that's of course 11 00:00:39,150 --> 00:00:47,220 the torch library that contains PI torch which is definitely by far our best weapon to build a new one 12 00:00:47,220 --> 00:00:52,650 that work and do some computer vision and that's for a specific reason it's because by torch content 13 00:00:52,650 --> 00:00:58,890 the dynamic graphs things to which we are able to compute very efficiently the gradients of composition 14 00:00:58,890 --> 00:01:01,590 functions in backward propagation. 15 00:01:01,590 --> 00:01:06,870 You know when we have to update day to wait through stochastic gradient descent Well we have to compute 16 00:01:06,960 --> 00:01:11,380 the gradient of some composition functions because we have several layers. 17 00:01:11,400 --> 00:01:14,240 You know it's a deep neural network so we have several layers. 18 00:01:14,370 --> 00:01:19,740 And therefore it's like one has some functions of the PRI's layer which has some functions of the previous 19 00:01:19,740 --> 00:01:23,160 previous layer so that generates some composition functions. 20 00:01:23,160 --> 00:01:28,140 We have to compute the gradient of these composition functions to have data weights according to how 21 00:01:28,140 --> 00:01:32,980 much they are responsible for the error between the target and the predictions. 22 00:01:33,120 --> 00:01:34,860 So that's where it plays. 23 00:01:35,010 --> 00:01:41,400 And the dynamic graphs is a highly advanced graph structure that allows to have some very fast and efficient 24 00:01:41,610 --> 00:01:43,550 computation of the gradients. 25 00:01:43,560 --> 00:01:49,980 So that's why torture is our first choice then from torche undergrad which is the module responsible 26 00:01:49,980 --> 00:01:51,240 for graden descent. 27 00:01:51,360 --> 00:01:58,710 We are importing the variable class which will be used to convert the tensors into some torche variables 28 00:01:58,710 --> 00:02:02,030 that will contain both the tensor and a gradient. 29 00:02:02,160 --> 00:02:07,710 And then the storage variable containing the tensor in the gradients will be one element of the graph. 30 00:02:07,710 --> 00:02:14,550 Then of course we import CB2 and that even if we're not going to implement a model based on open Hargus 31 00:02:14,580 --> 00:02:20,310 gate we're just importing CB2 because we will be drawing some rectangles around the deck. 32 00:02:20,390 --> 00:02:24,860 But the detection of the dog will not be based on open city Harker's Cate's. 33 00:02:25,020 --> 00:02:33,160 They will be based on as is the neural network that is single shot multa box detection so opens we just 34 00:02:33,180 --> 00:02:41,490 to draw the rectangles then hear from Data Import base transform the classes as label map data is just 35 00:02:41,490 --> 00:02:50,010 a folder that contains the classes based transform and classes then base transform is a class that will 36 00:02:50,010 --> 00:02:56,850 do the required transformations so that the image the input images will be compatible with the neural 37 00:02:56,850 --> 00:02:57,450 network. 38 00:02:57,510 --> 00:03:02,370 You know when we feed the neural network with the input images they have to have a certain format and 39 00:03:02,490 --> 00:03:08,880 base transform will be used to transform the images in this format so that they can be accepted into 40 00:03:08,880 --> 00:03:10,100 the network. 41 00:03:10,380 --> 00:03:12,360 And then what Les's. 42 00:03:12,630 --> 00:03:17,160 Well look Les's is just a dictionary that will do the encoding of the classes. 43 00:03:17,280 --> 00:03:24,390 So for example planes will be encoded as one Dug's will be included as to is just an example it's not 44 00:03:24,540 --> 00:03:26,710 exactly just mapping but that's the idea. 45 00:03:26,730 --> 00:03:31,170 We'll do a mapping because of course we want to work with numbers and not text. 46 00:03:31,230 --> 00:03:36,480 So that's just a very simple dictionary doing the mapping between the text fields of the classes and 47 00:03:36,540 --> 00:03:37,860 some integers. 48 00:03:37,860 --> 00:03:42,400 All right then from the import build SSD. 49 00:03:42,720 --> 00:03:51,600 So first SSD is the library of the single shot multi-book action model and then build as that we import 50 00:03:51,600 --> 00:03:57,120 from the SSD library will be the constructor of the SSD neural network. 51 00:03:57,120 --> 00:04:03,090 And so if you want to have a look you can have a look in this as is digitized and fell to see how it 52 00:04:03,090 --> 00:04:03,680 works. 53 00:04:03,840 --> 00:04:09,300 But it is just a constructor that will build the architecture of this single shot not box detection 54 00:04:09,300 --> 00:04:10,100 model. 55 00:04:10,230 --> 00:04:19,860 And finally image I know is just the library that we'll use to process the images of the video and applying 56 00:04:19,980 --> 00:04:23,250 the detect function that will implement on the images. 57 00:04:23,250 --> 00:04:30,960 So at first we wanted to import pill P L which is another library but image I O actually turns out to 58 00:04:30,960 --> 00:04:34,090 be a much better choice in terms of lines of code. 59 00:04:34,170 --> 00:04:40,350 You'll see we will only have to type two or three lines of code to be able to process the images of 60 00:04:40,350 --> 00:04:41,100 the video. 61 00:04:41,220 --> 00:04:47,430 That is funny Doug and before and apply to detect function that will implement to detect the dog and 62 00:04:47,430 --> 00:04:49,430 the humans on the video. 63 00:04:50,010 --> 00:04:50,370 All right. 64 00:04:50,370 --> 00:04:56,490 So I hope you have now a clear understanding of the libraries that will be used. 65 00:04:56,490 --> 00:04:58,360 It's important to know how they work. 66 00:04:58,560 --> 00:05:05,820 And now with you're going to do is define the detect function that will do the detections. 67 00:05:05,840 --> 00:05:08,650 So let's take a fresh start in the next tutorial. 68 00:05:08,750 --> 00:05:10,790 And so until then enjoy computer vision. 7633