Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,550 --> 00:00:05,560
Hello and welcome to this new tutorial in the previous tutorial we loaded the Cascades and now that
2
00:00:05,560 --> 00:00:09,480
we did that we were ready to define a function that will do the detections.
3
00:00:09,610 --> 00:00:12,960
So we will make that function in two or three tutorials.
4
00:00:13,060 --> 00:00:18,970
We will start by defining it and getting the coordinates of the rectangle that will detect the face
5
00:00:19,060 --> 00:00:24,520
and then the eyes then we will have a for loop that will iterate through the different faces there are
6
00:00:24,520 --> 00:00:26,530
detected in the video.
7
00:00:26,560 --> 00:00:31,620
And so for each of these faces we will draw a rectangle indicating that's a face.
8
00:00:31,810 --> 00:00:36,330
And also inside each of these faces we will detect some eyes.
9
00:00:36,340 --> 00:00:38,000
So that's what this function is about.
10
00:00:38,050 --> 00:00:44,320
And let's implement it right now to define a function in python we have to start with Teff and then
11
00:00:44,320 --> 00:00:50,500
we need to give a name to this function we we're going to call it detect that we go detect and now we
12
00:00:50,500 --> 00:00:52,270
need to input some arguments.
13
00:00:52,270 --> 00:00:58,630
So since this function is going to be applied on single images and that is going to return for each
14
00:00:58,630 --> 00:01:04,880
of these images the same image with the rectangle that is detecting the faces and the eyes.
15
00:01:05,110 --> 00:01:10,750
Well it's going to take as inputs an image it's going to be the different images of the video coming
16
00:01:10,750 --> 00:01:11,680
from the webcam.
17
00:01:11,830 --> 00:01:15,720
But this function works on single images one by one.
18
00:01:15,850 --> 00:01:20,540
So that's why we can put the video directly we have to play it on images.
19
00:01:20,860 --> 00:01:26,730
But as you saw in the intuition lectures a cascade works on black and white image.
20
00:01:26,740 --> 00:01:32,440
So what we need to do is first get the black and white version of the image but also the original image
21
00:01:32,440 --> 00:01:36,940
because at the end this function will return the original image with the rectangles.
22
00:01:36,940 --> 00:01:39,790
It will now return the image in black and white.
23
00:01:39,790 --> 00:01:46,000
So that's why right now what we need to take as arguments is grey which is the black and white image
24
00:01:46,000 --> 00:01:52,360
that is image and black and white and also the original frame the original image which we will call
25
00:01:52,360 --> 00:01:58,960
frame and that's are two arguments perfekt we need to add some color now and then enter to go inside
26
00:01:58,960 --> 00:02:01,800
the function and define what we want it to do.
27
00:02:02,230 --> 00:02:02,530
All right.
28
00:02:02,530 --> 00:02:07,810
So as I said at the beginning of this tutorial The first thing we need to do is get the coordinates
29
00:02:07,960 --> 00:02:10,400
of the rectangle that will detect the face.
30
00:02:10,540 --> 00:02:17,650
And so we're going to get some tables of four elements x y w h x and y are the coordinates of the upper
31
00:02:17,650 --> 00:02:18,460
left corner.
32
00:02:18,490 --> 00:02:23,060
W will be the width of the rectangle and age will be the height of the rectangle.
33
00:02:23,260 --> 00:02:28,510
But since we're going to get several tables of these four elements Well we're going to put all these
34
00:02:28,510 --> 00:02:35,680
troubles in a variable that we're going to call faces and we are going to get these supples things to
35
00:02:35,680 --> 00:02:41,710
a method of the Cascade classifier class which is to detect multi-skilled method.
36
00:02:41,710 --> 00:02:42,430
All right.
37
00:02:42,430 --> 00:02:48,010
And in order to get this method we need to take the object of the Cascade classifier class and since
38
00:02:48,010 --> 00:02:53,060
right now we're working with the faces we're trying to get the coordinates of the rectangles detecting
39
00:02:53,060 --> 00:02:53,950
the faces.
40
00:02:54,190 --> 00:02:58,730
Well we're going to get this method from Phase cascade.
41
00:02:58,750 --> 00:03:06,430
That's our object for the face to face cascade and then dart and then that's where we can use to detect
42
00:03:07,120 --> 00:03:15,110
multiday scale method that will get us to the coordinates of the upper left corner and the width and
43
00:03:15,110 --> 00:03:18,560
the height of the rectangles detecting the faces.
44
00:03:18,580 --> 00:03:21,920
So now does detect Martell's kill method takes several arguments.
45
00:03:22,090 --> 00:03:25,080
The first one is of course the emergent like in white.
46
00:03:25,090 --> 00:03:29,620
Because as we said Cascades work on black and white images.
47
00:03:29,740 --> 00:03:35,350
So Gray that's the first argument and then it's going to take two other arguments which are going to
48
00:03:35,350 --> 00:03:43,450
be first the scale factor which tells by how much the size of the image is going to be reduced or equivalently
49
00:03:43,870 --> 00:03:48,290
by how much the size of the filter is that it's the Colonel's will be increased.
50
00:03:48,290 --> 00:03:54,670
So that's the same and we are going to take a scaling factor of 1.3 which means that the size of the
51
00:03:54,670 --> 00:04:01,700
image will be reduced 1.3 times and then a last argument which is the minimum number of neighbors.
52
00:04:01,900 --> 00:04:07,480
So if you remember the intuition lectures you saw that in order for a zone of pixels to be accepted
53
00:04:07,650 --> 00:04:13,310
we all need to have at least a certain number of neighbor zones that are also accepted and that certain
54
00:04:13,370 --> 00:04:17,110
number is exactly the minimum number that we're going to input right now.
55
00:04:17,110 --> 00:04:18,420
And that is going to be five.
56
00:04:18,640 --> 00:04:26,110
So that means that in order for a zone of pixels to be accepted one at least five neighbor zones must
57
00:04:26,110 --> 00:04:27,360
also be accepted.
58
00:04:27,580 --> 00:04:29,310
So that's what this is about.
59
00:04:29,320 --> 00:04:33,850
And so now I suspect some of you will ask why 1.3 and five.
60
00:04:34,000 --> 00:04:39,340
Well the reason is that we get some really good results with these numbers and that's why we're getting
61
00:04:39,340 --> 00:04:39,820
them.
62
00:04:39,820 --> 00:04:45,970
But of course that's due to experience we experience them and it turns out that 1.3 in five is a good
63
00:04:45,970 --> 00:04:49,540
combo to detect some faces with the web cam.
64
00:04:49,540 --> 00:04:51,170
All right so there we go.
65
00:04:51,190 --> 00:05:00,450
We have our faces so as a reminder faces our troubles of four elements x and y which are ordinated the
66
00:05:00,520 --> 00:05:07,530
upper left corner of the rectangle that will detect the face and W and age which are respectively the
67
00:05:07,530 --> 00:05:10,020
width and height of these rectangles.
68
00:05:10,350 --> 00:05:10,800
Perfect.
69
00:05:10,800 --> 00:05:15,240
So now we're going to start a full loop and actually we're going to iterate through the different faces
70
00:05:15,240 --> 00:05:22,290
because we have several tables and for each of these faces we will draw the rectangle and inside these
71
00:05:22,290 --> 00:05:26,300
rectangles will detect some eyes but we'll do that in the next tutorial.
72
00:05:26,310 --> 00:05:28,200
So until then enjoy computer vision.
8216
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.