Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,510 --> 00:00:05,880
Hello and welcome to listen to toil to they we're going to do the face recognition with the web cam
2
00:00:06,210 --> 00:00:13,350
by playing the detect function that we made in the previous toils on each of the images coming from
3
00:00:13,350 --> 00:00:14,520
the webcam.
4
00:00:14,520 --> 00:00:14,850
All right.
5
00:00:14,850 --> 00:00:16,050
So let's do this.
6
00:00:16,050 --> 00:00:21,680
And if that doesn't take a while we might actually see the results at the end of this tutorial.
7
00:00:21,720 --> 00:00:27,510
So the first thing we need is the last frame coming from the webcam because that's on this last frame
8
00:00:27,510 --> 00:00:33,930
that we will apply the detect function and so to get this last frame we're going to create an object
9
00:00:34,080 --> 00:00:36,670
of another class which is video capture.
10
00:00:36,840 --> 00:00:43,260
So that's a class working on the web cam and basically this object will contain several entities several
11
00:00:43,260 --> 00:00:46,260
informations and one of them will be the frame.
12
00:00:46,260 --> 00:00:49,650
The last image last frame coming from the webcam.
13
00:00:49,830 --> 00:00:57,420
So this object is going to be called Video underscore capture because this is actually an object from
14
00:00:58,200 --> 00:01:07,190
the video capture class which is a class of the CB2 much all.
15
00:01:07,200 --> 00:01:14,190
So let's not forget CB2 here video capture class and that class takes one argument actually which is
16
00:01:14,340 --> 00:01:16,220
0 0 1 0.
17
00:01:16,260 --> 00:01:22,140
If it's the webcam of the computer and one if it's a webcam coming from an external device that is if
18
00:01:22,140 --> 00:01:26,040
you have an external webcam that you plugged into your computer.
19
00:01:26,340 --> 00:01:32,460
I'm going to use the webcam of my computer the internal webcam so that's going to be zero.
20
00:01:32,520 --> 00:01:36,330
But if you have an external webcam you will have to put one here.
21
00:01:36,330 --> 00:01:37,260
All right.
22
00:01:37,260 --> 00:01:44,400
So that's basically all we need to do to get this video capture object that contains the last frame
23
00:01:44,580 --> 00:01:45,930
coming from the webcam.
24
00:01:46,170 --> 00:01:54,540
And now we are ready to start the loop to apply to detect function on all the images coming from the
25
00:01:54,540 --> 00:01:59,640
web cam that we get things through this video capture object and on all these images.
26
00:01:59,640 --> 00:02:04,060
We will apply to detect function to detect the face and the eyes.
27
00:02:04,170 --> 00:02:09,270
So that's why we're going to start a loop now which is not going to be a for loop because it's going
28
00:02:09,270 --> 00:02:15,180
to be an infinite loop and we will break it by using the break trick.
29
00:02:15,180 --> 00:02:20,720
And by then you know you can end the loop by just typing break and actually break will be applied.
30
00:02:20,910 --> 00:02:24,920
If we type on q on the keyboard to turn off the webcam.
31
00:02:25,320 --> 00:02:27,590
So this while loop.
32
00:02:27,660 --> 00:02:36,140
The trick is to do well here and then trick which means we repeat infinitely until daybreak.
33
00:02:36,210 --> 00:02:36,720
All right.
34
00:02:36,840 --> 00:02:39,050
So what is the first step in it to do.
35
00:02:39,210 --> 00:02:43,560
Well the first step is to get the last frame coming from the webcam and to get it.
36
00:02:43,620 --> 00:02:50,100
We're going to use a method for more video capture object which is the read method and this read method
37
00:02:50,160 --> 00:02:55,530
returns two elements and the second element is the last frame coming from the webcam.
38
00:02:55,590 --> 00:03:02,700
And since we're only interested in that we're going to use this trick here underscore cummer frame because
39
00:03:02,700 --> 00:03:06,600
the read method returns two elements and we are only interested in this one.
40
00:03:06,600 --> 00:03:11,970
And the fact that we're putting an underscore here means that we won't get the first element returned
41
00:03:11,970 --> 00:03:13,190
by this read method.
42
00:03:14,420 --> 00:03:19,130
All right and so now we need to call this read method to get the last frame of the webcam and to do
43
00:03:19,130 --> 00:03:23,510
this since this method is a method of the video capture class.
44
00:03:23,510 --> 00:03:32,050
Well we need to take our video capture object then that and then read with some parenthesis.
45
00:03:32,060 --> 00:03:32,810
Perfect.
46
00:03:32,990 --> 00:03:39,160
That gets us the last frame of the webcam and now we are almost ready to apply to detect method.
47
00:03:39,170 --> 00:03:44,840
Remember the Digic method not only takes the frame as input but also the gray that is the black and
48
00:03:44,840 --> 00:03:50,210
white version of the frame and therefore we need to do something here to get this black and white version
49
00:03:50,210 --> 00:03:55,100
of this frame that we just got thanks to the read method of our video capture object.
50
00:03:55,280 --> 00:04:03,140
And to do this we're going to use the city color function CVG color is a function of our of each module
51
00:04:03,320 --> 00:04:06,600
that allows to do some transformations on some frames.
52
00:04:06,650 --> 00:04:12,650
So there will be some transformations on our original colored frame and this transformation will be
53
00:04:12,650 --> 00:04:19,190
of course to convert this color image into black and white so that we can get the second argument here
54
00:04:19,610 --> 00:04:22,850
Grey which will be the black and white version of this frame.
55
00:04:23,180 --> 00:04:28,390
So let's call it actually gray black and white version of the frame.
56
00:04:28,530 --> 00:04:36,530
And now we're going to take our city to Mudgal because we want to take the CVT color class which takes
57
00:04:36,920 --> 00:04:42,530
two arguments the first one is of course frame because that's the image we want to convert into black
58
00:04:42,530 --> 00:04:43,260
and white.
59
00:04:43,550 --> 00:04:54,060
And the second one will be civil to that all in capital letters color underscore b g are to gray.
60
00:04:54,560 --> 00:04:56,170
So what is this.
61
00:04:56,180 --> 00:05:01,320
This argument Sivi to that color B.G. are two gray cells.
62
00:05:01,340 --> 00:05:08,060
In fact to do an average on blue green and red and that's to take a scale of gray because you know cause
63
00:05:08,120 --> 00:05:11,310
the Cascades are applied on the gray images.
64
00:05:11,480 --> 00:05:16,670
But in order to get the right contrast of black and white well we need to do some average on the blue
65
00:05:16,760 --> 00:05:19,090
red and green to get this contrast.
66
00:05:19,160 --> 00:05:24,590
You know if we have some parts of the images that are with dark colors and some dark red blue green
67
00:05:24,830 --> 00:05:28,000
Well we will get some dark gray in the great image.
68
00:05:28,010 --> 00:05:32,350
So that's just a trick to obtain the right shades of gray.
69
00:05:32,360 --> 00:05:40,370
All right then I think you guess what we have to do we have to frame the original color image and gray
70
00:05:40,610 --> 00:05:43,060
which is the black and white version of the image.
71
00:05:43,130 --> 00:05:46,700
So we are ready to apply the detect function.
72
00:05:46,700 --> 00:05:48,830
We have the two arguments grand frame.
73
00:05:48,830 --> 00:05:55,200
So basically we're ready to get the original image with the detector rectangles detecting the face and
74
00:05:55,200 --> 00:05:56,100
the eyes.
75
00:05:56,150 --> 00:06:03,200
So we're going to introduce a new voice will here that we're going to call canvas and that will be the
76
00:06:03,200 --> 00:06:11,840
result of the detect function applied to our frame which is the last frame captured by the webcam and
77
00:06:12,140 --> 00:06:14,450
gray the black and white version of this frame.
78
00:06:14,750 --> 00:06:22,720
So what we simply need to do here is to call our detect function and then apply gray and frame gray
79
00:06:23,450 --> 00:06:32,120
and frame and that will get us the last streamed image of the web cam with the rectangle detectors Okay.
80
00:06:32,290 --> 00:06:35,170
Then we need to use some more tricks.
81
00:06:35,170 --> 00:06:41,410
The first week we need to do is to display in an animated way all the successive outputs that is all
82
00:06:41,410 --> 00:06:44,760
the successive processed images in a window.
83
00:06:44,830 --> 00:06:51,450
And to do this we're going to use from RACV to Mudgal the eum show function.
84
00:06:51,710 --> 00:06:54,960
So that's that is just a function that will do that for you.
85
00:06:55,070 --> 00:07:00,080
Display all these processed images in an animated way in a window.
86
00:07:00,170 --> 00:07:02,550
And so this show function takes two arguments.
87
00:07:02,570 --> 00:07:10,910
The first one is video and the second one is the canvas canvas which is the output of the detect function.
88
00:07:10,910 --> 00:07:18,900
That is the original image coming from the webcam but with the detector rectangles right then we need
89
00:07:18,900 --> 00:07:26,070
to use another trick which will be used to stop the webcam and face the election process if we press
90
00:07:26,070 --> 00:07:27,490
on cue on the keyboard.
91
00:07:27,780 --> 00:07:40,910
So the trick to do that is to blame if conditions saying that if see that wait key and you put one and
92
00:07:41,150 --> 00:07:47,320
0 x f f equals equals or in some quotes.
93
00:07:47,420 --> 00:07:49,930
Q That's Q On the keyboard.
94
00:07:50,300 --> 00:07:53,380
So is this then what do we want to do.
95
00:07:53,390 --> 00:07:57,370
Remember where I told you about the while loop the infinite loop.
96
00:07:57,560 --> 00:08:01,680
Well we want to break the process that is set stop everything.
97
00:08:01,790 --> 00:08:07,250
So this well loop will be over and that will allow us to end the process.
98
00:08:07,250 --> 00:08:12,380
So basically it's not very important to understand the techniques and syntax of this.
99
00:08:12,380 --> 00:08:18,500
This just means that if we press on cue on the keyboard Well this will break the process.
100
00:08:18,500 --> 00:08:20,610
No more facial recognition.
101
00:08:21,110 --> 00:08:21,800
OK.
102
00:08:21,860 --> 00:08:22,780
Perfect.
103
00:08:22,790 --> 00:08:31,220
So actually the while loop is over so we can just go back to the main code and we have only two things
104
00:08:31,220 --> 00:08:32,390
to do left.
105
00:08:32,400 --> 00:08:40,460
The first thing is to turn off the webcam and to do this we take our video capture object and we're
106
00:08:40,460 --> 00:08:49,310
going to use this release method to exactly turn up the webcam and eventually we need to destroy the
107
00:08:49,310 --> 00:08:56,300
window inside which all the images were displayed and to do this we take our to Mudgal and then dart
108
00:08:56,690 --> 00:09:06,350
and then we're going to use to destroy all Windows function and there we go the code is over.
109
00:09:06,450 --> 00:09:07,740
Congratulations.
110
00:09:07,740 --> 00:09:11,580
Now we're ready to detect our face and our eyes.
111
00:09:11,580 --> 00:09:14,870
So let's do it right now because everything is ready.
112
00:09:15,120 --> 00:09:15,980
So here we go.
113
00:09:16,020 --> 00:09:17,460
I'm selecting all this.
114
00:09:17,460 --> 00:09:19,080
You have to select all this code.
115
00:09:19,170 --> 00:09:22,270
Make sure that you are in this fuller to be able to load the Cascades.
116
00:09:22,500 --> 00:09:26,020
And if that's the case just press enter.
117
00:09:26,040 --> 00:09:26,790
There we go.
118
00:09:26,820 --> 00:09:31,150
The code is well executed and hi guys.
119
00:09:31,150 --> 00:09:32,320
Welcome to my place.
120
00:09:32,320 --> 00:09:35,510
This is actually where I'm recording all these tutorials for you.
121
00:09:35,510 --> 00:09:37,660
So super happy to have you here.
122
00:09:38,080 --> 00:09:43,300
As you can see well we are detecting very well my face and my eyes.
123
00:09:43,300 --> 00:09:46,580
I think that's the case for you to execute your code.
124
00:09:46,690 --> 00:09:49,120
So of course sometimes we see some false positive.
125
00:09:49,120 --> 00:09:55,030
We can see some rectangle around my mouth detecting the eyes are open to the things that's my eye but
126
00:09:56,110 --> 00:10:02,360
if I don't make too much expression or if I don't talk yet there we go.
127
00:10:02,360 --> 00:10:08,310
We have this blue rectangle detects in my face and these two green rectangles detects in my eyes.
128
00:10:08,450 --> 00:10:09,190
So there we go.
129
00:10:09,200 --> 00:10:10,400
That works pretty well.
130
00:10:10,430 --> 00:10:18,650
If you are not 100 percent convinced of the power of this model Well remember that in the next modules
131
00:10:18,650 --> 00:10:22,570
we'll be using some much more powerful models open.
132
00:10:22,610 --> 00:10:25,810
Definitely doesn't contain the most powerful models.
133
00:10:25,880 --> 00:10:31,750
We actually saw it from module to module 2 will be about object detection and we actually tried Open
134
00:10:31,750 --> 00:10:37,540
survie to detect some very cute bouncing dog that we actually filmed ourself.
135
00:10:37,700 --> 00:10:40,000
But it really turned out to work very bad.
136
00:10:40,160 --> 00:10:44,330
So instead we implemented a much more powerful model based on neural networks.
137
00:10:44,330 --> 00:10:46,050
So hold onto that.
138
00:10:46,070 --> 00:10:49,880
This is going to be interesting and mostly much more powerful.
139
00:10:49,880 --> 00:10:52,330
So I was very happy to show you this.
140
00:10:52,340 --> 00:10:54,170
This was just a warm up.
141
00:10:54,170 --> 00:10:59,270
Keep up the good energy for what's coming next because it's going to be much more challenging but also
142
00:10:59,270 --> 00:11:01,080
much more fascinating.
143
00:11:01,580 --> 00:11:04,100
So I'll see you soon for the next module object detection.
144
00:11:04,100 --> 00:11:06,200
But first I'll give you a little homework.
145
00:11:06,200 --> 00:11:11,860
This will be the homework of this module one face detection and this will be about detecting some of
146
00:11:11,890 --> 00:11:13,510
the features in the face.
147
00:11:13,520 --> 00:11:14,590
So good luck with that.
148
00:11:14,630 --> 00:11:16,610
And until then enjoy computer vision.
15885
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.