Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
0
00:00:01,040 --> 00:00:01,889
[Autogenerated] What are the steps
1
00:00:01,889 --> 00:00:03,710
involved in building and training of
2
00:00:03,710 --> 00:00:05,410
machine learning? Margie, let's take a
3
00:00:05,410 --> 00:00:07,910
quick look in this clip off the machine
4
00:00:07,910 --> 00:00:11,130
learning workload Here is what the basic
5
00:00:11,130 --> 00:00:13,490
machine learning work floor looks like.
6
00:00:13,490 --> 00:00:15,630
Don't be intimidated. There are lots of
7
00:00:15,630 --> 00:00:17,460
processes involved here. Feel. Walk
8
00:00:17,460 --> 00:00:19,570
through each of these in detail. Once
9
00:00:19,570 --> 00:00:21,519
you've understood what you want to build,
10
00:00:21,519 --> 00:00:23,410
you first need to look at the raw data
11
00:00:23,410 --> 00:00:25,429
that you have available. What data do you
12
00:00:25,429 --> 00:00:27,420
have to work with Is a sufficient to train
13
00:00:27,420 --> 00:00:30,239
your machine learning? Marty? If not, you
14
00:00:30,239 --> 00:00:32,020
won't be able to proceed further. You
15
00:00:32,020 --> 00:00:34,250
might need to go back and seek for new
16
00:00:34,250 --> 00:00:37,000
data sources. Once you know that you have
17
00:00:37,000 --> 00:00:39,649
the data that you need, You can move on
18
00:00:39,649 --> 00:00:42,689
and load and store the data. Get it ready
19
00:00:42,689 --> 00:00:44,820
for machine learning. Make sure that it's
20
00:00:44,820 --> 00:00:47,140
located in a database or a data warehouse
21
00:00:47,140 --> 00:00:49,420
where you can access the data you need to
22
00:00:49,420 --> 00:00:51,929
set up by planes toe extract the data from
23
00:00:51,929 --> 00:00:53,820
where you have it stored. And once you
24
00:00:53,820 --> 00:00:56,700
have the data with you, you need to clean
25
00:00:56,700 --> 00:00:58,950
and prepare the data. This is the data pre
26
00:00:58,950 --> 00:01:01,740
processing stage data in the real world
27
00:01:01,740 --> 00:01:03,899
cannot be used directly to train your
28
00:01:03,899 --> 00:01:05,609
machine learning models. It needs to be
29
00:01:05,609 --> 00:01:08,129
pre process. Need to get rid off missing
30
00:01:08,129 --> 00:01:10,530
values. Take care off out liars. If you
31
00:01:10,530 --> 00:01:13,140
have no numeric representations of data,
32
00:01:13,140 --> 00:01:15,000
they have to be in quarter to nomadic
33
00:01:15,000 --> 00:01:17,500
form. These three processing steps that we
34
00:01:17,500 --> 00:01:19,829
just discussed you when you go from raw
35
00:01:19,829 --> 00:01:22,480
data toe clean data that you can feed into
36
00:01:22,480 --> 00:01:25,049
a machine learning model can together be
37
00:01:25,049 --> 00:01:27,010
thought off as the process of selecting
38
00:01:27,010 --> 00:01:29,579
and extracting features that exist in your
39
00:01:29,579 --> 00:01:31,969
data. Then you are a student off machine.
40
00:01:31,969 --> 00:01:34,480
Learning your attention is mostly focused
41
00:01:34,480 --> 00:01:36,109
on understanding the machine learning
42
00:01:36,109 --> 00:01:39,239
algorithm and how it works. But these
43
00:01:39,239 --> 00:01:41,489
three steps are critical and time
44
00:01:41,489 --> 00:01:43,900
consuming steps in the real world, they
45
00:01:43,900 --> 00:01:46,650
pick up an inordinate amount of time. It's
46
00:01:46,650 --> 00:01:48,659
quite possible that you spend more time on
47
00:01:48,659 --> 00:01:50,980
the steps than on building a machine
48
00:01:50,980 --> 00:01:53,109
learning. Marty, Once you have your data
49
00:01:53,109 --> 00:01:55,819
ready, the next step is to choose the
50
00:01:55,819 --> 00:01:57,799
right algorithm for your use case. Do you
51
00:01:57,799 --> 00:02:00,180
want a decision tree You want you support
52
00:02:00,180 --> 00:02:02,859
vector machines? Do you want to use naive
53
00:02:02,859 --> 00:02:05,930
bees or key nearest neighbors? The choice
54
00:02:05,930 --> 00:02:08,360
of algorithm. It's up to you and dependent
55
00:02:08,360 --> 00:02:10,729
on your use case. Once you've chosen an
56
00:02:10,729 --> 00:02:13,900
algorithm, you'll then stream your model
57
00:02:13,900 --> 00:02:15,900
on the data that you have. This is what is
58
00:02:15,900 --> 00:02:18,389
referred to US fitting Ahmadi. The
59
00:02:18,389 --> 00:02:21,080
training process tries to find the best
60
00:02:21,080 --> 00:02:23,430
possible model parameters so that you can
61
00:02:23,430 --> 00:02:25,689
use your model for prediction. Once you
62
00:02:25,689 --> 00:02:28,409
have a model, you need to validate and
63
00:02:28,409 --> 00:02:30,259
evaluate the model, see whether it's a
64
00:02:30,259 --> 00:02:32,849
good one. There are many validation
65
00:02:32,849 --> 00:02:34,610
techniques available. You'll choose a
66
00:02:34,610 --> 00:02:37,280
validation method and apply the validation
67
00:02:37,280 --> 00:02:39,900
method toe. Evaluate your model. You'll
68
00:02:39,900 --> 00:02:42,560
examine the fit off your model and then
69
00:02:42,560 --> 00:02:45,300
update the model if needed. Examining the
70
00:02:45,300 --> 00:02:47,939
fit off your model is also refer to us.
71
00:02:47,939 --> 00:02:49,680
According the model. You have different
72
00:02:49,680 --> 00:02:51,479
metrics that you can use four different
73
00:02:51,479 --> 00:02:53,860
kinds of models you have the are square
74
00:02:53,860 --> 00:02:56,199
for regression models, accuracy, precision
75
00:02:56,199 --> 00:02:59,000
and recall for class. If IRS, once you've
76
00:02:59,000 --> 00:03:01,330
evaluated and scored, your model, will
77
00:03:01,330 --> 00:03:03,460
check to see whether you're satisfied with
78
00:03:03,460 --> 00:03:06,139
the result. If you're not satisfied, you
79
00:03:06,139 --> 00:03:08,599
might wantto update the model. Maybe you
80
00:03:08,599 --> 00:03:10,439
choose a different algorithm. Maybe you'll
81
00:03:10,439 --> 00:03:12,900
use more data for training. Maybe you'll
82
00:03:12,900 --> 00:03:15,469
train for longer and This is an
83
00:03:15,469 --> 00:03:17,379
integrative process that continues till
84
00:03:17,379 --> 00:03:19,389
you're satisfied with the model that you
85
00:03:19,389 --> 00:03:21,430
have. Update the Mahdi. Choose a
86
00:03:21,430 --> 00:03:24,120
validation method, examined the model
87
00:03:24,120 --> 00:03:26,409
evaluated, see whether you're satisfied
88
00:03:26,409 --> 00:03:29,360
and repeat till you're done. Once they're
89
00:03:29,360 --> 00:03:31,889
satisfied, your model is ready to be
90
00:03:31,889 --> 00:03:34,349
deployed in production and to be used for
91
00:03:34,349 --> 00:03:36,210
predictions, you lose your model for
92
00:03:36,210 --> 00:03:38,800
predictions on these prediction instances,
93
00:03:38,800 --> 00:03:40,860
our new data points that you can then
94
00:03:40,860 --> 00:03:43,590
store in your database toe, improve your
95
00:03:43,590 --> 00:03:45,780
model. In the real world, prediction
96
00:03:45,780 --> 00:03:48,520
instances often become partof the training
97
00:03:48,520 --> 00:03:51,060
data for your Marty. This is the basic
98
00:03:51,060 --> 00:03:53,000
machine learning workflow and in this
99
00:03:53,000 --> 00:03:55,750
course will focus our attention on feature
100
00:03:55,750 --> 00:03:59,000
selection and extraction the initial few steps.
7895
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.