All language subtitles for 9. Modelling - Picking the Model

af Afrikaans
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bn Bengali
bs Bosnian
bg Bulgarian
ca Catalan
ceb Cebuano
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
tl Filipino
fi Finnish
fr French
fy Frisian
gl Galician
ka Georgian
de German
el Greek
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
km Khmer
ko Korean
ku Kurdish (Kurmanji)
ky Kyrgyz
lo Lao
la Latin
lv Latvian
lt Lithuanian
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mn Mongolian
my Myanmar (Burmese)
ne Nepali
no Norwegian
ps Pashto
fa Persian Download
pl Polish
pt Portuguese
pa Punjabi
ro Romanian
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
st Sesotho
sn Shona
sd Sindhi
si Sinhala
sk Slovak
sl Slovenian
so Somali
es Spanish
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
te Telugu
th Thai
tr Turkish
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
or Odia (Oriya)
rw Kinyarwanda
tk Turkmen
tt Tatar
ug Uyghur
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,650 --> 00:00:03,440 Step five modeling part two. 2 00:00:03,560 --> 00:00:07,330 Choosing mentioned in the last listen. 3 00:00:07,340 --> 00:00:09,100 There were three parts to modeling. 4 00:00:09,170 --> 00:00:13,310 Choosing a model choosing a model and comparing models. 5 00:00:13,310 --> 00:00:20,990 Once you've got your data split into training validation and test sets you can start to go through each 6 00:00:20,990 --> 00:00:22,390 of these steps. 7 00:00:22,490 --> 00:00:28,700 In this lesson we're going to cover some points on choosing a model which is you choose a model and 8 00:00:28,700 --> 00:00:30,280 train it on your training data. 9 00:00:31,100 --> 00:00:36,620 Unlike creating your own algorithms from scratch there are many pre-built machine learning models which 10 00:00:36,710 --> 00:00:40,260 you can take advantage of when you first begin. 11 00:00:40,400 --> 00:00:46,430 Your main goal will be knowing what kind of machine learning algorithm to use with what kind of problem 12 00:00:46,910 --> 00:00:52,560 this is because some algorithms work better than others on different types of data. 13 00:00:52,610 --> 00:00:56,630 We'll have a look at this specifically when we get hands on with our projects. 14 00:00:56,780 --> 00:01:03,920 But for now a tidbit to remember is if you're working with structured data decision trees such as random 15 00:01:03,920 --> 00:01:11,060 forest and gradient boosting algorithms like cat boost in x g boost tend to work best and if you're 16 00:01:11,060 --> 00:01:18,790 working with unstructured data deep learning neural networks and transfer learning tend to work best. 17 00:01:18,860 --> 00:01:22,140 Once you've chosen a model your next step is to train. 18 00:01:22,520 --> 00:01:27,680 The main goal here will be to line up the inputs and outputs. 19 00:01:27,770 --> 00:01:34,520 For example in our heart disease problem we want our model to look at the feature variables the inputs 20 00:01:35,090 --> 00:01:40,500 and then find the patterns and use them to predict the target variable. 21 00:01:40,500 --> 00:01:47,060 So remember from previous lesson these variables here are the feature variables we want to use these 22 00:01:47,060 --> 00:01:49,580 to predict the target variables. 23 00:01:49,580 --> 00:01:58,600 Another common naming setting is to use x which is the data to predict Y which is the labels different 24 00:01:58,600 --> 00:02:01,450 machine learning algorithms have different ways of doing this. 25 00:02:01,900 --> 00:02:06,340 We'll see how to do it for a handful of useful ones in future projects. 26 00:02:06,580 --> 00:02:11,020 And remember training a model takes place on the training data split. 27 00:02:11,050 --> 00:02:14,490 This is where your model learns the course material. 28 00:02:14,560 --> 00:02:19,780 We don't want to let our models cheat and see the final exam before they do their study. 29 00:02:19,900 --> 00:02:25,790 Depending on how much data you have and how complex your model is training may take a while. 30 00:02:25,870 --> 00:02:32,710 One of your biggest goals when training a model is to minimize the times between experiments. 31 00:02:32,770 --> 00:02:38,420 So sometimes this will mean to use a small portion of your data first. 32 00:02:38,590 --> 00:02:45,340 For example if your training dataset had 100000 examples you might start training a model with only 33 00:02:45,340 --> 00:02:48,770 the first 10000 and see how it goes. 34 00:02:48,790 --> 00:02:55,240 You might also decide to use a less complicated model to begin with deep model such as no networks generally 35 00:02:55,360 --> 00:02:58,350 take longer to train than other kinds of models. 36 00:02:58,350 --> 00:03:02,590 Now this is something worth considering when it comes to training your own models. 37 00:03:02,590 --> 00:03:09,730 For example if an experiment takes you three hours or even up to a couple of days for a small percentage 38 00:03:09,730 --> 00:03:15,050 abuse and performance of your model you might consider is this experiment actually worth it. 39 00:03:15,280 --> 00:03:18,160 Because machine learning is highly iterative. 40 00:03:18,160 --> 00:03:25,090 We want to minimize this experimentation time that we can go from step 1 to step 2 to step 3. 41 00:03:25,090 --> 00:03:32,930 But again if this looks confusing we'll see it in practice in the hands on projects and things to remember. 42 00:03:33,100 --> 00:03:36,280 Some models work better than others and different problems. 43 00:03:36,280 --> 00:03:38,730 Don't be afraid to try things and this is really important. 44 00:03:38,730 --> 00:03:41,910 Machine learning is as we said before a highly iterative process. 45 00:03:41,920 --> 00:03:48,760 So some things that you try out may not work the first time and that means you just neglect that thing 46 00:03:48,760 --> 00:03:51,240 going forward and try something else. 47 00:03:51,280 --> 00:03:54,910 Start small and build up add complexity as you need. 48 00:03:54,910 --> 00:03:59,620 What this means is that for example if you had one hundred thousand examples we're going to start on 49 00:03:59,620 --> 00:04:04,870 ten thousand we're going to use a simple model first to begin with rather than going to the biggest 50 00:04:04,870 --> 00:04:10,690 and latest and greatest model because what we're after is practical results not something that's that's 51 00:04:10,700 --> 00:04:12,190 the best on paper. 52 00:04:12,190 --> 00:04:15,180 We want to see something that can actually be used in the real world. 53 00:04:16,030 --> 00:04:21,850 And part of your experiments will involve tuning a model which shows good initial results to get better 54 00:04:21,910 --> 00:04:22,660 results. 55 00:04:22,660 --> 00:04:27,730 Like tuning a car if your car does well on on one track it might not do well on another track. 56 00:04:27,940 --> 00:04:29,950 So you turn it up. 57 00:04:29,950 --> 00:04:34,510 Let's have a look at how to do that but instead of four cars we'll do it for machine learning models. 6354

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.