All language subtitles for 01_multiple-features.en

af Afrikaans
ak Akan
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bn Bengali
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fr French
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
ko Korean
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranî)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
fa Persian Download
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal)
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es Spanish
es-419 Spanish (Latin American)
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:01,440 --> 00:00:04,240 Welcome back. In this week, 2 00:00:04,240 --> 00:00:05,845 we'll learn to make linear regression 3 00:00:05,845 --> 00:00:08,320 much faster and much more powerful, 4 00:00:08,320 --> 00:00:10,090 and by the end of this week, 5 00:00:10,090 --> 00:00:11,665 you'll be two thirds of the way 6 00:00:11,665 --> 00:00:13,645 to finishing this first course. 7 00:00:13,645 --> 00:00:16,060 Let's start by looking at the version of 8 00:00:16,060 --> 00:00:19,300 linear regression that look at not just one feature, 9 00:00:19,300 --> 00:00:22,615 but a lot of different features. Let's take a look. 10 00:00:22,615 --> 00:00:26,845 In the original version of linear regression, 11 00:00:26,845 --> 00:00:29,785 you had a single feature x, 12 00:00:29,785 --> 00:00:33,825 the size of the house and you're able to predict y, 13 00:00:33,825 --> 00:00:35,820 the price of the house. 14 00:00:35,820 --> 00:00:43,065 The model was fwb of x equals wx plus b. 15 00:00:43,065 --> 00:00:46,450 But now, what if you did not only have the size of 16 00:00:46,450 --> 00:00:47,800 the house as a feature with 17 00:00:47,800 --> 00:00:49,775 which to try to predict the price, 18 00:00:49,775 --> 00:00:53,244 but if you also knew the number of bedrooms, 19 00:00:53,244 --> 00:00:56,650 the number of floors and the age of the home in years. 20 00:00:56,650 --> 00:00:58,390 It seems like this would give you 21 00:00:58,390 --> 00:01:01,720 a lot more information with which to predict the price. 22 00:01:01,720 --> 00:01:04,580 To introduce a little bit of new notation, 23 00:01:04,580 --> 00:01:07,580 we're going to use the variables X_1, 24 00:01:07,580 --> 00:01:12,000 X_2, X_3 and X_4, 25 00:01:12,000 --> 00:01:14,195 to denote the four features. 26 00:01:14,195 --> 00:01:16,540 For simplicity, let's introduce 27 00:01:16,540 --> 00:01:18,190 a little bit more notation. 28 00:01:18,190 --> 00:01:20,525 We'll write X subscript j 29 00:01:20,525 --> 00:01:23,005 or sometimes I'll just say for short, 30 00:01:23,005 --> 00:01:26,980 X sub j, to represent the list of features. 31 00:01:26,980 --> 00:01:29,575 Here, j will go from one to four, 32 00:01:29,575 --> 00:01:31,465 because we have four features. 33 00:01:31,465 --> 00:01:33,910 I'm going to use lowercase 34 00:01:33,910 --> 00:01:37,090 n to denote the total number of features, 35 00:01:37,090 --> 00:01:38,485 so in this example, 36 00:01:38,485 --> 00:01:40,250 n is equal to 4. 37 00:01:40,250 --> 00:01:43,395 As before, we'll use X superscript 38 00:01:43,395 --> 00:01:48,165 i to denote the ith training example. 39 00:01:48,165 --> 00:01:51,760 Here X superscript i is actually 40 00:01:51,760 --> 00:01:55,615 going to be a list of four numbers, 41 00:01:55,615 --> 00:01:59,360 or sometimes we'll call this a vector that 42 00:01:59,360 --> 00:02:04,405 includes all the features of the ith training example. 43 00:02:04,405 --> 00:02:07,079 As a concrete example, 44 00:02:07,079 --> 00:02:10,190 X superscript in parentheses 2, 45 00:02:10,190 --> 00:02:11,450 will be a vector of 46 00:02:11,450 --> 00:02:14,810 the features for the second training example, 47 00:02:14,810 --> 00:02:18,770 so it will equal to this 1416, 3, 48 00:02:18,770 --> 00:02:21,670 2 and 40 and technically, 49 00:02:21,670 --> 00:02:23,810 I'm writing these numbers in a row, 50 00:02:23,810 --> 00:02:25,460 so sometimes this is called 51 00:02:25,460 --> 00:02:28,250 a row vector rather than a column vector. 52 00:02:28,250 --> 00:02:30,050 But if you don't know what the difference is, 53 00:02:30,050 --> 00:02:31,910 don't worry about it, it's 54 00:02:31,910 --> 00:02:34,405 not that important for this purpose. 55 00:02:34,405 --> 00:02:38,035 To refer to a specific feature 56 00:02:38,035 --> 00:02:40,555 in the ith training example, 57 00:02:40,555 --> 00:02:45,425 I will write X superscript i, subscript j, 58 00:02:45,425 --> 00:02:49,230 so for example, X superscript 2 subscript 59 00:02:49,230 --> 00:02:53,240 3 will be the value of the third feature, 60 00:02:53,240 --> 00:02:55,160 that is the number of floors in 61 00:02:55,160 --> 00:02:57,140 the second training example and 62 00:02:57,140 --> 00:03:00,325 so that's going to be equal to 2. 63 00:03:00,325 --> 00:03:04,740 Sometimes in order to emphasize that this X^2 64 00:03:04,740 --> 00:03:06,540 is not a number but is 65 00:03:06,540 --> 00:03:08,970 actually a list of numbers that is a vector, 66 00:03:08,970 --> 00:03:12,510 we'll draw an arrow on top of that just to 67 00:03:12,510 --> 00:03:17,985 visually show that is a vector and over here as well, 68 00:03:17,985 --> 00:03:21,835 but you don't have to draw this arrow in your notation. 69 00:03:21,835 --> 00:03:25,880 You can think of the arrow as an optional signifier. 70 00:03:25,880 --> 00:03:27,680 They're sometimes used just to 71 00:03:27,680 --> 00:03:30,575 emphasize that this is a vector and not a number. 72 00:03:30,575 --> 00:03:32,900 Now that we have multiple features, 73 00:03:32,900 --> 00:03:36,175 let's take a look at what a model would look like. 74 00:03:36,175 --> 00:03:39,325 Previously, this is how we defined the model, 75 00:03:39,325 --> 00:03:41,480 where X was a single feature, 76 00:03:41,480 --> 00:03:42,875 so a single number. 77 00:03:42,875 --> 00:03:45,245 But now with multiple features, 78 00:03:45,245 --> 00:03:47,810 we're going to define it differently. 79 00:03:47,810 --> 00:03:51,085 Instead, the model will be, 80 00:03:51,085 --> 00:03:55,965 fwb of X equals w1x1 81 00:03:55,965 --> 00:04:01,890 plus w2x2 plus w3x3 plus w4x4 plus b. 82 00:04:01,890 --> 00:04:05,855 Concretely for housing price prediction, 83 00:04:05,855 --> 00:04:10,370 one possible model may be that we estimate the price 84 00:04:10,370 --> 00:04:14,960 of the house as 0.1 times X_1, the size of the house, 85 00:04:14,960 --> 00:04:18,315 plus four times X_2, 86 00:04:18,315 --> 00:04:19,545 the number of bedrooms, 87 00:04:19,545 --> 00:04:22,530 plus ten times X_3, the number of floors, 88 00:04:22,530 --> 00:04:24,600 minus 2 times X_4, 89 00:04:24,600 --> 00:04:27,250 the age of the house in years plus 80. 90 00:04:27,250 --> 00:04:29,420 Let's think a bit about how you 91 00:04:29,420 --> 00:04:31,535 might interpret these parameters. 92 00:04:31,535 --> 00:04:33,620 If the model is trying to predict 93 00:04:33,620 --> 00:04:36,095 the price of the house in thousands of dollars, 94 00:04:36,095 --> 00:04:40,520 you can think of this b equals 80 as 95 00:04:40,520 --> 00:04:42,800 saying that the base price of a house 96 00:04:42,800 --> 00:04:45,320 starts off at maybe $80,000, 97 00:04:45,320 --> 00:04:46,790 assuming it has no size, 98 00:04:46,790 --> 00:04:49,615 no bedrooms, no floor and no age. 99 00:04:49,615 --> 00:04:52,660 You can think of this 0.1 as saying 100 00:04:52,660 --> 00:04:55,600 that maybe for every additional square foot, 101 00:04:55,600 --> 00:05:01,835 the price will increase by 0.1 $1,000 or by $100, 102 00:05:01,835 --> 00:05:04,945 because we're saying that for each square foot, 103 00:05:04,945 --> 00:05:07,900 the price increases by 0.1, 104 00:05:07,900 --> 00:05:11,975 times $1,000, which is $100. 105 00:05:11,975 --> 00:05:15,150 Maybe for each additional bathroom, 106 00:05:15,150 --> 00:05:16,680 the price increases by 107 00:05:16,680 --> 00:05:20,055 $4,000 and for each additional floor 108 00:05:20,055 --> 00:05:21,300 the price may increase by 109 00:05:21,300 --> 00:05:25,515 $10,000 and for each additional year of the house's age, 110 00:05:25,515 --> 00:05:28,650 the price may decrease by $2,000, 111 00:05:28,650 --> 00:05:31,455 because the parameter is negative 2. 112 00:05:31,455 --> 00:05:35,760 In general, if you have n features, 113 00:05:35,760 --> 00:05:38,980 then the model will look like this. 114 00:05:39,440 --> 00:05:42,300 Here again is the definition 115 00:05:42,300 --> 00:05:44,930 of the model with n features. 116 00:05:44,930 --> 00:05:46,840 What we're going to do next is 117 00:05:46,840 --> 00:05:48,520 introduce a little bit of notation 118 00:05:48,520 --> 00:05:50,170 to rewrite this expression in 119 00:05:50,170 --> 00:05:52,210 a simpler but equivalent way. 120 00:05:52,210 --> 00:05:54,760 Let's define W as a list of 121 00:05:54,760 --> 00:05:58,000 numbers that list the parameters W_1, 122 00:05:58,000 --> 00:05:59,590 W_2, W_3, 123 00:05:59,590 --> 00:06:01,925 all the way through W_n. 124 00:06:01,925 --> 00:06:04,960 In mathematics, this is called a vector 125 00:06:04,960 --> 00:06:08,200 and sometimes to designate that this is a vector, 126 00:06:08,200 --> 00:06:10,060 which just means a list of numbers, 127 00:06:10,060 --> 00:06:12,970 I'm going to draw a little arrow on top. 128 00:06:12,970 --> 00:06:15,895 You don't always have to draw this arrow 129 00:06:15,895 --> 00:06:19,330 and you can do so or not in your own notation, 130 00:06:19,330 --> 00:06:21,400 so you can think of this little arrow as 131 00:06:21,400 --> 00:06:23,425 just an optional signifier 132 00:06:23,425 --> 00:06:26,350 to remind us that this is a vector. 133 00:06:26,350 --> 00:06:29,260 If you've taken the linear algebra class before, 134 00:06:29,260 --> 00:06:31,000 you might recognize that this is 135 00:06:31,000 --> 00:06:34,210 a row vector as opposed to a column vector. 136 00:06:34,210 --> 00:06:36,460 But if you don't know what those terms means, 137 00:06:36,460 --> 00:06:38,030 you don't need to worry about it. 138 00:06:38,030 --> 00:06:39,795 Next, same as before, 139 00:06:39,795 --> 00:06:45,100 b is a single number and not a vector and so this vector 140 00:06:45,100 --> 00:06:47,710 W together with this number b 141 00:06:47,710 --> 00:06:51,360 are the parameters of the model. 142 00:06:51,360 --> 00:06:57,365 Let me also write X as a list or a vector, 143 00:06:57,365 --> 00:06:59,420 again a row vector that 144 00:06:59,420 --> 00:07:02,030 lists all of the features X_1, X_2, 145 00:07:02,030 --> 00:07:04,795 X_3 up to X_n, 146 00:07:04,795 --> 00:07:07,530 this is again a vector, 147 00:07:07,530 --> 00:07:13,440 so I'm going to add a little arrow up on top to signify. 148 00:07:13,760 --> 00:07:16,410 In the notation up on top, 149 00:07:16,410 --> 00:07:20,600 we can also add little arrows here and here 150 00:07:20,600 --> 00:07:23,660 to signify that that W and that X 151 00:07:23,660 --> 00:07:26,355 are actually these lists of numbers, 152 00:07:26,355 --> 00:07:29,835 that they're actually these vectors. 153 00:07:29,835 --> 00:07:34,750 With this notation, the model can now be rewritten more 154 00:07:34,750 --> 00:07:39,310 succinctly as f of x equals, 155 00:07:39,310 --> 00:07:43,710 the vector w dot and this dot refers to 156 00:07:43,710 --> 00:07:48,525 a dot product from linear algebra of X the vector, 157 00:07:48,525 --> 00:07:50,730 plus the number b. 158 00:07:50,730 --> 00:07:53,190 What is this dot product thing? 159 00:07:53,190 --> 00:07:55,315 Well, the dot products of 160 00:07:55,315 --> 00:07:59,450 two vectors of two lists of numbers W and X, 161 00:07:59,450 --> 00:08:02,490 is computed by checking 162 00:08:02,490 --> 00:08:05,475 the corresponding pairs of numbers, 163 00:08:05,475 --> 00:08:09,030 W_1 and X_1 multiplying that, 164 00:08:09,030 --> 00:08:12,540 W_2 X_2 multiplying that, 165 00:08:12,540 --> 00:08:15,525 W_3 X_3 multiplying that, 166 00:08:15,525 --> 00:08:19,080 all the way up to W_n and X_n multiplying 167 00:08:19,080 --> 00:08:23,340 that and then summing up all of these products. 168 00:08:23,340 --> 00:08:25,335 Writing that out, 169 00:08:25,335 --> 00:08:32,480 this means that the dot products is equal to W_1X_1 plus 170 00:08:32,480 --> 00:08:41,910 W_2X_2 plus W_3X_3 plus all the way up to W_nX_n. 171 00:08:41,910 --> 00:08:46,810 Then finally we add back in the b on top. 172 00:08:46,840 --> 00:08:49,410 You notice that this gives us 173 00:08:49,410 --> 00:08:53,080 exactly the same expression as we had on top. 174 00:08:53,390 --> 00:08:57,490 The dot traffic notation lets you write the model in 175 00:08:57,490 --> 00:09:01,240 a more compact form with fewer characters. 176 00:09:01,240 --> 00:09:04,510 The name for this type of linear regression model with 177 00:09:04,510 --> 00:09:08,230 multiple input features is multiple linear regression. 178 00:09:08,230 --> 00:09:11,200 This is in contrast to univariate regression, 179 00:09:11,200 --> 00:09:13,415 which has just one feature. 180 00:09:13,415 --> 00:09:15,510 By the way, you might think 181 00:09:15,510 --> 00:09:18,400 this algorithm is called multivariate regression, 182 00:09:18,400 --> 00:09:20,590 but that term actually refers to something 183 00:09:20,590 --> 00:09:23,170 else that we won't be using here. 184 00:09:23,170 --> 00:09:24,970 I'm going to refer to this model 185 00:09:24,970 --> 00:09:27,755 as multiple linear regression. 186 00:09:27,755 --> 00:09:31,705 That's it for linear regression with multiple features, 187 00:09:31,705 --> 00:09:34,555 which is also called multiple linear regression. 188 00:09:34,555 --> 00:09:36,495 In order to implement this, 189 00:09:36,495 --> 00:09:39,710 there's a really neat trick called vectorization, 190 00:09:39,710 --> 00:09:42,040 which will make it much simpler to implement 191 00:09:42,040 --> 00:09:44,965 this and many other learning algorithms. 192 00:09:44,965 --> 00:09:47,260 Let's go on to the next video to take 193 00:09:47,260 --> 00:09:50,770 a look at what is vectorization.13756

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.