All language subtitles for 005 Binary and One-Hot Encoding_en

af Afrikaans
ak Akan
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bn Bengali
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fr French
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
ko Korean
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranรฎ)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal)
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es-419 Spanish (Latin American)
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,806 --> 00:00:03,300 Instructor: Hi, and welcome back. 2 00:00:03,300 --> 00:00:06,990 Let's begin this lesson by introducing binary encoding. 3 00:00:06,990 --> 00:00:08,130 We will start from the 4 00:00:08,130 --> 00:00:10,290 ordinal numbers we assigned earlier. 5 00:00:10,290 --> 00:00:12,630 Bread is represented by the number one, 6 00:00:12,630 --> 00:00:14,190 yogurt by the number two, 7 00:00:14,190 --> 00:00:16,203 and muffin is designated with three. 8 00:00:17,370 --> 00:00:19,560 Binary encoding implies we should turn 9 00:00:19,560 --> 00:00:21,630 these numbers into binary. 10 00:00:21,630 --> 00:00:24,000 One in binary is zero one, 11 00:00:24,000 --> 00:00:26,253 so bread would be zero one. 12 00:00:27,540 --> 00:00:29,700 Two in binary is one zero. 13 00:00:29,700 --> 00:00:32,043 So, yogurt would be one zero. 14 00:00:33,150 --> 00:00:35,190 Three in binary is one one. 15 00:00:35,190 --> 00:00:38,010 So, muffins would be one one. 16 00:00:38,010 --> 00:00:40,320 The next step of the process is to divide 17 00:00:40,320 --> 00:00:41,670 these into different columns, 18 00:00:41,670 --> 00:00:44,280 as if we were creating two new variables. 19 00:00:44,280 --> 00:00:46,281 For the first one, bread is zero, 20 00:00:46,281 --> 00:00:49,170 yogurt is one and muffins are one. 21 00:00:49,170 --> 00:00:51,600 For the second variable, bread is one, 22 00:00:51,600 --> 00:00:54,210 yogurt is zero, and muffins are one. 23 00:00:54,210 --> 00:00:56,790 We have differentiated between the three categories 24 00:00:56,790 --> 00:00:58,530 and have removed the order. 25 00:00:58,530 --> 00:01:00,330 However, there are still some 26 00:01:00,330 --> 00:01:02,223 implied correlations between them. 27 00:01:03,600 --> 00:01:05,850 For instance, bread and yogurt 28 00:01:05,850 --> 00:01:08,250 seem exactly the opposite of each other. 29 00:01:08,250 --> 00:01:09,570 It's like we are saying, 30 00:01:09,570 --> 00:01:11,700 whatever is bread is not yogurt 31 00:01:11,700 --> 00:01:13,110 and vice versa. 32 00:01:13,110 --> 00:01:14,790 Even if this makes sense, 33 00:01:14,790 --> 00:01:16,800 if we encode them in a different way. 34 00:01:16,800 --> 00:01:18,390 This opposite correlation would be 35 00:01:18,390 --> 00:01:19,980 true for muffins and yogurt, 36 00:01:19,980 --> 00:01:21,780 but no longer for bread. 37 00:01:21,780 --> 00:01:25,080 Therefore, binary encoding proves problematic, 38 00:01:25,080 --> 00:01:26,880 but is a great improvement regarding 39 00:01:26,880 --> 00:01:28,443 the initial ordinal method. 40 00:01:30,750 --> 00:01:31,830 All right. 41 00:01:31,830 --> 00:01:35,100 Finally, we have the so-called one-hot encoding. 42 00:01:35,100 --> 00:01:38,610 One-hot is very simple and widely adopted. 43 00:01:38,610 --> 00:01:40,740 It consists of creating as many columns 44 00:01:40,740 --> 00:01:42,630 as there are possible values. 45 00:01:42,630 --> 00:01:44,640 Here, we have three products. 46 00:01:44,640 --> 00:01:48,120 Thus, we need three columns or three variables. 47 00:01:48,120 --> 00:01:50,703 Let's call them bread, yogurt, and muffins. 48 00:01:51,780 --> 00:01:54,780 Imagine these variables as asking the question: 49 00:01:54,780 --> 00:01:56,370 Is this product bread? 50 00:01:56,370 --> 00:01:57,720 Is this product yogurt? 51 00:01:57,720 --> 00:01:59,463 And is this product muffins? 52 00:02:00,690 --> 00:02:03,330 One means yes, zero means no. 53 00:02:03,330 --> 00:02:05,610 So for a product that is bread, 54 00:02:05,610 --> 00:02:08,223 we will have one zero zero. 55 00:02:08,223 --> 00:02:11,733 For a product that is yogurt, zero one zero, 56 00:02:12,570 --> 00:02:15,990 and for a product that is muffin, zero zero one. 57 00:02:15,990 --> 00:02:17,610 This is very intuitive 58 00:02:17,610 --> 00:02:19,080 as a product can only be of 59 00:02:19,080 --> 00:02:21,000 one type at the same time. 60 00:02:21,000 --> 00:02:23,610 Thus, there will be only one value one 61 00:02:23,610 --> 00:02:25,590 and everything else will be zero. 62 00:02:25,590 --> 00:02:26,970 This means the products are 63 00:02:26,970 --> 00:02:28,950 uncorrelated and unequivocal, 64 00:02:28,950 --> 00:02:31,503 which is useful and usually works like a charm. 65 00:02:32,850 --> 00:02:35,400 Many lessons ago we were talking about cats, 66 00:02:35,400 --> 00:02:37,740 dogs, and horses classification. 67 00:02:37,740 --> 00:02:40,470 The target vectors there were one-hot encoded, 68 00:02:40,470 --> 00:02:42,963 so we had the same type of vectors. 69 00:02:44,370 --> 00:02:46,320 There is one big problem with 70 00:02:46,320 --> 00:02:47,880 one-hot encoding, though. 71 00:02:47,880 --> 00:02:51,300 One-hot encoding requires a lot of new variables. 72 00:02:51,300 --> 00:02:55,590 For example, Ikea offers around 12,000 products. 73 00:02:55,590 --> 00:02:57,450 Do we want to include 12,000 74 00:02:57,450 --> 00:02:59,250 columns in our inputs? 75 00:02:59,250 --> 00:03:00,603 Definitely not. 76 00:03:01,860 --> 00:03:04,380 If we used binary, the 12,000 products 77 00:03:04,380 --> 00:03:06,870 would be represented by 16 columns only 78 00:03:06,870 --> 00:03:09,210 since the 12000th product would be written 79 00:03:09,210 --> 00:03:10,653 like this in binary. 80 00:03:11,760 --> 00:03:13,400 This is exponentially lower than the 81 00:03:13,400 --> 00:03:15,210 12,000 columns we would need 82 00:03:15,210 --> 00:03:16,740 for one-hot encoding. 83 00:03:16,740 --> 00:03:19,470 In such cases, we must use binary, 84 00:03:19,470 --> 00:03:20,940 even though that would introduce 85 00:03:20,940 --> 00:03:22,560 some unjustified correlations 86 00:03:22,560 --> 00:03:23,793 between the products. 87 00:03:25,170 --> 00:03:26,850 Clearly there is a trade off 88 00:03:26,850 --> 00:03:29,370 between binary and one-hot encoding. 89 00:03:29,370 --> 00:03:30,870 We would prefer one-hot when 90 00:03:30,870 --> 00:03:32,370 we have a few categories 91 00:03:32,370 --> 00:03:35,550 and binary when dealing with many categories. 92 00:03:35,550 --> 00:03:37,219 All right, that was all. 93 00:03:37,219 --> 00:03:38,823 Thanks for watching. 6635

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.