All language subtitles for 005 Test for the Mean. Population Variance Known_en

af Afrikaans
ak Akan
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bn Bengali
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fr French
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
ko Korean
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranî)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
fa Persian Download
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal)
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es Spanish
es-419 Spanish (Latin American)
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,930 --> 00:00:01,980 -: All right. 2 00:00:01,980 --> 00:00:04,200 Now that we've covered the necessary theory, 3 00:00:04,200 --> 00:00:05,823 it is time for some testing. 4 00:00:06,990 --> 00:00:09,570 We're going to explore two types of tests 5 00:00:09,570 --> 00:00:11,310 drawn from a single population 6 00:00:11,310 --> 00:00:13,413 and drawn from multiple populations. 7 00:00:14,430 --> 00:00:16,470 This is very similar to confidence intervals 8 00:00:16,470 --> 00:00:18,000 for a single population, 9 00:00:18,000 --> 00:00:20,250 and confidence intervals for two populations 10 00:00:20,250 --> 00:00:21,603 that we covered previously. 11 00:00:22,860 --> 00:00:26,340 In the next few videos, we will run tests for a single mean 12 00:00:26,340 --> 00:00:29,223 with both known variance and unknown variance. 13 00:00:30,810 --> 00:00:31,830 Let's start with a test 14 00:00:31,830 --> 00:00:34,083 in which the variance is known, shall we? 15 00:00:35,580 --> 00:00:36,420 For this test, 16 00:00:36,420 --> 00:00:39,663 we will use our good old data scientist salary example. 17 00:00:40,710 --> 00:00:42,663 Here's the data set one more time. 18 00:00:43,680 --> 00:00:46,683 By now, I hope you are able to calculate the sample mean, 19 00:00:48,210 --> 00:00:51,097 it is $100,200. 20 00:00:52,110 --> 00:00:54,150 The population variance is known 21 00:00:54,150 --> 00:00:57,933 and its standard deviation is equal to $15,000. 22 00:00:58,950 --> 00:01:01,803 Moreover, the sample size is 30. 23 00:01:03,300 --> 00:01:06,480 However, you saw that according to Glassdoor, 24 00:01:06,480 --> 00:01:08,820 the popular salary information website, 25 00:01:08,820 --> 00:01:12,693 the mean data scientist salary is $113,000. 26 00:01:13,620 --> 00:01:15,660 The sample that is available on Glassdoor 27 00:01:15,660 --> 00:01:17,760 is based on self-reported numbers, 28 00:01:17,760 --> 00:01:20,910 and you would like to see if its value is correct. 29 00:01:20,910 --> 00:01:22,860 We needed a two-sided test, 30 00:01:22,860 --> 00:01:24,090 as we are interested in knowing 31 00:01:24,090 --> 00:01:27,090 both that the salary is significantly less than that 32 00:01:27,090 --> 00:01:29,013 or significantly more than that. 33 00:01:30,840 --> 00:01:34,380 The null hypothesis is, the population means salary 34 00:01:34,380 --> 00:01:36,603 is $113,000. 35 00:01:38,190 --> 00:01:43,190 We denoted as mu zero equals $113,000. 36 00:01:44,520 --> 00:01:46,260 The alternative hypothesis 37 00:01:46,260 --> 00:01:48,180 is that the population mean salary 38 00:01:48,180 --> 00:01:50,763 is different than $113,000. 39 00:01:53,250 --> 00:01:54,090 All right. 40 00:01:54,090 --> 00:01:56,193 Formula time, almost. 41 00:01:57,060 --> 00:02:00,120 Testing is done by standardizing the variable at hand 42 00:02:00,120 --> 00:02:02,580 and comparing it to the z, 43 00:02:02,580 --> 00:02:05,013 which follows a standard normal distribution. 44 00:02:06,120 --> 00:02:08,070 Remember standardization? 45 00:02:08,070 --> 00:02:10,350 We learned about it in the previous section. 46 00:02:10,350 --> 00:02:12,960 Back then, I told you it was very important 47 00:02:12,960 --> 00:02:14,613 and you will now see why. 48 00:02:15,720 --> 00:02:17,160 For those that don't remember, 49 00:02:17,160 --> 00:02:18,480 I suggest watching the video 50 00:02:18,480 --> 00:02:20,700 on standardization once again. 51 00:02:20,700 --> 00:02:22,983 For the others, I will quickly go through it. 52 00:02:24,180 --> 00:02:27,030 We standardize a variable by subtracting the mean 53 00:02:27,030 --> 00:02:29,373 and dividing by the standard deviation. 54 00:02:30,420 --> 00:02:33,303 Since it is a sample, we use the standard error. 55 00:02:34,230 --> 00:02:37,420 Thus, the formula for standardization becomes 56 00:02:38,740 --> 00:02:41,850 Z is equal to the sample mean 57 00:02:41,850 --> 00:02:45,660 minus the value of interest from the null hypothesis 58 00:02:45,660 --> 00:02:47,823 divided by the standard error. 59 00:02:50,070 --> 00:02:52,380 In this way, we obtain a distribution 60 00:02:52,380 --> 00:02:55,623 with a mean of 0 and a standard deviation of 1. 61 00:02:56,910 --> 00:03:00,753 This Z should not be mistaken with z. 62 00:03:02,460 --> 00:03:05,100 The Z is the standardized variable 63 00:03:05,100 --> 00:03:07,050 associated with the test 64 00:03:07,050 --> 00:03:09,393 and will be called the Z-score from now on. 65 00:03:11,070 --> 00:03:13,080 The z is the one from the table 66 00:03:13,080 --> 00:03:14,790 that we've talked about before, 67 00:03:14,790 --> 00:03:18,213 and henceforth will be referred to as the critical value. 68 00:03:20,190 --> 00:03:22,710 All right, how does testing work? 69 00:03:22,710 --> 00:03:23,703 Think about this. 70 00:03:24,660 --> 00:03:26,730 The z is normally distributed 71 00:03:26,730 --> 00:03:29,340 with a mean and standard deviation of 1. 72 00:03:29,340 --> 00:03:31,320 The Z is normally distributed 73 00:03:31,320 --> 00:03:34,110 with a mean of X bar minus mu zero 74 00:03:34,110 --> 00:03:35,930 and a standard deviation of 1. 75 00:03:38,490 --> 00:03:41,250 Standardization lets us compare the means. 76 00:03:41,250 --> 00:03:45,210 The closer the difference of X bar and mu zero to zero, 77 00:03:45,210 --> 00:03:47,583 the closer Z-score itself to zero. 78 00:03:48,870 --> 00:03:52,383 This implies a higher chance to accept the null hypothesis. 79 00:03:53,670 --> 00:03:55,383 Let's go back to the example. 80 00:03:56,340 --> 00:04:00,120 So what is the value of our standardized variable? 81 00:04:00,120 --> 00:04:01,710 We plug in the numbers that we have 82 00:04:01,710 --> 00:04:03,360 from the beginning of the lesson. 83 00:04:04,260 --> 00:04:07,863 What we get is a Z-score of minus 4.67. 84 00:04:08,970 --> 00:04:11,160 Now, we will compare the absolute value 85 00:04:11,160 --> 00:04:16,079 of minus 4.67 with a z of alpha divided by 2, 86 00:04:16,079 --> 00:04:18,183 where alpha is the significance level. 87 00:04:19,110 --> 00:04:21,000 Note that we use the absolute value 88 00:04:21,000 --> 00:04:24,330 as it is much easier to always compare positive Z's 89 00:04:24,330 --> 00:04:26,073 with positive z's. 90 00:04:27,270 --> 00:04:31,290 Moreover, some z tables don't include negative values. 91 00:04:31,290 --> 00:04:33,360 You should be aware that the two statements, 92 00:04:33,360 --> 00:04:37,620 minus 4.67 is lower than the negative critical value, 93 00:04:37,620 --> 00:04:40,830 is the same as 4.67 is higher 94 00:04:40,830 --> 00:04:42,633 than the positive critical value. 95 00:04:43,620 --> 00:04:45,960 Thus, our decision rule becomes, 96 00:04:45,960 --> 00:04:47,610 absolute value of the Z-score 97 00:04:47,610 --> 00:04:49,530 should be higher than the absolute value 98 00:04:49,530 --> 00:04:50,630 of the critical value. 99 00:04:52,140 --> 00:04:53,790 Using 5% significance, 100 00:04:53,790 --> 00:04:56,550 our alpha is 0.05. 101 00:04:56,550 --> 00:04:58,350 Since it is a two-sided test, 102 00:04:58,350 --> 00:05:01,743 we check the table for z of 0.025. 103 00:05:02,880 --> 00:05:05,643 A corresponding value is 1.96. 104 00:05:07,200 --> 00:05:09,000 The last thing we need to do is compare 105 00:05:09,000 --> 00:05:11,553 our standardized variable to the critical value. 106 00:05:12,450 --> 00:05:15,180 If the Z-score is higher than 1.96, 107 00:05:15,180 --> 00:05:17,940 we would reject the null hypothesis. 108 00:05:17,940 --> 00:05:20,073 If it is lower, we will accept it. 109 00:05:21,630 --> 00:05:25,020 4.67 is higher than 1.96, 110 00:05:25,020 --> 00:05:27,333 therefore, we reject the null hypothesis. 111 00:05:28,500 --> 00:05:31,590 The answer is that at the 5% significance level, 112 00:05:31,590 --> 00:05:34,290 we have rejected the null hypothesis, 113 00:05:34,290 --> 00:05:36,240 or at 5% significance, 114 00:05:36,240 --> 00:05:37,680 there is no statistical evidence 115 00:05:37,680 --> 00:05:40,623 that the mean salary is $113,000. 116 00:05:42,600 --> 00:05:44,640 There are many other ways to express this 117 00:05:44,640 --> 00:05:46,500 and you'll probably hear more about this 118 00:05:46,500 --> 00:05:47,793 later on in the course. 119 00:05:50,070 --> 00:05:52,770 What if we had a different significance level? 120 00:05:52,770 --> 00:05:54,270 Using 1% significance, 121 00:05:54,270 --> 00:05:57,240 we have an alpha of 0.01. 122 00:05:57,240 --> 00:06:01,173 So z of alpha divided by 2 is 2.58. 123 00:06:02,100 --> 00:06:07,100 Once again, our Z-score of 4.67 is higher than 2.58, 124 00:06:07,710 --> 00:06:09,930 so we would reject the null hypothesis 125 00:06:09,930 --> 00:06:11,853 even at the 1% significance. 126 00:06:13,650 --> 00:06:15,030 But how much further can we go 127 00:06:15,030 --> 00:06:18,298 before we could not reject the null hypothesis anymore? 128 00:06:18,298 --> 00:06:19,131 0.5%? 129 00:06:20,073 --> 00:06:20,906 0.1%? 130 00:06:21,900 --> 00:06:23,820 There is a special technique that allows us 131 00:06:23,820 --> 00:06:26,100 to see what the significance level is, 132 00:06:26,100 --> 00:06:29,493 after which we will be unable to reject the null hypothesis. 133 00:06:30,330 --> 00:06:33,240 We will see it in our next video. 134 00:06:33,240 --> 00:06:34,073 Stay tuned. 10021

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.