Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,480 --> 00:00:02,490
-: Alright, great.
2
00:00:02,490 --> 00:00:05,400
Now that we know what the P-value is and how to use it,
3
00:00:05,400 --> 00:00:07,503
we will get back to hypothesis testing.
4
00:00:08,430 --> 00:00:11,970
We saw only one of two possible cases. Remember?
5
00:00:11,970 --> 00:00:14,280
We haven't covered the more commonly observed case
6
00:00:14,280 --> 00:00:16,533
when the population variance is unknown.
7
00:00:17,880 --> 00:00:21,960
Alright. Imagine you are the marketing analyst of a company
8
00:00:21,960 --> 00:00:23,580
and that you've been asked to estimate
9
00:00:23,580 --> 00:00:26,100
if the email open rate of one of the firm's competitors
10
00:00:26,100 --> 00:00:27,603
is above your company's.
11
00:00:29,130 --> 00:00:32,970
Your company has an open rate of 40%.
12
00:00:32,970 --> 00:00:34,650
An email open rate is a measure
13
00:00:34,650 --> 00:00:37,350
of how many people on the email list actually open
14
00:00:37,350 --> 00:00:38,850
the emails they have received.
15
00:00:40,020 --> 00:00:41,790
At first, you struggle to figure out
16
00:00:41,790 --> 00:00:43,830
how to get such specific information
17
00:00:43,830 --> 00:00:45,720
about a competitor company,
18
00:00:45,720 --> 00:00:48,420
but then you see that an employee of that competitor company
19
00:00:48,420 --> 00:00:52,200
posted a selfie on Facebook saying "LOL.
20
00:00:52,200 --> 00:00:55,977
The email management software we are using drives me nuts."
21
00:00:57,060 --> 00:00:59,700
In the background, you can see her screen
22
00:00:59,700 --> 00:01:01,650
and it shows clearly the summaries
23
00:01:01,650 --> 00:01:04,769
of the last 10 email campaigns that were sent
24
00:01:04,769 --> 00:01:08,073
and their corresponding open rates. Bingo!
25
00:01:09,210 --> 00:01:12,360
With your statistical skills, that's all you need.
26
00:01:12,360 --> 00:01:14,103
A little help from Facebook.
27
00:01:15,840 --> 00:01:18,090
Let's state the hypotheses.
28
00:01:18,090 --> 00:01:23,090
Null hypothesis mean open rate is lower or equal to 40%.
29
00:01:24,360 --> 00:01:29,360
Alternative hypothesis mean open rate is higher than 40%.
30
00:01:29,730 --> 00:01:31,800
Note that in hypothesis testing,
31
00:01:31,800 --> 00:01:34,710
we are aiming to reject the null hypothesis.
32
00:01:34,710 --> 00:01:38,070
When we wanna test if the open rate is higher than 40%,
33
00:01:38,070 --> 00:01:41,313
the null hypothesis actually states the opposite statement.
34
00:01:42,330 --> 00:01:44,700
Also, pay attention that this time,
35
00:01:44,700 --> 00:01:46,953
we are dealing with a one-sided test.
36
00:01:48,930 --> 00:01:50,910
Alright. Your boss told you
37
00:01:50,910 --> 00:01:55,110
that 0.05 is an adequate significance level for this test,
38
00:01:55,110 --> 00:01:56,433
so that's what you'll use.
39
00:01:58,740 --> 00:02:00,300
Here's the data set.
40
00:02:00,300 --> 00:02:05,280
You calculate the sample mean and get 37.7%.
41
00:02:05,280 --> 00:02:08,880
The sample standard deviation is 13.74%,
42
00:02:08,880 --> 00:02:13,560
thus the standard error is 4.34%.
43
00:02:13,560 --> 00:02:16,710
You assume that the population of open rates of sent emails
44
00:02:16,710 --> 00:02:18,063
is normally distributed.
45
00:02:19,050 --> 00:02:21,390
Like confidence intervals with variance unknown
46
00:02:21,390 --> 00:02:22,680
in a small sample,
47
00:02:22,680 --> 00:02:25,563
the correct statistic to use is the t-statistic.
48
00:02:27,090 --> 00:02:29,580
Remember, you do not know the variance
49
00:02:29,580 --> 00:02:31,293
and the sample is not big enough.
50
00:02:32,160 --> 00:02:34,080
This means that the variable follows
51
00:02:34,080 --> 00:02:35,940
the student's T distribution
52
00:02:35,940 --> 00:02:38,313
and you must employ the t-statistic.
53
00:02:40,020 --> 00:02:41,760
Let's calculate it then.
54
00:02:41,760 --> 00:02:44,853
We calculate the T score the same way as the Z score.
55
00:02:46,290 --> 00:02:48,810
The T score is equal to the sample mean,
56
00:02:48,810 --> 00:02:51,450
minus the hypothesized mean value,
57
00:02:51,450 --> 00:02:53,313
divided by the standard error.
58
00:02:54,900 --> 00:02:58,263
The result that we get is -0.53.
59
00:03:00,090 --> 00:03:01,020
-: As we said earlier,
60
00:03:01,020 --> 00:03:03,600
it is easier to work with positive numbers.
61
00:03:03,600 --> 00:03:07,920
So, we should compare the absolute value of -0.53
62
00:03:07,920 --> 00:03:11,700
with the appropriate T with N minus one degrees of freedom
63
00:03:11,700 --> 00:03:15,153
at 0.05 one-sided significance.
64
00:03:16,650 --> 00:03:18,390
We quickly navigate through the table
65
00:03:18,390 --> 00:03:22,773
and get 1.83 at the 5% significance critical value.
66
00:03:24,690 --> 00:03:29,690
Okay, 0.53 is lower than 1.83.
67
00:03:30,060 --> 00:03:31,890
Remember the Decision rule?
68
00:03:31,890 --> 00:03:34,110
If the absolute value of the T score
69
00:03:34,110 --> 00:03:36,750
is lower than the statistic from the table,
70
00:03:36,750 --> 00:03:38,883
we cannot reject the null hypothesis.
71
00:03:39,840 --> 00:03:41,763
Therefore, we must accept it.
72
00:03:43,830 --> 00:03:46,380
What you do next is you go and tell your boss
73
00:03:46,380 --> 00:03:49,080
that at this level of significance, statistically,
74
00:03:49,080 --> 00:03:51,690
we cannot say that the email open rate of our competitors
75
00:03:51,690 --> 00:03:53,693
is higher than 40%.
76
00:03:55,694 --> 00:03:59,340
Okay. What about the second measurement we saw?
77
00:03:59,340 --> 00:04:00,870
What was that?
78
00:04:00,870 --> 00:04:02,190
Ah, yes.
79
00:04:02,190 --> 00:04:03,393
The P-value.
80
00:04:04,440 --> 00:04:08,283
The P-value of this statistic is 0.304.
81
00:04:09,510 --> 00:04:12,090
As the P-value is greater than the significance level
82
00:04:12,090 --> 00:04:16,140
of 0.05, we come to the same conclusion.
83
00:04:16,140 --> 00:04:18,543
We cannot reject the null hypothesis.
84
00:04:20,310 --> 00:04:21,632
Let's do a quick check.
85
00:04:22,710 --> 00:04:25,470
If the significance level was 0.01,
86
00:04:25,470 --> 00:04:27,270
the P-value would still be higher
87
00:04:27,270 --> 00:04:30,240
and we wouldn't reject the null hypothesis.
88
00:04:30,240 --> 00:04:32,070
This is an important observation
89
00:04:32,070 --> 00:04:34,110
that we haven't noted before.
90
00:04:34,110 --> 00:04:37,890
If we cannot reject a test at 0.05 significance,
91
00:04:37,890 --> 00:04:40,290
we could not reject it at smaller levels either.
92
00:04:41,490 --> 00:04:43,560
Alright. That's all for now.
93
00:04:43,560 --> 00:04:46,110
Make sure you learn the material by doing the exercises
94
00:04:46,110 --> 00:04:47,580
after this lesson.
95
00:04:47,580 --> 00:04:48,580
Thanks for watching.
7324
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.