Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,004 --> 00:00:02,004
- [Instructor] Statistical analysis might seem like
2
00:00:02,004 --> 00:00:04,009
an exact science, but if you've ever tried
3
00:00:04,009 --> 00:00:06,006
to apply statistics to real life,
4
00:00:06,006 --> 00:00:09,003
you know that, in fact, it is not.
5
00:00:09,003 --> 00:00:10,007
In this movie, I'd like to review
6
00:00:10,007 --> 00:00:12,009
a number of potential sources of error
7
00:00:12,009 --> 00:00:15,006
that might creep into your own analyses,
8
00:00:15,006 --> 00:00:16,009
because once you know what they are,
9
00:00:16,009 --> 00:00:19,006
you can do your best to avoid them.
10
00:00:19,006 --> 00:00:23,001
Not using random samples can be a huge source of error.
11
00:00:23,001 --> 00:00:26,000
One famous example comes from the 1936
12
00:00:26,000 --> 00:00:28,000
U.S. Presidential election,
13
00:00:28,000 --> 00:00:31,002
where a telephone poll of "Literary Digest" subscribers
14
00:00:31,002 --> 00:00:33,001
projected that Alfred Landon
15
00:00:33,001 --> 00:00:36,007
would beat Franklin Delano Roosevelt by a wide margin.
16
00:00:36,007 --> 00:00:40,008
In fact, Roosevelt won nearly 2/3 of the popular vote.
17
00:00:40,008 --> 00:00:42,006
The error came from two sources.
18
00:00:42,006 --> 00:00:45,005
"Literary Digest" was a conservative publication,
19
00:00:45,005 --> 00:00:47,001
which biased the results,
20
00:00:47,001 --> 00:00:49,009
and the poll was conducted by telephone.
21
00:00:49,009 --> 00:00:53,000
In 1936, only the financially well-off
22
00:00:53,000 --> 00:00:54,007
had telephones in their homes,
23
00:00:54,007 --> 00:00:57,004
so that biased the results, as well.
24
00:00:57,004 --> 00:01:00,003
You can also run into investigator bias.
25
00:01:00,003 --> 00:01:03,003
It's easy to anticipate what your data will tell you,
26
00:01:03,003 --> 00:01:06,003
that's normal, but you shouldn't let those expectations
27
00:01:06,003 --> 00:01:07,008
affect your judgment.
28
00:01:07,008 --> 00:01:10,005
Many interesting discoveries come from the moment
29
00:01:10,005 --> 00:01:13,007
when you look at your data and think, that's strange,
30
00:01:13,007 --> 00:01:18,000
because the results don't fit your preconceived notions.
31
00:01:18,000 --> 00:01:21,003
You can also run into trouble working with old data.
32
00:01:21,003 --> 00:01:24,003
The world changes and just because your customers
33
00:01:24,003 --> 00:01:26,008
might have been ready for a product two years ago,
34
00:01:26,008 --> 00:01:29,004
doesn't mean they are now.
35
00:01:29,004 --> 00:01:33,004
And finally, you can run into trouble basing your policy
36
00:01:33,004 --> 00:01:36,008
on a survey or experiment with a small sample.
37
00:01:36,008 --> 00:01:38,009
The more data you can get, the better,
38
00:01:38,009 --> 00:01:42,006
and the greater variety of people that you ask,
39
00:01:42,006 --> 00:01:47,008
again, selected randomly, the better your analysis will be.
40
00:01:47,008 --> 00:01:49,005
Hopefully, pointing out these sources of error
41
00:01:49,005 --> 00:01:51,009
will help you in your own analysis
42
00:01:51,009 --> 00:01:54,000
and you can use random data,
43
00:01:54,000 --> 00:01:56,003
eliminate your own personal bias,
44
00:01:56,003 --> 00:01:58,002
work with the newest data that you have,
45
00:01:58,002 --> 00:02:00,000
and have an adequate sample size.
3613
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.