Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:03,030 --> 00:00:04,500
Instructor: Welcome back!
2
00:00:04,500 --> 00:00:05,790
This lecture is going to serve
3
00:00:05,790 --> 00:00:09,210
as an overview of what a probability distribution is
4
00:00:09,210 --> 00:00:12,210
and what some of its main characteristics are.
5
00:00:12,210 --> 00:00:15,810
Simply put, a distribution shows the possible values
6
00:00:15,810 --> 00:00:18,783
a variable can take and how frequently they occur.
7
00:00:19,680 --> 00:00:20,790
Before we start,
8
00:00:20,790 --> 00:00:23,760
let us introduce some important notation we use
9
00:00:23,760 --> 00:00:26,160
for the remainder of the course.
10
00:00:26,160 --> 00:00:30,060
Assume that uppercase Y represents the actual outcome
11
00:00:30,060 --> 00:00:33,600
of an event and lowercase Y represents one
12
00:00:33,600 --> 00:00:34,863
of the possible outcomes.
13
00:00:36,000 --> 00:00:37,440
One way to denote the likelihood
14
00:00:37,440 --> 00:00:39,870
of reaching a particular outcome Y,
15
00:00:39,870 --> 00:00:43,863
is P of Y equals Y.
16
00:00:44,790 --> 00:00:48,063
We can also express it as P of Y.
17
00:00:49,590 --> 00:00:53,070
For example, uppercase Y could represent the number
18
00:00:53,070 --> 00:00:55,890
of red marbles we draw out of a bag
19
00:00:55,890 --> 00:00:58,650
and lowercase Y would be a specific number
20
00:00:58,650 --> 00:01:00,603
like three or five.
21
00:01:01,620 --> 00:01:03,390
Then we express the probability
22
00:01:03,390 --> 00:01:05,430
of getting exactly five red marbles
23
00:01:05,430 --> 00:01:10,430
as P of Y equals five or P of five.
24
00:01:12,270 --> 00:01:14,970
Since P of Y expresses the probability
25
00:01:14,970 --> 00:01:16,770
for each distinct outcome,
26
00:01:16,770 --> 00:01:19,233
we call this the probability function.
27
00:01:20,850 --> 00:01:22,680
Good job folks!
28
00:01:22,680 --> 00:01:24,660
So probability distributions
29
00:01:24,660 --> 00:01:27,540
or simply probabilities measure the likelihood
30
00:01:27,540 --> 00:01:28,920
of an outcome depending
31
00:01:28,920 --> 00:01:31,653
on how often it is featured in the sample space.
32
00:01:32,520 --> 00:01:34,320
Recall that we constructed the probability
33
00:01:34,320 --> 00:01:36,480
frequency distribution of an event
34
00:01:36,480 --> 00:01:38,733
in the introductory section of the course.
35
00:01:40,080 --> 00:01:41,520
We recorded the frequency
36
00:01:41,520 --> 00:01:43,860
for each unique value and divided it
37
00:01:43,860 --> 00:01:46,473
by the total number of elements in the sample space.
38
00:01:47,370 --> 00:01:50,340
Usually, that is the way we construct these probabilities
39
00:01:50,340 --> 00:01:53,673
when we have a finite number of possible outcomes.
40
00:01:54,690 --> 00:01:57,180
If we had an infinite number of possibilities
41
00:01:57,180 --> 00:02:00,960
then recording the frequency for each one becomes impossible
42
00:02:00,960 --> 00:02:04,320
because there are infinitely many of them.
43
00:02:04,320 --> 00:02:07,680
For instance, imagine you are a data scientist
44
00:02:07,680 --> 00:02:10,983
and want to analyze the time it takes for your code to run.
45
00:02:11,910 --> 00:02:13,980
Any single compilation could take anywhere
46
00:02:13,980 --> 00:02:17,280
from a few milliseconds to several days.
47
00:02:17,280 --> 00:02:20,700
Often, the result will be between a few milliseconds
48
00:02:20,700 --> 00:02:21,663
and a few minutes.
49
00:02:22,650 --> 00:02:26,010
If we record time in seconds, we lose precision
50
00:02:26,010 --> 00:02:27,660
which is something to be avoided.
51
00:02:29,010 --> 00:02:30,180
To do so we need
52
00:02:30,180 --> 00:02:32,673
to use the smallest possible measurement of time.
53
00:02:33,510 --> 00:02:37,380
Since every milli, micro or even nanosecond could be split
54
00:02:37,380 --> 00:02:41,460
in half for greater accuracy no such thing exists.
55
00:02:41,460 --> 00:02:43,050
In less than an hour from now
56
00:02:43,050 --> 00:02:46,500
we will talk in more detail about continuous distributions
57
00:02:46,500 --> 00:02:47,800
and how to deal with them.
58
00:02:49,380 --> 00:02:51,903
Now is the time to introduce some key definitions.
59
00:02:52,860 --> 00:02:54,690
Regardless of whether we have a finite
60
00:02:54,690 --> 00:02:56,880
or infinite number of possibilities,
61
00:02:56,880 --> 00:02:58,620
we define distributions using
62
00:02:58,620 --> 00:03:00,690
only two characteristics,
63
00:03:00,690 --> 00:03:03,360
mean and variance.
64
00:03:03,360 --> 00:03:04,950
Simply put, the mean
65
00:03:04,950 --> 00:03:07,983
of the distribution is its average value.
66
00:03:08,940 --> 00:03:10,710
Variance, on the other hand
67
00:03:10,710 --> 00:03:12,993
is essentially how spread out the data is.
68
00:03:13,980 --> 00:03:16,170
We measure this spread by how far away
69
00:03:16,170 --> 00:03:18,213
from the mean all the values are.
70
00:03:19,860 --> 00:03:21,540
The more dispersed the data is
71
00:03:21,540 --> 00:03:23,433
the higher its variance will be.
72
00:03:24,600 --> 00:03:26,760
We denote the mean of a distribution
73
00:03:26,760 --> 00:03:28,950
with the Greek letter mu
74
00:03:28,950 --> 00:03:31,503
and it's variance with sigma-squared.
75
00:03:33,780 --> 00:03:36,450
Okay, when analyzing distributions
76
00:03:36,450 --> 00:03:38,280
it is important to understand what kind
77
00:03:38,280 --> 00:03:42,753
of data we are dealing with, population or sample data.
78
00:03:43,980 --> 00:03:46,290
Population data is the formal way of referring
79
00:03:46,290 --> 00:03:50,673
to all the data while sample data is just a part of it.
80
00:03:51,810 --> 00:03:54,900
For example, if an employer surveys an entire department
81
00:03:54,900 --> 00:03:56,700
about how they travel to work
82
00:03:56,700 --> 00:03:58,950
the data would represent the population
83
00:03:58,950 --> 00:04:00,570
of the department.
84
00:04:00,570 --> 00:04:03,750
However, this same data would also just be a sample
85
00:04:03,750 --> 00:04:05,763
of the employees in the whole company.
86
00:04:07,560 --> 00:04:10,050
Something to remember when using sample data is
87
00:04:10,050 --> 00:04:11,970
that we adopt different notations
88
00:04:11,970 --> 00:04:13,743
for the mean and variance.
89
00:04:14,670 --> 00:04:17,640
We denote sample mean as x-bar
90
00:04:17,640 --> 00:04:20,673
and sample variance as s-squared.
91
00:04:22,260 --> 00:04:23,880
One flaw of variance is
92
00:04:23,880 --> 00:04:26,610
that it is measured in squared units.
93
00:04:26,610 --> 00:04:29,580
For example, if you are measuring time and seconds,
94
00:04:29,580 --> 00:04:32,080
the variance would be measured in seconds-squared.
95
00:04:32,940 --> 00:04:35,823
Usually, there is no direct interpretation of that value.
96
00:04:36,810 --> 00:04:39,060
To make further sense of variance, we introduce
97
00:04:39,060 --> 00:04:41,670
a third characteristic of the distribution
98
00:04:41,670 --> 00:04:43,263
called standard deviation.
99
00:04:44,400 --> 00:04:47,220
Standard deviation is simply the positive square root
100
00:04:47,220 --> 00:04:48,183
of variance.
101
00:04:49,290 --> 00:04:52,950
As you may suspect, we denote it as sigma when dealing
102
00:04:52,950 --> 00:04:57,423
with a population and as S when dealing with a sample.
103
00:04:59,400 --> 00:05:01,860
Unlike variance, standard deviation is measured
104
00:05:01,860 --> 00:05:04,470
in the same units as the mean.
105
00:05:04,470 --> 00:05:08,640
Thus, we can directly interpret it and is often preferable.
106
00:05:08,640 --> 00:05:11,700
One idea, which we will use a lot, is that any value
107
00:05:11,700 --> 00:05:16,080
between mu minus sigma and mu plus sigma falls
108
00:05:16,080 --> 00:05:19,830
within one standard deviation away from the mean.
109
00:05:19,830 --> 00:05:22,260
The more congested the middle of the distribution,
110
00:05:22,260 --> 00:05:24,393
the more data falls within that interval.
111
00:05:25,230 --> 00:05:27,330
Similarly, the less data that falls
112
00:05:27,330 --> 00:05:30,393
within the interval the more dispersed the data is.
113
00:05:31,740 --> 00:05:33,360
Fantastic!
114
00:05:33,360 --> 00:05:36,240
It is important to know that a constant relationship exists
115
00:05:36,240 --> 00:05:39,840
between mean and variance for any distribution.
116
00:05:39,840 --> 00:05:43,230
By definition, the variance equals the expected value
117
00:05:43,230 --> 00:05:46,590
of the squared difference from the mean for any value.
118
00:05:46,590 --> 00:05:50,790
We denote this as sigma-squared equals the expected value
119
00:05:50,790 --> 00:05:52,713
of Y minus mu-squared.
120
00:05:53,820 --> 00:05:56,190
After some simplification, this is equal
121
00:05:56,190 --> 00:06:01,190
to the expected value of y-squared minus mu-squared.
122
00:06:01,470 --> 00:06:03,330
As you will see in the coming lectures
123
00:06:03,330 --> 00:06:05,790
if we are dealing with a specific distribution
124
00:06:05,790 --> 00:06:08,073
we can find a much more precise formula.
125
00:06:10,380 --> 00:06:12,120
Okay, when we are getting acquainted
126
00:06:12,120 --> 00:06:13,410
with a certain data set
127
00:06:13,410 --> 00:06:15,960
we want to analyze or make predictions with,
128
00:06:15,960 --> 00:06:17,070
we are most interested
129
00:06:17,070 --> 00:06:21,060
in the mean, variance and type of the distribution.
130
00:06:21,060 --> 00:06:24,360
In our next video, we will introduce several distributions
131
00:06:24,360 --> 00:06:26,223
and the characteristics they possess.
132
00:06:27,090 --> 00:06:28,173
Thanks for watching!
10339
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.