Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:03,030 --> 00:00:04,320
Instructor: Welcome back.
2
00:00:04,320 --> 00:00:05,790
In this practical example
3
00:00:05,790 --> 00:00:07,500
we will explore several scenarios
4
00:00:07,500 --> 00:00:09,810
where understanding how a data set is distributed
5
00:00:09,810 --> 00:00:11,790
is truly beneficial.
6
00:00:11,790 --> 00:00:13,620
We will examine different data samples
7
00:00:13,620 --> 00:00:16,170
which follow a normal, a Student's T,
8
00:00:16,170 --> 00:00:18,423
and a Poisson distribution.
9
00:00:19,290 --> 00:00:21,450
Furthermore, we will analyze instances
10
00:00:21,450 --> 00:00:24,180
of exponential and binomial data
11
00:00:24,180 --> 00:00:26,610
to help us appreciate the elegant statistics
12
00:00:26,610 --> 00:00:28,353
these distributions possess.
13
00:00:29,670 --> 00:00:31,050
Let's begin.
14
00:00:31,050 --> 00:00:33,780
Imagine you are working as a head project manager
15
00:00:33,780 --> 00:00:35,340
for one of the most renowned companies
16
00:00:35,340 --> 00:00:38,550
in the world of video games, EA Games.
17
00:00:38,550 --> 00:00:40,380
Your various responsibilities include:
18
00:00:40,380 --> 00:00:43,710
supervising the development and release of the 2018 edition
19
00:00:43,710 --> 00:00:46,497
of the soccer game titled "FIFA 19".
20
00:00:47,640 --> 00:00:50,790
Above all else, you need to ensure the game is well rounded
21
00:00:50,790 --> 00:00:53,190
and provides a genuinely enjoyable experience
22
00:00:53,190 --> 00:00:55,080
for all the customers.
23
00:00:55,080 --> 00:00:57,300
The game has a professional competitive scene
24
00:00:57,300 --> 00:00:59,610
so it needs to be balanced.
25
00:00:59,610 --> 00:01:03,150
By balanced, we mean that no team or individual player
26
00:01:03,150 --> 00:01:05,459
should invariably be a preferred option
27
00:01:05,459 --> 00:01:07,323
regardless of the opposition.
28
00:01:08,430 --> 00:01:10,950
Therefore, we expect to have an equal number
29
00:01:10,950 --> 00:01:14,010
of good players and poor players in the game.
30
00:01:14,010 --> 00:01:15,633
Let's see if that's the case.
31
00:01:17,070 --> 00:01:19,440
We provided you with access to a data set
32
00:01:19,440 --> 00:01:21,900
containing the stats for each individual player
33
00:01:21,900 --> 00:01:23,520
in "FIFA 19".
34
00:01:23,520 --> 00:01:25,053
So let's have a closer look.
35
00:01:25,890 --> 00:01:29,790
You can use Microsoft Excel to open the FIFA 19 file
36
00:01:29,790 --> 00:01:31,113
accompanying this lecture.
37
00:01:33,450 --> 00:01:36,900
To begin with, examine the overall column.
38
00:01:36,900 --> 00:01:38,760
It represents the quality of a player
39
00:01:38,760 --> 00:01:42,063
in their natural position on a scale from one to 100.
40
00:01:42,900 --> 00:01:45,390
This value is a sort of weighted average
41
00:01:45,390 --> 00:01:48,153
of the many individual stats each player has.
42
00:01:49,410 --> 00:01:52,650
As you probably know, the importance of attributes varies
43
00:01:52,650 --> 00:01:54,660
for different positions on the field.
44
00:01:54,660 --> 00:01:57,870
For instance, acceleration and top speed
45
00:01:57,870 --> 00:02:00,573
are more important for a winger than tackling.
46
00:02:01,590 --> 00:02:04,440
However, the inverse is true for center backs.
47
00:02:04,440 --> 00:02:07,290
Thus, we alter the weight for each stat
48
00:02:07,290 --> 00:02:09,690
based on the position of the player.
49
00:02:09,690 --> 00:02:11,880
Therefore, we do not have a single formula
50
00:02:11,880 --> 00:02:14,583
which calculates the overall evaluation.
51
00:02:16,050 --> 00:02:17,940
To get an idea of how well distributed
52
00:02:17,940 --> 00:02:19,530
the overall values are,
53
00:02:19,530 --> 00:02:23,583
we can construct a histogram and set the bin size to one.
54
00:02:24,840 --> 00:02:27,393
We do so by selecting the overall column,
55
00:02:28,800 --> 00:02:30,960
clicking on insert,
56
00:02:30,960 --> 00:02:35,493
then insert statistics chart and selecting histogram.
57
00:02:36,990 --> 00:02:38,880
To adjust the size of the bins,
58
00:02:38,880 --> 00:02:41,400
right click on the x-axis of the graph
59
00:02:41,400 --> 00:02:46,293
and press format axis before setting bandwidth to one.
60
00:02:47,910 --> 00:02:49,290
The graph is bell-shaped
61
00:02:49,290 --> 00:02:51,960
and resembles a normal distribution.
62
00:02:51,960 --> 00:02:55,710
But wait, aren't we dealing with discrete values?
63
00:02:55,710 --> 00:02:57,783
How can this be a normal distribution?
64
00:02:58,755 --> 00:03:00,270
Although, although that may be true
65
00:03:00,270 --> 00:03:03,630
continuous variables can take discreet values
66
00:03:03,630 --> 00:03:05,343
but not vice versa.
67
00:03:06,180 --> 00:03:08,820
Furthermore, since we are dealing with rounded averages
68
00:03:08,820 --> 00:03:10,170
we are inclined to believe
69
00:03:10,170 --> 00:03:12,930
that the overall value is not entirely discreet
70
00:03:12,930 --> 00:03:14,853
but rather an approximation.
71
00:03:16,050 --> 00:03:17,973
Let's take a closer look at the graph.
72
00:03:20,280 --> 00:03:22,260
Now we can notice its thin tails
73
00:03:22,260 --> 00:03:25,410
which suggest a smaller number of outliers.
74
00:03:25,410 --> 00:03:27,810
This reflects real life quite accurately
75
00:03:27,810 --> 00:03:31,110
since very few professional players are exceptionally good
76
00:03:31,110 --> 00:03:33,993
or bad at every single aspect of the sport.
77
00:03:35,610 --> 00:03:38,550
Besides even the least skilled professional soccer players
78
00:03:38,550 --> 00:03:41,130
are far superior to the average person.
79
00:03:41,130 --> 00:03:43,980
That explains why the lowest overall values
80
00:03:43,980 --> 00:03:46,743
start from around 50 rather than zero.
81
00:03:48,390 --> 00:03:50,430
The stats should reflect the performance of players
82
00:03:50,430 --> 00:03:51,690
in the real world.
83
00:03:51,690 --> 00:03:54,450
As normal distribution is the most frequently observed
84
00:03:54,450 --> 00:03:56,730
in nature, it is only logical
85
00:03:56,730 --> 00:03:58,923
that the data resembles this distribution.
86
00:04:00,240 --> 00:04:03,180
Moreover, the bell-shaped graph with thin tails
87
00:04:03,180 --> 00:04:04,743
further supports this idea.
88
00:04:06,390 --> 00:04:07,950
Since one of the main characteristics
89
00:04:07,950 --> 00:04:10,500
of a normal distribution is symmetry,
90
00:04:10,500 --> 00:04:13,710
the overall values are symmetrically distributed.
91
00:04:13,710 --> 00:04:16,079
Thus, we can safely consider the game balanced
92
00:04:16,079 --> 00:04:18,333
and acceptable for competitive play.
93
00:04:19,589 --> 00:04:21,360
It is also worth noting that players
94
00:04:21,360 --> 00:04:25,380
within the single team or division share similar stats.
95
00:04:25,380 --> 00:04:27,330
This skews the data a certain way
96
00:04:27,330 --> 00:04:29,400
and explains why we cannot expect the values
97
00:04:29,400 --> 00:04:31,203
to follow a normal distribution.
98
00:04:32,430 --> 00:04:34,410
Now if we wish to further test the balance
99
00:04:34,410 --> 00:04:35,850
of the overall stats,
100
00:04:35,850 --> 00:04:39,570
we can examine a small sample of random players.
101
00:04:39,570 --> 00:04:41,597
For instance, we can construct a histogram
102
00:04:41,597 --> 00:04:44,310
of the first 30 players in the data set
103
00:04:44,310 --> 00:04:46,053
based on their ID number.
104
00:04:47,760 --> 00:04:49,080
Since our data is limited
105
00:04:49,080 --> 00:04:51,390
we need to adjust the size of the bins,
106
00:04:51,390 --> 00:04:53,970
otherwise it is possible for each value to occur
107
00:04:53,970 --> 00:04:55,800
only once or twice.
108
00:04:55,800 --> 00:04:58,440
That would result in many bins of one or two
109
00:04:58,440 --> 00:05:00,153
and make the histogram redundant.
110
00:05:01,770 --> 00:05:03,810
If we adjust the bin size to three,
111
00:05:03,810 --> 00:05:05,700
we will see that the graph slightly resembles
112
00:05:05,700 --> 00:05:07,650
a normal distribution.
113
00:05:07,650 --> 00:05:11,580
However, we will also notice the fatter tails.
114
00:05:11,580 --> 00:05:13,770
Since the number of observations is limited
115
00:05:13,770 --> 00:05:16,110
we can safely consider this sample follows
116
00:05:16,110 --> 00:05:18,153
a Student's t-distribution.
117
00:05:20,400 --> 00:05:23,790
Recall that the Student's t-distribution is also symmetric.
118
00:05:23,790 --> 00:05:24,930
So we are confident
119
00:05:24,930 --> 00:05:27,090
that even the small sample we are examining
120
00:05:27,090 --> 00:05:29,283
confirms our goal of a balanced game.
121
00:05:30,660 --> 00:05:32,490
Before we move on to other aspects
122
00:05:32,490 --> 00:05:33,900
of the development of the game,
123
00:05:33,900 --> 00:05:36,690
let's explore how a single stat is distributed
124
00:05:36,690 --> 00:05:38,940
among the players in the game.
125
00:05:38,940 --> 00:05:41,103
Take the shot power column for example.
126
00:05:42,300 --> 00:05:45,840
If we construct a histogram and set the bin size to one,
127
00:05:45,840 --> 00:05:49,260
we will see a distribution with two peaks.
128
00:05:49,260 --> 00:05:51,873
It resembles two graphs placed side by side.
129
00:05:52,770 --> 00:05:53,970
A way to interpret this
130
00:05:53,970 --> 00:05:56,370
is having two distinct groups of players,
131
00:05:56,370 --> 00:05:58,530
one with a mean of around 21
132
00:05:58,530 --> 00:06:00,783
and another one with a mean of around 65.
133
00:06:02,610 --> 00:06:03,840
The reason behind this
134
00:06:03,840 --> 00:06:06,270
is the presence of goalkeepers in the game.
135
00:06:06,270 --> 00:06:08,670
The stats important for them are completely different
136
00:06:08,670 --> 00:06:11,460
from the stats essential for outfield players.
137
00:06:11,460 --> 00:06:12,840
Thus, it only makes sense
138
00:06:12,840 --> 00:06:14,910
that they will have distinctly lower values
139
00:06:14,910 --> 00:06:17,433
for many of the non-goalkeeper specific stats.
140
00:06:18,510 --> 00:06:22,290
If we examine a goalkeeping trait like GK diving,
141
00:06:22,290 --> 00:06:25,800
we will be able to see the division into types more clearly.
142
00:06:25,800 --> 00:06:28,440
We have two completely different clusters.
143
00:06:28,440 --> 00:06:30,660
The low value represents how outfield players
144
00:06:30,660 --> 00:06:32,220
would perform in goal...
145
00:06:32,220 --> 00:06:33,900
And the higher one represents
146
00:06:33,900 --> 00:06:35,943
the actual goalkeepers performance.
147
00:06:36,810 --> 00:06:38,880
If we only examine the goalies
148
00:06:38,880 --> 00:06:41,880
we will see the values are normally distributed once again
149
00:06:41,880 --> 00:06:44,223
so the game is indeed balanced.
150
00:06:45,840 --> 00:06:47,520
Great job.
151
00:06:47,520 --> 00:06:49,920
Another aspect which meets the game more enjoyable
152
00:06:49,920 --> 00:06:52,410
is creating a sense of realism.
153
00:06:52,410 --> 00:06:54,840
For instance, the young professional soccer players
154
00:06:54,840 --> 00:06:56,670
outnumber the veterans.
155
00:06:56,670 --> 00:06:57,920
Here are the reasons why.
156
00:06:58,860 --> 00:07:02,070
First, a significant number of promising young players
157
00:07:02,070 --> 00:07:03,510
suffer bad injuries,
158
00:07:03,510 --> 00:07:05,730
which significantly slow down their progress
159
00:07:05,730 --> 00:07:07,323
or even halt it altogether.
160
00:07:08,340 --> 00:07:10,650
Second, some are forced to retire
161
00:07:10,650 --> 00:07:12,510
while others simply decide to quit
162
00:07:12,510 --> 00:07:15,360
after spending too much time off the field.
163
00:07:15,360 --> 00:07:16,350
Last but not least,
164
00:07:16,350 --> 00:07:19,050
young players who are not given the opportunity to play
165
00:07:19,050 --> 00:07:20,850
often decide to go to university
166
00:07:20,850 --> 00:07:22,953
instead of pursuing a career in soccer.
167
00:07:24,090 --> 00:07:26,130
All of these factors lead to attrition
168
00:07:26,130 --> 00:07:29,850
which results in having fewer players above the age of 35
169
00:07:29,850 --> 00:07:31,683
than players below the age of 20.
170
00:07:32,880 --> 00:07:35,910
To make sure the game captures this aspect of the sport
171
00:07:35,910 --> 00:07:37,383
check out the age column.
172
00:07:39,030 --> 00:07:41,640
Once again, we can construct a histogram
173
00:07:41,640 --> 00:07:43,353
and set the bin size to one.
174
00:07:44,220 --> 00:07:46,110
We already demonstrated how to do this
175
00:07:46,110 --> 00:07:49,263
for the overall column, so just follow the same steps.
176
00:07:50,760 --> 00:07:52,710
By setting the bin width to one
177
00:07:52,710 --> 00:07:55,623
every age gets represented by a separate bar on the graph.
178
00:07:56,640 --> 00:07:58,980
Age is a discreet variable
179
00:07:58,980 --> 00:08:01,560
representing the age of each player.
180
00:08:01,560 --> 00:08:04,380
In addition, age has a minimum value of 16
181
00:08:04,380 --> 00:08:07,170
since the game only consists of first team players
182
00:08:07,170 --> 00:08:09,930
who have signed a professional contract.
183
00:08:09,930 --> 00:08:12,840
Thus, you can consider 16 as the starting point
184
00:08:12,840 --> 00:08:16,110
for any player who can sign a professional contract.
185
00:08:16,110 --> 00:08:18,330
You may view it as sort of an origin
186
00:08:18,330 --> 00:08:20,820
for a Poisson distribution.
187
00:08:20,820 --> 00:08:23,370
Then each bar in the graph would showcase the likelihood
188
00:08:23,370 --> 00:08:26,853
of a certain player within the data to be a specific age.
189
00:08:27,690 --> 00:08:29,880
Since a Poisson distribution is skewed,
190
00:08:29,880 --> 00:08:32,403
the younger players outnumber the older ones.
191
00:08:33,299 --> 00:08:36,780
As we mentioned before, that is also true in real life.
192
00:08:36,780 --> 00:08:39,179
Therefore, this creates an additional layer of realism
193
00:08:39,179 --> 00:08:41,309
to the game, and should make it more enjoyable
194
00:08:41,309 --> 00:08:42,363
for the customers.
195
00:08:44,910 --> 00:08:47,190
Do you remember that as a head project manager,
196
00:08:47,190 --> 00:08:48,780
apart from the development of the game,
197
00:08:48,780 --> 00:08:52,110
you also need to supervise the official release?
198
00:08:52,110 --> 00:08:55,353
One of its most important aspects is social media marketing.
199
00:08:56,190 --> 00:08:58,110
Now, imagine your main competitor
200
00:08:58,110 --> 00:09:00,210
is trying to expand their customer base
201
00:09:00,210 --> 00:09:02,160
by uploading free video previews
202
00:09:02,160 --> 00:09:06,090
of their new games each Monday prior to their launch.
203
00:09:06,090 --> 00:09:08,550
A month ago, you assigned one of the interns
204
00:09:08,550 --> 00:09:11,400
to keep track of the progress of their views.
205
00:09:11,400 --> 00:09:13,260
You can find the recorded viewership values
206
00:09:13,260 --> 00:09:16,983
in the Daily Views Excel file accompanying this lecture.
207
00:09:19,110 --> 00:09:20,880
Before we proceed with the analysis
208
00:09:20,880 --> 00:09:23,280
I recommend that you download and open the file.
209
00:09:26,130 --> 00:09:30,390
Okay, the Excel file contains a single sheet titled Views,
210
00:09:30,390 --> 00:09:31,983
which comprises two columns.
211
00:09:33,030 --> 00:09:36,090
The first one indicates the number of days post-release
212
00:09:36,090 --> 00:09:38,010
when the value was recorded.
213
00:09:38,010 --> 00:09:40,290
The second one shows the number of views
214
00:09:40,290 --> 00:09:41,523
since the last check.
215
00:09:42,630 --> 00:09:44,490
To get a better understanding of the data
216
00:09:44,490 --> 00:09:47,140
you would wanna see how viewership changes over time.
217
00:09:48,150 --> 00:09:51,363
In order to do so, you decide to graph the data set.
218
00:09:52,740 --> 00:09:54,570
The easiest way to do this
219
00:09:54,570 --> 00:09:58,713
is by marking columns A and B and clicking on insert.
220
00:09:59,820 --> 00:10:01,890
The next step is going to charts
221
00:10:01,890 --> 00:10:03,693
and selecting a scatter plot.
222
00:10:05,160 --> 00:10:08,100
Since most of the views occur within the first few days
223
00:10:08,100 --> 00:10:10,680
the graph starts off at a very high point
224
00:10:10,680 --> 00:10:12,333
and drops down rather quickly.
225
00:10:13,500 --> 00:10:16,530
We can see that daily views start around 100,000
226
00:10:16,530 --> 00:10:19,413
but fall to about 20,000 within a week.
227
00:10:20,700 --> 00:10:22,980
Once the new video is released and promoted
228
00:10:22,980 --> 00:10:25,770
viewership drops to around 10,000 per day
229
00:10:25,770 --> 00:10:28,743
and steadily decreases as it loses relevancy.
230
00:10:30,510 --> 00:10:32,310
By the time a second video has been released
231
00:10:32,310 --> 00:10:33,900
around the 14th day,
232
00:10:33,900 --> 00:10:36,450
the video gets barely a few thousand views per day.
233
00:10:37,560 --> 00:10:40,713
This kind of behavior resembles an exponential distribution.
234
00:10:41,940 --> 00:10:44,070
To check how accurate our assumption is,
235
00:10:44,070 --> 00:10:46,890
we can select the chart elements button on
236
00:10:46,890 --> 00:10:48,243
and select a trend line.
237
00:10:50,070 --> 00:10:52,920
If we do not specify the type of relationship we expect,
238
00:10:52,920 --> 00:10:54,930
Excel is going to assume a linear one
239
00:10:54,930 --> 00:10:57,003
and create a straight trend line.
240
00:10:58,800 --> 00:11:01,440
Since this distribution resembles an exponential one
241
00:11:01,440 --> 00:11:03,993
we pick an exponential trend line instead.
242
00:11:04,890 --> 00:11:08,640
The curve of the trend line fits the data points accurately.
243
00:11:08,640 --> 00:11:10,860
If we assume that the views in fact follow
244
00:11:10,860 --> 00:11:12,090
such a distribution,
245
00:11:12,090 --> 00:11:14,670
then the trend line would represent the PDF
246
00:11:14,670 --> 00:11:16,983
for a view occurring on a specific day.
247
00:11:18,270 --> 00:11:19,770
To test whether views really follow
248
00:11:19,770 --> 00:11:21,390
an exponential distribution,
249
00:11:21,390 --> 00:11:23,943
we should look at the CDF graph as well.
250
00:11:24,780 --> 00:11:26,970
We can graph the relationship between the first
251
00:11:26,970 --> 00:11:28,410
and third columns.
252
00:11:28,410 --> 00:11:31,620
Since total views represents the cumulative number of views
253
00:11:31,620 --> 00:11:33,510
up to a given period in time,
254
00:11:33,510 --> 00:11:37,740
it shows the aggregated number of views the video got.
255
00:11:37,740 --> 00:11:39,480
Let's create another scatter plot
256
00:11:39,480 --> 00:11:41,380
following the same steps as last time.
257
00:11:43,080 --> 00:11:46,170
We can notice that the curve goes up at a decreasing rate
258
00:11:46,170 --> 00:11:48,510
before eventually plateauing.
259
00:11:48,510 --> 00:11:51,660
This also matches our expectation of the CDF
260
00:11:51,660 --> 00:11:53,433
of an exponential distribution.
261
00:11:55,020 --> 00:11:57,720
Now that we know the viewership fluctuates each day
262
00:11:57,720 --> 00:12:02,040
we can state that each video loses relevancy rather quickly.
263
00:12:02,040 --> 00:12:04,470
This means that such a campaign is only beneficial
264
00:12:04,470 --> 00:12:05,970
in the short term.
265
00:12:05,970 --> 00:12:07,710
Therefore, you advise your marketing team
266
00:12:07,710 --> 00:12:10,770
to release similar videos only during the last month
267
00:12:10,770 --> 00:12:12,510
before launching the game.
268
00:12:12,510 --> 00:12:15,360
That way, all the videos will generate enough attention
269
00:12:15,360 --> 00:12:17,010
to make the game feel immense
270
00:12:17,010 --> 00:12:19,443
without losing customer interest in the process.
271
00:12:21,060 --> 00:12:22,143
Fantastic.
272
00:12:22,980 --> 00:12:24,810
In addition to competitor analysis
273
00:12:24,810 --> 00:12:27,423
you need to conduct some customer analysis as well.
274
00:12:29,100 --> 00:12:31,170
You certainly care which of your clients can afford
275
00:12:31,170 --> 00:12:33,420
to spend more on in-game purchases,
276
00:12:33,420 --> 00:12:35,520
so you send out a survey.
277
00:12:35,520 --> 00:12:36,990
One of the survey questions
278
00:12:36,990 --> 00:12:38,940
is whether the customer is a premium member
279
00:12:38,940 --> 00:12:41,583
of the official fan club of any team in the game.
280
00:12:42,720 --> 00:12:45,870
Since these fans are more devoted and financially capable
281
00:12:45,870 --> 00:12:48,360
you wanna find out if there is any other feature
282
00:12:48,360 --> 00:12:50,970
you could use to target this group.
283
00:12:50,970 --> 00:12:53,340
You decide to examine a small sample of the data
284
00:12:53,340 --> 00:12:55,350
which contains the age of the customer
285
00:12:55,350 --> 00:12:57,180
according to their EA sports account
286
00:12:57,180 --> 00:12:59,163
and whether they are a premium member.
287
00:13:01,140 --> 00:13:04,200
This data is stored in the Customer's Membership Excel file
288
00:13:04,200 --> 00:13:05,523
accompanying this lecture.
289
00:13:06,630 --> 00:13:09,570
After opening the file, we see two columns,
290
00:13:09,570 --> 00:13:11,310
one with numeric values
291
00:13:11,310 --> 00:13:13,653
and the other one with ones and zeros.
292
00:13:15,000 --> 00:13:17,970
The first column represents the age of the customer.
293
00:13:17,970 --> 00:13:20,883
The second one shows whether they are a member or not.
294
00:13:21,750 --> 00:13:24,180
If the customer is also a member of the fan club
295
00:13:24,180 --> 00:13:27,360
we put one in the second column.
296
00:13:27,360 --> 00:13:30,753
Alternatively, if they are not, we write down a zero.
297
00:13:31,650 --> 00:13:33,870
Now, if we construct the scatter plot
298
00:13:33,870 --> 00:13:36,780
we are going to see that most people under the age of 34
299
00:13:36,780 --> 00:13:38,370
don't have a membership.
300
00:13:38,370 --> 00:13:42,030
Whilst most people over the age of 34 do.
301
00:13:42,030 --> 00:13:44,130
Of course, there are exceptions to this rule,
302
00:13:44,130 --> 00:13:46,980
which is normal when we are dealing with real world data.
303
00:13:48,750 --> 00:13:50,430
That being said, the data looks
304
00:13:50,430 --> 00:13:52,890
like it follows a logistic distribution
305
00:13:52,890 --> 00:13:55,920
since the likelihood of having a membership sharply rises
306
00:13:55,920 --> 00:13:57,663
after nearing a specific value.
307
00:13:58,560 --> 00:14:01,290
In this case, we can think about 34
308
00:14:01,290 --> 00:14:03,753
as the location of the distribution.
309
00:14:04,890 --> 00:14:08,040
This leads us to believe that 34 is the approximate age
310
00:14:08,040 --> 00:14:11,130
at which customers have already reached financial stability
311
00:14:11,130 --> 00:14:13,203
and can afford higher membership fees.
312
00:14:14,400 --> 00:14:17,100
This insight suggests we should target customers
313
00:14:17,100 --> 00:14:21,063
above the age of 34 since they're more likely to spend more.
314
00:14:22,170 --> 00:14:23,580
One way to use this information
315
00:14:23,580 --> 00:14:27,660
is to release more expensive legend FIFA ultimate team cards
316
00:14:27,660 --> 00:14:30,303
for players who have retired in the past 20 years.
317
00:14:32,940 --> 00:14:34,920
Fantastic work.
318
00:14:34,920 --> 00:14:37,680
In this lecture, you were able to see numerous examples
319
00:14:37,680 --> 00:14:39,630
where knowing how to deal with distributions
320
00:14:39,630 --> 00:14:41,400
is truly beneficial.
321
00:14:41,400 --> 00:14:42,720
You developed an understanding
322
00:14:42,720 --> 00:14:44,940
of the practical aspect of probability,
323
00:14:44,940 --> 00:14:47,760
and discovered why knowing how the data is distributed
324
00:14:47,760 --> 00:14:50,910
can help us make correct business decisions.
325
00:14:50,910 --> 00:14:52,650
In the next section of the course
326
00:14:52,650 --> 00:14:55,290
we will further talk about how probability ties
327
00:14:55,290 --> 00:14:59,790
into other important fields, like finance and data science.
328
00:14:59,790 --> 00:15:02,223
See you all there and thanks for watching.
26118
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.