Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
0
00:00:01,040 --> 00:00:02,359
[Autogenerated] in this clip, Let's first
1
00:00:02,359 --> 00:00:04,730
more to discussing feature combination.
2
00:00:04,730 --> 00:00:06,719
Now it's quite possible that the raw data
3
00:00:06,719 --> 00:00:08,500
that you're working with contains very
4
00:00:08,500 --> 00:00:10,769
granular information, which doesn't have
5
00:00:10,769 --> 00:00:13,210
much predictive power. Feature combination
6
00:00:13,210 --> 00:00:15,349
may involve aggregating and bringing
7
00:00:15,349 --> 00:00:17,420
features together to get a future with
8
00:00:17,420 --> 00:00:20,440
more political power. Now you might find
9
00:00:20,440 --> 00:00:22,660
it in the real bold. Some features
10
00:00:22,660 --> 00:00:24,510
naturally work better when they're
11
00:00:24,510 --> 00:00:27,769
considered together. Ah, feature by itself
12
00:00:27,769 --> 00:00:30,420
may not contain much information, but then
13
00:00:30,420 --> 00:00:32,609
considered in conjunction with another
14
00:00:32,609 --> 00:00:35,000
feature. The teachers and combination
15
00:00:35,000 --> 00:00:37,179
might contain information that is relevant
16
00:00:37,179 --> 00:00:39,240
to your model. It's quite possible that
17
00:00:39,240 --> 00:00:42,299
the original feature might be to row or
18
00:00:42,299 --> 00:00:45,039
two granular. Bringing features together
19
00:00:45,039 --> 00:00:47,289
can help improve the predictive power of
20
00:00:47,289 --> 00:00:49,229
features. Let's say you're building a
21
00:00:49,229 --> 00:00:51,039
machine learning model tow. Predict
22
00:00:51,039 --> 00:00:53,100
traffic patterns in a city. Let's say the
23
00:00:53,100 --> 00:00:56,210
city is a bang lord. Now you might get
24
00:00:56,210 --> 00:00:58,359
information from the day off the beat that
25
00:00:58,359 --> 00:01:00,729
it is. You might also get information from
26
00:01:00,729 --> 00:01:02,929
the time off the But when taken in
27
00:01:02,929 --> 00:01:05,540
conjunction, when you use a future cross
28
00:01:05,540 --> 00:01:07,780
day off the week plus time off the day,
29
00:01:07,780 --> 00:01:10,219
you might get a resulting feature that has
30
00:01:10,219 --> 00:01:12,439
more predictive power If you're looking at
31
00:01:12,439 --> 00:01:15,480
traffic at Friday evening at 60 m, you
32
00:01:15,480 --> 00:01:17,219
know it's going to be terrible. But if
33
00:01:17,219 --> 00:01:19,420
you're looking at the same time, 60 M on a
34
00:01:19,420 --> 00:01:23,239
Sunday traffic is quite likely not as bad.
35
00:01:23,239 --> 00:01:25,250
Let's say you want to combine features
36
00:01:25,250 --> 00:01:27,439
together to predict temperature. You can
37
00:01:27,439 --> 00:01:29,459
take into account the current season
38
00:01:29,459 --> 00:01:31,200
whether it's spring, summer, fall or
39
00:01:31,200 --> 00:01:33,519
winter. You can also take into account the
40
00:01:33,519 --> 00:01:35,840
time off the but when taken together, you
41
00:01:35,840 --> 00:01:38,530
might find that the feature combination is
42
00:01:38,530 --> 00:01:41,129
more than the sum of the parts. And
43
00:01:41,129 --> 00:01:42,959
finally, let's move on to the last
44
00:01:42,959 --> 00:01:44,420
component that we'll discuss with in
45
00:01:44,420 --> 00:01:46,879
future engineering. At this dimensionality
46
00:01:46,879 --> 00:01:49,430
reduction. When you're working with data
47
00:01:49,430 --> 00:01:50,950
in the real world, you will find that a
48
00:01:50,950 --> 00:01:53,060
common problem toe have is that you have
49
00:01:53,060 --> 00:01:55,909
too much data. This is a curse and not a
50
00:01:55,909 --> 00:01:57,519
blessing, and it's often referred to as
51
00:01:57,519 --> 00:01:59,920
the cost of dimensionality. This is where
52
00:01:59,920 --> 00:02:02,239
he would apply pre processing algorithms
53
00:02:02,239 --> 00:02:05,250
to reduce the complexity off raw features,
54
00:02:05,250 --> 00:02:07,790
and the specific aim off these algorithms
55
00:02:07,790 --> 00:02:10,340
is to reduce the number of input features
56
00:02:10,340 --> 00:02:12,639
so you have fewer features to work with
57
00:02:12,639 --> 00:02:14,900
having too many features to work within
58
00:02:14,900 --> 00:02:17,120
your data is referred to us the cost of
59
00:02:17,120 --> 00:02:19,229
dimensionality, and it leads to several
60
00:02:19,229 --> 00:02:21,020
problems. You have problems visualizing
61
00:02:21,020 --> 00:02:23,169
your data. You encounter problems during
62
00:02:23,169 --> 00:02:25,439
training as Celeste hearing prediction.
63
00:02:25,439 --> 00:02:27,210
When you work with higher dimensionality
64
00:02:27,210 --> 00:02:30,009
data machine learning models find it hard
65
00:02:30,009 --> 00:02:32,759
to find patterns within your data leading
66
00:02:32,759 --> 00:02:34,919
toe poor quality models over fitted
67
00:02:34,919 --> 00:02:37,360
models. Forfeited models are those that
68
00:02:37,360 --> 00:02:40,009
perform well in training but poorly in the
69
00:02:40,009 --> 00:02:42,689
real world. In production, dimensionality
70
00:02:42,689 --> 00:02:45,699
reduction explicitly aims to solve the
71
00:02:45,699 --> 00:02:48,319
coast of dimensionality while preserving
72
00:02:48,319 --> 00:02:50,449
as much information as possible from the
73
00:02:50,449 --> 00:02:51,990
underlying features. You don't want to
74
00:02:51,990 --> 00:02:54,530
lose too much information. Dimensionality
75
00:02:54,530 --> 00:02:56,759
reduction is a form of unsupervised
76
00:02:56,759 --> 00:02:59,030
learning. You're working with an unlabeled
77
00:02:59,030 --> 00:03:01,710
corpus of data based on the kind of data
78
00:03:01,710 --> 00:03:03,659
that you're working with. There are many
79
00:03:03,659 --> 00:03:05,120
different techniques that you can use for
80
00:03:05,120 --> 00:03:07,009
dimensionality reduction. When you're
81
00:03:07,009 --> 00:03:09,159
working with linear data, you can choose
82
00:03:09,159 --> 00:03:10,930
principle components analysis, which
83
00:03:10,930 --> 00:03:13,939
involves Lee orienting your original data
84
00:03:13,939 --> 00:03:15,659
so that their projected along newer,
85
00:03:15,659 --> 00:03:18,150
better axes. If you're working with non
86
00:03:18,150 --> 00:03:20,120
linear data, you can apply manifold
87
00:03:20,120 --> 00:03:22,139
learning techniques. This involved
88
00:03:22,139 --> 00:03:24,840
unrolling complex forms off Dayton and
89
00:03:24,840 --> 00:03:27,150
higher access toe express data in a
90
00:03:27,150 --> 00:03:29,840
simpler form with lower dimensionality.
91
00:03:29,840 --> 00:03:32,060
Manafort Island techniques are similar toe
92
00:03:32,060 --> 00:03:34,120
unrolling a carpet so that it's
93
00:03:34,120 --> 00:03:36,569
represented in two dimensions. Leading
94
00:03:36,569 --> 00:03:39,610
semantic analysis is a topic modeling and
95
00:03:39,610 --> 00:03:41,370
dimensionality reduction technique that
96
00:03:41,370 --> 00:03:44,319
you can use to work with text data if
97
00:03:44,319 --> 00:03:46,389
you're working with images auto, including
98
00:03:46,389 --> 00:03:51,000
confined, efficient, lower dimensionality representation for your images.
7751
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.