Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,470 --> 00:00:02,910
In the last video, you saw
2
00:00:02,910 --> 00:00:04,695
what is unsupervised learning,
3
00:00:04,695 --> 00:00:08,025
and one type of unsupervised
learning called clustering.
4
00:00:08,025 --> 00:00:11,220
Let's give a slightly
more formal definition
5
00:00:11,220 --> 00:00:12,645
of unsupervised learning
6
00:00:12,645 --> 00:00:14,160
and take a quick look at
7
00:00:14,160 --> 00:00:15,405
some other types of
8
00:00:15,405 --> 00:00:17,640
unsupervised learning
other than clustering.
9
00:00:17,640 --> 00:00:19,365
Whereas in supervised learning,
10
00:00:19,365 --> 00:00:21,915
the data comes with
both inputs x and
11
00:00:21,915 --> 00:00:25,230
input labels y, in
unsupervised learning,
12
00:00:25,230 --> 00:00:27,390
the data comes only with inputs
13
00:00:27,390 --> 00:00:29,850
x but not output labels y,
14
00:00:29,850 --> 00:00:32,085
and the algorithm has to find
15
00:00:32,085 --> 00:00:34,140
some structure or some pattern
16
00:00:34,140 --> 00:00:36,600
or something interesting
in the data.
17
00:00:36,600 --> 00:00:39,440
We're seeing just one example of
18
00:00:39,440 --> 00:00:43,100
unsupervised learning called
a clustering algorithm,
19
00:00:43,100 --> 00:00:45,805
which groups similar
data points together.
20
00:00:45,805 --> 00:00:48,740
In this specialization,
you'll learn about
21
00:00:48,740 --> 00:00:50,480
clustering as well as
22
00:00:50,480 --> 00:00:53,390
two other types of
unsupervised learning.
23
00:00:53,390 --> 00:00:56,555
One is called anomaly detection,
24
00:00:56,555 --> 00:00:59,870
which is used to
detect unusual events.
25
00:00:59,870 --> 00:01:02,570
This turns out to be
really important for
26
00:01:02,570 --> 00:01:05,270
fraud detection in
the financial system,
27
00:01:05,270 --> 00:01:08,690
where unusual events,
unusual transactions could
28
00:01:08,690 --> 00:01:13,030
be signs of fraud and for
many other applications.
29
00:01:13,030 --> 00:01:17,105
You also learn about
dimensionality reduction.
30
00:01:17,105 --> 00:01:18,425
This lets you take
31
00:01:18,425 --> 00:01:21,860
a big data-set and almost
magically compress it
32
00:01:21,860 --> 00:01:24,080
to a much smaller data-set while
33
00:01:24,080 --> 00:01:26,935
losing as little
information as possible.
34
00:01:26,935 --> 00:01:29,360
In case anomaly detection and
35
00:01:29,360 --> 00:01:31,370
dimensionality
reduction don't seem
36
00:01:31,370 --> 00:01:33,020
to make too much
sense to you yet.
37
00:01:33,020 --> 00:01:34,670
Don't worry about
it. We'll get to
38
00:01:34,670 --> 00:01:37,120
this later in the
specialization.
39
00:01:37,120 --> 00:01:39,050
Now, I'd like to ask you
40
00:01:39,050 --> 00:01:43,040
another question to help you
check your understanding,
41
00:01:43,040 --> 00:01:45,110
and no pressure, if
you don't get it
42
00:01:45,110 --> 00:01:47,620
right on the first
try, is totally fine.
43
00:01:47,620 --> 00:01:50,780
Please select any
of the following
44
00:01:50,780 --> 00:01:54,200
that you think are examples
of unsupervised learning.
45
00:01:54,200 --> 00:01:57,260
Two are unsupervised
examples and two
46
00:01:57,260 --> 00:02:01,650
are supervised learning
examples. Please take a look.
47
00:02:02,930 --> 00:02:06,670
Maybe you remember the
spam filtering problem.
48
00:02:06,670 --> 00:02:09,085
If you have labeled data you now
49
00:02:09,085 --> 00:02:11,800
label as spam or
non-spam e-mail,
50
00:02:11,800 --> 00:02:15,275
you can treat this as a
supervised learning problem.
51
00:02:15,275 --> 00:02:18,175
The second example, the
news story example.
52
00:02:18,175 --> 00:02:20,620
That's exactly the
Google News and
53
00:02:20,620 --> 00:02:23,690
tangible example that you
saw in the last video.
54
00:02:23,690 --> 00:02:26,200
You can approach that using
55
00:02:26,200 --> 00:02:29,515
a clustering algorithm to
group news articles together.
56
00:02:29,515 --> 00:02:32,465
That we'll use
unsupervised learning.
57
00:02:32,465 --> 00:02:35,020
The market segmentation example
58
00:02:35,020 --> 00:02:36,810
that I talked about a
little bit earlier.
59
00:02:36,810 --> 00:02:38,350
You can do that as
60
00:02:38,350 --> 00:02:41,680
an unsupervised learning
problem as well because you can
61
00:02:41,680 --> 00:02:44,195
give your algorithm
some data and ask it
62
00:02:44,195 --> 00:02:47,600
to discover market
segments automatically.
63
00:02:47,600 --> 00:02:51,620
The final example on
diagnosing diabetes.
64
00:02:51,620 --> 00:02:53,420
Well, actually that's a lot like
65
00:02:53,420 --> 00:02:55,100
our breast cancer example
66
00:02:55,100 --> 00:02:57,775
from the supervised
learning videos.
67
00:02:57,775 --> 00:03:00,980
Only instead of benign
or malignant tumors,
68
00:03:00,980 --> 00:03:04,205
we instead have diabetes
or not diabetes.
69
00:03:04,205 --> 00:03:07,130
You can approach this as a
supervised learning problem,
70
00:03:07,130 --> 00:03:08,330
just like we did for the
71
00:03:08,330 --> 00:03:10,990
breast tumor
classification problem.
72
00:03:10,990 --> 00:03:14,210
Even though in the last video,
73
00:03:14,210 --> 00:03:17,660
we've talked mainly about
clustering, in later videos,
74
00:03:17,660 --> 00:03:20,720
in this specialization, we'll
dive much more deeply into
75
00:03:20,720 --> 00:03:25,135
anomaly detection and
dimensionality reduction as well.
76
00:03:25,135 --> 00:03:27,885
That's unsupervised learning.
77
00:03:27,885 --> 00:03:29,630
Before we wrap up this section,
78
00:03:29,630 --> 00:03:30,890
I want to share
with you something
79
00:03:30,890 --> 00:03:32,540
that I find really exciting,
80
00:03:32,540 --> 00:03:34,340
and useful, which is the use of
81
00:03:34,340 --> 00:03:36,545
Jupyter Notebooks in
machine learning.
82
00:03:36,545 --> 00:03:39,510
Let's take a look at
that in the next video.5946
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.