Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:05,420 --> 00:00:05,990
In this lesson.
2
00:00:05,990 --> 00:00:10,090
I wanted to take a little bit of time to speak about correlation analysis.
3
00:00:10,100 --> 00:00:15,380
So correlation analysis is when you've got two variables and you're wanting to see if there is a correlation
4
00:00:15,380 --> 00:00:16,780
between the two variables.
5
00:00:16,790 --> 00:00:20,230
So if one variable goes up, does the other variable also go up?
6
00:00:20,240 --> 00:00:24,890
Sometimes you can have a negative correlation as well where you can have an item going up, but the
7
00:00:24,890 --> 00:00:26,590
other variable then goes down.
8
00:00:26,600 --> 00:00:30,920
So we're going to have a look at our scatter chart and just see some of the functionality that we've
9
00:00:30,920 --> 00:00:33,920
got available to us when it comes to analyzing your data.
10
00:00:34,100 --> 00:00:38,390
Again, you might find this very useful when you when you're working with your own data and you want
11
00:00:38,390 --> 00:00:41,810
to take to variables and see if there is actually correlation between the two.
12
00:00:41,840 --> 00:00:45,770
Now, please note in this training data it is actually quite highly correlated.
13
00:00:45,770 --> 00:00:48,260
So you're going to see that there is a very strong correlation.
14
00:00:48,260 --> 00:00:52,520
But I just wanted to really show you how the scatterplot can work and also some of the benefits you
15
00:00:52,520 --> 00:00:54,470
can get from this type of analysis.
16
00:00:54,770 --> 00:00:59,720
So let's move on and we're going to, first of all, go across the scatter chart and we're going to
17
00:00:59,720 --> 00:01:02,520
pick this visualization like we've done previously.
18
00:01:02,540 --> 00:01:06,620
What we're going to do is we're just going to select it, make a nice and big, and you see that you
19
00:01:06,620 --> 00:01:11,600
get your values X and Y axis as options that you might want to work with.
20
00:01:11,640 --> 00:01:15,650
Now, in this case, what we're going to do is we're going to take our X axes and we're going to say
21
00:01:15,650 --> 00:01:18,500
we want to see our sales values on there.
22
00:01:18,830 --> 00:01:22,520
Now, what you can see is it's taken one value, which is my total sum of sale.
23
00:01:22,790 --> 00:01:26,270
Then in the Y axis, we want to say we want to look at our profit.
24
00:01:26,690 --> 00:01:32,000
And now what it's done is it's now taken one data point and it's taken the total sales and the total
25
00:01:32,000 --> 00:01:35,720
profit, and it's created that data point for everything that is in the data set.
26
00:01:36,020 --> 00:01:39,050
So really what we want to do from here is we want to break this out a bit.
27
00:01:39,050 --> 00:01:41,470
We want to be able to see this from different aspect.
28
00:01:41,480 --> 00:01:43,910
So what we're going to look at is our product pedigree.
29
00:01:43,910 --> 00:01:46,970
So we're going to take our product category, we're going to drop it into our values.
30
00:01:46,970 --> 00:01:51,650
So now you can see that it's actually created a data point for each of my product categories that we
31
00:01:51,650 --> 00:01:52,070
have.
32
00:01:52,070 --> 00:01:57,560
And you can see now it's taken the sum of the sales and the sum of the profit, and it's basically now
33
00:01:57,560 --> 00:02:00,050
plotted those into this graph.
34
00:02:00,500 --> 00:02:05,300
And what you can see from this is you actually do have a sort of a 45 degree line going upwards, which
35
00:02:05,300 --> 00:02:10,520
shows that the strong correlation because what it means is that as your sales is going up, so is your
36
00:02:10,520 --> 00:02:11,240
profit.
37
00:02:11,360 --> 00:02:14,210
So that's basically what we're seeing with this correlation.
38
00:02:14,210 --> 00:02:19,250
So over here we've got quite high sales and we can also see that there is quite high profit.
39
00:02:19,730 --> 00:02:23,960
And that's really what we're looking at from the scatter chart is to understand is there a correlation
40
00:02:23,960 --> 00:02:25,700
from these two variables?
41
00:02:25,700 --> 00:02:27,020
And quite clearly there is.
42
00:02:27,320 --> 00:02:28,820
We could look at this in more detail.
43
00:02:28,820 --> 00:02:33,560
For example, you could take more data points so we could say, let's have a look at our product subcategory.
44
00:02:33,770 --> 00:02:37,970
So you can see now we've got a product subcategory and you can see again that there's very strong correlation
45
00:02:37,970 --> 00:02:38,990
between these two.
46
00:02:39,290 --> 00:02:44,480
Now, if we had values that were set over here, then this is probably items that you might want to
47
00:02:44,480 --> 00:02:49,250
look at in more detail because this is items where you got a large amount of sales, but you've got
48
00:02:49,250 --> 00:02:50,390
very little profit.
49
00:02:50,570 --> 00:02:55,550
However, if you have items that are over here, this would be good for you because the sales value
50
00:02:55,550 --> 00:02:57,830
is small, but the profit value is quite high.
51
00:02:57,830 --> 00:03:02,180
So you obviously making quite high profit from low sales values there.
52
00:03:02,910 --> 00:03:08,530
So again, you can see that we've got these dots now, each one representing a different product subcategory.
53
00:03:08,550 --> 00:03:12,780
Now we know that we've got quite a lot of product names and you could actually pop this in there as
54
00:03:12,780 --> 00:03:13,260
well.
55
00:03:13,290 --> 00:03:17,580
Because remember, what we're trying to find here is we're just looking for anomalies or we're looking
56
00:03:17,580 --> 00:03:19,440
for things that don't fit into the trend.
57
00:03:19,470 --> 00:03:21,770
And again, most of these are actually fitting into the trend.
58
00:03:21,780 --> 00:03:26,730
Again, we can see that there's a 45 degree line basically showing the options here.
59
00:03:27,030 --> 00:03:29,540
That's the first part that we can look at from here.
60
00:03:29,550 --> 00:03:32,310
There are some formatting options that you could ever look at here.
61
00:03:32,310 --> 00:03:38,790
Very much kind of like what we've seen previously, and you've got your x axis, your y axis options
62
00:03:38,790 --> 00:03:41,100
that are pretty much exactly the same.
63
00:03:41,100 --> 00:03:42,380
You've got grid lines.
64
00:03:42,390 --> 00:03:43,860
We've also got markers.
65
00:03:43,980 --> 00:03:49,110
If you want to put a marker onto onto your options, it probably wouldn't make too much sense here.
66
00:03:49,860 --> 00:03:51,210
This category label.
67
00:03:51,210 --> 00:03:52,770
I want to leave until a little bit later.
68
00:03:52,770 --> 00:03:54,720
We're going to look at something called the bubble plot.
69
00:03:54,720 --> 00:03:59,610
And I'm going to show you how you can change the settings just to allow us to see correct categories
70
00:03:59,610 --> 00:04:00,240
on this.
71
00:04:00,720 --> 00:04:04,980
In terms of general as well, you see that there's not too much different new properties title effects,
72
00:04:04,980 --> 00:04:06,420
the ones that we're used to.
73
00:04:06,750 --> 00:04:12,660
However, if we go across to our analytics, you will see that there is some analytics that we can see
74
00:04:12,660 --> 00:04:13,230
from here.
75
00:04:13,230 --> 00:04:18,329
So we can pick a trend line and this can be very useful to see what that actual trend is.
76
00:04:18,329 --> 00:04:23,160
And as I'm saying, we've got a 45 degree line which is showing me that as the sales get higher, so
77
00:04:23,160 --> 00:04:24,060
does the profit.
78
00:04:24,560 --> 00:04:25,680
Now this can be quite useful.
79
00:04:25,680 --> 00:04:32,550
If you wanted to pair this, for example, with say that you wanted to have a slicer now because this
80
00:04:32,550 --> 00:04:38,640
is my product names, what I could do is I could actually use my product category and I could see how
81
00:04:38,640 --> 00:04:41,700
all my different product names and my product categories performing.
82
00:04:41,700 --> 00:04:47,310
So if I drop that in there, I could see if there's any noticeable difference between the trends between
83
00:04:47,310 --> 00:04:48,750
these different product categories.
84
00:04:48,750 --> 00:04:53,940
As you can see in this data, pretty much there is an all pretty good sales goes up, so does profit
85
00:04:53,940 --> 00:04:54,960
go up as well?
86
00:04:55,380 --> 00:04:59,790
But that's just one of the options that you could use, is to pair it with a slicer and to be able to
87
00:04:59,790 --> 00:05:00,960
use your trend line.
88
00:05:01,140 --> 00:05:03,240
You will see that there's other options as well.
89
00:05:03,390 --> 00:05:07,350
You could turn on your symmetry shading and would allow you to play with that.
90
00:05:07,770 --> 00:05:12,870
Can see looking at your upper shading, lower shading options ratio lines as well.
91
00:05:12,880 --> 00:05:17,130
You want to have a ratio line in there that shows you the ratio that is on there.
92
00:05:17,400 --> 00:05:21,030
So as I say, these these are options for you to be able to play with.
93
00:05:21,150 --> 00:05:25,710
Now I want to do want to move to is just another option, which is called a bubble plot.
94
00:05:26,650 --> 00:05:28,980
So I'm going to remove the trend line just to make this simpler.
95
00:05:28,990 --> 00:05:33,640
And what we're going to do is we're going to go back to our product category because a bubble plot,
96
00:05:33,940 --> 00:05:38,230
because we're going to use the size of the bubbles to represent something.
97
00:05:38,470 --> 00:05:41,350
So basically, in this case, we're going to use order quantity.
98
00:05:41,650 --> 00:05:44,130
So let's pop our order quantity into our size.
99
00:05:44,140 --> 00:05:49,000
And now you're going to see that the size will now change according to how much order quantity.
100
00:05:49,000 --> 00:05:50,950
So basically we're using a third variable.
101
00:05:51,190 --> 00:05:55,490
But what we can do is we can go to our formatting and you can also turn your category label on here.
102
00:05:55,510 --> 00:06:01,810
So basically now you can see the name of the product category and you can basically now see the different
103
00:06:01,810 --> 00:06:07,030
product categories and with the labels on and also with the bubble size, that will show you your order
104
00:06:07,030 --> 00:06:07,540
quantity.
105
00:06:07,900 --> 00:06:09,760
So this is called the bubble plot.
106
00:06:10,180 --> 00:06:16,210
What you do get is also an option that you can actually show a visualization of this changing of a ton.
107
00:06:16,960 --> 00:06:17,100
Okay.
108
00:06:17,140 --> 00:06:21,880
So one of the options that we've got is we can take our order date and we can actually drop it into
109
00:06:21,880 --> 00:06:23,020
our player axis.
110
00:06:23,200 --> 00:06:27,940
So when we drop it in here and press play, it will actually show you day by day.
111
00:06:27,970 --> 00:06:31,540
Now what is actually happening with our products?
112
00:06:36,280 --> 00:06:39,850
Now, another option you could try with this is to try drop down.
113
00:06:43,260 --> 00:06:48,570
Go to our data hierarchy and you'll see that it's currently can be changed to year.
114
00:06:48,600 --> 00:06:50,430
So this is just a different view.
115
00:06:50,670 --> 00:06:56,160
And with this now we press and play and then you can sort of see year by year what the changes have
116
00:06:56,160 --> 00:06:56,640
been.
117
00:06:57,520 --> 00:06:59,380
So as I said, this is just a different view.
118
00:06:59,410 --> 00:07:02,900
It just allows you to do a bit of correlation analysis with your data.
119
00:07:02,920 --> 00:07:07,810
You can look at different values in your X and Y axis and see how they're correlated.
120
00:07:07,810 --> 00:07:12,490
And then there are the tools such as the player axis and the size where you can actually ever look at
121
00:07:12,490 --> 00:07:14,470
your data from a different perspective.
122
00:07:15,050 --> 00:07:15,250
Okay.
123
00:07:15,250 --> 00:07:16,950
We're going to conclude the lesson there.
124
00:07:17,110 --> 00:07:18,070
We'll see you in the next one.
12926
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.