Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:01,480 --> 00:00:03,880
In this lesson, we're going
to take a look at the concept
2
00:00:03,880 --> 00:00:05,890
of data aggregation.
3
00:00:05,890 --> 00:00:08,830
That is taking large
amounts of data
4
00:00:08,830 --> 00:00:11,650
and using some
computational method
5
00:00:11,650 --> 00:00:15,460
to arrive at a single value
from that group of data.
6
00:00:15,460 --> 00:00:19,070
And to do this, we're going
to use multirow functions.
7
00:00:19,070 --> 00:00:22,300
And multirow functions will
take a number of values
8
00:00:22,300 --> 00:00:25,960
and return only one
value in return.
9
00:00:25,960 --> 00:00:28,060
So that would be something
like, for instance,
10
00:00:28,060 --> 00:00:31,490
and we'll look at how to do
this, how to take an average.
11
00:00:31,490 --> 00:00:34,300
So an average takes a
large number of values
12
00:00:34,300 --> 00:00:36,220
and returns one value.
13
00:00:36,220 --> 00:00:38,840
And that's what
multirow functions do.
14
00:00:38,840 --> 00:00:42,910
And those are an important part
of data aggregation in SQL.
15
00:00:42,910 --> 00:00:46,730
Let's connect to our database
and take a look at this.
16
00:00:46,730 --> 00:00:48,730
So we've got our connection.
17
00:00:48,730 --> 00:00:51,280
Let's start with
probably the simplest
18
00:00:51,280 --> 00:00:54,880
of the multirow functions.
19
00:00:54,880 --> 00:00:58,780
Let's say select star from EMP.
20
00:00:58,780 --> 00:01:01,660
And we have our rows returned.
21
00:01:01,660 --> 00:01:04,270
Now, SQL Developer
gives this the benefit
22
00:01:04,270 --> 00:01:07,900
of these row numbers that
tell us how many rows we have.
23
00:01:07,900 --> 00:01:10,270
But if we didn't have
that or we wanted
24
00:01:10,270 --> 00:01:13,600
to know the number of rows
that met a certain condition,
25
00:01:13,600 --> 00:01:16,650
we could use the count
multirow function.
26
00:01:20,020 --> 00:01:24,210
So here we're saying count
star, star meaning all the rows
27
00:01:24,210 --> 00:01:26,270
from the EMP table.
28
00:01:26,270 --> 00:01:29,670
We might change this
to dept, count star
29
00:01:29,670 --> 00:01:31,500
from the dept table--
30
00:01:31,500 --> 00:01:33,910
that has four rows--
31
00:01:33,910 --> 00:01:39,400
or the bonus table,
which has only one.
32
00:01:39,400 --> 00:01:42,910
So count star is going
to allow us to count
33
00:01:42,910 --> 00:01:46,100
the number of rows in a table.
34
00:01:46,100 --> 00:01:51,210
We could also do this
with a limiting condition,
35
00:01:51,210 --> 00:01:54,960
where job equals clerk.
36
00:01:54,960 --> 00:02:00,310
So how many people in the
EMP table have the job clerk?
37
00:02:00,310 --> 00:02:02,120
There's four.
38
00:02:02,120 --> 00:02:05,960
How many are managers?
39
00:02:05,960 --> 00:02:06,930
Three.
40
00:02:06,930 --> 00:02:08,970
So count is a
multirow function that
41
00:02:08,970 --> 00:02:13,560
takes in all of that data in
the column, the job column,
42
00:02:13,560 --> 00:02:17,300
and returns a count
based on that.
43
00:02:17,300 --> 00:02:19,470
Let's look at another
multirow function,
44
00:02:19,470 --> 00:02:22,200
which we mentioned
earlier, average.
45
00:02:24,970 --> 00:02:31,580
So sal being the salary column,
and we can take a look at that,
46
00:02:31,580 --> 00:02:33,670
here's the list of
the salaries we have.
47
00:02:33,670 --> 00:02:37,160
Well, what's the average
salary in that list?
48
00:02:37,160 --> 00:02:41,530
Well, we use the average
function to calculate that.
49
00:02:41,530 --> 00:02:45,810
And notice that it brings
back a long decimal.
50
00:02:45,810 --> 00:02:49,530
Here's where we could put a
single-row function in concert
51
00:02:49,530 --> 00:02:53,840
to maybe the 2 spot, just
to shorten it a little bit.
52
00:02:56,440 --> 00:03:00,010
Notice this brings in the
idea of a nested function.
53
00:03:00,010 --> 00:03:03,940
So here we have the
average function.
54
00:03:03,940 --> 00:03:07,710
And it's nested inside
the round function.
55
00:03:07,710 --> 00:03:09,480
And we can do this
because average
56
00:03:09,480 --> 00:03:11,670
will bring back a single row.
57
00:03:11,670 --> 00:03:14,160
And round is a
single-row function.
58
00:03:14,160 --> 00:03:19,820
So this calculates the average
salary in our EMP table.
59
00:03:19,820 --> 00:03:22,060
Let's take a look at
another function that
60
00:03:22,060 --> 00:03:24,160
works against the EMP table.
61
00:03:24,160 --> 00:03:27,370
We'll look at sal again.
62
00:03:27,370 --> 00:03:30,040
And we'll use the min function.
63
00:03:30,040 --> 00:03:32,830
So min is a multirow
function that
64
00:03:32,830 --> 00:03:35,380
looks at all the values
in the sal column
65
00:03:35,380 --> 00:03:38,150
and brings back
the minimum value.
66
00:03:38,150 --> 00:03:40,510
And so if we were to
look at all the data,
67
00:03:40,510 --> 00:03:45,720
we'll see, in fact, that
800 is the minimum salary.
68
00:03:45,720 --> 00:03:51,790
We can also use the max function
to find the largest value, that
69
00:03:51,790 --> 00:03:53,930
being 5,000.
70
00:03:53,930 --> 00:03:58,890
And again, there's
the max value.
71
00:03:58,890 --> 00:04:02,010
What if we wanted to
know the total salary
72
00:04:02,010 --> 00:04:04,900
for all of our employees?
73
00:04:04,900 --> 00:04:06,820
Then we would use
the sum function.
74
00:04:10,470 --> 00:04:16,130
Again, multirow function takes
many values, returns one.
75
00:04:16,130 --> 00:04:17,590
So we're starting
to see how we can
76
00:04:17,590 --> 00:04:20,830
use data aggregation to
answer some questions that we
77
00:04:20,830 --> 00:04:21,970
have about the database.
78
00:04:21,970 --> 00:04:23,140
And these are simple.
79
00:04:23,140 --> 00:04:26,290
But they do show us how we
can use the SQL programming
80
00:04:26,290 --> 00:04:29,110
language to answer questions.
81
00:04:29,110 --> 00:04:31,900
Well, let's take this a
little bit further and add
82
00:04:31,900 --> 00:04:34,690
a new clause to our statement.
83
00:04:38,860 --> 00:04:40,710
So here's our data.
84
00:04:40,710 --> 00:04:46,530
Let's say we want to know the
average salary of our managers.
85
00:04:50,720 --> 00:04:51,950
Here's the average function.
86
00:04:55,350 --> 00:04:57,180
Managers is in the job column.
87
00:05:01,240 --> 00:05:05,710
So this shows us how we can
use the group by command.
88
00:05:05,710 --> 00:05:08,980
So group by is a clause
in our select statement
89
00:05:08,980 --> 00:05:12,730
that we can use to get
average salaries in this case.
90
00:05:12,730 --> 00:05:15,490
So we're finding the
average salary for each job.
91
00:05:15,490 --> 00:05:17,350
And we wanted to know manager.
92
00:05:17,350 --> 00:05:18,490
And that's here.
93
00:05:18,490 --> 00:05:21,090
And so we could apply a
round to that, if we wanted.
94
00:05:21,090 --> 00:05:23,830
But that shows us that
the average salary
95
00:05:23,830 --> 00:05:27,790
for each one of the
employees in our EMP table.
96
00:05:27,790 --> 00:05:33,820
So the group by clause is always
after the from clause and the
97
00:05:33,820 --> 00:05:35,080
where clause.
98
00:05:35,080 --> 00:05:38,320
So we're actually leveraging
a data aggregation
99
00:05:38,320 --> 00:05:43,730
function, a multirow function,
with the group by statement.
100
00:05:43,730 --> 00:05:48,260
Let's see what else, other kinds
of questions, we can ask here.
101
00:05:51,180 --> 00:05:55,040
Let's find the minimum
salary in a department.
102
00:06:02,650 --> 00:06:05,640
We list out our departments
and then the minimum salary
103
00:06:05,640 --> 00:06:08,570
in each one of the departments.
104
00:06:08,570 --> 00:06:12,680
Let's say, for instance, that
we forgot about the group
105
00:06:12,680 --> 00:06:14,240
by clause.
106
00:06:14,240 --> 00:06:17,890
And we think, well, what I want
to know is the minimum salary--
107
00:06:17,890 --> 00:06:21,810
I want to change it to
the maximum salary--
108
00:06:21,810 --> 00:06:25,940
in the EMP table for
different departments.
109
00:06:25,940 --> 00:06:28,630
Click this.
110
00:06:28,630 --> 00:06:33,460
And we'll get this error, not
a single group group function.
111
00:06:33,460 --> 00:06:36,370
And what that means
is this function
112
00:06:36,370 --> 00:06:40,420
has no way of displaying
the data and this column
113
00:06:40,420 --> 00:06:43,900
because it's not been
directed how to group them.
114
00:06:43,900 --> 00:06:47,550
And that's why we need to
have the group by command.
115
00:06:50,240 --> 00:06:51,580
We tell it how to group it.
116
00:06:51,580 --> 00:06:53,690
We didn't tell it, in
this case, to group it
117
00:06:53,690 --> 00:06:56,870
by the name, group
it by the job.
118
00:06:56,870 --> 00:06:58,430
We showed it the column.
119
00:06:58,430 --> 00:07:00,170
We said, well, we want
the deptno column.
120
00:07:00,170 --> 00:07:04,480
But we never gave the specific
direction on how to group it.
9594
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.