Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,720 --> 00:00:08,550
So in the last lecture we discussed that we wanted to create a list of top three customers with maximum
2
00:00:08,550 --> 00:00:10,860
orders from each state.
3
00:00:11,640 --> 00:00:14,970
Now, what should we cover steps to perform this?
4
00:00:16,380 --> 00:00:17,060
Let's see.
5
00:00:17,070 --> 00:00:25,620
So first we will combine our customer tables and order tables so that we get the order detail of each
6
00:00:25,620 --> 00:00:29,580
customer, how much each customer is ordering.
7
00:00:30,540 --> 00:00:36,900
Then on this combined data, we will add row numbers using row number function.
8
00:00:37,650 --> 00:00:47,040
And then once we have row number in front of each customer, we will filter only top three customers
9
00:00:47,040 --> 00:00:51,360
with very close on this newly created grow number.
10
00:00:51,810 --> 00:00:54,660
So these are the steps we want to perform.
11
00:00:55,080 --> 00:00:57,420
Now let's look at our data again.
12
00:00:58,590 --> 00:01:01,090
So this is a sample of our customer table.
13
00:01:01,110 --> 00:01:03,570
This is a sample of our sales table.
14
00:01:04,380 --> 00:01:11,460
Now, in sales table, we have multiple order IDs corresponding to customer IDs.
15
00:01:12,330 --> 00:01:15,950
So we have multiple rows with the same customer ID.
16
00:01:17,030 --> 00:01:26,030
We want to aggregate this data on customer level and get the total number of order IDs, total number
17
00:01:26,030 --> 00:01:29,360
of sales, total quantity, etc..
18
00:01:31,490 --> 00:01:33,950
Now let's go into PG.
19
00:01:35,950 --> 00:01:40,870
So let's first select the top ten rows of our customer table.
20
00:01:53,670 --> 00:01:54,870
Let's run this.
21
00:01:56,200 --> 00:01:59,960
So we have all this data and our final table.
22
00:01:59,980 --> 00:02:07,150
We don't need a segment, etc. So in our final table we will be just keeping customer ID, customer
23
00:02:07,150 --> 00:02:09,430
name and state.
24
00:02:12,240 --> 00:02:19,140
Now let's select the top ten rows from our sales table.
25
00:02:32,220 --> 00:02:35,400
So here we have our deadline.
26
00:02:35,430 --> 00:02:39,060
Order ID, order date, ship, date, etc..
27
00:02:40,110 --> 00:02:45,330
And then we have sales quantity and discount and profit.
28
00:02:46,760 --> 00:02:56,720
So to get the total number of orders for each customer, we want to count the distinct order IDs each
29
00:02:56,720 --> 00:02:58,460
customer ID has.
30
00:02:58,760 --> 00:03:04,940
So for example, for first two rows, order ID is same and the customer ID is also same.
31
00:03:06,260 --> 00:03:08,620
So there is no point of contact.
32
00:03:08,630 --> 00:03:15,770
This has to, since these two are the details of same order and we want to calculate the total number
33
00:03:15,770 --> 00:03:16,250
of orders.
34
00:03:16,250 --> 00:03:19,430
So we'll use distinct with order IDs.
35
00:03:20,740 --> 00:03:23,620
Now let's combine these two tables.
36
00:03:28,520 --> 00:03:37,610
So suppose our first table is customer table and we want to select all the columns for now.
37
00:03:40,970 --> 00:03:48,050
Now, I will tell you what, all we want to select from all the table after we write the next part of
38
00:03:48,050 --> 00:03:48,770
our query.
39
00:03:53,330 --> 00:03:53,990
From.
40
00:03:54,870 --> 00:03:57,420
Customer as a.
41
00:03:59,320 --> 00:04:05,710
Then we want to use the left join since we want all the data of our customer table and only limited
42
00:04:05,710 --> 00:04:07,480
data of our order table.
43
00:04:10,190 --> 00:04:10,820
So left.
44
00:04:10,820 --> 00:04:11,360
Join.
45
00:04:12,630 --> 00:04:21,360
And then instead of using the order table, I will do some aggregation on customer level and order table.
46
00:04:21,510 --> 00:04:22,650
So we'll write.
47
00:04:24,470 --> 00:04:25,340
Select.
48
00:04:25,340 --> 00:04:27,200
I am starting a sub query.
49
00:04:28,110 --> 00:04:34,470
First, we want the customer ID since we want to match on customer ID, so.
50
00:04:35,410 --> 00:04:36,490
Customer ID.
51
00:04:37,890 --> 00:04:43,830
Then for each customer ID, I have on the count of distinct order IDs.
52
00:04:43,890 --> 00:04:47,610
So I will count of thing.
53
00:04:50,940 --> 00:04:52,590
Order early.
54
00:04:56,680 --> 00:04:58,300
Now for this problem.
55
00:04:58,720 --> 00:05:06,700
This data is sufficient for us, but we will use the same data for our other Windows function as well.
56
00:05:07,000 --> 00:05:15,220
So I will also importing sales quantity and total profit from each customer into our combined table.
57
00:05:15,490 --> 00:05:24,760
These two columns are enough for our row number problem, but I am importing this new sales quantity
58
00:05:24,760 --> 00:05:28,270
and discount for our future problem statements.
59
00:05:30,870 --> 00:05:35,160
So for sales, we can have total sales.
60
00:05:35,870 --> 00:05:38,270
So we can write some of.
61
00:05:39,920 --> 00:05:41,120
Sales.
62
00:05:42,370 --> 00:05:43,000
Coma.
63
00:05:43,000 --> 00:05:45,460
Similarly, some of quantities.
64
00:05:50,540 --> 00:05:52,820
I will also rename this.
65
00:05:53,660 --> 00:05:56,090
I was ordered numb.
66
00:05:57,810 --> 00:06:00,150
Sales some of sales as.
67
00:06:02,160 --> 00:06:03,300
Sales.
68
00:06:03,930 --> 00:06:04,620
Total.
69
00:06:05,880 --> 00:06:08,160
Some of quantity as.
70
00:06:15,320 --> 00:06:16,670
One duty.
71
00:06:17,090 --> 00:06:17,720
Total.
72
00:06:19,600 --> 00:06:20,800
And then.
73
00:06:24,170 --> 00:06:27,890
Sum of profit as profit total.
74
00:06:36,240 --> 00:06:41,100
Now let's look again what information we are getting from our order table.
75
00:06:41,140 --> 00:06:45,840
We are getting a similar ID count of distinct order IDs.
76
00:06:46,800 --> 00:06:53,880
Total sales for each customer, total number of quantities for each customer, and total profit from
77
00:06:53,880 --> 00:06:54,810
each customer.
78
00:06:55,890 --> 00:06:59,850
From where we want this data, we want this data from our sales table.
79
00:07:00,760 --> 00:07:07,630
And we want this data to be aggregated on customer ID, so we'll write.
80
00:07:10,400 --> 00:07:10,940
But.
81
00:07:12,460 --> 00:07:13,040
Somewhat.
82
00:07:13,210 --> 00:07:13,840
I'd.
83
00:07:16,260 --> 00:07:17,310
Let's be.
84
00:07:18,800 --> 00:07:20,390
So this is our first table.
85
00:07:20,420 --> 00:07:24,170
This is our second table from second table.
86
00:07:24,320 --> 00:07:26,600
We want these many variables.
87
00:07:26,600 --> 00:07:27,890
So we'll be.
88
00:07:29,870 --> 00:07:30,590
Not.
89
00:07:31,510 --> 00:07:32,260
Order them.
90
00:07:36,750 --> 00:07:37,440
Cuomo.
91
00:07:38,760 --> 00:07:41,370
B dot sales total.
92
00:07:43,900 --> 00:07:47,740
Cuomo b not quantity total.
93
00:07:50,760 --> 00:07:52,610
And then B dot.
94
00:07:53,760 --> 00:07:54,140
Profit.
95
00:08:04,280 --> 00:08:06,200
So we have two tables.
96
00:08:06,200 --> 00:08:07,550
We are joining two table.
97
00:08:08,240 --> 00:08:13,070
Let's just join this wheel right on a dot.
98
00:08:13,850 --> 00:08:19,940
Because somebody is equal to B dot customer ID.
99
00:08:24,160 --> 00:08:26,220
So now our query is ready.
100
00:08:27,960 --> 00:08:35,260
This query is just for combining the two tables and bringing all the important information from all
101
00:08:35,280 --> 00:08:35,940
the table.
102
00:08:36,790 --> 00:08:39,340
So you can see that in front of.
103
00:08:40,040 --> 00:08:41,150
Each customer.
104
00:08:41,240 --> 00:08:43,850
Now I have total number of orders.
105
00:08:44,210 --> 00:08:45,290
Total sales.
106
00:08:45,290 --> 00:08:47,540
Total quantity and profit total.
107
00:08:49,190 --> 00:08:52,250
Now let us verify this data once.
108
00:08:53,080 --> 00:09:02,140
So let's select this customer ID and let's see if the total number of distinct order is five or not.
109
00:09:02,140 --> 00:09:04,470
So limit.
110
00:09:07,170 --> 00:09:09,830
This is the customer I.D. I want to inspect.
111
00:09:11,340 --> 00:09:11,670
Right.
112
00:09:11,770 --> 00:09:14,100
Select a sir from customer.
113
00:09:15,480 --> 00:09:16,320
We're.
114
00:09:19,090 --> 00:09:22,450
But some of the ID is equal to this.
115
00:09:24,750 --> 00:09:30,360
The number of distinct authorities for this customer should be five.
116
00:09:30,750 --> 00:09:33,540
So let's run this query.
117
00:09:38,350 --> 00:09:44,890
We should select sales instead of customer since we want to get the orders from this customer.
118
00:09:46,120 --> 00:09:47,500
Let's just run this.
119
00:09:49,030 --> 00:09:52,000
You can see that I will also.
120
00:09:53,260 --> 00:09:55,030
Order it by.
121
00:09:57,210 --> 00:09:58,050
Or I'd.
122
00:10:03,870 --> 00:10:05,190
Let's run this.
123
00:10:07,790 --> 00:10:13,970
So you can see that first order, second or third order, fourth order and then fifth order.
124
00:10:16,150 --> 00:10:20,230
One, two, three, four.
125
00:10:20,560 --> 00:10:21,550
And then five.
126
00:10:21,580 --> 00:10:28,120
So the total distinct orders on this customer are just five.
127
00:10:28,480 --> 00:10:30,940
So you can see that our query is correct.
128
00:10:32,020 --> 00:10:33,220
Let's run this again.
129
00:10:36,590 --> 00:10:38,870
Now we want to use this data.
130
00:10:39,320 --> 00:10:40,300
On this data.
131
00:10:40,310 --> 00:10:41,540
We want to.
132
00:10:42,340 --> 00:10:45,520
But order number on the basis of ordering them.
133
00:10:46,090 --> 00:10:52,510
So what I will do is I will create another table and save this information in that table.
134
00:10:53,470 --> 00:10:55,390
So I will write Create table.
135
00:10:56,580 --> 00:10:57,580
Well, name it.
136
00:10:59,780 --> 00:11:00,710
Customer.
137
00:11:01,620 --> 00:11:02,300
Order.
138
00:11:02,790 --> 00:11:09,780
Since it contains the detail of aggregate order for each customer, I am naming it as customer order.
139
00:11:11,380 --> 00:11:12,130
As.
140
00:11:14,570 --> 00:11:23,600
Now you can create tables like this by putting all the data that you are getting inside this record.
141
00:11:31,040 --> 00:11:32,170
If I run this.
142
00:11:32,500 --> 00:11:38,830
All the data that I was getting from this select statement is now stored in this customer order.
143
00:11:39,640 --> 00:11:42,400
Let's look at customer order once more.
144
00:11:44,150 --> 00:11:45,200
Select.
145
00:11:46,590 --> 00:11:47,590
Start from.
146
00:11:49,240 --> 00:11:50,050
Customer.
147
00:11:51,220 --> 00:11:51,850
Marder.
148
00:11:55,880 --> 00:11:57,770
Let's run this.
149
00:12:00,420 --> 00:12:03,540
You can see that we are getting the same data.
150
00:12:09,820 --> 00:12:14,650
Now we want to provide raw numbers to each customer.
151
00:12:15,880 --> 00:12:19,510
On the basis of their number of orders in each state.
152
00:12:21,000 --> 00:12:23,820
And we really don't want this additional column.
153
00:12:23,820 --> 00:12:26,340
So I will just select.
154
00:12:28,050 --> 00:12:29,490
Customer ID.
155
00:12:30,590 --> 00:12:31,310
Cuomo.
156
00:12:32,900 --> 00:12:34,190
Because some of the name.
157
00:12:35,270 --> 00:12:38,540
Cuomo estate, comma.
158
00:12:40,060 --> 00:12:40,830
Order them.
159
00:12:40,840 --> 00:12:45,100
Order them is the variable which we created just a while ago.
160
00:12:46,730 --> 00:12:49,850
Which contains the number of distinct authorities.
161
00:12:49,850 --> 00:12:54,380
So order them and then we'll use the rule num function.
162
00:12:55,350 --> 00:12:57,330
Rule number.
163
00:12:58,710 --> 00:12:59,430
Or what?
164
00:13:00,260 --> 00:13:05,330
Then we start the bracket partition by.
165
00:13:06,970 --> 00:13:12,460
Now we want to partition by state since we want top three customer from each state.
166
00:13:12,460 --> 00:13:15,790
So partition by state and then.
167
00:13:16,770 --> 00:13:17,760
Order to buy.
168
00:13:18,000 --> 00:13:22,680
We want to order our customers on the basis of number of orders.
169
00:13:23,660 --> 00:13:28,970
And we created ordered them variable for exactly that purpose.
170
00:13:29,270 --> 00:13:36,950
And we want the customers with maximum number of orders to be placed in first row.
171
00:13:36,980 --> 00:13:39,710
So we want order by descending.
172
00:13:42,610 --> 00:13:47,230
So write DSC will save this.
173
00:13:48,090 --> 00:13:49,470
Variable as.
174
00:13:51,200 --> 00:13:51,780
Hello.
175
00:13:53,030 --> 00:13:53,570
And.
176
00:13:54,560 --> 00:13:56,360
So this is the window function.
177
00:13:56,420 --> 00:13:59,970
Row, number, row number is the keyword.
178
00:13:59,990 --> 00:14:02,870
Then over is also a keyword partition.
179
00:14:02,870 --> 00:14:04,220
By is also a keyword.
180
00:14:04,490 --> 00:14:13,370
Then we have to mention the variable on which we want the group and then order by is also a keyword.
181
00:14:13,400 --> 00:14:20,120
After that we want to mention the variable on which we want to order the data or provide row number.
182
00:14:21,960 --> 00:14:27,260
And at last we providing us to this variable.
183
00:14:29,250 --> 00:14:33,060
So we went on with this column from.
184
00:14:34,010 --> 00:14:34,870
Our data.
185
00:14:42,900 --> 00:14:44,040
Let's run this.
186
00:14:51,820 --> 00:14:52,990
Should be ordering them.
187
00:14:54,670 --> 00:14:55,930
Let's run this again.
188
00:15:02,430 --> 00:15:03,510
This is our data.
189
00:15:03,960 --> 00:15:07,950
Now, you can see that for the state of Alabama.
190
00:15:09,590 --> 00:15:12,980
With customer ID DC 12850.
191
00:15:13,010 --> 00:15:16,160
We have the total number of orders are nine.
192
00:15:16,370 --> 00:15:19,250
So here in the row number we are getting one.
193
00:15:20,520 --> 00:15:22,350
For the second customer.
194
00:15:22,510 --> 00:15:23,870
The same is set.
195
00:15:23,880 --> 00:15:25,530
The number of orders is it?
196
00:15:25,740 --> 00:15:27,450
That's why we are getting to.
197
00:15:28,370 --> 00:15:29,150
In the third.
198
00:15:29,150 --> 00:15:31,430
The number of orders is again eight.
199
00:15:31,610 --> 00:15:33,930
That's why we are getting the three.
200
00:15:33,950 --> 00:15:35,510
So one, two, three.
201
00:15:36,510 --> 00:15:41,430
And similarly, if the state changes the numbering again, it starts from one.
202
00:15:41,430 --> 00:15:50,700
So for Arizona, the maximum number of orders is ten and we are providing this customer as row and one.
203
00:15:51,660 --> 00:15:53,760
So you can check other rules as well.
204
00:15:55,200 --> 00:16:04,260
And see that we have many customers from California and the maximum number of orders is 13.
205
00:16:05,580 --> 00:16:06,750
From California.
206
00:16:07,050 --> 00:16:14,520
So from each state, I'm getting the customer with maximum number of orders.
207
00:16:15,120 --> 00:16:18,840
For example, in the District of Columbia, there is only one customer.
208
00:16:19,230 --> 00:16:22,050
And that's why we are getting row and as one.
209
00:16:23,480 --> 00:16:27,470
There are no entries of two and three since there is only one customer.
210
00:16:28,990 --> 00:16:30,190
So you can see that.
211
00:16:31,650 --> 00:16:37,080
We are getting this rule and variable for each estate and for each customer.
212
00:16:37,890 --> 00:16:42,960
Now we want top three customers from each state.
213
00:16:43,230 --> 00:16:45,180
And how can we find out that?
214
00:16:46,010 --> 00:16:53,180
We can use the ro and variable to segregate such customers, for example, in New Mexico.
215
00:16:53,210 --> 00:16:55,160
The top three customers are this.
216
00:16:56,220 --> 00:16:57,060
In.
217
00:16:58,880 --> 00:17:02,600
So Minnesota, the top three customers are this customer.
218
00:17:02,630 --> 00:17:12,230
So we can clearly use a work loss on our roll number variable where I can only get 1 to 3 numbers.
219
00:17:12,560 --> 00:17:14,000
So if I write.
220
00:17:17,300 --> 00:17:18,350
The same query.
221
00:17:18,920 --> 00:17:19,300
Right.
222
00:17:19,310 --> 00:17:20,090
Where?
223
00:17:22,160 --> 00:17:26,510
Drew mn is less than or equal to three.
224
00:17:26,540 --> 00:17:30,230
I should get three customers from each state.
225
00:17:30,710 --> 00:17:31,790
Let's run this.
226
00:17:34,630 --> 00:17:40,520
So I'm getting an error because ROE and is not available.
227
00:17:40,540 --> 00:17:47,590
Currently I'm getting an error because in the same statement we are creating and we are providing the
228
00:17:47,590 --> 00:17:48,490
workflows.
229
00:17:48,640 --> 00:17:57,280
So again, you can save this information into some other table and then use this condition.
230
00:17:57,280 --> 00:18:05,020
Or you can write this whole definition of ROE and to workflows or another way is to.
231
00:18:05,830 --> 00:18:09,700
Portrait inside a subquery select.
232
00:18:10,820 --> 00:18:13,220
Start from now.
233
00:18:13,220 --> 00:18:14,370
This is my table name.
234
00:18:14,420 --> 00:18:18,950
And after we are creating this table name, I'm using a word clause.
235
00:18:18,950 --> 00:18:19,430
So.
236
00:18:22,130 --> 00:18:23,270
Let's run this.
237
00:18:26,390 --> 00:18:26,720
Again.
238
00:18:26,750 --> 00:18:31,880
Since we are using a subquery, we have to write as a And here, here.
239
00:18:31,940 --> 00:18:33,020
Dot this.
240
00:18:36,710 --> 00:18:37,970
Let's run this again.
241
00:18:38,880 --> 00:18:40,570
Now look what is working fine.
242
00:18:40,590 --> 00:18:42,960
You can see that from each state.
243
00:18:42,960 --> 00:18:46,920
We are getting the three rows only with the maximum number of orders.
244
00:18:47,220 --> 00:18:53,250
So from Alabama, we have these three customers from Colorado.
245
00:18:53,280 --> 00:18:55,320
We have these three customers.
246
00:18:56,140 --> 00:18:57,250
From Kentucky.
247
00:18:57,250 --> 00:18:59,530
We have these three customers and so on.
248
00:18:59,530 --> 00:19:03,130
So from each step we are getting just three customers.
249
00:19:05,000 --> 00:19:07,460
So that's how we use rule number.
250
00:19:08,270 --> 00:19:10,480
This is our first window function.
251
00:19:10,490 --> 00:19:17,030
We'll be using the same dataset that we created that is customer order for our rest of window functions
252
00:19:17,030 --> 00:19:17,600
as well.
253
00:19:17,960 --> 00:19:18,590
Thank you.
19664
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.