Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
0
1
00:00:00,300 --> 00:00:05,310
In this video, I will be covering the theoretical concept behind joining tables.
1
2
00:00:06,320 --> 00:00:09,020
Why this is done and how it is used.
2
3
00:00:10,880 --> 00:00:17,030
We will then cover each and every type of joint and their implementation in the upcoming videos.
3
4
00:00:19,030 --> 00:00:21,010
Let me take an example business scenario.
4
5
00:00:22,110 --> 00:00:30,090
Suppose the management wants to know about the performance of each region so that they can decide on
5
6
00:00:30,090 --> 00:00:32,450
the marketing budget for each region.
6
7
00:00:34,580 --> 00:00:38,950
So they ask you to find the total sales done in each region.
7
8
00:00:41,710 --> 00:00:43,690
There are two things I need for this.
8
9
00:00:45,440 --> 00:00:49,700
First, I need the sales data, which is in the sales table.
9
10
00:00:50,720 --> 00:00:55,490
And second, I need the region, which is part of the customer table.
10
11
00:00:57,710 --> 00:01:03,140
If we could somehow assign region values to each of these transactions.
11
12
00:01:04,490 --> 00:01:08,300
That is, if we could assign that this transaction.
12
13
00:01:09,740 --> 00:01:12,110
Belongs to the south region.
13
14
00:01:13,140 --> 00:01:16,680
And this transaction belongs to the West region.
14
15
00:01:18,470 --> 00:01:27,270
Then we can simply aggregate the data on these regions to find the of sales for each region.
15
16
00:01:28,730 --> 00:01:32,420
But how do we know the region for this transaction?
16
17
00:01:33,950 --> 00:01:39,020
You must have noticed that there is one field common between these two tables.
17
18
00:01:40,440 --> 00:01:42,480
That is the customer, I'd call them.
18
19
00:01:43,860 --> 00:01:50,730
Using this custom I'd, we can match the customer who did this transaction to which region this
19
20
00:01:50,730 --> 00:01:52,140
particular customer belongs.
20
21
00:01:54,560 --> 00:02:03,080
This is the concept of joins, you can join the information present in two or more tables using a
21
22
00:02:03,080 --> 00:02:06,080
common key column or a group of columns.
22
23
00:02:08,640 --> 00:02:17,760
We can also join more than two tables, for example, here we can join sales table and customer table
23
24
00:02:17,760 --> 00:02:23,700
based on customer I'd, and we can join sales table and product table based on product I'd.
24
25
00:02:25,350 --> 00:02:32,640
Once we have this join data, we can then find things like how much our customers from the state of
25
26
00:02:32,640 --> 00:02:36,510
California spending on each of the product category.
26
27
00:02:38,750 --> 00:02:45,200
Now, let us see what we need to know so that we can create join data from two tables.
27
28
00:02:47,550 --> 00:02:51,690
First, we need to know the name of the tables that we want to join.
28
29
00:02:54,210 --> 00:03:00,510
For simplicity's sake, let's say we are joining only two tables, so we need to know the name of these
29
30
00:03:00,510 --> 00:03:01,260
two tables.
30
31
00:03:03,180 --> 00:03:08,130
So this is our sales table and this one is our customer's table.
31
32
00:03:10,220 --> 00:03:15,710
Next, we need to know the common column based on which we will join these two tables.
32
33
00:03:17,310 --> 00:03:21,180
For example, here, the common column was customer id.
33
34
00:03:22,660 --> 00:03:28,540
Lastly, we need the list of columns that we want from each table in our join data.
34
35
00:03:29,910 --> 00:03:36,030
Suppose we are trying to find out region wise sales, we can choose to have all columns from both tables
35
36
00:03:36,690 --> 00:03:45,540
or we can choose which columns we want from each table so I can choose only order I'd customer I'd and
36
37
00:03:45,540 --> 00:03:47,520
sales value from the sales table.
37
38
00:03:48,890 --> 00:03:55,880
And region value from the customer table to get a table like this and then use it to find the region
38
39
00:03:55,880 --> 00:03:56,570
wise sales
39
40
00:03:58,350 --> 00:04:00,180
This output will be lighter.
40
41
00:04:01,130 --> 00:04:03,800
But this will have limited functionality only.
41
42
00:04:05,240 --> 00:04:11,750
So which columns should come in, the output from which table, this is an option that we have while
42
43
00:04:11,750 --> 00:04:12,650
joining the tables
43
44
00:04:14,760 --> 00:04:17,520
Now, let me tell you about the different types of joins.
44
45
00:04:19,310 --> 00:04:25,160
We will cover each type of join in detail in the coming videos, but here I just want to explain the
45
46
00:04:25,160 --> 00:04:28,040
logic behind all these different types of joint.
46
47
00:04:29,940 --> 00:04:36,570
For the purpose of this example, suppose this is my sales data, I have taken only a few rows and columns
47
48
00:04:37,050 --> 00:04:39,440
so that we can clearly see what is happening.
48
49
00:04:40,490 --> 00:04:45,950
Now we want to join these two tables based on the Common key, which is the customer I'd.
49
50
00:04:48,220 --> 00:04:52,360
Let us look closely at the customer IDs in these two tables.
50
51
00:04:54,340 --> 00:05:02,800
What do we notice, We see that these customer IDs CG-12520, DV-13045,
51
52
00:05:02,800 --> 00:05:09,790
and SO-20335, these three customer IDs are present in both the tables.
52
53
00:05:11,070 --> 00:05:13,960
This customer I'd PG-18895
53
54
00:05:14,370 --> 00:05:19,860
This is present only in the sales table and it is not present in the customer table.
54
55
00:05:21,430 --> 00:05:28,530
And this I'd BH-11710 is present in customer table, but not in the sales table.
55
56
00:05:30,120 --> 00:05:34,620
Now, we have four options here, which results in four types of joins.
56
57
00:05:36,760 --> 00:05:45,610
The first option is that in the result, we get only those results for which customer Id is present
57
58
00:05:45,610 --> 00:05:46,750
in both the tables.
58
59
00:05:47,820 --> 00:05:52,810
So in the first point, we have these three customer Ids which are present in both the tables.
59
60
00:05:53,550 --> 00:06:00,450
So in the result, we will get only these three customer ids and all the data from both the tables.
60
61
00:06:02,480 --> 00:06:10,010
So you can see we have order line, order Id, order date, customer id, product id and sales value, all of these
61
62
00:06:10,010 --> 00:06:16,000
columns are coming from the sales table and these three columns, customer names, state and region.
62
63
00:06:16,430 --> 00:06:18,080
This is coming from the customer table.
63
64
00:06:19,080 --> 00:06:24,660
So you can see that the data is joint of the two tables, but the number of rows are less.
64
65
00:06:24,810 --> 00:06:28,890
Only those rows are there for which we have data from both the tables.
65
66
00:06:30,550 --> 00:06:37,660
This type of join is called inner join, and the name is coming from this Venn diagram.
66
67
00:06:40,130 --> 00:06:49,060
This circle, the present customer IDs of sales table, this second circle represents the customer ID
67
68
00:06:49,250 --> 00:06:50,390
of customer table.
68
69
00:06:51,500 --> 00:06:57,890
This center shaded part is those customer IDs, which are present in both the labels.
69
70
00:06:59,370 --> 00:07:03,690
This is the inner part of the venn diagram, so we are calling it inner join.
70
71
00:07:06,530 --> 00:07:14,420
Other option is if we decide that in the output, we want all the customer I.D. from the sales table.
71
72
00:07:15,580 --> 00:07:22,000
They may or may not be present in the customer table if they are present in the customer table,
72
73
00:07:22,000 --> 00:07:27,670
For example, the first id CG-12520, this is present in the customer table also
73
74
00:07:27,670 --> 00:07:31,410
Also, if it is there, we will get the customer data.
74
75
00:07:32,290 --> 00:07:40,000
But if that customer ID is not present in the customer table, for example, this ID
75
76
00:07:40,000 --> 00:07:40,600
PG-18895
76
77
00:07:42,030 --> 00:07:49,920
Since this is not available in the table, we will still have it in the output and in the places where
77
78
00:07:49,920 --> 00:07:51,430
we are getting the customer data.
78
79
00:07:51,660 --> 00:07:52,980
We will have null value.
79
80
00:07:54,720 --> 00:08:02,460
So you can see the output of left joining, we have sales data for all these ids because we are getting
80
81
00:08:02,460 --> 00:08:04,260
all the Ids from the sales table.
81
82
00:08:05,260 --> 00:08:08,840
But for that particular id Where we do not have the customer data.
82
83
00:08:09,340 --> 00:08:10,300
The data is null.
83
84
00:08:12,180 --> 00:08:15,210
You can see the Venn diagram for left join.
84
85
00:08:16,820 --> 00:08:21,170
All the customer ids from sales table are picked so that is shaded
85
86
00:08:22,370 --> 00:08:28,040
But those customer IDs, which are not part of the sales table, those are left out.
86
87
00:08:29,170 --> 00:08:32,650
So since the left part is shaded, this is called left join.
87
88
00:08:35,520 --> 00:08:37,020
Similar Is right join.
88
89
00:08:37,950 --> 00:08:40,770
If you want all customer I.D. from customer table.
89
90
00:08:41,710 --> 00:08:48,010
Even if no data is available for some Ids in the sales table, we can do a right join.
90
91
00:08:49,580 --> 00:08:56,870
This is the output, if we do a right join we will not have sales data for this particular ID
91
92
00:08:57,150 --> 00:09:02,120
BH-11710 because this is not present in the sales table.
92
93
00:09:03,170 --> 00:09:10,300
But this will be part of the output because it is in customer table and we are doing a right join.
93
94
00:09:12,200 --> 00:09:18,830
You can look at the Venn diagram. This time, we are taking all customer Ids from customer table and excluding
94
95
00:09:18,830 --> 00:09:21,740
those I'ds which are not present in the customer table.
95
96
00:09:25,710 --> 00:09:33,180
Fourth option is if you want to include all customer Ids present in both the tables in the situation,
96
97
00:09:33,390 --> 00:09:35,460
the result looks like this.
97
98
00:09:36,500 --> 00:09:40,240
We have all the customer ids for the customer Id where.
98
99
00:09:40,280 --> 00:09:42,230
We do not have customer data, there
99
100
00:09:42,240 --> 00:09:46,440
We get null for customer details and for that customer Id
100
101
00:09:46,790 --> 00:09:48,160
Where we do not have sales data.
101
102
00:09:48,170 --> 00:09:51,050
We get null for the sales transaction details.
102
103
00:09:51,740 --> 00:09:53,110
You can see the venn diagram.
103
104
00:09:54,290 --> 00:09:55,710
All the parts are shaded.
104
105
00:09:56,120 --> 00:10:02,090
This is why this particular join is called full outer join or full join.
105
106
00:10:04,220 --> 00:10:10,850
Now, one question may come to your mind if we keep sales data on the left and do left join.
106
107
00:10:11,890 --> 00:10:18,860
Or if we keep sales stable on the right and the right, join, will these to give same results.
107
108
00:10:20,180 --> 00:10:26,270
Short answer is yes, but once we have covered the topics in detail, you should try it out yourself
108
109
00:10:26,450 --> 00:10:28,870
to confirm that this holds true.
109
110
00:10:30,070 --> 00:10:37,030
There is one more join called Cross Join, but that is quite different from these ones, so we will
110
111
00:10:37,030 --> 00:10:39,880
learn more about it in its respective video.
111
112
00:10:42,340 --> 00:10:47,740
So this was about joining two different tables with different data, using a common column.
112
113
00:10:49,450 --> 00:10:53,860
There is another way of combining data, which is combining the same type of tables.
113
114
00:10:55,300 --> 00:11:02,080
For example, suppose the online team saves customer data in one table, so, for example, these four
114
115
00:11:02,080 --> 00:11:08,690
customers are online customers and the second table is showing our offline customers.
115
116
00:11:08,710 --> 00:11:10,960
So these three are the offline customers list.
116
117
00:11:12,380 --> 00:11:19,460
Now, to get data of all the customers, we may want to combine the entries in these two tables, this
117
118
00:11:19,460 --> 00:11:26,720
type of combining data is handled by operators like Union Accept and intersect.
118
119
00:11:27,860 --> 00:11:30,050
These are called combining inquiries.
119
120
00:11:31,300 --> 00:11:38,320
So this is different than joining this is combining the data and these queries will be discussed after
120
121
00:11:38,320 --> 00:11:40,090
we have discussed the joining queries.
121
122
00:11:41,010 --> 00:11:47,340
With this background about the topic, let's jump right in and see each of these joining and combining
122
123
00:11:47,340 --> 00:11:48,260
queries in detail.
123
124
00:11:49,460 --> 00:11:55,910
Any questions coming to your mind now will most probably get answered in these upcoming videos?
124
125
00:11:56,970 --> 00:11:58,280
I'll see you in the next video.
13674
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.