All language subtitles for 2. Concepts of Joining and Combining Data

af Afrikaans
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bn Bengali
bs Bosnian
bg Bulgarian
ca Catalan
ceb Cebuano
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
tl Filipino
fi Finnish
fr French Download
fy Frisian
gl Galician
ka Georgian
de German
el Greek
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
km Khmer
ko Korean
ku Kurdish (Kurmanji)
ky Kyrgyz
lo Lao
la Latin
lv Latvian
lt Lithuanian
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mn Mongolian
my Myanmar (Burmese)
ne Nepali
no Norwegian
ps Pashto
fa Persian
pl Polish
pt Portuguese
pa Punjabi
ro Romanian
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
st Sesotho
sn Shona
sd Sindhi
si Sinhala
sk Slovak
sl Slovenian
so Somali
es Spanish
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
te Telugu
th Thai
tr Turkish
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
or Odia (Oriya)
rw Kinyarwanda
tk Turkmen
tt Tatar
ug Uyghur
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 0 1 00:00:00,300 --> 00:00:05,310 In this video, I will be covering the theoretical concept behind joining tables. 1 2 00:00:06,320 --> 00:00:09,020 Why this is done and how it is used. 2 3 00:00:10,880 --> 00:00:17,030 We will then cover each and every type of joint and their implementation in the upcoming videos. 3 4 00:00:19,030 --> 00:00:21,010 Let me take an example business scenario. 4 5 00:00:22,110 --> 00:00:30,090 Suppose the management wants to know about the performance of each region so that they can decide on 5 6 00:00:30,090 --> 00:00:32,450 the marketing budget for each region. 6 7 00:00:34,580 --> 00:00:38,950 So they ask you to find the total sales done in each region. 7 8 00:00:41,710 --> 00:00:43,690 There are two things I need for this. 8 9 00:00:45,440 --> 00:00:49,700 First, I need the sales data, which is in the sales table. 9 10 00:00:50,720 --> 00:00:55,490 And second, I need the region, which is part of the customer table. 10 11 00:00:57,710 --> 00:01:03,140 If we could somehow assign region values to each of these transactions. 11 12 00:01:04,490 --> 00:01:08,300 That is, if we could assign that this transaction. 12 13 00:01:09,740 --> 00:01:12,110 Belongs to the south region. 13 14 00:01:13,140 --> 00:01:16,680 And this transaction belongs to the West region. 14 15 00:01:18,470 --> 00:01:27,270 Then we can simply aggregate the data on these regions to find the of sales for each region. 15 16 00:01:28,730 --> 00:01:32,420 But how do we know the region for this transaction? 16 17 00:01:33,950 --> 00:01:39,020 You must have noticed that there is one field common between these two tables. 17 18 00:01:40,440 --> 00:01:42,480 That is the customer, I'd call them. 18 19 00:01:43,860 --> 00:01:50,730 Using this custom I'd, we can match the customer who did this transaction to which region this 19 20 00:01:50,730 --> 00:01:52,140 particular customer belongs. 20 21 00:01:54,560 --> 00:02:03,080 This is the concept of joins, you can join the information present in two or more tables using a 21 22 00:02:03,080 --> 00:02:06,080 common key column or a group of columns. 22 23 00:02:08,640 --> 00:02:17,760 We can also join more than two tables, for example, here we can join sales table and customer table 23 24 00:02:17,760 --> 00:02:23,700 based on customer I'd, and we can join sales table and product table based on product I'd. 24 25 00:02:25,350 --> 00:02:32,640 Once we have this join data, we can then find things like how much our customers from the state of 25 26 00:02:32,640 --> 00:02:36,510 California spending on each of the product category. 26 27 00:02:38,750 --> 00:02:45,200 Now, let us see what we need to know so that we can create join data from two tables. 27 28 00:02:47,550 --> 00:02:51,690 First, we need to know the name of the tables that we want to join. 28 29 00:02:54,210 --> 00:03:00,510 For simplicity's sake, let's say we are joining only two tables, so we need to know the name of these 29 30 00:03:00,510 --> 00:03:01,260 two tables. 30 31 00:03:03,180 --> 00:03:08,130 So this is our sales table and this one is our customer's table. 31 32 00:03:10,220 --> 00:03:15,710 Next, we need to know the common column based on which we will join these two tables. 32 33 00:03:17,310 --> 00:03:21,180 For example, here, the common column was customer id. 33 34 00:03:22,660 --> 00:03:28,540 Lastly, we need the list of columns that we want from each table in our join data. 34 35 00:03:29,910 --> 00:03:36,030 Suppose we are trying to find out region wise sales, we can choose to have all columns from both tables 35 36 00:03:36,690 --> 00:03:45,540 or we can choose which columns we want from each table so I can choose only order I'd customer I'd and 36 37 00:03:45,540 --> 00:03:47,520 sales value from the sales table. 37 38 00:03:48,890 --> 00:03:55,880 And region value from the customer table to get a table like this and then use it to find the region 38 39 00:03:55,880 --> 00:03:56,570 wise sales 39 40 00:03:58,350 --> 00:04:00,180 This output will be lighter. 40 41 00:04:01,130 --> 00:04:03,800 But this will have limited functionality only. 41 42 00:04:05,240 --> 00:04:11,750 So which columns should come in, the output from which table, this is an option that we have while 42 43 00:04:11,750 --> 00:04:12,650 joining the tables 43 44 00:04:14,760 --> 00:04:17,520 Now, let me tell you about the different types of joins. 44 45 00:04:19,310 --> 00:04:25,160 We will cover each type of join in detail in the coming videos, but here I just want to explain the 45 46 00:04:25,160 --> 00:04:28,040 logic behind all these different types of joint. 46 47 00:04:29,940 --> 00:04:36,570 For the purpose of this example, suppose this is my sales data, I have taken only a few rows and columns 47 48 00:04:37,050 --> 00:04:39,440 so that we can clearly see what is happening. 48 49 00:04:40,490 --> 00:04:45,950 Now we want to join these two tables based on the Common key, which is the customer I'd. 49 50 00:04:48,220 --> 00:04:52,360 Let us look closely at the customer IDs in these two tables. 50 51 00:04:54,340 --> 00:05:02,800 What do we notice, We see that these customer IDs CG-12520, DV-13045, 51 52 00:05:02,800 --> 00:05:09,790 and SO-20335, these three customer IDs are present in both the tables. 52 53 00:05:11,070 --> 00:05:13,960 This customer I'd PG-18895 53 54 00:05:14,370 --> 00:05:19,860 This is present only in the sales table and it is not present in the customer table. 54 55 00:05:21,430 --> 00:05:28,530 And this I'd BH-11710 is present in customer table, but not in the sales table. 55 56 00:05:30,120 --> 00:05:34,620 Now, we have four options here, which results in four types of joins. 56 57 00:05:36,760 --> 00:05:45,610 The first option is that in the result, we get only those results for which customer Id is present 57 58 00:05:45,610 --> 00:05:46,750 in both the tables. 58 59 00:05:47,820 --> 00:05:52,810 So in the first point, we have these three customer Ids which are present in both the tables. 59 60 00:05:53,550 --> 00:06:00,450 So in the result, we will get only these three customer ids and all the data from both the tables. 60 61 00:06:02,480 --> 00:06:10,010 So you can see we have order line, order Id, order date, customer id, product id and sales value, all of these 61 62 00:06:10,010 --> 00:06:16,000 columns are coming from the sales table and these three columns, customer names, state and region. 62 63 00:06:16,430 --> 00:06:18,080 This is coming from the customer table. 63 64 00:06:19,080 --> 00:06:24,660 So you can see that the data is joint of the two tables, but the number of rows are less. 64 65 00:06:24,810 --> 00:06:28,890 Only those rows are there for which we have data from both the tables. 65 66 00:06:30,550 --> 00:06:37,660 This type of join is called inner join, and the name is coming from this Venn diagram. 66 67 00:06:40,130 --> 00:06:49,060 This circle, the present customer IDs of sales table, this second circle represents the customer ID 67 68 00:06:49,250 --> 00:06:50,390 of customer table. 68 69 00:06:51,500 --> 00:06:57,890 This center shaded part is those customer IDs, which are present in both the labels. 69 70 00:06:59,370 --> 00:07:03,690 This is the inner part of the venn diagram, so we are calling it inner join. 70 71 00:07:06,530 --> 00:07:14,420 Other option is if we decide that in the output, we want all the customer I.D. from the sales table. 71 72 00:07:15,580 --> 00:07:22,000 They may or may not be present in the customer table if they are present in the customer table, 72 73 00:07:22,000 --> 00:07:27,670 For example, the first id CG-12520, this is present in the customer table also 73 74 00:07:27,670 --> 00:07:31,410 Also, if it is there, we will get the customer data. 74 75 00:07:32,290 --> 00:07:40,000 But if that customer ID is not present in the customer table, for example, this ID 75 76 00:07:40,000 --> 00:07:40,600 PG-18895 76 77 00:07:42,030 --> 00:07:49,920 Since this is not available in the table, we will still have it in the output and in the places where 77 78 00:07:49,920 --> 00:07:51,430 we are getting the customer data. 78 79 00:07:51,660 --> 00:07:52,980 We will have null value. 79 80 00:07:54,720 --> 00:08:02,460 So you can see the output of left joining, we have sales data for all these ids because we are getting 80 81 00:08:02,460 --> 00:08:04,260 all the Ids from the sales table. 81 82 00:08:05,260 --> 00:08:08,840 But for that particular id Where we do not have the customer data. 82 83 00:08:09,340 --> 00:08:10,300 The data is null. 83 84 00:08:12,180 --> 00:08:15,210 You can see the Venn diagram for left join. 84 85 00:08:16,820 --> 00:08:21,170 All the customer ids from sales table are picked so that is shaded 85 86 00:08:22,370 --> 00:08:28,040 But those customer IDs, which are not part of the sales table, those are left out. 86 87 00:08:29,170 --> 00:08:32,650 So since the left part is shaded, this is called left join. 87 88 00:08:35,520 --> 00:08:37,020 Similar Is right join. 88 89 00:08:37,950 --> 00:08:40,770 If you want all customer I.D. from customer table. 89 90 00:08:41,710 --> 00:08:48,010 Even if no data is available for some Ids in the sales table, we can do a right join. 90 91 00:08:49,580 --> 00:08:56,870 This is the output, if we do a right join we will not have sales data for this particular ID 91 92 00:08:57,150 --> 00:09:02,120 BH-11710 because this is not present in the sales table. 92 93 00:09:03,170 --> 00:09:10,300 But this will be part of the output because it is in customer table and we are doing a right join. 93 94 00:09:12,200 --> 00:09:18,830 You can look at the Venn diagram. This time, we are taking all customer Ids from customer table and excluding 94 95 00:09:18,830 --> 00:09:21,740 those I'ds which are not present in the customer table. 95 96 00:09:25,710 --> 00:09:33,180 Fourth option is if you want to include all customer Ids present in both the tables in the situation, 96 97 00:09:33,390 --> 00:09:35,460 the result looks like this. 97 98 00:09:36,500 --> 00:09:40,240 We have all the customer ids for the customer Id where. 98 99 00:09:40,280 --> 00:09:42,230 We do not have customer data, there 99 100 00:09:42,240 --> 00:09:46,440 We get null for customer details and for that customer Id 100 101 00:09:46,790 --> 00:09:48,160 Where we do not have sales data. 101 102 00:09:48,170 --> 00:09:51,050 We get null for the sales transaction details. 102 103 00:09:51,740 --> 00:09:53,110 You can see the venn diagram. 103 104 00:09:54,290 --> 00:09:55,710 All the parts are shaded. 104 105 00:09:56,120 --> 00:10:02,090 This is why this particular join is called full outer join or full join. 105 106 00:10:04,220 --> 00:10:10,850 Now, one question may come to your mind if we keep sales data on the left and do left join. 106 107 00:10:11,890 --> 00:10:18,860 Or if we keep sales stable on the right and the right, join, will these to give same results. 107 108 00:10:20,180 --> 00:10:26,270 Short answer is yes, but once we have covered the topics in detail, you should try it out yourself 108 109 00:10:26,450 --> 00:10:28,870 to confirm that this holds true. 109 110 00:10:30,070 --> 00:10:37,030 There is one more join called Cross Join, but that is quite different from these ones, so we will 110 111 00:10:37,030 --> 00:10:39,880 learn more about it in its respective video. 111 112 00:10:42,340 --> 00:10:47,740 So this was about joining two different tables with different data, using a common column. 112 113 00:10:49,450 --> 00:10:53,860 There is another way of combining data, which is combining the same type of tables. 113 114 00:10:55,300 --> 00:11:02,080 For example, suppose the online team saves customer data in one table, so, for example, these four 114 115 00:11:02,080 --> 00:11:08,690 customers are online customers and the second table is showing our offline customers. 115 116 00:11:08,710 --> 00:11:10,960 So these three are the offline customers list. 116 117 00:11:12,380 --> 00:11:19,460 Now, to get data of all the customers, we may want to combine the entries in these two tables, this 117 118 00:11:19,460 --> 00:11:26,720 type of combining data is handled by operators like Union Accept and intersect. 118 119 00:11:27,860 --> 00:11:30,050 These are called combining inquiries. 119 120 00:11:31,300 --> 00:11:38,320 So this is different than joining this is combining the data and these queries will be discussed after 120 121 00:11:38,320 --> 00:11:40,090 we have discussed the joining queries. 121 122 00:11:41,010 --> 00:11:47,340 With this background about the topic, let's jump right in and see each of these joining and combining 122 123 00:11:47,340 --> 00:11:48,260 queries in detail. 123 124 00:11:49,460 --> 00:11:55,910 Any questions coming to your mind now will most probably get answered in these upcoming videos? 124 125 00:11:56,970 --> 00:11:58,280 I'll see you in the next video. 13674

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.