All language subtitles for 4. Understanding Data Tables vs. Lookup Tables

af Afrikaans
ak Akan
sq Albanian
am Amharic
ar Arabic Download
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bn Bengali
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fr French
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
ko Korean
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranî)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
fa Persian
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal)
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es Spanish
es-419 Spanish (Latin American)
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,640 --> 00:00:00,980 All right. 2 00:00:00,990 --> 00:00:04,040 Continue on with this concept of normalization. 3 00:00:04,050 --> 00:00:07,560 Let's talk about data tables versus lookup tables. 4 00:00:07,740 --> 00:00:13,920 Now generally speaking data models contain two different types of tables data sometimes known as fact 5 00:00:13,920 --> 00:00:14,660 tables. 6 00:00:14,880 --> 00:00:21,630 And look up or dimension tables and the way to kind of differentiate between the two is that data tables 7 00:00:21,990 --> 00:00:27,410 contain those quantitative metrics the numbers or values that you really care about. 8 00:00:27,720 --> 00:00:34,560 Typically these data tables are pretty granular detailed level with ID or key columns that can be used 9 00:00:34,560 --> 00:00:38,060 to create table relationships to look ups. 10 00:00:38,160 --> 00:00:45,990 Now on the other hand look up tables they provide descriptive often text based attributes or characteristics 11 00:00:46,470 --> 00:00:48,190 about each dimension and table. 12 00:00:48,330 --> 00:00:54,240 So you remember in the last lecture I talked about each table in a model serving a specific and distinct 13 00:00:54,240 --> 00:00:55,430 purpose. 14 00:00:55,440 --> 00:01:02,310 Well that's exactly what lookup tables are doing they're providing rich information and attributes about 15 00:01:02,310 --> 00:01:06,370 things like customers like products like dates. 16 00:01:06,630 --> 00:01:10,280 And that's exactly the role that they play within our model. 17 00:01:10,290 --> 00:01:17,430 So let's walk through an example here this blue table is a data table that contains numerical values 18 00:01:17,440 --> 00:01:24,670 those quantities along with two key columns that we can use to connect this data table with some lookups 19 00:01:24,900 --> 00:01:27,580 a date column and a product ID column. 20 00:01:27,910 --> 00:01:34,030 So what we'll be able to do is take that date column and connect it to a date column within a calendar 21 00:01:34,030 --> 00:01:38,300 lookup table and the entire purpose of the calendar lookup table. 22 00:01:38,430 --> 00:01:44,590 It's entire reason for existing is to provide additional information about each date. 23 00:01:44,800 --> 00:01:50,740 So it's got one row per date and additional attributes like the day of the month the month number the 24 00:01:50,740 --> 00:01:53,920 year the week day and so on and so forth. 25 00:01:53,920 --> 00:01:58,780 So the bottom line is that if you know the date you also know all of this additional information about 26 00:01:58,780 --> 00:02:05,850 months quarters years weeks etc. and the calendar Look-Up provides or gives us access to that information. 27 00:02:06,280 --> 00:02:13,120 Same story holds for the Product ID that product ID column in our data table can connect to or relate 28 00:02:13,120 --> 00:02:20,590 to it matching product ID in a product lookup table and that product lookup tables purpose is to provide 29 00:02:20,650 --> 00:02:25,130 all sorts of product specific information about each ID. 30 00:02:25,150 --> 00:02:31,900 So similar concept to the calendar table just in this case if you know the product ID then by definition 31 00:02:31,900 --> 00:02:37,390 you know that brand you know the product name you know the skew the retail price the cost the weight 32 00:02:37,720 --> 00:02:42,800 and any other product specific attributes captured by this product lookup table. 33 00:02:43,210 --> 00:02:47,930 So in order to understand the real mechanics of how these relationships work. 34 00:02:47,950 --> 00:02:53,910 One of the first things we need to dig into is the idea of primary versus foreign keys. 35 00:02:53,920 --> 00:02:57,720 Now consider the same set of three tables or blue data table. 36 00:02:57,890 --> 00:03:04,360 Our green calendar and our orange product lookup table consider the date and product ID columns in our 37 00:03:04,360 --> 00:03:05,660 data table. 38 00:03:05,710 --> 00:03:11,560 These are called foreign keys they're keys because they're used to create relationships with our lookup 39 00:03:11,560 --> 00:03:17,440 tables and they're foreign because they contain multiple instances of each value. 40 00:03:17,530 --> 00:03:20,140 And this is the case with many data tables. 41 00:03:20,170 --> 00:03:27,040 Here we have multiple transactions and multiple quantity values per day and multiple transactions and 42 00:03:27,040 --> 00:03:29,720 quantity values per product ID. 43 00:03:29,800 --> 00:03:36,130 So we could potentially see many many duplicate values in both of those date and product ID columns. 44 00:03:36,130 --> 00:03:42,190 Now when we look at the same columns in the lookup tables date in the calendar table and product ID 45 00:03:42,310 --> 00:03:49,330 in the product lookup table These are called primary keys because they uniquely identify each row of 46 00:03:49,330 --> 00:03:54,490 the table and they match the foreign keys in any related data tables. 47 00:03:54,490 --> 00:03:56,260 So it's easy to mix these up. 48 00:03:56,410 --> 00:04:03,110 But as a rule of thumb primary keys are unique foreign keys often contain duplicates. 49 00:04:03,280 --> 00:04:05,140 So let's jump in a power be-I. 50 00:04:05,290 --> 00:04:10,210 I want to open up the relationships view and see if we can identify some of these primary and foreign 51 00:04:10,210 --> 00:04:14,280 keys in the Adventure Works tables that we're working with. 52 00:04:14,410 --> 00:04:14,820 All right. 53 00:04:14,860 --> 00:04:20,040 So once you've got your file open we're going to go ahead and navigate to the relationships. 54 00:04:20,650 --> 00:04:25,720 And we're going to be spending a lot of time here because this is where the data modeling magic really 55 00:04:25,720 --> 00:04:26,680 happens. 56 00:04:26,950 --> 00:04:32,500 And the first thing I want to do is just looking at these tables that we've got I identify which are 57 00:04:32,500 --> 00:04:35,260 data tables and which are look at tables. 58 00:04:35,260 --> 00:04:37,960 Now we've already started hinting at this quite a bit. 59 00:04:37,990 --> 00:04:43,870 We know that the one table that contains our quantitative metrics the values that we care about the 60 00:04:43,870 --> 00:04:47,290 order quantities is our A.W. sales table. 61 00:04:47,290 --> 00:04:54,010 So let's grab that and we're going to pull it down to kind of distinguish it from the lookup tables. 62 00:04:54,010 --> 00:04:58,720 So now we know this is our data table looking up at our other tables. 63 00:04:58,720 --> 00:05:02,530 Product Look-Up contains all sorts of information about products. 64 00:05:02,830 --> 00:05:07,300 Customer is all obviously focused on customer level information. 65 00:05:07,550 --> 00:05:13,810 Calendar is all about dates territory has information about sales regions countries and continents. 66 00:05:13,960 --> 00:05:20,080 And then we have some additional product related category and subcategory tables that are almost an 67 00:05:20,140 --> 00:05:22,720 orphan off here so come on guys. 68 00:05:24,020 --> 00:05:29,510 So all of these tables serve very specific distinct purposes. 69 00:05:29,720 --> 00:05:36,080 And I'm kind of an OCD person so I'm going to drag the product lookup table over here with its friends 70 00:05:36,860 --> 00:05:39,510 and click and drag and just rearrange things a bit. 71 00:05:39,690 --> 00:05:44,430 We're going to be doing more designing in here but there we go. 72 00:05:44,450 --> 00:05:50,050 So at this point we know which tables are lookup tables and we know which tables or data table. 73 00:05:50,300 --> 00:05:55,600 Let's take a quick pass through and see if we can identify our primary and foreign keys. 74 00:05:55,640 --> 00:06:02,600 So starting with our look ups customer look up clearly has a customer key that uniquely identify each 75 00:06:02,600 --> 00:06:10,670 row within this table and therefore uniquely identify as all of this information about customers calendar 76 00:06:10,670 --> 00:06:11,380 look up. 77 00:06:11,450 --> 00:06:12,370 Same story here. 78 00:06:12,380 --> 00:06:15,890 It's not called a key but the date serves that purpose. 79 00:06:15,890 --> 00:06:21,080 If you know the date you know all of this information about that date the month that week the day of 80 00:06:21,080 --> 00:06:27,710 the week and so on and so forth territories we've got a clear sales territory key here which maps to 81 00:06:27,710 --> 00:06:32,310 regions countries and continents and then products are a little bit more complicated. 82 00:06:32,330 --> 00:06:36,040 We're going to get into this later when we talk about snowflake's agreements. 83 00:06:36,320 --> 00:06:41,690 But what you may notice is that some of these tables actually contain more than one key but for the 84 00:06:41,690 --> 00:06:44,480 time being pretend like you never even saw that. 85 00:06:45,050 --> 00:06:48,840 Now shifting down to the data table let's talk about foreign keys. 86 00:06:49,100 --> 00:06:52,340 So we know that multiple sales take place per day. 87 00:06:52,430 --> 00:06:58,490 So both of these date fields order date and stock date are both foreign keys because there are multiple 88 00:06:58,490 --> 00:07:00,160 instances of each value. 89 00:07:00,410 --> 00:07:06,410 And those dates can tie into the date field or calendar look at similar starting with product key got 90 00:07:06,410 --> 00:07:08,230 multiple sales per product. 91 00:07:08,300 --> 00:07:09,760 Same with customer key. 92 00:07:09,770 --> 00:07:11,210 Same with territory. 93 00:07:11,210 --> 00:07:17,220 So all three of these keys plus the two dates make up our foreign keys within the data table. 94 00:07:17,270 --> 00:07:19,180 So that's about enough for now. 95 00:07:19,250 --> 00:07:24,520 That's your crash course in data versus look up tables and primary versus foreign keys. 10415

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.