All language subtitles for 3. Data Frame Indexing and Selection

af Afrikaans
ak Akan
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bn Bengali
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fr French
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian Download
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
ko Korean
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranî)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
fa Persian
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal)
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es Spanish
es-419 Spanish (Latin American)
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,420 --> 00:00:05,250 Hello everyone and welcome to the lecture on data frame's selection and indexing. 2 00:00:05,530 --> 00:00:08,910 And this lecture we're going to learn how to grab data out of our data frame. 3 00:00:09,120 --> 00:00:12,410 Let's go ahead and jump to our studio to see how to do this. 4 00:00:12,450 --> 00:00:12,810 All right. 5 00:00:12,810 --> 00:00:19,230 So here our studio and I'm going to go ahead and just move the environment tabs and the file tabs to 6 00:00:19,230 --> 00:00:22,880 the side so we can focus on the scripting and console. 7 00:00:23,090 --> 00:00:27,200 Notice I'm going to be using the same data we used in a previous lecture. 8 00:00:27,240 --> 00:00:33,300 This weather data of the days the temperature and the rain to go ahead and pass this as ADF going to 9 00:00:33,300 --> 00:00:35,400 go ahead and run the source. 10 00:00:35,440 --> 00:00:39,750 And now on a call if it's going to reveal that data frame. 11 00:00:39,750 --> 00:00:45,030 Let's go ahead and learn how we can use this bracket notation we're familiar with to actually grab information 12 00:00:45,390 --> 00:00:52,050 from a data frame when we were learning about Matrix indexing and selection of elements from a matrix 13 00:00:52,060 --> 00:00:52,260 . 14 00:00:52,350 --> 00:00:55,350 We learned that we could just say something like Matt. 15 00:00:55,650 --> 00:01:00,750 And then the Rose who want it and the columns he wanted were actually going to be able to do the same 16 00:01:00,750 --> 00:01:02,550 thing with data frames. 17 00:01:02,550 --> 00:01:05,680 So imagine that we wanted everything from the first row. 18 00:01:05,790 --> 00:01:10,870 We would just input row number one comma and then blank in the killing. 19 00:01:11,010 --> 00:01:14,980 All the columns from the first row and that would actually return. 20 00:01:15,000 --> 00:01:20,330 Likewise the first row of our data frame and notice how the labels of the columns are kept in the results 21 00:01:20,330 --> 00:01:20,920 . 22 00:01:20,970 --> 00:01:24,900 They can also do the same thing to get everything from the first column. 23 00:01:25,080 --> 00:01:32,700 So I can say something like blank comma 1 instead of that square bracket notation and slow return the 24 00:01:32,700 --> 00:01:36,590 days notice that the levels are included here. 25 00:01:37,620 --> 00:01:42,600 Now something to note here is that one of the main points of using data frames is actually being able 26 00:01:42,600 --> 00:01:49,050 to label your columns and your rows by some sort of character or string names such as days temp or rain 27 00:01:49,050 --> 00:01:49,940 . 28 00:01:50,040 --> 00:01:55,230 We're going to be able to actually index using those names instead of having to remember which column 29 00:01:55,230 --> 00:01:57,270 numbers are which row numbers you want. 30 00:01:57,480 --> 00:01:59,220 Let's go ahead and show how we can do this. 31 00:01:59,340 --> 00:02:01,530 Going to go ahead and clear this. 32 00:02:01,530 --> 00:02:03,170 We have our data frame. 33 00:02:03,210 --> 00:02:07,240 Let's go ahead and select all range values from the rain column. 34 00:02:07,380 --> 00:02:15,330 I can say Blinkx comma and then input the string rain instead of inputting the number three for the 35 00:02:15,330 --> 00:02:21,230 third index column and that's going to return all the values of the rain call. 36 00:02:21,300 --> 00:02:26,770 For example let's say I wanted to get the first five rows for days and temps. 37 00:02:27,030 --> 00:02:33,600 Well I can say D-F if I want the first five rows That's actually all the rows in this case going to 38 00:02:33,600 --> 00:02:36,260 go ahead in the use slicing notation here. 39 00:02:36,300 --> 00:02:41,710 One colon five comma and then I want the columns days and temp. 40 00:02:41,730 --> 00:02:45,640 So I want all five row values for the columns days and temp. 41 00:02:45,780 --> 00:02:54,540 I can go ahead and pass and a vector is in the combined function with days and temp inside of it. 42 00:02:55,260 --> 00:03:00,350 And there you have that subsection of the data frame with just those two columns. 43 00:03:01,140 --> 00:03:05,030 We also have the ability to just grab all the values of a particular column. 44 00:03:05,130 --> 00:03:08,250 Now we can do this in a couple of ways. 45 00:03:08,280 --> 00:03:11,940 One way we can do this is by using the dollar sign notation. 46 00:03:11,940 --> 00:03:17,480 So you put the name of your data frame a dollar sign and then the name of the column. 47 00:03:17,670 --> 00:03:22,950 And you'll notice that our studio actually provides little clips for the column names. 48 00:03:23,160 --> 00:03:27,430 So I can just go ahead and say days will return all the values for days. 49 00:03:27,660 --> 00:03:32,630 The dollar sign and I begin to get the other column name option's temp etc.. 50 00:03:32,960 --> 00:03:36,240 We can also use bracket notation to do the same thing. 51 00:03:36,270 --> 00:03:42,140 I can put India off and then just pass in a string of the column I want. 52 00:03:42,150 --> 00:03:44,050 Notice how this is different though. 53 00:03:44,400 --> 00:03:50,040 If I use the bracket notation I'm actually getting the data returns in a data frame format. 54 00:03:50,040 --> 00:03:58,230 So I keep this index notation with the column days and the row number values if I use the dollar sign 55 00:03:58,230 --> 00:04:01,130 format actually getting back a vector. 56 00:04:01,170 --> 00:04:07,050 So keep that in mind when you're using this notation to callback columns completely. 57 00:04:07,050 --> 00:04:07,550 All right. 58 00:04:07,620 --> 00:04:13,020 Let's go ahead and learn how we can use the subset function to grab a subset of values from our data 59 00:04:13,030 --> 00:04:15,280 Freyne based off some condition. 60 00:04:15,610 --> 00:04:19,120 For example imagine I wanted to grab the days where it rained. 61 00:04:19,170 --> 00:04:24,540 That means we want to grab the days where rain is equal to true but use a subset condition to do that 62 00:04:25,440 --> 00:04:29,450 you put in subset you pass in your data frame source. 63 00:04:29,450 --> 00:04:31,890 So in our case it's día. 64 00:04:32,910 --> 00:04:39,350 Then you pass in the subset argument and you go ahead and pasand some sort of condition statement that 65 00:04:39,380 --> 00:04:43,220 these conditions statements pretty much always use comparison operators. 66 00:04:43,440 --> 00:04:47,580 And we want our condition to be rain is equal to true. 67 00:04:47,580 --> 00:04:51,780 Notice that rain doesn't actually have to be a character string. 68 00:04:51,870 --> 00:04:56,190 You can just put in range without any quotes and that's because subset knows that you're going to be 69 00:04:56,190 --> 00:05:01,580 referring to column names in the court and subset. 70 00:05:01,580 --> 00:05:08,090 If we go ahead and do that we get back a data frame where the values of rain was equal to true again 71 00:05:08,110 --> 00:05:08,190 . 72 00:05:08,220 --> 00:05:12,960 Notice how the condition uses some sort of comparison operator in the above case it was the equality 73 00:05:12,960 --> 00:05:13,700 operator. 74 00:05:13,800 --> 00:05:15,130 Equals equals. 75 00:05:15,480 --> 00:05:20,310 Let's go ahead and show another example where we grab days where the temperature was greater than 23 76 00:05:20,310 --> 00:05:21,960 degrees. 77 00:05:22,170 --> 00:05:30,120 Again I say subset pass in my data frame source put in subset as my argument and then put in some sort 78 00:05:30,120 --> 00:05:31,040 of condition. 79 00:05:31,170 --> 00:05:39,370 This case I want temp column to be greater than 23 so return all instances where that's true. 80 00:05:39,660 --> 00:05:40,470 And there you have it. 81 00:05:40,470 --> 00:05:44,320 It's rows 4 and 5 and it's returned as a data frame. 82 00:05:45,030 --> 00:05:51,300 OK finally let's learn about ordering a data from we can sort the order of our data frame by using the 83 00:05:51,390 --> 00:05:56,250 order function you pass in the column you want to sort by into the order function. 84 00:05:56,430 --> 00:05:59,550 Then you use that vector to select from the data frame. 85 00:05:59,550 --> 00:06:03,000 Let's go ahead and see an example of this. 86 00:06:03,320 --> 00:06:09,430 I'm going to make a variable called sorted temp. 87 00:06:10,080 --> 00:06:17,960 I'm going to go ahead and call the order function and then Pessin ZF. 88 00:06:18,090 --> 00:06:28,740 Let's go ahead and pasand DFA temp in if I call sort of temp. 89 00:06:28,740 --> 00:06:29,490 All right. 90 00:06:29,490 --> 00:06:36,180 Notice how I get back the index notation for the temp vector. 91 00:06:36,180 --> 00:06:41,510 Basically what this means is it's the sorted temperatures by index. 92 00:06:41,550 --> 00:06:49,030 So temperature 21 should be sorted first if you're going to sort them from least to greatest. 93 00:06:49,050 --> 00:06:53,650 Then it goes twenty two point two and the remaining three were already in order. 94 00:06:53,670 --> 00:06:58,160 Let me go ahead and show you how you can use this sort of vector to actually sort your data frame. 95 00:06:58,340 --> 00:07:08,220 You would just go DMF pass in sort of temp comma and this will go ahead and sort your data frame by 96 00:07:08,220 --> 00:07:17,010 those temperature values and you can notice now how these rows values were connected to sort of temp 97 00:07:17,010 --> 00:07:17,970 . 98 00:07:18,300 --> 00:07:24,000 When we think of what's actually going on here we're essentially just asking for the index elements 99 00:07:24,090 --> 00:07:26,760 in that order and by default it's ascending. 100 00:07:26,760 --> 00:07:30,570 You can also pass a negative sign to do a descending order. 101 00:07:30,600 --> 00:07:33,060 Let me go ahead and show you how we can do that. 102 00:07:33,110 --> 00:07:45,150 All say sending 10 order that a negative term. 103 00:07:45,270 --> 00:07:48,010 Take a look at the ascending temp. 104 00:07:48,210 --> 00:07:50,290 Now it's 5 4 3 1 2. 105 00:07:50,430 --> 00:07:59,310 And when we pass that into our data frame the sending temp column or comics Scuse me because all the 106 00:07:59,310 --> 00:08:01,170 columns. 107 00:08:01,170 --> 00:08:05,310 Now we get a descending temperature again with this sort of notation. 108 00:08:05,310 --> 00:08:07,680 All you're doing is you're basically just asking. 109 00:08:07,800 --> 00:08:15,660 Give me back the rows of the data frame in this order 5 4 3 1 2 because you use the order function to 110 00:08:15,660 --> 00:08:17,900 sort by temperature. 111 00:08:17,910 --> 00:08:22,800 You also could have done this exact operation instead of using the bracket notation here by using the 112 00:08:22,800 --> 00:08:26,200 dollar sign notation to show you a quick example that. 113 00:08:26,250 --> 00:08:35,110 Let's go ahead and just say the sending temp order E-F. 114 00:08:35,890 --> 00:08:42,480 Tim I will go ahead and put a negative sign up front that is sending and then we'll actually get the 115 00:08:42,480 --> 00:08:47,170 same thing when we call for the sending as the data frame. 116 00:08:47,190 --> 00:08:52,650 All right that's it for the basics of selection and indexing of a data frame in the next lecture. 117 00:08:52,650 --> 00:08:58,200 We're going to be doing a complete overview of the various operations and capabilities of a data frame 118 00:08:58,200 --> 00:08:58,830 . 119 00:08:58,890 --> 00:09:04,200 Remember for the next lecture there's an entire notebook ready for you to reference almost as a cheat 120 00:09:04,200 --> 00:09:06,090 sheet for use in data frames. 121 00:09:06,150 --> 00:09:09,930 The next lecture is actually going to be one of the most fundamental lectures as far as learning how 122 00:09:09,930 --> 00:09:12,390 to use data frames with our. 123 00:09:12,390 --> 00:09:15,200 OK thanks everyone and I'll see you at the next lecture 13109

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.