All language subtitles for 2022_lecture5-720p-en

af Afrikaans
ak Akan
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bn Bengali
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fr French
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
ko Korean
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranî)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
fa Persian
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal)
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es Spanish
es-419 Spanish (Latin American)
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian Download
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 0 00:00:00,000 --> 00:01:18,380 [MUSIC PLAYING] 1 00:01:18,380 --> 00:01:20,000 SPEAKER 1: All right. 2 00:01:20,000 --> 00:01:21,830 This is CS 50. 3 00:01:21,830 --> 00:01:24,740 And this is already week 5, which means this is actually 4 00:01:24,740 --> 00:01:27,240 our last week in C together. 5 00:01:27,240 --> 00:01:31,070 In fact, in just a few days' time, what has looked like this 6 00:01:31,070 --> 00:01:33,490 and much more cryptic than this perhaps, is 7 00:01:33,490 --> 00:01:35,990 going to be distilled into something much simpler next week. 8 00:01:35,990 --> 00:01:38,150 When we transition to a language called Python. 9 00:01:38,150 --> 00:01:42,470 And with Python, we'll still have our conditionals, and loops, and functions, 10 00:01:42,470 --> 00:01:43,173 and so forth. 11 00:01:43,173 --> 00:01:46,340 But a lot of the low-level plumbing that you might have been wrestling with, 12 00:01:46,340 --> 00:01:49,020 struggling with, frustrated by, over the past couple of weeks, 13 00:01:49,020 --> 00:01:51,320 especially, now that we've introduced pointers. 14 00:01:51,320 --> 00:01:54,200 And it feels like you probably have to do everything yourself. 15 00:01:54,200 --> 00:01:57,060 In Python, and in a lot of higher level languages 16 00:01:57,060 --> 00:01:59,450 so to speak-- more modern, more recent languages, 17 00:01:59,450 --> 00:02:02,540 you'll be able to do so much more with just single lines of code. 18 00:02:02,540 --> 00:02:05,540 And indeed, we're going to start leveraging libraries, all the more code 19 00:02:05,540 --> 00:02:06,980 that other people wrote. 20 00:02:06,980 --> 00:02:10,160 Frameworks, which is collections of libraries that other people wrote. 21 00:02:10,160 --> 00:02:13,610 And on top of all that, will you be able to make even better, grander, more 22 00:02:13,610 --> 00:02:17,210 impressive projects, that actually solve problems of particular interest to you. 23 00:02:17,210 --> 00:02:20,100 Particularly, by way of your own final project. 24 00:02:20,100 --> 00:02:23,600 So last week though, in week 4, recall that we focused on memory. 25 00:02:23,600 --> 00:02:26,210 And we've been treating this memory inside of your computer 26 00:02:26,210 --> 00:02:27,560 is like a canvas, right. 27 00:02:27,560 --> 00:02:30,770 At the end of the day, it's just zeros and ones, or bytes, really. 28 00:02:30,770 --> 00:02:33,900 And it's really up to you what you do with those bytes. 29 00:02:33,900 --> 00:02:37,400 And how you interconnect them, how you represent information on them. 30 00:02:37,400 --> 00:02:39,478 And arrays, were like one of the simplest ways. 31 00:02:39,478 --> 00:02:41,270 We started playing around with that memory. 32 00:02:41,270 --> 00:02:43,160 Just contiguous chunks of memory. 33 00:02:43,160 --> 00:02:44,300 Back-to-back, to back. 34 00:02:44,300 --> 00:02:47,030 But let's consider, for a moment, some of the problems that 35 00:02:47,030 --> 00:02:48,620 pretty quickly arise with arrays. 36 00:02:48,620 --> 00:02:52,190 And then, today focus on what more generally are called data structures. 37 00:02:52,190 --> 00:02:57,110 Using your computer's memory as a much more versatile canvas, 38 00:02:57,110 --> 00:02:59,380 to create even two-dimensional structures. 39 00:02:59,380 --> 00:03:01,130 To represent information, and, ultimately, 40 00:03:01,130 --> 00:03:03,210 to solve more interesting problems. 41 00:03:03,210 --> 00:03:04,790 So here's an array of size 3. 42 00:03:04,790 --> 00:03:06,590 Maybe, the size of 3 integers. 43 00:03:06,590 --> 00:03:08,838 And suppose that this is inside of a program. 44 00:03:08,838 --> 00:03:11,630 And at this point in the story, you've got 3 numbers in it already. 45 00:03:11,630 --> 00:03:13,040 1, 2 and 3. 46 00:03:13,040 --> 00:03:17,077 And suppose, whatever the context, you need to now add a fourth number 47 00:03:17,077 --> 00:03:17,660 to this array. 48 00:03:17,660 --> 00:03:18,950 Like, the number 4. 49 00:03:18,950 --> 00:03:21,967 Well, instinctively, where should the number 4 go? 50 00:03:21,967 --> 00:03:24,050 If this is your computer's memory and we currently 51 00:03:24,050 --> 00:03:25,759 have this array 1, 2, 3, from what. 52 00:03:25,759 --> 00:03:27,110 Left to right. 53 00:03:27,110 --> 00:03:30,340 Where should the number 4 just, perhaps, naively go. 54 00:03:30,340 --> 00:03:31,340 Yeah, what do you think? 55 00:03:31,340 --> 00:03:32,420 AUDIENCE: Replace number 1. 56 00:03:32,420 --> 00:03:32,930 SPEAKER 1: Sorry? 57 00:03:32,930 --> 00:03:33,830 AUDIENCE: Replace number 1. 58 00:03:33,830 --> 00:03:34,580 SPEAKER 1: Oh, OK. 59 00:03:34,580 --> 00:03:36,020 So you could replace number 1. 60 00:03:36,020 --> 00:03:37,895 I don't really like that, though, because I'd 61 00:03:37,895 --> 00:03:39,290 like to keep number 1 around. 62 00:03:39,290 --> 00:03:40,580 But that's an option. 63 00:03:40,580 --> 00:03:42,330 But I'm losing, of course, information. 64 00:03:42,330 --> 00:03:44,790 So what else could I do if I want to add the number 4. 65 00:03:44,790 --> 00:03:45,290 Over there? 66 00:03:45,290 --> 00:03:46,665 AUDIENCE: On the right side of 3. 67 00:03:46,665 --> 00:03:47,332 SPEAKER 1: Yeah. 68 00:03:47,332 --> 00:03:49,472 So, I mean, it feels like if there's some ordering 69 00:03:49,472 --> 00:03:51,680 to these, which seems kind of a reasonable inference, 70 00:03:51,680 --> 00:03:53,780 that it probably belongs somewhere over here. 71 00:03:53,780 --> 00:03:57,260 But recall last week, as we started poking around a computer's memory, 72 00:03:57,260 --> 00:03:59,130 there's other stuff potentially going on. 73 00:03:59,130 --> 00:04:02,750 And if fill that in, ideally, we'd want to just plop the number 4 here. 74 00:04:02,750 --> 00:04:04,580 If we're maintaining this kind of order. 75 00:04:04,580 --> 00:04:06,980 But recall in the context of your computer's memory, 76 00:04:06,980 --> 00:04:08,420 there might be other stuff there. 77 00:04:08,420 --> 00:04:10,932 Some of these garbage values that might be usable, 78 00:04:10,932 --> 00:04:12,890 but we don't really know or care what they are. 79 00:04:12,890 --> 00:04:14,480 As represented by Oscar here. 80 00:04:14,480 --> 00:04:17,510 But there might actually be useful data in use. 81 00:04:17,510 --> 00:04:20,900 Like, if your program has not just a few integers in this array, 82 00:04:20,900 --> 00:04:23,030 but also a string that says like, "Hello, world." 83 00:04:23,030 --> 00:04:29,090 It could be that your computer has plopped the H-E-L-L-O W-O-R-L-D right 84 00:04:29,090 --> 00:04:30,210 after this array. 85 00:04:30,210 --> 00:04:30,710 Why? 86 00:04:30,710 --> 00:04:32,960 Well, maybe, you created the array in one line of code 87 00:04:32,960 --> 00:04:34,610 and filled it with 1, 2, 3. 88 00:04:34,610 --> 00:04:37,010 Maybe the next line of code used GET-STRING. 89 00:04:37,010 --> 00:04:40,230 Or maybe just hard coded a string in your code for "Hello, world." 90 00:04:40,230 --> 00:04:42,977 And so you painted yourself into a corner, so to speak. 91 00:04:42,977 --> 00:04:45,560 Now I think you might claim, well, let's just overwrite the H. 92 00:04:45,560 --> 00:04:47,510 But that's problematic for the same reasons. 93 00:04:47,510 --> 00:04:49,230 We don't want to do that. 94 00:04:49,230 --> 00:04:52,130 So where else could the 4 go? 95 00:04:52,130 --> 00:04:55,370 Or how do we solve this problem if we want to add a number, 96 00:04:55,370 --> 00:04:57,080 and there's clearly memory available. 97 00:04:57,080 --> 00:05:00,470 Because those garbage values are junk that we don't care about anymore. 98 00:05:00,470 --> 00:05:02,600 So we could certainly reuse those. 99 00:05:02,600 --> 00:05:06,240 Where could the 4, and perhaps this whole array, go? 100 00:05:06,240 --> 00:05:06,740 OK. 101 00:05:06,740 --> 00:05:08,570 So I'm hearing we could move it somewhere. 102 00:05:08,570 --> 00:05:10,403 Maybe, replace some of those garbage values. 103 00:05:10,403 --> 00:05:12,420 And honestly, we have a lot of options. 104 00:05:12,420 --> 00:05:14,660 We could use any of these garbage values up here. 105 00:05:14,660 --> 00:05:17,400 We could use any of these down here, or even further down. 106 00:05:17,400 --> 00:05:20,960 The point is there is plenty of memory available as 107 00:05:20,960 --> 00:05:24,410 indicated by these Oscars, where we could put 4, maybe even, 5, 108 00:05:24,410 --> 00:05:25,790 6 or more integers. 109 00:05:25,790 --> 00:05:28,970 The catch is that we chose poorly early on. 110 00:05:28,970 --> 00:05:30,050 Or we just got unlucky. 111 00:05:30,050 --> 00:05:33,686 And 1, 2, 3 ended up back-to-back with some other data that we care about. 112 00:05:33,686 --> 00:05:34,769 All right, so that's fine. 113 00:05:34,769 --> 00:05:37,579 Let's go ahead and assume that we'll abstract away everything else. 114 00:05:37,579 --> 00:05:40,745 And we'll plop the new array in this location here. 115 00:05:40,745 --> 00:05:42,620 So I'm going to go ahead and copy the 1 over. 116 00:05:42,620 --> 00:05:43,520 The 2 over. 117 00:05:43,520 --> 00:05:44,420 The 3 over. 118 00:05:44,420 --> 00:05:47,152 And then, ultimately, once I'm ready to fill the 4, 119 00:05:47,152 --> 00:05:49,610 I can throw away, essentially, the old array at this point. 120 00:05:49,610 --> 00:05:51,620 Because I have it now entirely in duplicate. 121 00:05:51,620 --> 00:05:53,760 And I can populate it with the number 4. 122 00:05:53,760 --> 00:05:54,260 All right. 123 00:05:54,260 --> 00:05:55,130 So problem solved. 124 00:05:55,130 --> 00:05:58,100 That is a correct potential solution to this problem. 125 00:05:58,100 --> 00:05:59,183 But, what's the trade off? 126 00:05:59,183 --> 00:06:02,142 And this is something we're going to start thinking about all the more. 127 00:06:02,142 --> 00:06:04,820 What's the downside of having solved this problem in this way? 128 00:06:04,820 --> 00:06:06,415 Yeah. 129 00:06:06,415 --> 00:06:07,790 I'm adding a lot of running time. 130 00:06:07,790 --> 00:06:10,580 It took me a lot of effort to copy those additional numbers. 131 00:06:10,580 --> 00:06:12,020 Now, granted, it's a small array. 132 00:06:12,020 --> 00:06:13,020 3 numbers, who cares. 133 00:06:13,020 --> 00:06:14,895 It's going to be over in the blink of an eye. 134 00:06:14,895 --> 00:06:17,580 But if we start talking about interesting data sets, 135 00:06:17,580 --> 00:06:20,190 web application data sets, mobile app data sets. 136 00:06:20,190 --> 00:06:23,670 Where you have not just a few, but maybe a few hundred, few thousand, 137 00:06:23,670 --> 00:06:25,630 a few million pieces of data. 138 00:06:25,630 --> 00:06:28,770 This is probably a suboptimal solution to just, oh, 139 00:06:28,770 --> 00:06:30,752 move all your data from one place to another. 140 00:06:30,752 --> 00:06:32,460 Because who's to say that we're not going 141 00:06:32,460 --> 00:06:34,050 to paint ourselves into a new corner. 142 00:06:34,050 --> 00:06:37,260 And it would feel like you're wasting all of this time moving stuff around. 143 00:06:37,260 --> 00:06:41,110 And, ultimately, just costing yourself a huge amount of time. 144 00:06:41,110 --> 00:06:44,130 In fact, if we put this now into the context of our Big O notation 145 00:06:44,130 --> 00:06:49,050 from a few weeks back, what might the running time now of Search 146 00:06:49,050 --> 00:06:50,160 be for an array? 147 00:06:50,160 --> 00:06:51,270 Let's start simple. 148 00:06:51,270 --> 00:06:53,430 A throwback a couple of weeks ago. 149 00:06:53,430 --> 00:06:56,580 If you're using an array, to recap, what was the running time 150 00:06:56,580 --> 00:06:59,590 of a Search algorithm in Big O notation? 151 00:06:59,590 --> 00:07:01,770 So, maybe, in the worst case. 152 00:07:01,770 --> 00:07:05,550 If you've got n numbers, 3 in this case or 4, but n more generally. 153 00:07:05,550 --> 00:07:08,320 Big O of what for Search? 154 00:07:08,320 --> 00:07:08,820 Yeah. 155 00:07:08,820 --> 00:07:09,420 What do you think? 156 00:07:09,420 --> 00:07:10,050 AUDIENCE: Big O of n. 157 00:07:10,050 --> 00:07:11,100 SPEAKER 1: Big O of n. 158 00:07:11,100 --> 00:07:12,720 And what's your intuition for that? 159 00:07:12,720 --> 00:07:14,145 AUDIENCE: [INAUDIBLE]. 160 00:07:18,487 --> 00:07:19,070 SPEAKER 1: OK. 161 00:07:19,070 --> 00:07:19,310 Yeah. 162 00:07:19,310 --> 00:07:22,102 So if we go through each element, for instance, from left to right, 163 00:07:22,102 --> 00:07:25,490 then Search is going to take this a Big O running time. 164 00:07:25,490 --> 00:07:28,520 If, though, we're talking about these numbers, specifically. 165 00:07:28,520 --> 00:07:31,490 And now I'll explicitly stipulate that, yeah, they're sorted. 166 00:07:31,490 --> 00:07:32,660 Does that buy us anything? 167 00:07:32,660 --> 00:07:36,950 What would the Big O notation be for Searching an array in this case, 168 00:07:36,950 --> 00:07:39,440 be it of size 3, or 4, or n, more generally. 169 00:07:39,440 --> 00:07:40,490 AUDIENCE: Big O of n. 170 00:07:40,490 --> 00:07:42,290 SPEAKER 1: Big O of, not n, but rather? 171 00:07:42,290 --> 00:07:42,680 AUDIENCE: Log n. 172 00:07:42,680 --> 00:07:43,700 SPEAKER 1: Log n, right. 173 00:07:43,700 --> 00:07:47,708 Because we could use per week zero binary search on an array like this, 174 00:07:47,708 --> 00:07:49,250 we'd have to deal with some rounding. 175 00:07:49,250 --> 00:07:51,440 Because there's not a perfect number of elements at the moment. 176 00:07:51,440 --> 00:07:52,850 But you could use binary search. 177 00:07:52,850 --> 00:07:54,170 Go to the middle roughly. 178 00:07:54,170 --> 00:07:55,910 And then go left or right, left or right, 179 00:07:55,910 --> 00:07:57,660 until you find the element you care about. 180 00:07:57,660 --> 00:08:01,820 So Search remains in Big O of log n when using arrays. 181 00:08:01,820 --> 00:08:03,650 But what about insertion, now? 182 00:08:03,650 --> 00:08:05,690 If we start to think about other operations. 183 00:08:05,690 --> 00:08:09,380 Like, adding a number to this array, or adding a friend to your contacts 184 00:08:09,380 --> 00:08:12,050 app, or Google finding another page on the internet. 185 00:08:12,050 --> 00:08:14,510 So insertion happens all the time. 186 00:08:14,510 --> 00:08:17,330 What's the running time of Insert? 187 00:08:17,330 --> 00:08:20,630 When it comes to inserting into an existing array of size n. 188 00:08:20,630 --> 00:08:23,300 How many steps might that take? 189 00:08:23,300 --> 00:08:24,170 Big O of n. 190 00:08:24,170 --> 00:08:25,220 It would be, indeed, n. 191 00:08:25,220 --> 00:08:25,720 Why? 192 00:08:25,720 --> 00:08:28,580 Because in the worst case, where you're out of space, 193 00:08:28,580 --> 00:08:31,148 you have to allocate, it would seem, a new array. 194 00:08:31,148 --> 00:08:33,440 Maybe, taking over some of the previous garbage values. 195 00:08:33,440 --> 00:08:35,180 But the catch is, even though you're only 196 00:08:35,180 --> 00:08:37,550 inserting one new number, like the number 4, 197 00:08:37,550 --> 00:08:41,070 you have to copy over all the darn existing numbers into the new one. 198 00:08:41,070 --> 00:08:44,060 So if your original array of size n, the copying of that 199 00:08:44,060 --> 00:08:45,930 is going to take Big O of n plus 1. 200 00:08:45,930 --> 00:08:48,930 But we can throw away the plus 1 because of the math we did in the past. 201 00:08:48,930 --> 00:08:51,860 So Insert now becomes Big O of n. 202 00:08:51,860 --> 00:08:53,720 And that might not be ideal. 203 00:08:53,720 --> 00:08:56,510 Because if you're in the habit of inserting things frequently, 204 00:08:56,510 --> 00:08:58,880 that could start to add up, and add up, and add up. 205 00:08:58,880 --> 00:09:01,820 And this is why computer programs, and websites, and mobile apps 206 00:09:01,820 --> 00:09:02,990 could be slow. 207 00:09:02,990 --> 00:09:06,000 If you're not being mindful of these trade offs. 208 00:09:06,000 --> 00:09:10,010 So what about, just for good measure, Omega notation. 209 00:09:10,010 --> 00:09:11,270 And maybe, the best case. 210 00:09:11,270 --> 00:09:13,760 Well just to recap here, we could get lucky 211 00:09:13,760 --> 00:09:16,052 and Search could just take one step. 212 00:09:16,052 --> 00:09:18,260 Because you might just get lucky, and boom the number 213 00:09:18,260 --> 00:09:20,810 you're looking for is right there in the middle, if using binary search. 214 00:09:20,810 --> 00:09:22,670 Or even linear search, for that matter. 215 00:09:22,670 --> 00:09:23,720 And insert 2. 216 00:09:23,720 --> 00:09:27,710 If there's enough room, and we didn't have to move all of those numbers-- 217 00:09:27,710 --> 00:09:29,247 1, 2, and 3, to a new location. 218 00:09:29,247 --> 00:09:30,080 You could get lucky. 219 00:09:30,080 --> 00:09:32,240 And we could have, as someone suggested, just 220 00:09:32,240 --> 00:09:34,038 put the number 4 right there at the end. 221 00:09:34,038 --> 00:09:36,080 And if we don't get lucky, it might take n steps. 222 00:09:36,080 --> 00:09:39,960 If we do get lucky, it might just take the one, or constant number, of steps. 223 00:09:39,960 --> 00:09:41,670 In fact, let me go ahead and do this. 224 00:09:41,670 --> 00:09:43,320 How about we do something like this? 225 00:09:43,320 --> 00:09:45,020 Let me switch over to some code here. 226 00:09:45,020 --> 00:09:48,110 Let me start to make a program called List.C. 227 00:09:48,110 --> 00:09:50,789 And in List.C, let's start with the old way. 228 00:09:50,789 --> 00:09:54,030 So we follow the breadcrumbs we've laid for ourselves as follows. 229 00:09:54,030 --> 00:09:57,470 So in this List.C, I'm going to include standardio.h. 230 00:09:57,470 --> 00:09:59,450 Int main(void) as usual. 231 00:09:59,450 --> 00:10:02,780 Then inside of my code here, I'm going to go ahead and give myself 232 00:10:02,780 --> 00:10:04,590 the first version of memory. 233 00:10:04,590 --> 00:10:09,330 So int list 3 is now implemented at the moment, in an array. 234 00:10:09,330 --> 00:10:11,687 So we're rewinding for now to week 2 style code. 235 00:10:11,687 --> 00:10:13,520 And then, let me just initialize this thing. 236 00:10:13,520 --> 00:10:15,200 At the first location will be 1. 237 00:10:15,200 --> 00:10:17,240 At the next location will be 2. 238 00:10:17,240 --> 00:10:19,910 And at the last location will be 3. 239 00:10:19,910 --> 00:10:22,240 So the array is zero indexed always. 240 00:10:22,240 --> 00:10:23,990 I, for just the sake of discussion though, 241 00:10:23,990 --> 00:10:27,420 am putting in the numbers 1, 2, 3, like a normal person might. 242 00:10:27,420 --> 00:10:27,920 All right. 243 00:10:27,920 --> 00:10:29,337 So now let's just print these out. 244 00:10:29,337 --> 00:10:30,800 4 int i gets 0. 245 00:10:30,800 --> 00:10:32,840 I less than 3, i++. 246 00:10:32,840 --> 00:10:35,750 Let's go ahead now and print out using printf. 247 00:10:35,750 --> 00:10:38,660 %i/n list [i]. 248 00:10:38,660 --> 00:10:42,290 So very simple program, inspired by what we did in week 2. 249 00:10:42,290 --> 00:10:46,200 Just to create and then print out the contents of an array. 250 00:10:46,200 --> 00:10:48,380 So let's Make List. 251 00:10:48,380 --> 00:10:52,460 So far, so good. ./list And voila, we see 1, 2, 3. 252 00:10:52,460 --> 00:10:57,470 Now let's start to practice some of what we're preaching with this new syntax. 253 00:10:57,470 --> 00:11:02,060 So let me go in now and get rid of the array version. 254 00:11:02,060 --> 00:11:04,910 And let me zoom out a little bit to give ourselves some more space. 255 00:11:04,910 --> 00:11:08,450 And now let's begin to create a list of size 3. 256 00:11:08,450 --> 00:11:11,630 So if I'm going to do this now, dynamically, 257 00:11:11,630 --> 00:11:15,780 so that I'm allocating these things again and again, 258 00:11:15,780 --> 00:11:17,430 let me go ahead and do this. 259 00:11:17,430 --> 00:11:24,470 Let me give myself a list that's of type int* equal the return value of malloc 260 00:11:24,470 --> 00:11:31,490 of 3 times the size of an int, so what this is going to do for me is give me 261 00:11:31,490 --> 00:11:34,490 enough memory for that very first picture we drew on the board. 262 00:11:34,490 --> 00:11:37,160 Which was the array containing 1, 2, and 3. 263 00:11:37,160 --> 00:11:39,990 But laying the foundation to be able to resize it, 264 00:11:39,990 --> 00:11:41,580 which was ultimately the goal. 265 00:11:41,580 --> 00:11:43,650 So my syntax is a little different here. 266 00:11:43,650 --> 00:11:47,090 I'm going to use malloc and get memory from the so-called "heap", as we 267 00:11:47,090 --> 00:11:48,000 called it last week. 268 00:11:48,000 --> 00:11:51,890 Instead of using the stack by just doing the previous version where I said, 269 00:11:51,890 --> 00:11:54,680 int list 3. 270 00:11:54,680 --> 00:11:59,090 That is to say this line of code from the first version is in some sense 271 00:11:59,090 --> 00:12:02,630 identical to this line of code in the second version. 272 00:12:02,630 --> 00:12:04,730 But the first line of code puts the memory 273 00:12:04,730 --> 00:12:06,890 on the stack, automatically, for me. 274 00:12:06,890 --> 00:12:09,800 The second line of code, that I've left here now, 275 00:12:09,800 --> 00:12:13,280 is creating an array of size 3, but it's putting it on the heap. 276 00:12:13,280 --> 00:12:16,900 And that's important because it was only on the heap and via this new function 277 00:12:16,900 --> 00:12:17,830 last week, malloc. 278 00:12:17,830 --> 00:12:20,860 That you can actually ask for more memory, and even give it back. 279 00:12:20,860 --> 00:12:24,760 When you just use the first notation int list 3, 280 00:12:24,760 --> 00:12:28,150 you have permanently given yourself an array of size 3. 281 00:12:28,150 --> 00:12:31,130 You cannot add to that in code. 282 00:12:31,130 --> 00:12:33,010 So let me go ahead and do this. 283 00:12:33,010 --> 00:12:36,143 If list==null, something went wrong. 284 00:12:36,143 --> 00:12:37,310 The computers out of memory. 285 00:12:37,310 --> 00:12:39,503 So let's just return 1 and quit out of this program. 286 00:12:39,503 --> 00:12:40,670 There's nothing to see here. 287 00:12:40,670 --> 00:12:42,520 So just a good error check there. 288 00:12:42,520 --> 00:12:44,770 Now let me go ahead and initialize this list. 289 00:12:44,770 --> 00:12:46,720 So list [0] will be 1 again. 290 00:12:46,720 --> 00:12:48,070 List [1] will be 2. 291 00:12:48,070 --> 00:12:50,440 And list [2] will be 3. 292 00:12:50,440 --> 00:12:52,810 So that's the same kind of syntax as before. 293 00:12:52,810 --> 00:12:55,930 And notice this equivalence. 294 00:12:55,930 --> 00:13:00,730 Recall that there's this relationship between chunks of memory and arrays. 295 00:13:00,730 --> 00:13:03,550 And arrays are really just doing pointer arithmetic for you, 296 00:13:03,550 --> 00:13:05,260 where the square bracket notation is. 297 00:13:05,260 --> 00:13:10,030 So if I've asked myself here, in line 5, for enough memory for 3 integers, 298 00:13:10,030 --> 00:13:15,250 it is perfectly OK to treat it now like an array using square bracket notation. 299 00:13:15,250 --> 00:13:17,740 Because the computer will do the arithmetic for me 300 00:13:17,740 --> 00:13:20,440 and find the first location, the second, and the third. 301 00:13:20,440 --> 00:13:24,550 If you really want to be cool and hacker-like, well, 302 00:13:24,550 --> 00:13:31,300 you could say list=1, list+1=2, list+2=3. 303 00:13:33,880 --> 00:13:36,220 That's the same thing using very explicit, 304 00:13:36,220 --> 00:13:38,830 pointer arithmetic, which we looked at briefly last week. 305 00:13:38,830 --> 00:13:41,170 But this is atrocious to look at for most people. 306 00:13:41,170 --> 00:13:42,860 It's just not very user friendly. 307 00:13:42,860 --> 00:13:45,790 It's longer to type, so most people, even when 308 00:13:45,790 --> 00:13:48,670 allocating memory dynamically as I did a second ago, 309 00:13:48,670 --> 00:13:52,630 would just use the more familiar notation of an array. 310 00:13:52,630 --> 00:13:53,240 All right. 311 00:13:53,240 --> 00:13:54,310 So let's go on. 312 00:13:54,310 --> 00:13:58,840 Now suppose time passes and I realize, oh shoot, 313 00:13:58,840 --> 00:14:03,820 I really wanted this array to be of size 4 instead of size 3. 314 00:14:03,820 --> 00:14:06,362 Now, obviously, I could just rewind and like fix the program. 315 00:14:06,362 --> 00:14:08,320 But suppose that this is a much larger program. 316 00:14:08,320 --> 00:14:10,690 And I've realized, at this point, that I need 317 00:14:10,690 --> 00:14:14,080 to be able to dynamically add more things to this array for whatever 318 00:14:14,080 --> 00:14:14,740 reason. 319 00:14:14,740 --> 00:14:16,280 Well let me go ahead and do this. 320 00:14:16,280 --> 00:14:18,670 Let me just say, all right, list should actually 321 00:14:18,670 --> 00:14:24,700 be the result of asking for 4 chunks of memory from malloc. 322 00:14:24,700 --> 00:14:28,735 And then, I could do something like this, list [3]=4. 323 00:14:31,690 --> 00:14:34,700 Now this is buggy, potentially, in a couple of ways. 324 00:14:34,700 --> 00:14:41,530 But let me ask first, what's really wrong, first, with this code? 325 00:14:41,530 --> 00:14:45,850 The goal at hand is to start with the array of size 3 with the 1, 2, 3. 326 00:14:45,850 --> 00:14:47,660 And I want to add a number 4 to it. 327 00:14:47,660 --> 00:14:53,380 So at the moment, in line 17, I've asked the computer for a chunk of 4 integers. 328 00:14:53,380 --> 00:14:54,940 Just like the picture. 329 00:14:54,940 --> 00:14:57,130 And then I'm adding the number 4 to it. 330 00:14:57,130 --> 00:15:00,610 But I have skipped a few steps and broken this somehow. 331 00:15:00,610 --> 00:15:01,894 Yeah. 332 00:15:01,894 --> 00:15:04,023 AUDIENCE: You don't know exactly [INAUDIBLE].. 333 00:15:04,023 --> 00:15:04,690 SPEAKER 1: Yeah. 334 00:15:04,690 --> 00:15:07,060 I don't necessarily know where this is going to end up in memory. 335 00:15:07,060 --> 00:15:08,560 It's probably not going to be immediately 336 00:15:08,560 --> 00:15:09,910 adjacent to the previous chunk. 337 00:15:09,910 --> 00:15:12,740 And so, yes, even though I'm putting the number for there, 338 00:15:12,740 --> 00:15:16,700 I haven't copied the 1, the 2, or the 3 over to this chunk of memory. 339 00:15:16,700 --> 00:15:18,400 So well let me fix-- 340 00:15:18,400 --> 00:15:22,630 well, that's actually, indeed, really the essence of the problem. 341 00:15:22,630 --> 00:15:26,080 I am orphaning the original chunk of memory. 342 00:15:26,080 --> 00:15:29,260 If you think of the picture that I drew earlier, the line of code 343 00:15:29,260 --> 00:15:35,500 up here on line 5 that allocates space for the initial 3 integers. 344 00:15:35,500 --> 00:15:36,820 This code is fine. 345 00:15:36,820 --> 00:15:38,270 This code is fine. 346 00:15:38,270 --> 00:15:41,650 But as soon as I do this, I'm clobbering the value of list. 347 00:15:41,650 --> 00:15:43,960 And saying no, don't point at this chunk of memory. 348 00:15:43,960 --> 00:15:47,900 Point at this chunk of memory, at which point I've forgotten if you will, 349 00:15:47,900 --> 00:15:50,230 where the original chunk of memory is. 350 00:15:50,230 --> 00:15:54,820 So the right way to do something like this, would be a little more involved. 351 00:15:54,820 --> 00:15:57,398 Let me go ahead and give myself a temporary variable. 352 00:15:57,398 --> 00:15:58,690 And I'll literally call it TMP. 353 00:15:58,690 --> 00:16:00,820 T-M-P, like I did last week. 354 00:16:00,820 --> 00:16:04,120 So that I can now ask the computer for a completely different chunk of memory 355 00:16:04,120 --> 00:16:05,290 of size 4. 356 00:16:05,290 --> 00:16:08,230 I'm going to again say if TMP equals null, 357 00:16:08,230 --> 00:16:10,370 I'm going to say bad things happened here. 358 00:16:10,370 --> 00:16:11,560 So let me just return 1. 359 00:16:11,560 --> 00:16:13,840 And you know what, just to be tidy, let me 360 00:16:13,840 --> 00:16:16,542 free the original list before I quit. 361 00:16:16,542 --> 00:16:18,250 Because remember from last week, any time 362 00:16:18,250 --> 00:16:20,650 you use malloc you eventually have to use free. 363 00:16:20,650 --> 00:16:24,040 But this chunk of code here is just a safety check. 364 00:16:24,040 --> 00:16:26,440 If there's no more memory, there's nothing to see here. 365 00:16:26,440 --> 00:16:29,500 I'm just going to clean up my state and quit. 366 00:16:29,500 --> 00:16:32,840 But now, if I have asked for this chunk of memory, 367 00:16:32,840 --> 00:16:38,200 now I can do this 4 int i gets 0. 368 00:16:38,200 --> 00:16:40,600 I is less than 3, i++. 369 00:16:40,600 --> 00:16:42,520 What if I do something like this? 370 00:16:42,520 --> 00:16:46,540 TMP [i] equals list [i]. 371 00:16:46,540 --> 00:16:50,980 That would seem to have the effect of copying all of the memory from one 372 00:16:50,980 --> 00:16:51,800 to the other. 373 00:16:51,800 --> 00:16:55,510 And then, I think I need to do one last thing TMP [3] 374 00:16:55,510 --> 00:16:57,460 gets the number 4, for instance. 375 00:16:57,460 --> 00:17:01,480 Again, I'm hard coding the numbers for the sake of discussion. 376 00:17:01,480 --> 00:17:06,460 After I've done this, what could I now do? 377 00:17:06,460 --> 00:17:10,990 I could now set list equals to TMP. 378 00:17:10,990 --> 00:17:14,048 And now, I have updated my linked list properly. 379 00:17:14,048 --> 00:17:15,340 So let me go ahead and do this. 380 00:17:15,340 --> 00:17:17,080 4 int i gets 0. 381 00:17:17,080 --> 00:17:19,480 I is less than 4, i++. 382 00:17:19,480 --> 00:17:24,820 Let me go ahead and print each of these elements out with %i using list [i]. 383 00:17:24,820 --> 00:17:27,890 And then, I'm going to return 0 just to signify that all is successful. 384 00:17:27,890 --> 00:17:31,990 Now so to recap, we initialize the original array 385 00:17:31,990 --> 00:17:35,140 of size 3 and plug-in the values 1, 2, 3. 386 00:17:35,140 --> 00:17:35,960 Time passes. 387 00:17:35,960 --> 00:17:38,210 And then, I realize, wait a minute, I need more space. 388 00:17:38,210 --> 00:17:40,585 And so I asked the computer for a second chunk of memory. 389 00:17:40,585 --> 00:17:41,800 This one of size 4. 390 00:17:41,800 --> 00:17:44,467 Just as a safety check, I make sure that TMP doesn't equal null. 391 00:17:44,467 --> 00:17:46,008 Because if it does I'm out of memory. 392 00:17:46,008 --> 00:17:47,590 So I should just quit altogether. 393 00:17:47,590 --> 00:17:50,110 But once I'm sure that it's not null, I'm 394 00:17:50,110 --> 00:17:55,450 going to copy all the values from the old list into the new list. 395 00:17:55,450 --> 00:17:58,910 And then, I'm going to add my new number at the end of that list. 396 00:17:58,910 --> 00:18:02,410 And then, now that I'm done playing around with this temporary variable, 397 00:18:02,410 --> 00:18:05,860 I'm going to remember in my list variable what 398 00:18:05,860 --> 00:18:07,900 the addresses of this new chunk of memory. 399 00:18:07,900 --> 00:18:10,570 And then, I'm going to print all of those values out. 400 00:18:10,570 --> 00:18:14,350 So at least, aesthetically, when I make this new version of my list, 401 00:18:14,350 --> 00:18:16,660 except for my missing semicolon. 402 00:18:16,660 --> 00:18:17,590 Let me try this again. 403 00:18:17,590 --> 00:18:19,480 When I make lists, Oh OK. 404 00:18:19,480 --> 00:18:20,620 What did I do this time? 405 00:18:20,620 --> 00:18:23,290 Implicitly declaring a library function malloc. 406 00:18:23,290 --> 00:18:27,749 What's my mistake any time you see that kind of error? 407 00:18:27,749 --> 00:18:28,510 AUDIENCE: Library. 408 00:18:28,510 --> 00:18:28,800 SPEAKER 1: Yeah. 409 00:18:28,800 --> 00:18:29,380 A library. 410 00:18:29,380 --> 00:18:34,700 So up here, I forgot to do include stdlib.h, which is where malloc lives. 411 00:18:34,700 --> 00:18:36,490 Let me go ahead and, again, do make list. 412 00:18:36,490 --> 00:18:37,250 There we go. 413 00:18:37,250 --> 00:18:38,950 So I fixed that dot/list. 414 00:18:38,950 --> 00:18:41,829 And I should see 1, 2, 3, 4. 415 00:18:41,829 --> 00:18:45,640 But they're still a bug here. 416 00:18:45,640 --> 00:18:48,310 Does anyone see the the-- bug or question? 417 00:18:48,310 --> 00:18:50,100 AUDIENCE: You forgot to free them. 418 00:18:50,100 --> 00:18:50,790 SPEAKER 1: I'm sorry, say again. 419 00:18:50,790 --> 00:18:52,470 AUDIENCE: You forgot to free them. 420 00:18:52,470 --> 00:18:54,570 SPEAKER 1: I forgot to free the original list. 421 00:18:54,570 --> 00:18:58,170 And we could see this, even if not just with our own eyes or intuition. 422 00:18:58,170 --> 00:19:00,847 If I do something like Valgrind of dot/list, 423 00:19:00,847 --> 00:19:02,430 remember our tool from this past week. 424 00:19:02,430 --> 00:19:05,310 Let me increase the size of my terminal window, temporarily. 425 00:19:05,310 --> 00:19:07,540 The output is crazy cryptic at first. 426 00:19:07,540 --> 00:19:12,780 But, notice that I have definitely lost some number of bytes here. 427 00:19:12,780 --> 00:19:15,150 And indeed, it's even pointing at the line number 428 00:19:15,150 --> 00:19:16,930 in which some of those bytes were lost. 429 00:19:16,930 --> 00:19:18,930 So let me go ahead and back to my code. 430 00:19:18,930 --> 00:19:23,610 And indeed, I think what I need to do is, before I clobber the value of list 431 00:19:23,610 --> 00:19:27,150 pointing it at this new chunk of memory instead of the old, 432 00:19:27,150 --> 00:19:29,910 I think I now need to first, proactively, 433 00:19:29,910 --> 00:19:32,460 say free the old list of memory. 434 00:19:32,460 --> 00:19:34,480 And then, change its value. 435 00:19:34,480 --> 00:19:39,250 So if I now do Make List and do dot /list, the output is still the same. 436 00:19:39,250 --> 00:19:42,450 And, if I cross my fingers and run Valgrind again 437 00:19:42,450 --> 00:19:46,440 after increasing my window size, hopefully here. 438 00:19:46,440 --> 00:19:48,160 Oh, still a bug. 439 00:19:48,160 --> 00:19:49,080 So better. 440 00:19:49,080 --> 00:19:52,020 It seems like less memory is lost. 441 00:19:52,020 --> 00:19:54,450 What have I now forgotten to do? 442 00:19:54,450 --> 00:19:56,430 AUDIENCE: You forgot to free the end. 443 00:19:56,430 --> 00:19:58,740 SPEAKER 1: I forgot to free it at the very end, too. 444 00:19:58,740 --> 00:20:01,560 Because I still have a chunk of memory that I got from malloc. 445 00:20:01,560 --> 00:20:04,200 So let me go to the very bottom of the program now. 446 00:20:04,200 --> 00:20:09,330 And after I'm done senselessly just printing this thing out, 447 00:20:09,330 --> 00:20:12,450 let me free the new list. 448 00:20:12,450 --> 00:20:15,780 And now let me do Make List, dot/list. 449 00:20:15,780 --> 00:20:17,670 It's still works, visually. 450 00:20:17,670 --> 00:20:22,200 Now let's do Valgrind of dot/list, Enter. 451 00:20:22,200 --> 00:20:25,530 And now, hopefully, all heap blocks were freed. 452 00:20:25,530 --> 00:20:27,018 No leaks are possible. 453 00:20:27,018 --> 00:20:30,060 So this is perhaps the best output you can see from a tool like Valgrind. 454 00:20:30,060 --> 00:20:32,950 I used the heap, but I freed all the memory as well. 455 00:20:32,950 --> 00:20:34,630 So there were 2 fixes needed there. 456 00:20:34,630 --> 00:20:35,130 All right. 457 00:20:35,130 --> 00:20:38,910 Any questions then on this array-based approach, the first of which 458 00:20:38,910 --> 00:20:41,530 is statically allocating an array, so to speak. 459 00:20:41,530 --> 00:20:43,230 By just hard coding the number 3. 460 00:20:43,230 --> 00:20:47,190 The second version now is dynamically allocating the array, 461 00:20:47,190 --> 00:20:49,380 using not the stack but the heap. 462 00:20:49,380 --> 00:20:52,800 But, it too, suffers from the slowness we described earlier, 463 00:20:52,800 --> 00:20:55,290 of having to copy all those values from one to the other. 464 00:20:55,290 --> 00:20:55,790 OK. 465 00:20:55,790 --> 00:20:57,183 A hand was over here. 466 00:20:57,183 --> 00:20:59,858 AUDIENCE: Why do you not have to free the TMP? 467 00:20:59,858 --> 00:21:00,900 SPEAKER 1: Good question. 468 00:21:00,900 --> 00:21:02,820 Why did I not have to free the TMP? 469 00:21:02,820 --> 00:21:05,130 I essentially did eventually. 470 00:21:05,130 --> 00:21:10,360 Because TMP was pointing at the chunk of 4 integers. 471 00:21:10,360 --> 00:21:15,810 But on line 33 here, I assigned list to be 472 00:21:15,810 --> 00:21:18,580 identical to what TMP was pointing at. 473 00:21:18,580 --> 00:21:23,173 And so, when I finally freed the list, that was the same thing as freeing TMP. 474 00:21:23,173 --> 00:21:26,340 In fact, if I wanted to, I could say free TMP here and it would be the same. 475 00:21:26,340 --> 00:21:28,080 But conceptually, it's wrong. 476 00:21:28,080 --> 00:21:32,130 Because at this point in the story, I should be freeing the actual list, not 477 00:21:32,130 --> 00:21:33,240 that temporary variable. 478 00:21:33,240 --> 00:21:35,340 But they were the same at that point in the story. 479 00:21:35,340 --> 00:21:35,840 Yeah. 480 00:21:35,840 --> 00:21:37,878 AUDIENCE: Is [? the line ?] part of it? 481 00:21:37,878 --> 00:21:38,920 SPEAKER 1: Good question. 482 00:21:38,920 --> 00:21:41,350 And long story short, everything we're doing thus far 483 00:21:41,350 --> 00:21:42,820 is still in the world of arrays. 484 00:21:42,820 --> 00:21:44,710 The only distinction we're making is that 485 00:21:44,710 --> 00:21:51,220 in version 1, when I said int list [3], that was an array of fixed size. 486 00:21:51,220 --> 00:21:55,150 So-called statically allocated on the stack, as per last week. 487 00:21:55,150 --> 00:21:58,900 This version now is still dealing with arrays, but I'm flexing my muscles 488 00:21:58,900 --> 00:22:00,980 and using dynamic memory allocation. 489 00:22:00,980 --> 00:22:03,498 So that I can still use an array per the first pictures 490 00:22:03,498 --> 00:22:04,540 we started talking about. 491 00:22:04,540 --> 00:22:07,070 But I can at least grow the array if I want. 492 00:22:07,070 --> 00:22:10,990 So we haven't even now solved this, even better in a sense, with linked lists. 493 00:22:10,990 --> 00:22:12,080 That's going to come next. 494 00:22:12,080 --> 00:22:12,580 Yeah. 495 00:22:12,580 --> 00:22:16,930 AUDIENCE: How are you able to free list and then still make list? 496 00:22:16,930 --> 00:22:19,720 SPEAKER 1: How am I able to free list? 497 00:22:19,720 --> 00:22:24,310 I freed the original address of list. 498 00:22:24,310 --> 00:22:27,220 I, then, changed what list is storing. 499 00:22:27,220 --> 00:22:30,070 I'm moving its arrow to a new chunk of memory. 500 00:22:30,070 --> 00:22:33,550 And that is perfectly reasonable for me to now manipulate 501 00:22:33,550 --> 00:22:37,180 because now list is pointing at the same value of TMP. 502 00:22:37,180 --> 00:22:42,610 And TMP is what was given the return value of malloc, the second time. 503 00:22:42,610 --> 00:22:44,780 So that chunk of memory is valid. 504 00:22:44,780 --> 00:22:48,220 So these are just squares on the board, right. 505 00:22:48,220 --> 00:22:49,970 There's just pointers inside of them. 506 00:22:49,970 --> 00:22:51,887 So what I'm technically saying is, and I'm not 507 00:22:51,887 --> 00:22:54,040 pointing I'm not freeing list per se, I am 508 00:22:54,040 --> 00:22:58,660 freeing the chunk of memory that begins at the address currently in list. 509 00:22:58,660 --> 00:23:04,060 Therefore, if a few lines later, I change what the address is in list. 510 00:23:04,060 --> 00:23:08,080 Totally reasonable to then touch that memory, and eventually free it later. 511 00:23:08,080 --> 00:23:10,390 Because you're not freeing the variable per se, 512 00:23:10,390 --> 00:23:12,790 you're freeing the address in the variable. 513 00:23:12,790 --> 00:23:13,630 Good distinction. 514 00:23:13,630 --> 00:23:14,140 All right. 515 00:23:14,140 --> 00:23:19,750 So let me back up here and now make one final edit. 516 00:23:19,750 --> 00:23:24,190 So let's finish this with one final improvement here. 517 00:23:24,190 --> 00:23:27,160 Because it turns out, there's a somewhat better way 518 00:23:27,160 --> 00:23:30,610 to actually resize an array as we've been doing here. 519 00:23:30,610 --> 00:23:35,028 And there's another function in stdlib that's called realloc, for re-allocate. 520 00:23:35,028 --> 00:23:37,570 And I'm just going to go in and make a little bit of a change 521 00:23:37,570 --> 00:23:40,578 here so that I can do the following. 522 00:23:40,578 --> 00:23:42,370 Let me go ahead and first comment this now, 523 00:23:42,370 --> 00:23:45,320 just so we can keep track of what's been going on this whole time. 524 00:23:45,320 --> 00:23:51,970 So dynamically allocate an array of size 3. 525 00:23:51,970 --> 00:23:56,650 Assign 3 numbers to that array. 526 00:23:56,650 --> 00:23:58,330 Time passes. 527 00:23:58,330 --> 00:24:03,640 Allocate new array of size 4. 528 00:24:03,640 --> 00:24:09,460 Copy numbers from old array into new array. 529 00:24:09,460 --> 00:24:14,170 And add fourth number to new array. 530 00:24:14,170 --> 00:24:15,895 Free old array. 531 00:24:18,850 --> 00:24:24,460 Remember, if you will, new array using my same list variable. 532 00:24:24,460 --> 00:24:28,960 And now, print new array. 533 00:24:28,960 --> 00:24:31,270 Free new array. 534 00:24:31,270 --> 00:24:32,260 Hopefully, that helps. 535 00:24:32,260 --> 00:24:35,530 And we'll post this code online after 2, which tells a more explicit story. 536 00:24:35,530 --> 00:24:39,220 So it turns out that we can reduce some of the labor involved with this. 537 00:24:39,220 --> 00:24:41,980 Not so much with the printing here, but with this copying. 538 00:24:41,980 --> 00:24:44,260 Turns out c does have a function called realloc, 539 00:24:44,260 --> 00:24:49,580 that can actually handle the resizing of an array for you, as follows. 540 00:24:49,580 --> 00:24:51,700 I'm going to scroll up to where I previously 541 00:24:51,700 --> 00:24:54,820 allocated a new array of size 4. 542 00:24:54,820 --> 00:25:02,020 And I'm instead going to say this, resize old array to be of size 4. 543 00:25:02,020 --> 00:25:04,477 Now, previously this wasn't necessarily possible. 544 00:25:04,477 --> 00:25:06,310 Because recall that we had painted ourselves 545 00:25:06,310 --> 00:25:08,143 into a corner with the example on the screen 546 00:25:08,143 --> 00:25:10,990 where "Hello, world" happened to be right after the original array. 547 00:25:10,990 --> 00:25:12,410 But let me do this. 548 00:25:12,410 --> 00:25:15,340 Let me use realloc, for re-allocate. 549 00:25:15,340 --> 00:25:18,640 And pass in not just the size of memory we want this time, 550 00:25:18,640 --> 00:25:22,330 but also the address that we want to resize. 551 00:25:22,330 --> 00:25:25,940 Which, again, is this array called list. 552 00:25:25,940 --> 00:25:26,440 All right. 553 00:25:26,440 --> 00:25:29,330 The code thereafter is pretty much the same. 554 00:25:29,330 --> 00:25:33,200 But what I don't need to do is this. 555 00:25:33,200 --> 00:25:36,520 So realloc is a pretty handy function that will do the following. 556 00:25:36,520 --> 00:25:39,670 If at the very beginning of class, when we had 1, 2, 3 on the board. 557 00:25:39,670 --> 00:25:43,010 And someone's instinct was to just plop the 4 right at the end of the list. 558 00:25:43,010 --> 00:25:45,760 If there's available memory, realloc will just do that. 559 00:25:45,760 --> 00:25:50,200 And boom, it will just grow the array for you in the computer's memory. 560 00:25:50,200 --> 00:25:54,160 If, though, it realizes, sorry, there's already a string like "Hello, world" 561 00:25:54,160 --> 00:25:57,040 or something else there, realloc will handle 562 00:25:57,040 --> 00:26:00,730 the trouble of moving that whole array from 1 chunk of memory, 563 00:26:00,730 --> 00:26:03,010 originally, to a new chunk of memory. 564 00:26:03,010 --> 00:26:09,400 And then realloc will return to you, the address of that new chunk of memory. 565 00:26:09,400 --> 00:26:13,550 And it will handle the process of freeing the old chunk for you. 566 00:26:13,550 --> 00:26:15,800 So you do not need to do this yourself. 567 00:26:15,800 --> 00:26:19,130 So in fact, let me go ahead and get rid of this as well. 568 00:26:19,130 --> 00:26:24,100 So realloc just condenses, a lot of what we just did, into a single function. 569 00:26:24,100 --> 00:26:28,110 Whereby, realloc handles it for you. 570 00:26:28,110 --> 00:26:28,610 All right. 571 00:26:28,610 --> 00:26:31,670 So that's the final improvement on this array-based approach. 572 00:26:31,670 --> 00:26:34,450 So what now, knowing what your memory is, 573 00:26:34,450 --> 00:26:37,400 what can we now do with it that solves that kind of problem? 574 00:26:37,400 --> 00:26:39,320 Because the world is going to get really slow. 575 00:26:39,320 --> 00:26:42,320 And our apps, and our phones, and our computers are getting really slow, 576 00:26:42,320 --> 00:26:46,550 if we're just constantly wasting time moving things around in memory. 577 00:26:46,550 --> 00:26:48,410 What could we perhaps do instead? 578 00:26:48,410 --> 00:26:50,480 Well there's one new piece of syntax today 579 00:26:50,480 --> 00:26:53,840 that builds on these 3 pieces of syntax from the past. 580 00:26:53,840 --> 00:26:55,700 Recall, that we've looked at struct, which 581 00:26:55,700 --> 00:26:58,820 is a keyword in C, that just lets you invent your own structure. 582 00:26:58,820 --> 00:27:02,060 Your own variable, if you will, in conjunction with typedef. 583 00:27:02,060 --> 00:27:06,200 Which lets you say a person has a name and a number, or something like that. 584 00:27:06,200 --> 00:27:08,660 Or a candidate has a name and some number of votes. 585 00:27:08,660 --> 00:27:13,040 You can encapsulate multiple pieces of data inside of just one using struct. 586 00:27:13,040 --> 00:27:17,160 What did we use the Dot Notation for now, a couple of times? 587 00:27:17,160 --> 00:27:20,468 What does the Dot operator do in C? 588 00:27:20,468 --> 00:27:21,760 AUDIENCE: Access the structure. 589 00:27:21,760 --> 00:27:22,150 SPEAKER 1: Perfect. 590 00:27:22,150 --> 00:27:24,200 To access the field inside of a structure. 591 00:27:24,200 --> 00:27:26,325 So if you've got a person with a name and a number, 592 00:27:26,325 --> 00:27:29,350 you could say something like person.name or person.number, 593 00:27:29,350 --> 00:27:31,510 if person is the name of one such variable. 594 00:27:31,510 --> 00:27:33,850 Star, of course, we've seen now in a few ways. 595 00:27:33,850 --> 00:27:37,540 Like way back in week 1, we saw it as like, multiplication. 596 00:27:37,540 --> 00:27:40,750 Last week, we began to see it in the context of pointers, 597 00:27:40,750 --> 00:27:42,970 whereby, you use it to declare a pointer. 598 00:27:42,970 --> 00:27:45,560 Like, int* p, or something like that. 599 00:27:45,560 --> 00:27:48,040 But we also saw it in one other context, which 600 00:27:48,040 --> 00:27:51,380 was like the opposite, which was the dereference operator. 601 00:27:51,380 --> 00:27:53,272 Which says if this is an address, that is 602 00:27:53,272 --> 00:27:56,230 if this is a variable like a pointer, and you put a star in front of it 603 00:27:56,230 --> 00:27:59,980 then with no int or no char, no data type in front of it. 604 00:27:59,980 --> 00:28:01,870 That means go to that address. 605 00:28:01,870 --> 00:28:05,300 And it dereferences the pointer and goes to that location. 606 00:28:05,300 --> 00:28:07,720 So it turns out that using these 3 building blocks, 607 00:28:07,720 --> 00:28:10,760 you can actually start to now use your computer's memory almost any way 608 00:28:10,760 --> 00:28:11,260 you want. 609 00:28:11,260 --> 00:28:13,720 And even next week, when we transition to Python, 610 00:28:13,720 --> 00:28:16,360 and you start to get a lot of features for free. 611 00:28:16,360 --> 00:28:18,550 Like a single line of code will just do so much 612 00:28:18,550 --> 00:28:23,170 more in Python than it does in C. It boils down to those basic primitives. 613 00:28:23,170 --> 00:28:25,060 And just so you've seen it already. 614 00:28:25,060 --> 00:28:29,770 It turns out that it's so common in C to use this operator 615 00:28:29,770 --> 00:28:33,790 to go inside of a structure and this operator to go to an address, 616 00:28:33,790 --> 00:28:36,250 that there's shorthand notation for it, a.k.a. 617 00:28:36,250 --> 00:28:37,450 syntactic sugar. 618 00:28:37,450 --> 00:28:39,095 That literally looks like an arrow. 619 00:28:39,095 --> 00:28:41,470 So recall last week, I was in the habit of pointing, even 620 00:28:41,470 --> 00:28:42,670 with the big foam finger. 621 00:28:42,670 --> 00:28:47,020 This arrow notation, a hyphen and an angled bracket, 622 00:28:47,020 --> 00:28:53,950 denotes going to an address and looking at a field inside of it. 623 00:28:53,950 --> 00:28:56,240 But we'll see this in practice in just a bit. 624 00:28:56,240 --> 00:28:59,110 So what might be the solution, now, to this problem 625 00:28:59,110 --> 00:29:02,620 we saw a moment ago whereby, we had painted ourselves into a corner. 626 00:29:02,620 --> 00:29:05,900 And our memory, a few moments ago, looked like this. 627 00:29:05,900 --> 00:29:10,720 We could just copy the whole existing array to a new location, add the 4, 628 00:29:10,720 --> 00:29:12,010 and go about our business. 629 00:29:12,010 --> 00:29:15,850 What would another, perhaps better solution longer term 630 00:29:15,850 --> 00:29:21,145 be, that doesn't require constantly moving stuff around? 631 00:29:21,145 --> 00:29:23,020 Maybe hang in there for your instincts if you 632 00:29:23,020 --> 00:29:27,200 know the buzz phrase we're looking for from past experience, hang in there. 633 00:29:27,200 --> 00:29:29,800 But if we want to avoid moving the 1, 2, and the 3, 634 00:29:29,800 --> 00:29:32,500 but we still want to be able to add endless amounts of data. 635 00:29:32,500 --> 00:29:33,980 What could we do? 636 00:29:33,980 --> 00:29:34,480 Yeah. 637 00:29:34,480 --> 00:29:37,390 So maybe create some kind of list using pointers that 638 00:29:37,390 --> 00:29:39,370 just point at a new location, right. 639 00:29:39,370 --> 00:29:42,490 In an ideal world, even though this piece of memory 640 00:29:42,490 --> 00:29:45,430 is being used by this h in the string "Hello, world", 641 00:29:45,430 --> 00:29:47,980 maybe we could somehow use a pointer from last week. 642 00:29:47,980 --> 00:29:52,330 Like an arrow, that says after the 3, oh I don't know, go down over here 643 00:29:52,330 --> 00:29:54,040 to this location in memory. 644 00:29:54,040 --> 00:29:58,310 And you just stitch together these integers in memory 645 00:29:58,310 --> 00:30:00,340 so that each one leads to the next. 646 00:30:00,340 --> 00:30:03,700 It's not necessarily the case that it's literally back-to-back. 647 00:30:03,700 --> 00:30:05,950 That would have the downside, it would seem, 648 00:30:05,950 --> 00:30:07,510 of costing us a little bit of space. 649 00:30:07,510 --> 00:30:10,120 Like a pointer, which recall, takes up some amount of space. 650 00:30:10,120 --> 00:30:12,400 Typically 8 bytes or 64 bits. 651 00:30:12,400 --> 00:30:16,000 But I don't have to copy potentially a huge amount of data just 652 00:30:16,000 --> 00:30:17,440 to add one more number. 653 00:30:17,440 --> 00:30:19,278 And so these things do have a name. 654 00:30:19,278 --> 00:30:21,070 And indeed, these things are what generally 655 00:30:21,070 --> 00:30:24,820 would be called a linked list. 656 00:30:24,820 --> 00:30:27,340 A linked list captures exactly that intuition 657 00:30:27,340 --> 00:30:29,060 of linking together things in memory. 658 00:30:29,060 --> 00:30:30,530 So let's take a look at an example. 659 00:30:30,530 --> 00:30:32,322 Here's a computer's memory in the abstract. 660 00:30:32,322 --> 00:30:35,140 Suppose that I'm trying to create an array. 661 00:30:35,140 --> 00:30:38,200 Let's generalize it as a list, now, of numbers. 662 00:30:38,200 --> 00:30:39,880 An array has a very specific meaning. 663 00:30:39,880 --> 00:30:42,610 It's memory that's contiguous, back, to back, to back. 664 00:30:42,610 --> 00:30:46,240 At the end of the day, I as the programmer, just care about the data-- 665 00:30:46,240 --> 00:30:48,340 1, 2, 3, 4, and so forth. 666 00:30:48,340 --> 00:30:52,300 I don't really care how it's stored. 667 00:30:52,300 --> 00:30:54,610 I don't care how it's stored when I'm writing the code, 668 00:30:54,610 --> 00:30:56,443 I just wanted to work at the end of the day. 669 00:30:56,443 --> 00:30:58,570 So suppose that I first insert my number 1. 670 00:30:58,570 --> 00:31:02,110 And, who knows, it ends up, up there at location, 0X123, 671 00:31:02,110 --> 00:31:03,320 for the sake of discussion. 672 00:31:03,320 --> 00:31:03,820 All right. 673 00:31:03,820 --> 00:31:06,070 Maybe there's something already here. 674 00:31:06,070 --> 00:31:08,110 And heck, maybe there's something already here, 675 00:31:08,110 --> 00:31:11,095 but there's plenty of other options for where this thing can go. 676 00:31:11,095 --> 00:31:12,970 And suppose that, for the sake of discussion, 677 00:31:12,970 --> 00:31:14,803 the first available spot for the next number 678 00:31:14,803 --> 00:31:20,612 happens to be over here at location 0X456, for the sake of discussion. 679 00:31:20,612 --> 00:31:22,570 So that's where I'm going to plop the number 2. 680 00:31:22,570 --> 00:31:24,070 And where might the number 3 end up? 681 00:31:24,070 --> 00:31:26,860 Oh I don't know, maybe down over there at 0X789. 682 00:31:26,860 --> 00:31:31,030 The point being, I don't know what is, or really care about, 683 00:31:31,030 --> 00:31:33,190 everything else that's in the computer's memory. 684 00:31:33,190 --> 00:31:37,240 I just care that there are at least 3 locations available where 685 00:31:37,240 --> 00:31:40,300 I can put my 1, my 2, and my 3. 686 00:31:40,300 --> 00:31:44,020 But the catch is, now that we're not using an array, 687 00:31:44,020 --> 00:31:48,370 we can't just naively assume that you just add 1 to an index and boom, 688 00:31:48,370 --> 00:31:49,510 you're at the next number. 689 00:31:49,510 --> 00:31:52,960 Add 2 to an index, and boom you're at the next, next number. 690 00:31:52,960 --> 00:31:57,370 Now you have to leave these little breadcrumbs, or use the arrow notation, 691 00:31:57,370 --> 00:31:59,680 to lead from one to the other. 692 00:31:59,680 --> 00:32:01,870 And sometimes, it might be close, a few bytes away. 693 00:32:01,870 --> 00:32:05,810 Maybe, it's a whole gigabyte away in an even bigger computer's memory. 694 00:32:05,810 --> 00:32:07,540 So how might I do this? 695 00:32:07,540 --> 00:32:12,770 Like where do these pointers go, as you proposed? 696 00:32:12,770 --> 00:32:13,270 All right. 697 00:32:13,270 --> 00:32:15,340 All I have access to here are bytes. 698 00:32:15,340 --> 00:32:17,410 I've already stored the 1, the 2, and the 3. 699 00:32:17,410 --> 00:32:19,780 So what more should I do? 700 00:32:19,780 --> 00:32:20,480 OK, yeah. 701 00:32:20,480 --> 00:32:23,370 So let me, you put the pointers right next to these numbers. 702 00:32:23,370 --> 00:32:27,410 So let me at least plan ahead, so that when I ask the computer like malloc, 703 00:32:27,410 --> 00:32:30,470 recall from last week, for some memory, I don't just ask it now 704 00:32:30,470 --> 00:32:32,375 for space for just the number. 705 00:32:32,375 --> 00:32:34,250 Let me start getting into the habit of asking 706 00:32:34,250 --> 00:32:39,350 malloc for enough space for the number and a pointer to another such number. 707 00:32:39,350 --> 00:32:42,060 So it's a little more aggressive of me to ask for more memory. 708 00:32:42,060 --> 00:32:43,340 But I'm planning ahead. 709 00:32:43,340 --> 00:32:45,140 And here is an example of a trade off. 710 00:32:45,140 --> 00:32:48,920 Almost any time in CS, when you start using more space, you can save time. 711 00:32:48,920 --> 00:32:53,180 Or if you try to conserve space, you might have to lose time. 712 00:32:53,180 --> 00:32:54,680 It's being that trade off there. 713 00:32:54,680 --> 00:32:56,910 So how might I solve this? 714 00:32:56,910 --> 00:32:58,460 Well let me abstract this away. 715 00:32:58,460 --> 00:33:01,575 And either next to or below, I'm just drawing it vertically, just 716 00:33:01,575 --> 00:33:02,700 for the sake of discussion. 717 00:33:02,700 --> 00:33:04,670 So the arrows are a bit prettier. 718 00:33:04,670 --> 00:33:07,580 I've asked malloc for now twice as much space, 719 00:33:07,580 --> 00:33:09,590 it would seem, than I previously needed. 720 00:33:09,590 --> 00:33:13,535 But I'm going to use this second chunk of memory to refer to the next number. 721 00:33:13,535 --> 00:33:16,160 And I'm going to use this chunk of memory to refer to the next, 722 00:33:16,160 --> 00:33:17,970 essentially, stitching this thing together. 723 00:33:17,970 --> 00:33:20,030 So what should go in this first box? 724 00:33:20,030 --> 00:33:23,600 Well, I claim the number, 0X456. 725 00:33:23,600 --> 00:33:26,300 And it's written in hex because it represents a memory address. 726 00:33:26,300 --> 00:33:30,320 But this is the equivalent of drawing an arrow from one to the other. 727 00:33:30,320 --> 00:33:34,070 As a little check here, what should go in this second box 728 00:33:34,070 --> 00:33:37,940 if the goal is to stitch these together in order 1, 2, 3? 729 00:33:37,940 --> 00:33:40,112 Feel free to just shout this out. 730 00:33:40,112 --> 00:33:41,570 AUDIENCE: 0X789. 731 00:33:41,570 --> 00:33:42,990 SPEAKER 1: OK, that worked well. 732 00:33:42,990 --> 00:33:43,915 So 0X789, indeed. 733 00:33:43,915 --> 00:33:46,790 And you can't do that with the hands because I can't count that fast. 734 00:33:46,790 --> 00:33:51,030 So 0X789 should go here because that's like a little breadcrumb to the next. 735 00:33:51,030 --> 00:33:54,290 And then, we don't really have terribly many possibilities here. 736 00:33:54,290 --> 00:33:56,960 This has to have a value, right. 737 00:33:56,960 --> 00:34:01,830 Because at the end of the day, it's got to use its 64 bits in some way. 738 00:34:01,830 --> 00:34:05,170 So what value should go here, if this is the end of this list? 739 00:34:05,170 --> 00:34:06,170 AUDIENCE: 0. 740 00:34:06,170 --> 00:34:08,270 SPEAKER 1: So it could be 0X123. 741 00:34:08,270 --> 00:34:12,050 The implication being that it would be a cyclical list. 742 00:34:12,050 --> 00:34:14,570 Which is OK, but potentially problematic. 743 00:34:14,570 --> 00:34:18,620 If any of you have accidentally lost control over your code space 744 00:34:18,620 --> 00:34:21,680 because you had an infinite loop, this would seem a very easy way 745 00:34:21,680 --> 00:34:26,330 to give yourself the accidental probability of an infinite loop. 746 00:34:26,330 --> 00:34:28,916 What might be simpler than that and ward that off? 747 00:34:28,916 --> 00:34:29,590 AUDIENCE: Null. 748 00:34:29,590 --> 00:34:30,505 SPEAKER 1: Say again? 749 00:34:30,505 --> 00:34:31,130 AUDIENCE: Null. 750 00:34:31,130 --> 00:34:32,840 SPEAKER 1: So just the null character. 751 00:34:32,840 --> 00:34:35,540 Not N-U-L, confusingly, which is at the end of strings. 752 00:34:35,540 --> 00:34:38,550 But N-U-L-L, as we introduced it last week. 753 00:34:38,550 --> 00:34:40,580 Which is the same as 0x0. 754 00:34:40,580 --> 00:34:43,400 So this is just a special value that programmers decades ago 755 00:34:43,400 --> 00:34:47,510 decided that if you store the address 0, that's not a valid address. 756 00:34:47,510 --> 00:34:50,420 There's never going to be anything useful at 0x0. 757 00:34:50,420 --> 00:34:53,600 Therefore, it's a sentinel value, just a special value, 758 00:34:53,600 --> 00:34:54,800 that indicates that's it. 759 00:34:54,800 --> 00:34:56,870 There's nowhere further to go. 760 00:34:56,870 --> 00:35:00,470 It's OK to come back to your suggestion of making a cyclical list. 761 00:35:00,470 --> 00:35:02,390 But we'd better be smart enough to, maybe, 762 00:35:02,390 --> 00:35:06,380 remember where did the list start so that you can detect cycles. 763 00:35:06,380 --> 00:35:08,940 If you start looping around in this structure, otherwise. 764 00:35:08,940 --> 00:35:09,440 All right. 765 00:35:09,440 --> 00:35:11,640 But these addresses, who really cares at the end of the day 766 00:35:11,640 --> 00:35:12,920 if we abstract this away. 767 00:35:12,920 --> 00:35:14,820 It really just now looks like this. 768 00:35:14,820 --> 00:35:17,778 And indeed, this is how most anyone would draw this on a whiteboard 769 00:35:17,778 --> 00:35:19,070 if having a discussion at work. 770 00:35:19,070 --> 00:35:20,862 Talking about what data structure we should 771 00:35:20,862 --> 00:35:22,790 use to solve some problem in the real world. 772 00:35:22,790 --> 00:35:25,040 We don't care generally about the addresses. 773 00:35:25,040 --> 00:35:27,630 We care that in code we can access them. 774 00:35:27,630 --> 00:35:30,590 But in terms of the concept alone this would be, perhaps, 775 00:35:30,590 --> 00:35:32,239 the right way to think about this. 776 00:35:32,239 --> 00:35:34,197 All right, let me pause here and see if there's 777 00:35:34,197 --> 00:35:38,420 any questions on this idea of creating a linked list in memory by just storing, 778 00:35:38,420 --> 00:35:42,540 not just the numbers like 1, 2, 3, but twice as much data. 779 00:35:42,540 --> 00:35:45,110 So that you have little breadcrumbs in the form of pointers 780 00:35:45,110 --> 00:35:48,510 that can lead you from one to the next. 781 00:35:48,510 --> 00:35:50,674 Any questions on these linked lists? 782 00:35:54,130 --> 00:35:54,730 Any questions? 783 00:35:54,730 --> 00:35:55,230 No? 784 00:35:55,230 --> 00:35:55,940 All right. 785 00:35:55,940 --> 00:35:56,440 Oh, yeah. 786 00:35:56,440 --> 00:35:57,431 Over here. 787 00:35:57,431 --> 00:36:02,025 AUDIENCE: So does this takes time more memory than an array? 788 00:36:02,025 --> 00:36:04,150 SPEAKER 1: This does take more memory than an array 789 00:36:04,150 --> 00:36:06,699 because I now need space for these pointers. 790 00:36:06,699 --> 00:36:10,670 And to be clear, I technically didn't really draw this to scale. 791 00:36:10,670 --> 00:36:13,600 Thus far, in the class, we've generally thought about integers 792 00:36:13,600 --> 00:36:16,510 like, 1, 2 and 3, as being 4 bytes, or 32 bits. 793 00:36:16,510 --> 00:36:19,540 I made the claim last week that on modern computer's pointers 794 00:36:19,540 --> 00:36:22,570 tend to be 8 bytes or 64 bits. 795 00:36:22,570 --> 00:36:25,280 So, technically, this box should actually be a little bigger. 796 00:36:25,280 --> 00:36:26,980 It was just going to look a little stupid in the picture. 797 00:36:26,980 --> 00:36:28,330 So I abstracted it away. 798 00:36:28,330 --> 00:36:31,330 But, indeed, you're using more space as a result. 799 00:36:31,330 --> 00:36:32,787 AUDIENCE: [INAUDIBLE]. 800 00:36:32,787 --> 00:36:34,120 SPEAKER 1: Oh, how does-- sorry. 801 00:36:34,120 --> 00:36:37,970 How does the computer identify useful data from used data? 802 00:36:37,970 --> 00:36:40,780 So, for instance, garbage values or non-garbage values. 803 00:36:40,780 --> 00:36:43,420 For now, think of that as the job of malloc. 804 00:36:43,420 --> 00:36:46,810 So when you ask malloc for memory, as we started to last week, 805 00:36:46,810 --> 00:36:49,990 malloc keeps track of the addresses of the memory 806 00:36:49,990 --> 00:36:52,960 it has handed to as valid values. 807 00:36:52,960 --> 00:36:55,450 The other type of memory you use, not just from the heap. 808 00:36:55,450 --> 00:36:58,390 Because recall we briefly discussed that malloc uses space 809 00:36:58,390 --> 00:37:01,390 from the heap, which was drawn at the top of the picture, pointing down. 810 00:37:01,390 --> 00:37:05,220 There's also stack memory, which is where all of your local variables go. 811 00:37:05,220 --> 00:37:07,720 And where all of the memory used by individual functions go. 812 00:37:07,720 --> 00:37:10,053 And that was drawn in the picture is working its way up. 813 00:37:10,053 --> 00:37:12,820 That's just an artist's rendition of direction. 814 00:37:12,820 --> 00:37:16,180 The compiler, essentially, will also help 815 00:37:16,180 --> 00:37:19,868 keep track of which values are valid or not inside of the stack. 816 00:37:19,868 --> 00:37:21,910 Or really the underlying code that you've written 817 00:37:21,910 --> 00:37:23,243 will keep track of that for you. 818 00:37:23,243 --> 00:37:26,210 So it's managed for you at that point. 819 00:37:26,210 --> 00:37:26,710 All right. 820 00:37:26,710 --> 00:37:27,310 Good question. 821 00:37:27,310 --> 00:37:29,040 Sorry it took me a bit to catch on. 822 00:37:29,040 --> 00:37:31,210 So let's now translate this to actual code. 823 00:37:31,210 --> 00:37:34,780 How could we implement this idea of, let's call these things nodes. 824 00:37:34,780 --> 00:37:36,160 And that's a term of our NCS. 825 00:37:36,160 --> 00:37:40,210 Whenever you have some data structure that encapsulates information, node, 826 00:37:40,210 --> 00:37:42,947 N-O-D-E, is the generic term for that. 827 00:37:42,947 --> 00:37:44,780 So each of these might be said to be a node. 828 00:37:44,780 --> 00:37:45,830 Well, how can we do this? 829 00:37:45,830 --> 00:37:48,622 Well a couple of weeks ago, we saw how we could represent something 830 00:37:48,622 --> 00:37:50,260 like a student or a candidate. 831 00:37:50,260 --> 00:37:54,940 And a student, or rather a person, we said has a name and a number. 832 00:37:54,940 --> 00:37:56,680 And we used a few pieces of syntax here. 833 00:37:56,680 --> 00:37:59,890 One, we use the struct keyword, which gives us a data structure. 834 00:37:59,890 --> 00:38:04,420 We use typedef, which defines the name person to be our new data 835 00:38:04,420 --> 00:38:06,850 type representing that whole structure. 836 00:38:06,850 --> 00:38:08,950 So we probably have the right ingredients here 837 00:38:08,950 --> 00:38:11,500 to build up this thing called a node. 838 00:38:11,500 --> 00:38:14,620 And just to be clear, what should go inside of one of these nodes, 839 00:38:14,620 --> 00:38:15,435 do we think? 840 00:38:15,435 --> 00:38:17,560 It's not going to be a name or a number, obviously. 841 00:38:17,560 --> 00:38:22,250 But what should a node have in terms of those fields, perhaps? 842 00:38:22,250 --> 00:38:22,750 Yeah? 843 00:38:22,750 --> 00:38:23,625 AUDIENCE: [? Data. ?] 844 00:38:23,625 --> 00:38:26,600 SPEAKER 1: So a number like a number and a pointer in some form. 845 00:38:26,600 --> 00:38:28,850 So let's translate this to actual code. 846 00:38:28,850 --> 00:38:33,610 So let's rename person to node to capture this notion here. 847 00:38:33,610 --> 00:38:34,865 And the number is easy. 848 00:38:34,865 --> 00:38:36,740 If it's just going to be an int, that's fine. 849 00:38:36,740 --> 00:38:38,980 We can just say int number, or int n, or whatever 850 00:38:38,980 --> 00:38:41,380 you want to call that particular field. 851 00:38:41,380 --> 00:38:43,072 The next one is a little non-obvious. 852 00:38:43,072 --> 00:38:45,280 And this is where things get a little weird at first, 853 00:38:45,280 --> 00:38:47,830 but, in retrospect, it should all fit together. 854 00:38:47,830 --> 00:38:53,630 Let me propose that, ideally, we would say something like node* next. 855 00:38:53,630 --> 00:38:55,930 And I could call the word next anything I want. 856 00:38:55,930 --> 00:39:00,110 Next just means what comes after me is the notion I'm using it at. 857 00:39:00,110 --> 00:39:02,500 So a lot of CS people would just use next to represent 858 00:39:02,500 --> 00:39:03,880 the name of this pointer. 859 00:39:03,880 --> 00:39:05,260 But there's a catch here. 860 00:39:05,260 --> 00:39:08,440 C and C compilers are pretty naive, recall. 861 00:39:08,440 --> 00:39:11,660 They only look at code top to bottom, left to right. 862 00:39:11,660 --> 00:39:13,840 And any time they encounter a word they have never 863 00:39:13,840 --> 00:39:15,513 seen before, bad things happen. 864 00:39:15,513 --> 00:39:16,930 Like, you can't compile your code. 865 00:39:16,930 --> 00:39:18,920 You get some cryptic error message or the like. 866 00:39:18,920 --> 00:39:21,910 And that seems to be about to happen here. 867 00:39:21,910 --> 00:39:24,970 Because if the compiler is reading this code from top to bottom, 868 00:39:24,970 --> 00:39:27,340 it's going to say, oh, inside of this struct 869 00:39:27,340 --> 00:39:29,140 should be a variable called next. 870 00:39:29,140 --> 00:39:31,000 Which is of type node*. 871 00:39:31,000 --> 00:39:32,200 What the heck is a node? 872 00:39:32,200 --> 00:39:35,470 Because it literally does not find out until 2 lines 873 00:39:35,470 --> 00:39:37,720 later, after that semicolon. 874 00:39:37,720 --> 00:39:40,330 So the way to avoid this, which we haven't quite seen before, 875 00:39:40,330 --> 00:39:45,220 is that you can temporarily name this whole thing up here, struct node. 876 00:39:45,220 --> 00:39:50,560 And then, down here inside of the data structure, you say struct node*. 877 00:39:50,560 --> 00:39:52,210 And then, you leave the rest alone. 878 00:39:52,210 --> 00:39:56,620 This is a workaround this is possible because now you're 879 00:39:56,620 --> 00:39:59,740 teaching the compiler, from the first line, that here comes 880 00:39:59,740 --> 00:40:01,960 a data structure called struct node. 881 00:40:01,960 --> 00:40:05,420 Down here, you're shortening the name of this whole thing to just node. 882 00:40:05,420 --> 00:40:05,920 Why? 883 00:40:05,920 --> 00:40:09,003 It's just a little more convenient than having to write struct everywhere. 884 00:40:09,003 --> 00:40:12,760 But you do have to write struct node* inside of the data structure. 885 00:40:12,760 --> 00:40:15,730 But that's OK because it's already come into existence 886 00:40:15,730 --> 00:40:17,892 now, as of that first line of code. 887 00:40:17,892 --> 00:40:19,600 So that's the only fundamental difference 888 00:40:19,600 --> 00:40:22,900 between what we did last week with a person or a candidate. 889 00:40:22,900 --> 00:40:27,890 We just now have to use this struct workaround, syntactically. 890 00:40:27,890 --> 00:40:28,390 All right. 891 00:40:28,390 --> 00:40:29,170 Yeah, question. 892 00:40:29,170 --> 00:40:33,010 AUDIENCE: So [INAUDIBLE] have like right next to the [INAUDIBLE] point 893 00:40:33,010 --> 00:40:33,970 to another [INAUDIBLE]. 894 00:40:33,970 --> 00:40:39,070 SPEAKER 1: Why is the next variable a struct node* pointer and not an int 895 00:40:39,070 --> 00:40:41,150 star pointer, for instance? 896 00:40:41,150 --> 00:40:43,870 So think about the picture we are trying to draw. 897 00:40:43,870 --> 00:40:47,740 Technically, yes, each of these arrows I deliberately drew 898 00:40:47,740 --> 00:40:49,240 is pointing at the number. 899 00:40:49,240 --> 00:40:50,500 But that's not alone. 900 00:40:50,500 --> 00:40:53,320 They need to point at the whole data structure in memory. 901 00:40:53,320 --> 00:40:55,600 Because the computer, ultimately, and the compiler, 902 00:40:55,600 --> 00:40:59,470 in turn, needs to know that this chunk of memory is not just an int. 903 00:40:59,470 --> 00:41:01,040 It is a whole node. 904 00:41:01,040 --> 00:41:04,370 Inside of a node is a number and also another pointer. 905 00:41:04,370 --> 00:41:06,770 So when you draw these arrows, it would be 906 00:41:06,770 --> 00:41:09,380 incorrect to point at just the number. 907 00:41:09,380 --> 00:41:11,757 Because that throws away information that 908 00:41:11,757 --> 00:41:14,090 would leave the compiler wondering, OK, I'm at a number. 909 00:41:14,090 --> 00:41:15,200 Where the heck is the pointer? 910 00:41:15,200 --> 00:41:17,450 You have to tell it that it's pointing at a whole node 911 00:41:17,450 --> 00:41:20,857 so it knows a few bytes away is that corresponding pointer. 912 00:41:20,857 --> 00:41:21,440 Good question. 913 00:41:21,440 --> 00:41:23,183 Yeah. 914 00:41:23,183 --> 00:41:24,630 AUDIENCE: How do you [INAUDIBLE]. 915 00:41:24,630 --> 00:41:25,963 SPEAKER 1: Really good question. 916 00:41:25,963 --> 00:41:29,250 It would seem that just as copying the array earlier 917 00:41:29,250 --> 00:41:32,460 required twice as much memory, because we copied from old to new. 918 00:41:32,460 --> 00:41:35,130 So, technically, twice as much plus 1 for the new number. 919 00:41:35,130 --> 00:41:38,520 Here, too, it looks like we're using twice as much memory, also. 920 00:41:38,520 --> 00:41:41,400 And to my comment earlier, it's even more than twice as much memory 921 00:41:41,400 --> 00:41:45,270 because these pointers are 8 bytes, and not just 4 bytes like a typical integer 922 00:41:45,270 --> 00:41:45,870 is. 923 00:41:45,870 --> 00:41:47,280 The differences are these. 924 00:41:47,280 --> 00:41:50,910 In the context of the array, you were using that memory temporarily. 925 00:41:50,910 --> 00:41:52,750 So, yes, you needed twice as much memory. 926 00:41:52,750 --> 00:41:55,600 But then you were quickly freeing the original array. 927 00:41:55,600 --> 00:41:58,890 So you weren't consuming long-term, more memory than you might need. 928 00:41:58,890 --> 00:42:02,290 The difference here, too, is that, as we'll see in a moment, 929 00:42:02,290 --> 00:42:05,670 it turns out it's going to be relatively quick for me, potentially, 930 00:42:05,670 --> 00:42:07,620 to insert new numbers in here. 931 00:42:07,620 --> 00:42:10,620 Because I'm not going to have to do a huge amount of copying. 932 00:42:10,620 --> 00:42:13,800 And even though I might still have to follow all of these arrows, which 933 00:42:13,800 --> 00:42:16,080 is going to take some amount of time, I'm 934 00:42:16,080 --> 00:42:19,470 not going to have to be asking for more memory, freeing more memory. 935 00:42:19,470 --> 00:42:23,190 And certain operations in the computer, anything involving asking for or giving 936 00:42:23,190 --> 00:42:25,000 back memory, tends to be slower. 937 00:42:25,000 --> 00:42:26,858 So we get to avoid that situation as well. 938 00:42:26,858 --> 00:42:28,650 There's going to be some downsides, though. 939 00:42:28,650 --> 00:42:29,700 This is not all upside. 940 00:42:29,700 --> 00:42:33,760 But we'll see in a bit just what some of those trade offs actually are. 941 00:42:33,760 --> 00:42:34,260 All right. 942 00:42:34,260 --> 00:42:38,740 So from here, if we go back to the structure in code as we left it, 943 00:42:38,740 --> 00:42:41,820 let's start to now build up a linked list with some actual code. 944 00:42:41,820 --> 00:42:46,200 How do you go about, in C, representing a linked list in code? 945 00:42:46,200 --> 00:42:48,780 Well, at the moment, it would actually be as simple as this. 946 00:42:48,780 --> 00:42:51,930 You declare a variable, called list, for instance. 947 00:42:51,930 --> 00:42:54,970 That itself stores the address of a node. 948 00:42:54,970 --> 00:42:56,010 That's what node* means. 949 00:42:56,010 --> 00:42:57,220 The address of a node. 950 00:42:57,220 --> 00:42:59,880 So if you want to store a linked list in memory, 951 00:42:59,880 --> 00:43:02,397 you just create a variable called list, or whatever else. 952 00:43:02,397 --> 00:43:04,230 And you just say that this variable is going 953 00:43:04,230 --> 00:43:08,430 to be pointing at the first node in a list, wherever it happens to end up. 954 00:43:08,430 --> 00:43:12,270 Because malloc is ultimately going to be the tool that we use just to go 955 00:43:12,270 --> 00:43:16,270 get at any one particular node in memory. 956 00:43:16,270 --> 00:43:16,770 All right. 957 00:43:16,770 --> 00:43:18,690 So let's actually do this in pictorial form. 958 00:43:18,690 --> 00:43:21,690 When you write a line of code, like I just did here-- 959 00:43:21,690 --> 00:43:25,680 and I do not initialize it to anything with the assignment operator, 960 00:43:25,680 --> 00:43:26,730 an equal sign. 961 00:43:26,730 --> 00:43:30,720 It does exist in memory as a box, as I'll draw it here, called list. 962 00:43:30,720 --> 00:43:33,430 But I've deliberately drawn Oscar inside of it. 963 00:43:33,430 --> 00:43:33,930 Why? 964 00:43:33,930 --> 00:43:35,630 To connote what exactly? 965 00:43:35,630 --> 00:43:36,630 AUDIENCE: Garbage value. 966 00:43:36,630 --> 00:43:37,963 SPEAKER 1: It's a garbage value. 967 00:43:37,963 --> 00:43:42,400 I have been allocated the variable in memory, called list. 968 00:43:42,400 --> 00:43:46,470 Which is going to give me 64 bits or 8 bytes somewhere drawn here 969 00:43:46,470 --> 00:43:47,470 with this box. 970 00:43:47,470 --> 00:43:50,220 But if I myself have not used the assignment operator, 971 00:43:50,220 --> 00:43:53,830 it's not going to get magically initialized to any particular address 972 00:43:53,830 --> 00:43:54,330 for me. 973 00:43:54,330 --> 00:43:56,470 It's not going to even give me a node. 974 00:43:56,470 --> 00:44:01,150 This is literally just going to be an address of a future node that exists. 975 00:44:01,150 --> 00:44:02,760 So what would be a solution here? 976 00:44:02,760 --> 00:44:05,760 Suppose that I'm beginning to create my linked list, 977 00:44:05,760 --> 00:44:07,290 but I don't have any nodes yet. 978 00:44:07,290 --> 00:44:11,302 What would be a sensible thing to initialize the list to, perhaps? 979 00:44:11,302 --> 00:44:12,122 AUDIENCE: Null. 980 00:44:12,122 --> 00:44:13,080 SPEAKER 1: Yeah, again. 981 00:44:13,080 --> 00:44:13,838 AUDIENCE: To null. 982 00:44:13,838 --> 00:44:15,130 SPEAKER 1: So just null, right. 983 00:44:15,130 --> 00:44:16,860 When in doubt with pointers, generally it's 984 00:44:16,860 --> 00:44:18,610 a good thing to initialize things to null, 985 00:44:18,610 --> 00:44:20,160 so at least it's not a garbage value. 986 00:44:20,160 --> 00:44:21,420 It's a known value. 987 00:44:21,420 --> 00:44:22,418 Invalid, yes. 988 00:44:22,418 --> 00:44:24,210 But it's a special value you can then check 989 00:44:24,210 --> 00:44:26,140 for with a conditional, or the like. 990 00:44:26,140 --> 00:44:30,120 So this might be a better way to create a linked list, 991 00:44:30,120 --> 00:44:34,120 even before you've inserted any numbers into the thing itself. 992 00:44:34,120 --> 00:44:34,620 All right. 993 00:44:34,620 --> 00:44:37,835 So after that, how can we go about adding something to this linked list? 994 00:44:37,835 --> 00:44:39,210 So now the story looks like this. 995 00:44:39,210 --> 00:44:42,150 Oscar is gone because inside of this box is all zero bits. 996 00:44:42,150 --> 00:44:46,050 Just because it's nice and clean, and this represents an empty linked list. 997 00:44:46,050 --> 00:44:50,590 Well, if I want to add the number 1 to this linked list, what could I do? 998 00:44:50,590 --> 00:44:52,590 Well, perhaps I could start with code like this. 999 00:44:52,590 --> 00:44:54,300 Borrowing inspiration from last week. 1000 00:44:54,300 --> 00:44:58,920 Let's ask malloc for enough space for the size of a node. 1001 00:44:58,920 --> 00:45:03,060 And this gets to your question earlier, like, what is it I'm manipulating here? 1002 00:45:03,060 --> 00:45:06,360 I don't just need space for an int and I don't just need space for a pointer. 1003 00:45:06,360 --> 00:45:07,440 I need space for both. 1004 00:45:07,440 --> 00:45:10,150 And I gave that thing a name, node. 1005 00:45:10,150 --> 00:45:12,930 So size of node figures out and does the arithmetic for me. 1006 00:45:12,930 --> 00:45:15,390 And gives me back the right number of bytes. 1007 00:45:15,390 --> 00:45:18,930 This, then, stores the address of that chunk of memory 1008 00:45:18,930 --> 00:45:20,880 in what I'll temporarily called n. 1009 00:45:20,880 --> 00:45:23,160 Just to represent a generic new node. 1010 00:45:23,160 --> 00:45:24,870 And it's of type node*. 1011 00:45:24,870 --> 00:45:28,080 Because just like last week when I asked malloc for enough space for an int 1012 00:45:28,080 --> 00:45:30,360 and I stored it in an int* pointer. 1013 00:45:30,360 --> 00:45:32,760 This week, if I'm asking for memory for a node, 1014 00:45:32,760 --> 00:45:35,340 I'm storing it in a node* pointer. 1015 00:45:35,340 --> 00:45:38,520 So technically, nothing new there except for this new term 1016 00:45:38,520 --> 00:45:41,020 of art in data structure called node. 1017 00:45:41,020 --> 00:45:41,520 All right. 1018 00:45:41,520 --> 00:45:42,870 So what does that do for me? 1019 00:45:42,870 --> 00:45:45,660 It essentially draws a picture like this in memory. 1020 00:45:45,660 --> 00:45:49,690 I still have my list variable from my previous line of code initialize 1021 00:45:49,690 --> 00:45:50,190 to null. 1022 00:45:50,190 --> 00:45:51,648 And that's why I've drawn it blank. 1023 00:45:51,648 --> 00:45:54,060 I also now have a temporary variable called 1024 00:45:54,060 --> 00:45:57,570 n, which I initialize to the return value of malloc. 1025 00:45:57,570 --> 00:45:59,650 Which gave me one of these nodes in memory. 1026 00:45:59,650 --> 00:46:02,130 But I've drawn it having garbage values, too, 1027 00:46:02,130 --> 00:46:03,850 because I don't know what int is there. 1028 00:46:03,850 --> 00:46:05,308 I don't know what pointer is there. 1029 00:46:05,308 --> 00:46:09,600 It's garbage values because malloc does not magically initialize memory for me. 1030 00:46:09,600 --> 00:46:11,250 There is another function for that. 1031 00:46:11,250 --> 00:46:14,100 But malloc alone just says, sure, use this chunk of memory. 1032 00:46:14,100 --> 00:46:15,910 Deal with whatever is there. 1033 00:46:15,910 --> 00:46:18,900 So how can I go about initializing this to known values? 1034 00:46:18,900 --> 00:46:23,440 Well, suppose I want to insert the number 1 and then, leave it at that. 1035 00:46:23,440 --> 00:46:27,212 A list of size 1, I could do something like this. 1036 00:46:27,212 --> 00:46:29,920 And this is where you have to think back to some of these basics. 1037 00:46:29,920 --> 00:46:34,060 My conditional here is asking the question if n does not equal null. 1038 00:46:34,060 --> 00:46:37,210 So that is, if malloc gave me valid memory, 1039 00:46:37,210 --> 00:46:40,690 and I don't have to quit altogether because my computer's out of memory. 1040 00:46:40,690 --> 00:46:44,590 If n does not equal null, but is equal to valid address, 1041 00:46:44,590 --> 00:46:46,070 I'm going to go ahead and do this. 1042 00:46:46,070 --> 00:46:48,820 And this is cryptic looking syntax now. 1043 00:46:48,820 --> 00:46:52,150 But does someone want to take a stab at translating this inside line of code 1044 00:46:52,150 --> 00:46:56,380 to English, in some sense? 1045 00:46:56,380 --> 00:47:00,520 How might you explain what that inner line of code is doing? *n. 1046 00:47:00,520 --> 00:47:03,130 number equals 1. 1047 00:47:03,130 --> 00:47:05,355 Let me go further back. 1048 00:47:05,355 --> 00:47:06,477 Nope? 1049 00:47:06,477 --> 00:47:07,060 OK, over here. 1050 00:47:07,060 --> 00:47:07,772 Yeah. 1051 00:47:07,772 --> 00:47:09,010 AUDIENCE: [INAUDIBLE]. 1052 00:47:09,010 --> 00:47:09,802 SPEAKER 1: Perfect. 1053 00:47:09,802 --> 00:47:12,160 The place that n is pointing to, set it equal to 1. 1054 00:47:12,160 --> 00:47:16,060 Or using the vernacular of going there, go to the address in n 1055 00:47:16,060 --> 00:47:18,480 and set it's number field to 1. 1056 00:47:18,480 --> 00:47:20,480 However you want to think about it, that's fine. 1057 00:47:20,480 --> 00:47:22,930 But the * again is the dereference operator here. 1058 00:47:22,930 --> 00:47:24,730 And we're doing the parentheses, which we 1059 00:47:24,730 --> 00:47:28,240 haven't needed to do before because we haven't dealt with pointers and data 1060 00:47:28,240 --> 00:47:30,010 structures together until today. 1061 00:47:30,010 --> 00:47:32,380 This just means go there first. 1062 00:47:32,380 --> 00:47:34,720 And then once you're there, go access number. 1063 00:47:34,720 --> 00:47:36,830 You don't want to do one thing before the other. 1064 00:47:36,830 --> 00:47:38,890 So this is just enforcing order of operations. 1065 00:47:38,890 --> 00:47:41,300 The parentheses just like in grade school math. 1066 00:47:41,300 --> 00:47:41,800 All right. 1067 00:47:41,800 --> 00:47:43,210 So this line of code is cryptic. 1068 00:47:43,210 --> 00:47:43,982 It's ugly. 1069 00:47:43,982 --> 00:47:45,940 It's not something most people easily remember. 1070 00:47:45,940 --> 00:47:49,750 Thankfully, there's that syntactic sugar that simplifies this line of code 1071 00:47:49,750 --> 00:47:50,857 to just this. 1072 00:47:50,857 --> 00:47:52,690 And this, even though it's new to you today, 1073 00:47:52,690 --> 00:47:54,820 should eventually feel a little more familiar. 1074 00:47:54,820 --> 00:47:58,210 Because this now is shorthand notation for saying, start at n. 1075 00:47:58,210 --> 00:48:00,410 Go there as by following the arrow. 1076 00:48:00,410 --> 00:48:02,530 And when you get there, change the number field. 1077 00:48:02,530 --> 00:48:04,720 In this case, to 1. 1078 00:48:04,720 --> 00:48:07,240 So most people would not write code like this. 1079 00:48:07,240 --> 00:48:08,030 It's just ugly. 1080 00:48:08,030 --> 00:48:09,430 It's a couple extra keystrokes. 1081 00:48:09,430 --> 00:48:13,300 This just looks more like the artist's renditions we've been talking about. 1082 00:48:13,300 --> 00:48:17,530 And how most CS people would think about pointers as really just being arrows 1083 00:48:17,530 --> 00:48:18,710 in some form. 1084 00:48:18,710 --> 00:48:19,210 All right. 1085 00:48:19,210 --> 00:48:20,293 So what have we just done? 1086 00:48:20,293 --> 00:48:24,650 The picture now, after setting number to 1, looks a little something like this. 1087 00:48:24,650 --> 00:48:26,440 So there's still one step missing. 1088 00:48:26,440 --> 00:48:28,720 And that's, of course, to initialize, it would seem, 1089 00:48:28,720 --> 00:48:33,080 the pointer in this new node to something known like null. 1090 00:48:33,080 --> 00:48:34,735 So I bet we could do this like this. 1091 00:48:34,735 --> 00:48:36,610 With a different line of code, I'm just going 1092 00:48:36,610 --> 00:48:42,880 to say if n does not equal null, then set n's next field to null. 1093 00:48:42,880 --> 00:48:46,540 Or more pedantically, go to n, follow the arrow, 1094 00:48:46,540 --> 00:48:50,440 and then update the next field that you find there to equal null. 1095 00:48:50,440 --> 00:48:52,690 And again, this is just doing some nice bookkeeping. 1096 00:48:52,690 --> 00:48:55,870 Technically speaking, we might not need to set 1097 00:48:55,870 --> 00:48:58,910 this to null if we're going to keep adding more and more numbers to it. 1098 00:48:58,910 --> 00:49:02,110 But I'm doing it step-by-step so that I have a very clean picture. 1099 00:49:02,110 --> 00:49:05,800 And there's no bugs in my code at this point. 1100 00:49:05,800 --> 00:49:07,270 But I'm still not done. 1101 00:49:07,270 --> 00:49:09,730 There's one last thing I'm going to have to do here. 1102 00:49:09,730 --> 00:49:14,950 If the goal, ultimately, was to insert the number 1 into my linked list, 1103 00:49:14,950 --> 00:49:18,860 what's the last step I should, perhaps, do here? 1104 00:49:18,860 --> 00:49:20,050 Just been English is fine. 1105 00:49:20,050 --> 00:49:20,550 Yeah. 1106 00:49:20,550 --> 00:49:23,260 AUDIENCE: Set the pointer value to null. 1107 00:49:23,260 --> 00:49:24,010 SPEAKER 1: Yes. 1108 00:49:24,010 --> 00:49:27,970 I now need to update the actual variable, that represents my linked 1109 00:49:27,970 --> 00:49:31,030 list, to point at this brand new node. 1110 00:49:31,030 --> 00:49:35,317 That is now perfectly initialized as having an integer and a null pointer. 1111 00:49:35,317 --> 00:49:37,400 Yeah, technically, this is already pointing there. 1112 00:49:37,400 --> 00:49:40,090 But I describe this deliberately earlier as being temporary. 1113 00:49:40,090 --> 00:49:44,620 I just needed this to get it back from malloc and clean things up, initially. 1114 00:49:44,620 --> 00:49:47,230 This is the long term variable I care about. 1115 00:49:47,230 --> 00:49:49,480 So I'm going to want to do something simple like this. 1116 00:49:49,480 --> 00:49:51,520 List equals n. 1117 00:49:51,520 --> 00:49:53,863 And this seems a little weird that list equals n. 1118 00:49:53,863 --> 00:49:55,780 But again, think about what's inside this box. 1119 00:49:55,780 --> 00:49:57,988 At the moment this is null because there is no linked 1120 00:49:57,988 --> 00:49:59,530 list at the beginning of our story. 1121 00:49:59,530 --> 00:50:03,910 N is the address of the beginning, and it turns out, end of our linked list. 1122 00:50:03,910 --> 00:50:07,300 So it stands to reason that if you set list equal to n, 1123 00:50:07,300 --> 00:50:10,180 that has the effect of copying this address up here. 1124 00:50:10,180 --> 00:50:13,283 Or really just copying the arrow into that same location 1125 00:50:13,283 --> 00:50:14,950 so that now the picture looks like this. 1126 00:50:14,950 --> 00:50:18,340 And heck, if this was a temporary variable, it will eventually go away. 1127 00:50:18,340 --> 00:50:19,870 And now, this is the picture. 1128 00:50:19,870 --> 00:50:22,030 So an annoying number of steps, certainly, 1129 00:50:22,030 --> 00:50:24,520 to walk through verbally like this. 1130 00:50:24,520 --> 00:50:26,680 But it's just malloc to give yourself a node, 1131 00:50:26,680 --> 00:50:31,930 initialize the 2 fields inside of it, update the linked list, and boom, 1132 00:50:31,930 --> 00:50:32,770 you're on your way. 1133 00:50:32,770 --> 00:50:34,910 I didn't have to copy anything. 1134 00:50:34,910 --> 00:50:38,132 I just had to insert something in this case. 1135 00:50:38,132 --> 00:50:40,840 Let me pause here to see if there's any questions on those steps. 1136 00:50:40,840 --> 00:50:44,790 And we'll see before long it all in context with some larger code. 1137 00:50:44,790 --> 00:50:48,965 AUDIENCE: So if the statements [INAUDIBLE].. 1138 00:50:48,965 --> 00:50:49,590 SPEAKER 1: Yes. 1139 00:50:49,590 --> 00:50:53,010 I drew them separately just for the sake of the voiceover 1140 00:50:53,010 --> 00:50:55,020 of doing each thing very methodically. 1141 00:50:55,020 --> 00:50:57,090 In real code, as we'll transition to now, 1142 00:50:57,090 --> 00:50:59,220 I could have and should have just done it 1143 00:50:59,220 --> 00:51:03,000 all inside of one conditional after checking if n is not equal to null. 1144 00:51:03,000 --> 00:51:05,310 I could set number to a value like 1. 1145 00:51:05,310 --> 00:51:08,415 And I could set the pointer itself to something like null. 1146 00:51:08,415 --> 00:51:09,030 All right. 1147 00:51:09,030 --> 00:51:12,600 Well let's translate, then, this into some similar code 1148 00:51:12,600 --> 00:51:17,340 that allows us to build up a linked list now using code similar in spirit 1149 00:51:17,340 --> 00:51:18,150 to before. 1150 00:51:18,150 --> 00:51:19,900 But now, using this new primitive. 1151 00:51:19,900 --> 00:51:22,140 So I'm going to go back into VS Code here. 1152 00:51:22,140 --> 00:51:25,470 I'm going to go ahead now and delete the entirety of this old version that 1153 00:51:25,470 --> 00:51:27,270 was entirely array-based. 1154 00:51:27,270 --> 00:51:32,470 And now, inside of my main function, I'm going to go ahead and first do this. 1155 00:51:32,470 --> 00:51:36,180 I'm going to first give myself a list of size 0. 1156 00:51:36,180 --> 00:51:38,610 And I'm going to call that node* list. 1157 00:51:38,610 --> 00:51:41,610 And I'm going to initialize that to null, as we proposed earlier. 1158 00:51:41,610 --> 00:51:44,760 But I'm also now going to have to take the additional step of defining 1159 00:51:44,760 --> 00:51:45,970 what this node is. 1160 00:51:45,970 --> 00:51:49,500 So recall that I might do something like typedef, struct node. 1161 00:51:49,500 --> 00:51:52,320 Inside of this struct node, I'm going to have a number, which 1162 00:51:52,320 --> 00:51:54,010 I'll call number of type int. 1163 00:51:54,010 --> 00:51:56,160 And I'm going to have a structure called node 1164 00:51:56,160 --> 00:51:59,470 with a * that says the next pointer is called next. 1165 00:51:59,470 --> 00:52:03,150 And I'm going to call this whole thing, more succinctly, node, 1166 00:52:03,150 --> 00:52:04,830 instead of struct node. 1167 00:52:04,830 --> 00:52:07,920 Now as an aside, for those of you wondering what the difference really 1168 00:52:07,920 --> 00:52:09,600 is between struct and node. 1169 00:52:09,600 --> 00:52:12,450 Technically, I could do something like this. 1170 00:52:12,450 --> 00:52:15,960 Not use typedef and not use the word node alone. 1171 00:52:15,960 --> 00:52:19,680 This syntax here would actually create for me a new data 1172 00:52:19,680 --> 00:52:22,830 type called, verbosely, struct node. 1173 00:52:22,830 --> 00:52:25,440 And I could use this throughout my code saying struct node. 1174 00:52:25,440 --> 00:52:26,460 Struct node. 1175 00:52:26,460 --> 00:52:27,840 That just gets a little tedious. 1176 00:52:27,840 --> 00:52:30,715 And it would be nicer just to refer to this thing more simplistically 1177 00:52:30,715 --> 00:52:31,750 as a node. 1178 00:52:31,750 --> 00:52:34,230 So what typedef has been doing for us is it, 1179 00:52:34,230 --> 00:52:37,770 again, lets us invent our own word that's even more succinct. 1180 00:52:37,770 --> 00:52:41,040 And this just has the effect now of calling this whole thing 1181 00:52:41,040 --> 00:52:44,760 node without the need, subsequently, to keep saying struct all over the place. 1182 00:52:44,760 --> 00:52:46,170 Just FYI. 1183 00:52:46,170 --> 00:52:46,680 All right. 1184 00:52:46,680 --> 00:52:50,050 So now that this thing exists in main, let's go ahead and do this. 1185 00:52:50,050 --> 00:52:52,770 Let's add a number to list. 1186 00:52:52,770 --> 00:52:55,440 And to do this, I'm going to give myself a temporary variable. 1187 00:52:55,440 --> 00:52:57,340 I'll call it n for consistency. 1188 00:52:57,340 --> 00:53:00,540 I'm going to use malloc to give myself the size of a node, 1189 00:53:00,540 --> 00:53:02,080 just like in our slides. 1190 00:53:02,080 --> 00:53:03,540 And then, I'm going to do a little safety check. 1191 00:53:03,540 --> 00:53:06,470 If n equals equals null, I'm going to do the opposite of the slides. 1192 00:53:06,470 --> 00:53:08,220 I'm just going to quit out of this program 1193 00:53:08,220 --> 00:53:10,960 because there's nothing useful to be done at this point. 1194 00:53:10,960 --> 00:53:13,570 But most likely my computer is not going to run out of memory. 1195 00:53:13,570 --> 00:53:16,750 So I'm going to assume we can keep going with some of the logic here. 1196 00:53:16,750 --> 00:53:21,390 If n does not equal null, and that is it's a valid memory address, 1197 00:53:21,390 --> 00:53:23,370 I'm going to say n []-- 1198 00:53:23,370 --> 00:53:24,930 I'm going to build this up backwards. 1199 00:53:24,930 --> 00:53:26,707 Well let's do. 1200 00:53:26,707 --> 00:53:28,290 That's OK, let's go ahead and do this. 1201 00:53:28,290 --> 00:53:30,600 N [number] equals 1. 1202 00:53:30,600 --> 00:53:35,490 And then n [arrow next] equals null. 1203 00:53:35,490 --> 00:53:42,420 And now, update list to point to new node, list equals n. 1204 00:53:42,420 --> 00:53:44,580 So at this point in the story, we've essentially 1205 00:53:44,580 --> 00:53:49,330 constructed what was that first picture, which looks like this. 1206 00:53:49,330 --> 00:53:53,880 This is the corresponding code via which we built up this node in memory. 1207 00:53:53,880 --> 00:53:56,860 Suppose now, we want to add the number 2 to the list. 1208 00:53:56,860 --> 00:53:58,080 So let's do this again. 1209 00:53:58,080 --> 00:54:02,550 Add a number to list. 1210 00:54:02,550 --> 00:54:03,910 How might I do this? 1211 00:54:03,910 --> 00:54:06,330 Well, I don't need to redeclare n because I can use 1212 00:54:06,330 --> 00:54:08,110 the same temporary variables before. 1213 00:54:08,110 --> 00:54:13,310 So this time, I'm just going to say n equals malloc and the size of a node. 1214 00:54:13,310 --> 00:54:15,060 I'm, again, going to have my safety check. 1215 00:54:15,060 --> 00:54:19,290 So if n equals equals null, then let's just quit out of this altogether. 1216 00:54:19,290 --> 00:54:23,820 But, I have to be a little more careful now. 1217 00:54:23,820 --> 00:54:26,160 Technically speaking, what do I still need 1218 00:54:26,160 --> 00:54:30,540 to do before I quit out of my program to be really proper? 1219 00:54:30,540 --> 00:54:33,880 Free the memory that did succeed a little higher up. 1220 00:54:33,880 --> 00:54:39,280 So I think it suffices to free what is now called list, way at the top. 1221 00:54:39,280 --> 00:54:39,780 All right. 1222 00:54:39,780 --> 00:54:46,260 Now, if all was well, though, let's go ahead and say n [number] equals 2. 1223 00:54:46,260 --> 00:54:51,840 And now, n [arrow next] equals null. 1224 00:54:51,840 --> 00:54:54,900 And now, let's go ahead and add it to the list. 1225 00:54:54,900 --> 00:55:02,910 If I go ahead and do list arrow next equals n, 1226 00:55:02,910 --> 00:55:06,660 I think what we've just done is build up the equivalent, now, 1227 00:55:06,660 --> 00:55:09,660 of this in the computer's memory. 1228 00:55:09,660 --> 00:55:12,180 By going to the list field's next field, which 1229 00:55:12,180 --> 00:55:16,080 is synonymous with the 1 nodes, bottom-most box. 1230 00:55:16,080 --> 00:55:19,540 And store the address of what was n, which a moment ago looked like this. 1231 00:55:19,540 --> 00:55:22,390 And I'm just throwing away, in the picture, the temporary variable. 1232 00:55:22,390 --> 00:55:22,890 All right. 1233 00:55:22,890 --> 00:55:24,880 One last thing to do. 1234 00:55:24,880 --> 00:55:30,087 Let me go down here and say, add a number to list, n equals malloc. 1235 00:55:30,087 --> 00:55:31,170 Let's do it one more time. 1236 00:55:31,170 --> 00:55:32,340 Size of node. 1237 00:55:32,340 --> 00:55:35,280 And clearly, in a real program, we might want to start using a loop. 1238 00:55:35,280 --> 00:55:39,060 And do this dynamically or a function because it's a lot of repetition now. 1239 00:55:39,060 --> 00:55:42,120 But just to go through the syntax here, this is fine. 1240 00:55:42,120 --> 00:55:45,700 If n equals equals null, out of memory for some reason. 1241 00:55:45,700 --> 00:55:51,650 Let's return 1, but we should free the list itself 1242 00:55:51,650 --> 00:55:55,450 and even the second node, list [next]. 1243 00:55:55,450 --> 00:55:58,730 But I've deliberately done this poorly. 1244 00:55:58,730 --> 00:55:59,230 All right. 1245 00:55:59,230 --> 00:56:01,240 This is a little more subtle now. 1246 00:56:01,240 --> 00:56:04,570 And let me get rid of the highlighting just so it's a little more visible. 1247 00:56:04,570 --> 00:56:08,890 If n happens to equal equal null, and something really just 1248 00:56:08,890 --> 00:56:15,040 went wrong they're out of memory, why am I freeing 2 addresses now? 1249 00:56:15,040 --> 00:56:17,770 And again, it's not that I'm freeing those variables per se. 1250 00:56:17,770 --> 00:56:21,620 I'm freeing the addresses at in those variables. 1251 00:56:21,620 --> 00:56:23,890 But there's also a bug with my code here. 1252 00:56:23,890 --> 00:56:26,290 And it's subtle. 1253 00:56:26,290 --> 00:56:27,580 Let me ask more pointedly. 1254 00:56:27,580 --> 00:56:31,683 This line here, 43, what is that freeing specifically? 1255 00:56:31,683 --> 00:56:32,350 Can I go to you? 1256 00:56:32,350 --> 00:56:34,900 AUDIENCE: You're freeing list 2 times. 1257 00:56:34,900 --> 00:56:36,640 SPEAKER 1: I'm freeing, not so. 1258 00:56:36,640 --> 00:56:37,150 That's OK. 1259 00:56:37,150 --> 00:56:38,740 I'm not freeing list 2 times. 1260 00:56:38,740 --> 00:56:41,530 Technically, I'm freeing list once and list next once. 1261 00:56:41,530 --> 00:56:43,600 But let me just ask the more explicit question. 1262 00:56:43,600 --> 00:56:46,420 What am I freeing with line 43 at the moment? 1263 00:56:46,420 --> 00:56:49,420 Which node? 1264 00:56:49,420 --> 00:56:50,930 I think node number 1. 1265 00:56:50,930 --> 00:56:51,430 Why? 1266 00:56:51,430 --> 00:56:53,440 Because if 1 is at the beginning of the list, 1267 00:56:53,440 --> 00:56:56,530 list contains the address of that number 1 node. 1268 00:56:56,530 --> 00:56:58,280 And so this frees that node. 1269 00:56:58,280 --> 00:57:01,250 This line of code, you might think now intuitively, OK, 1270 00:57:01,250 --> 00:57:03,610 it's probably freeing the node number 2. 1271 00:57:03,610 --> 00:57:04,540 But this is bad. 1272 00:57:04,540 --> 00:57:05,410 And this is subtle. 1273 00:57:05,410 --> 00:57:07,120 Valgrind might help you catch this. 1274 00:57:07,120 --> 00:57:09,520 But by eyeing it, it's not necessarily obvious. 1275 00:57:09,520 --> 00:57:13,990 You should never touch memory that you have already freed. 1276 00:57:13,990 --> 00:57:16,930 And so, the fact that I did in this order, very bad. 1277 00:57:16,930 --> 00:57:19,630 Because I'm telling the operating system, I don't know. 1278 00:57:19,630 --> 00:57:22,150 I don't need the list address anymore. 1279 00:57:22,150 --> 00:57:23,410 Do with it what you want. 1280 00:57:23,410 --> 00:57:25,660 And then, literally one line later, you're saying, wait a minute. 1281 00:57:25,660 --> 00:57:27,730 Let me actually go to that address for a moment 1282 00:57:27,730 --> 00:57:30,400 and look at the next field of that first node. 1283 00:57:30,400 --> 00:57:31,220 It's too late. 1284 00:57:31,220 --> 00:57:33,710 You've already given up control over the node. 1285 00:57:33,710 --> 00:57:36,730 So it's an easy fix in this case, logically. 1286 00:57:36,730 --> 00:57:39,370 But we should be freeing the second node first 1287 00:57:39,370 --> 00:57:43,060 and then the first one so that we're doing it 1288 00:57:43,060 --> 00:57:45,040 in, essentially, reverse order. 1289 00:57:45,040 --> 00:57:46,957 And again, Valgrind would help you catch that. 1290 00:57:46,957 --> 00:57:49,582 But that's the kind of thing one needs to be careful about when 1291 00:57:49,582 --> 00:57:50,600 touching memory at all. 1292 00:57:50,600 --> 00:57:53,110 You cannot touch memory after you freed it. 1293 00:57:53,110 --> 00:57:54,970 But here is my last step. 1294 00:57:54,970 --> 00:58:00,490 Let me go ahead and update the number field of n to be 3. 1295 00:58:00,490 --> 00:58:03,500 The next node of n to be null. 1296 00:58:03,500 --> 00:58:05,290 And then, just like in the slide earlier, 1297 00:58:05,290 --> 00:58:11,020 I think I can do list next, next equals n. 1298 00:58:11,020 --> 00:58:14,890 And that has the effect now of building up in the computer's memory, 1299 00:58:14,890 --> 00:58:16,990 essentially, this data structure. 1300 00:58:16,990 --> 00:58:17,890 Very manually. 1301 00:58:17,890 --> 00:58:18,820 Very pedantically. 1302 00:58:18,820 --> 00:58:20,860 Like, in a better world, we'd have a loop and some functions 1303 00:58:20,860 --> 00:58:22,420 that are automating this process. 1304 00:58:22,420 --> 00:58:26,680 But, for now, we're doing it just to play around with the syntax. 1305 00:58:26,680 --> 00:58:31,420 So at this point, unfortunately, suppose I want to print the numbers. 1306 00:58:31,420 --> 00:58:36,190 It's no longer as easy as int i equals 0, i less than 3, i++. 1307 00:58:36,190 --> 00:58:43,420 Because you cannot just do something like this. 1308 00:58:43,420 --> 00:58:48,520 Because pointer arithmetic no longer comes into play 1309 00:58:48,520 --> 00:58:52,750 when it's you, who are stitching together the data structure in memory. 1310 00:58:52,750 --> 00:58:55,450 In all of our past examples with arrays, you've 1311 00:58:55,450 --> 00:58:58,820 been trusting that all of the bytes in the array are back, to back, to back. 1312 00:58:58,820 --> 00:59:01,533 So it's perfectly reasonable for the compiler and the computer 1313 00:59:01,533 --> 00:59:04,450 to just figure out, oh, well if you want [0], that's at the beginning. 1314 00:59:04,450 --> 00:59:06,130 [1], it's one location over. 1315 00:59:06,130 --> 00:59:08,110 [2], it's one location over. 1316 00:59:08,110 --> 00:59:11,030 This is way less obvious now. 1317 00:59:11,030 --> 00:59:14,650 Because even though you might want to go to the first element in the linked 1318 00:59:14,650 --> 00:59:19,270 list, or the second, or the third, you can't just jump to those arithmetically 1319 00:59:19,270 --> 00:59:20,590 by doing a bit of math. 1320 00:59:20,590 --> 00:59:24,040 Instead, you have to follow all of those arrows. 1321 00:59:24,040 --> 00:59:27,340 So with linked lists, you can't use this square bracket notation anymore 1322 00:59:27,340 --> 00:59:30,310 because one node might be here, over here, over here, over here. 1323 00:59:30,310 --> 00:59:33,550 You can't just use some simple offset. 1324 00:59:33,550 --> 00:59:36,340 So I think our code is going to have to be a little fancier. 1325 00:59:36,340 --> 00:59:39,820 And this might look scary at first, but it's just an application 1326 00:59:39,820 --> 00:59:42,160 of some of the basic definitions here. 1327 00:59:42,160 --> 00:59:49,480 Let me do a for-loop that actually uses a node* variable initialized 1328 00:59:49,480 --> 00:59:51,130 to the list itself. 1329 00:59:51,130 --> 00:59:55,780 I'm going to keep doing this, so long as TMP does not equal null. 1330 00:59:55,780 --> 00:59:58,360 And on each iteration of this loop, I'm going 1331 00:59:58,360 --> 01:00:03,100 to update TMP to be whatever TMP arrow next is. 1332 01:00:03,100 --> 01:00:05,710 And I'll remind you in a moment and explain in more detail. 1333 01:00:05,710 --> 01:00:09,730 But when I print something here with printf, I can still use %i. 1334 01:00:09,730 --> 01:00:12,040 Because it's still a number at the end of the day. 1335 01:00:12,040 --> 01:00:16,640 But what I want to print out is the number in this temporary variable. 1336 01:00:16,640 --> 01:00:19,032 So maybe the ugliest for-loop we've ever seen. 1337 01:00:19,032 --> 01:00:21,490 Because it's mixing, not just the idea of a for-loop, which 1338 01:00:21,490 --> 01:00:23,500 itself was a bit cryptic weeks ago. 1339 01:00:23,500 --> 01:00:26,025 But now, I'm using pointers instead of integers. 1340 01:00:26,025 --> 01:00:28,150 But I'm not violating the definition of a for-loop. 1341 01:00:28,150 --> 01:00:30,940 Recall that a for-loop has 3 main things in parentheses. 1342 01:00:30,940 --> 01:00:32,800 What do you want to initialize first? 1343 01:00:32,800 --> 01:00:35,740 What condition do you want to keep checking again and again? 1344 01:00:35,740 --> 01:00:39,440 And what update do you want to make on every iteration of the loop? 1345 01:00:39,440 --> 01:00:41,860 So with that basic definition in mind, this 1346 01:00:41,860 --> 01:00:44,350 is giving me a temporary variable called TMP 1347 01:00:44,350 --> 01:00:46,520 that is initialized to the beginning of the loop. 1348 01:00:46,520 --> 01:00:50,110 So it's like pointing my finger at the number 1 node. 1349 01:00:50,110 --> 01:00:53,530 Then, I'm asking the question, does TMP not equal null? 1350 01:00:53,530 --> 01:00:56,170 Well, hopefully, not because I'm pointing at a valid node 1351 01:00:56,170 --> 01:00:57,710 that is the number 1 node. 1352 01:00:57,710 --> 01:00:59,530 So, of course, it doesn't equal null yet. 1353 01:00:59,530 --> 01:01:02,030 Null won't be until we get to the end of the list. 1354 01:01:02,030 --> 01:01:03,530 So what do I do? 1355 01:01:03,530 --> 01:01:05,260 I started this TMP variable. 1356 01:01:05,260 --> 01:01:10,270 I follow the arrow and go to the number field they're in. 1357 01:01:10,270 --> 01:01:11,350 What do I then do? 1358 01:01:11,350 --> 01:01:15,010 The for-loop says, change TMP to be whatever 1359 01:01:15,010 --> 01:01:19,090 is at TMP, by following the arrow and grabbing the next field. 1360 01:01:19,090 --> 01:01:22,260 That, then, has the result of being checked against this conditional. 1361 01:01:22,260 --> 01:01:24,760 No, of course, it doesn't equal null because the second node 1362 01:01:24,760 --> 01:01:26,050 is the number 2 node. 1363 01:01:26,050 --> 01:01:27,920 Null is still at the very end. 1364 01:01:27,920 --> 01:01:29,710 So I print out the number 2. 1365 01:01:29,710 --> 01:01:33,670 Next step, I update TMP one more time to be whatever is next. 1366 01:01:33,670 --> 01:01:36,230 That, then, does not yet equal null. 1367 01:01:36,230 --> 01:01:38,470 So I go ahead and print out the number 3 node. 1368 01:01:38,470 --> 01:01:44,120 Then one last time, I update TMP to be whatever TMP is in the next field. 1369 01:01:44,120 --> 01:01:47,980 But after 1, 2, 3, that last next field is null. 1370 01:01:47,980 --> 01:01:51,790 And so, I break out of this for-loop altogether. 1371 01:01:51,790 --> 01:01:54,730 So if I do this in pictorial form, all we're 1372 01:01:54,730 --> 01:01:58,300 doing, if I now use my finger to represent the TMP variable. 1373 01:01:58,300 --> 01:02:02,080 I initialize TMP to be whatever list is, so it points here. 1374 01:02:02,080 --> 01:02:04,780 That's obviously not null so I print out whatever 1375 01:02:04,780 --> 01:02:09,100 is that TMP, follow the arrow in number, and I print that out. 1376 01:02:09,100 --> 01:02:11,290 Then I update TMP to point here. 1377 01:02:11,290 --> 01:02:13,077 Then I update TMP to point here. 1378 01:02:13,077 --> 01:02:14,410 Then I update TMP to point here. 1379 01:02:14,410 --> 01:02:15,160 Wait, that's null. 1380 01:02:15,160 --> 01:02:17,480 The for-loop ends. 1381 01:02:17,480 --> 01:02:21,670 So, again, admittedly much more cryptic than our familiar int i equals 0, 1382 01:02:21,670 --> 01:02:22,610 and so forth. 1383 01:02:22,610 --> 01:02:28,855 But it's just a different utilization of the for-loop syntax. 1384 01:02:28,855 --> 01:02:29,355 Yes. 1385 01:02:29,355 --> 01:02:33,140 AUDIENCE: How does it happen that you're always printing out the numbers. 1386 01:02:33,140 --> 01:02:35,018 Because it seems to me that addresses- 1387 01:02:35,018 --> 01:02:36,060 SPEAKER 1: Good question. 1388 01:02:36,060 --> 01:02:39,060 How is it that I'm actually printing numbers and not printing out 1389 01:02:39,060 --> 01:02:40,440 addresses instead. 1390 01:02:40,440 --> 01:02:42,120 The compiler is helping me here. 1391 01:02:42,120 --> 01:02:44,730 Because I taught it, in the very beginning of my program, 1392 01:02:44,730 --> 01:02:45,360 what a node is. 1393 01:02:45,360 --> 01:02:47,730 Which looks like this here. 1394 01:02:47,730 --> 01:02:51,510 The compiler knows that a node has a number of fields and a next field 1395 01:02:51,510 --> 01:02:53,430 down here, in the for-loop. 1396 01:02:53,430 --> 01:02:59,410 Because I'm iterating using a node* pointer, and not an int* pointer, 1397 01:02:59,410 --> 01:03:02,160 the compiler knows that any time I'm pointing at something, 1398 01:03:02,160 --> 01:03:03,940 I'm pointing at the whole node. 1399 01:03:03,940 --> 01:03:07,020 Doesn't matter where specifically in the rectangle I'm pointing per se. 1400 01:03:07,020 --> 01:03:09,210 It's, ultimately, pointing at the whole node itself. 1401 01:03:09,210 --> 01:03:13,320 And the fact that I, then, use TMP arrow number means, OK, 1402 01:03:13,320 --> 01:03:14,490 adjust your finger slightly. 1403 01:03:14,490 --> 01:03:18,510 So you're literally pointing at the number field and not the next field. 1404 01:03:18,510 --> 01:03:22,920 So that's sufficient information for the computer to distinguish the 2. 1405 01:03:22,920 --> 01:03:23,560 Good question. 1406 01:03:23,560 --> 01:03:26,730 Other questions then on this approach here. 1407 01:03:26,730 --> 01:03:28,042 Yeah, in the back. 1408 01:03:28,042 --> 01:03:29,280 AUDIENCE: How would you-- 1409 01:03:29,280 --> 01:03:33,840 SPEAKER 1: How would I use a for-loop to add elements to a linked list? 1410 01:03:33,840 --> 01:03:38,640 You will do something like this, if I may, in problem set 5. 1411 01:03:38,640 --> 01:03:41,730 We will give you some of the scaffolding for doing this. 1412 01:03:41,730 --> 01:03:44,700 But in this coming weeks materials will we guide you to that. 1413 01:03:44,700 --> 01:03:47,293 But let me not spoil it just yet. 1414 01:03:47,293 --> 01:03:48,210 Fair question, though. 1415 01:03:48,210 --> 01:03:48,710 Yeah. 1416 01:03:48,710 --> 01:03:51,077 AUDIENCE: So I had a question about line 49. 1417 01:03:51,077 --> 01:03:51,660 SPEAKER 1: OK. 1418 01:03:51,660 --> 01:03:53,678 AUDIENCE: Is line 49 possible in line 43? 1419 01:03:53,678 --> 01:03:54,720 SPEAKER 1: Good question. 1420 01:03:54,720 --> 01:03:57,900 Is line 49 acceptable, even if we freed it earlier. 1421 01:03:57,900 --> 01:04:00,600 We didn't free it in line 43, in this case, right. 1422 01:04:00,600 --> 01:04:04,800 You can only reach line 49, if n does not equal null. 1423 01:04:04,800 --> 01:04:06,990 And you do not return on line 45. 1424 01:04:06,990 --> 01:04:07,860 So that's safe. 1425 01:04:07,860 --> 01:04:12,180 I was only doing those freeing, if I knew on line 45 that I'm out of here 1426 01:04:12,180 --> 01:04:13,620 anyway, at that point. 1427 01:04:13,620 --> 01:04:14,400 Good question. 1428 01:04:14,400 --> 01:04:15,030 And, yeah. 1429 01:04:15,030 --> 01:04:16,405 AUDIENCE: I had a quick question. 1430 01:04:16,405 --> 01:04:19,380 Is TMP [INAUDIBLE]. 1431 01:04:19,380 --> 01:04:22,650 SPEAKER 1: Correct You're asking about TMP, because it's in a for-loop, 1432 01:04:22,650 --> 01:04:24,358 does that mean you don't have to free it? 1433 01:04:24,358 --> 01:04:26,760 You never have to free pointers, per se. 1434 01:04:26,760 --> 01:04:31,560 You should only free addresses that were returned to you by malloc. 1435 01:04:31,560 --> 01:04:33,930 So I haven't finished the program, to be fair. 1436 01:04:33,930 --> 01:04:35,880 But you're not freeing variables. 1437 01:04:35,880 --> 01:04:37,740 You're not freeing like, fields. 1438 01:04:37,740 --> 01:04:40,870 You are freeing specific addresses, whatever they may be. 1439 01:04:40,870 --> 01:04:43,770 So the last thing, and I was stalling on showing this 1440 01:04:43,770 --> 01:04:45,450 because it too is a little cryptic. 1441 01:04:45,450 --> 01:04:48,570 Here is how you can free, now, a whole linked list. 1442 01:04:48,570 --> 01:04:51,242 In the world of arrays, recall, it was so easy. 1443 01:04:51,242 --> 01:04:52,200 You just say free list. 1444 01:04:52,200 --> 01:04:53,920 You return 0 and you're done. 1445 01:04:53,920 --> 01:04:55,140 Not with a linked list. 1446 01:04:55,140 --> 01:04:57,000 Because, again, the computer doesn't know 1447 01:04:57,000 --> 01:04:59,700 what you have stitched together using all of these pointers 1448 01:04:59,700 --> 01:05:01,140 all over the computer's memory. 1449 01:05:01,140 --> 01:05:03,180 You need to follow those arrows. 1450 01:05:03,180 --> 01:05:05,920 So one way to do this would be as follows. 1451 01:05:05,920 --> 01:05:10,920 While the list itself is not null, so while there's a list to be freed. 1452 01:05:10,920 --> 01:05:12,240 What do I want to do? 1453 01:05:12,240 --> 01:05:14,972 I'm going to give myself a temporary variable called TMP again. 1454 01:05:14,972 --> 01:05:17,430 And it's a different TMP because it's in a different scope. 1455 01:05:17,430 --> 01:05:21,210 It's inside of the while loop instead the for-loop, a few lines earlier. 1456 01:05:21,210 --> 01:05:26,640 I am going to initialize TMP to be the address of the next node. 1457 01:05:26,640 --> 01:05:29,160 Just so I can get one step ahead of things. 1458 01:05:29,160 --> 01:05:30,450 Why am I doing this? 1459 01:05:30,450 --> 01:05:34,330 Because now, I can boldly free the list itself, 1460 01:05:34,330 --> 01:05:35,970 which does not mean the whole list. 1461 01:05:35,970 --> 01:05:38,670 Again, I'm freeing the address in list, which 1462 01:05:38,670 --> 01:05:41,410 is the address of the number 1 node. 1463 01:05:41,410 --> 01:05:42,390 That's what list is. 1464 01:05:42,390 --> 01:05:44,980 It's just the address of the number 1 node. 1465 01:05:44,980 --> 01:05:47,880 So if I first use TMP to point out the number 1466 01:05:47,880 --> 01:05:53,310 2 slightly in the middle of the picture, then it is safe for me on line 61, 1467 01:05:53,310 --> 01:05:55,290 at the moment, to free list. 1468 01:05:55,290 --> 01:05:57,870 That is the address of the first node. 1469 01:05:57,870 --> 01:06:02,160 Now I'm going to say, all right, once I freed the first node in the list, 1470 01:06:02,160 --> 01:06:07,080 I can update the list itself to be literally TMP. 1471 01:06:07,080 --> 01:06:09,120 And now, the loop repeats. 1472 01:06:09,120 --> 01:06:10,450 So what's happening here? 1473 01:06:10,450 --> 01:06:16,140 If you think about this picture, TMP is initially pointing at not the list, 1474 01:06:16,140 --> 01:06:17,550 but list arrow next. 1475 01:06:17,550 --> 01:06:20,940 So TMP, represented by my right hand here, is pointing at the number 2. 1476 01:06:20,940 --> 01:06:25,530 Totally safe and reasonable to free now the list itself a.k.a. 1477 01:06:25,530 --> 01:06:27,150 the address of the number 1 node. 1478 01:06:27,150 --> 01:06:29,880 That has the effect of just throwing away the number 1 node, 1479 01:06:29,880 --> 01:06:32,670 telling the computer you can reuse that memory for you. 1480 01:06:32,670 --> 01:06:36,150 The last line of code I wrote updated list to point at the number 1481 01:06:36,150 --> 01:06:40,560 2, at which point my loop proceeded to do the exact same thing again. 1482 01:06:40,560 --> 01:06:43,590 And only once my finger is literally pointing at nowhere, 1483 01:06:43,590 --> 01:06:46,350 the null symbol, will the loop, by nature of a while 1484 01:06:46,350 --> 01:06:48,990 loop as I'll toggle back to, break out. 1485 01:06:48,990 --> 01:06:51,630 And there's nothing more to be freed. 1486 01:06:51,630 --> 01:06:54,690 So again, what you'll see, ultimately, in problem set 5, 1487 01:06:54,690 --> 01:06:58,690 more on that later, is an opportunity to play around with just this syntax. 1488 01:06:58,690 --> 01:06:59,730 But also these ideas. 1489 01:06:59,730 --> 01:07:02,580 But again, even though the syntax is admittedly pretty cryptic, 1490 01:07:02,580 --> 01:07:06,300 we're still using basics like these for-loops or while loops. 1491 01:07:06,300 --> 01:07:09,960 We're just starting to now follow explicit addresses rather 1492 01:07:09,960 --> 01:07:13,740 than letting the computer do all of the arithmetic for us, 1493 01:07:13,740 --> 01:07:15,635 as we previously benefited from. 1494 01:07:15,635 --> 01:07:18,760 At the very end of this thing, I'm going to return 0 as though all is well. 1495 01:07:18,760 --> 01:07:22,240 And I think, then, we're good to go. 1496 01:07:22,240 --> 01:07:22,740 All right. 1497 01:07:22,740 --> 01:07:25,960 Questions on this linked list code now? 1498 01:07:25,960 --> 01:07:28,710 And again, we'll walk through this again in the coming weeks spec. 1499 01:07:28,710 --> 01:07:29,210 Yeah. 1500 01:07:29,210 --> 01:07:33,613 AUDIENCE: Can you explain the while loop [INAUDIBLE] starts in other ways? 1501 01:07:33,613 --> 01:07:34,280 SPEAKER 1: Sure. 1502 01:07:34,280 --> 01:07:37,950 Can we explain this while loop here for freeing the list. 1503 01:07:37,950 --> 01:07:40,580 So notice that, first, I'm just asking the obvious question. 1504 01:07:40,580 --> 01:07:41,420 Is the list null? 1505 01:07:41,420 --> 01:07:45,390 Because if it is, there's no work to be done. 1506 01:07:45,390 --> 01:07:49,460 However, while the list is not null, according to line 58, 1507 01:07:49,460 --> 01:07:50,540 what do we want to do? 1508 01:07:50,540 --> 01:07:54,920 I want to create a temporary variable that points at the same thing 1509 01:07:54,920 --> 01:07:57,540 that list arrow next is pointing at. 1510 01:07:57,540 --> 01:07:58,760 So what does that mean? 1511 01:07:58,760 --> 01:08:00,260 Here is list. 1512 01:08:00,260 --> 01:08:03,690 List arrow next is whatever this thing is here. 1513 01:08:03,690 --> 01:08:06,470 So if my right hand represents the temporary variable, 1514 01:08:06,470 --> 01:08:10,470 I'm literally pointing at the same thing as the list is itself. 1515 01:08:10,470 --> 01:08:13,640 The next line of code, recall, was free the list. 1516 01:08:13,640 --> 01:08:16,400 And unlike, in our world of arrays, like half an hour 1517 01:08:16,400 --> 01:08:19,100 ago where that just meant free the whole darn list, 1518 01:08:19,100 --> 01:08:23,690 you now have taken over control over the computer's memory with a linked list, 1519 01:08:23,690 --> 01:08:25,550 in ways that you didn't with the array. 1520 01:08:25,550 --> 01:08:28,850 The computer knew how to free the whole array because you 1521 01:08:28,850 --> 01:08:30,680 malloc the whole thing at once. 1522 01:08:30,680 --> 01:08:34,580 You are now mallocing the linked list one node at a time. 1523 01:08:34,580 --> 01:08:37,430 And the operating system does not keep track of for you 1524 01:08:37,430 --> 01:08:38,810 where all these nodes are. 1525 01:08:38,810 --> 01:08:42,470 So when you free list, you are literally freeing 1526 01:08:42,470 --> 01:08:46,430 the value of the list variable, which is just this first node here. 1527 01:08:46,430 --> 01:08:49,820 Then my last line of code, which I'll flip back to in a second, updates 1528 01:08:49,820 --> 01:08:54,500 list to now ignore the free memory and point at 2. 1529 01:08:54,500 --> 01:08:57,080 And the story then repeats. 1530 01:08:57,080 --> 01:09:00,500 So, again, it's just a very pedantic way of using 1531 01:09:00,500 --> 01:09:04,460 this new syntax of star notation, and the arrow notation, and the like, 1532 01:09:04,460 --> 01:09:08,420 to do the equivalent of walking down all of these arrows. 1533 01:09:08,420 --> 01:09:10,640 Following all of these breadcrumbs. 1534 01:09:10,640 --> 01:09:13,940 But it does take admittedly some getting used to. 1535 01:09:13,940 --> 01:09:16,445 Syntax, you only have to do one week. 1536 01:09:16,445 --> 01:09:18,320 But, again, next week in Python will we begin 1537 01:09:18,320 --> 01:09:20,150 to abstract a lot of this complexity away. 1538 01:09:20,150 --> 01:09:22,020 But none of this complexity is going away. 1539 01:09:22,020 --> 01:09:24,770 It's just that someone else, the authors of Python for instance, 1540 01:09:24,770 --> 01:09:26,908 will have automated this stuff for us. 1541 01:09:26,908 --> 01:09:28,700 The goal this week is to understand what it 1542 01:09:28,700 --> 01:09:31,980 is we're going to get for free, so to speak, next week. 1543 01:09:31,980 --> 01:09:32,480 All right. 1544 01:09:32,480 --> 01:09:36,810 Questions on these length lists. 1545 01:09:36,810 --> 01:09:37,310 All right. 1546 01:09:37,310 --> 01:09:38,450 Just, yeah, in the back. 1547 01:09:38,450 --> 01:09:41,264 AUDIENCE: So are the while loops strictly necessary 1548 01:09:41,264 --> 01:09:42,728 for the freeing [INAUDIBLE]. 1549 01:09:42,728 --> 01:09:43,770 SPEAKER 1: Fair question. 1550 01:09:43,770 --> 01:09:46,353 Let me summarize as, could we have freed this with a for-loop? 1551 01:09:46,353 --> 01:09:47,279 Absolutely. 1552 01:09:47,279 --> 01:09:48,630 It just is a matter of style. 1553 01:09:48,630 --> 01:09:51,670 It's a little more elegant to do it in a while loop, according to me. 1554 01:09:51,670 --> 01:09:53,672 But other people will reasonably disagree. 1555 01:09:53,672 --> 01:09:56,380 Anything you can do with a while loop you can do with a for-loop, 1556 01:09:56,380 --> 01:09:57,390 and vise versa. 1557 01:09:57,390 --> 01:09:59,729 Do while loops, recall, are a little different. 1558 01:09:59,729 --> 01:10:02,372 But they will always do at least one thing. 1559 01:10:02,372 --> 01:10:04,830 But for-loops and while loops behave the same in this case. 1560 01:10:04,830 --> 01:10:05,953 AUDIENCE: Thank you. 1561 01:10:05,953 --> 01:10:06,620 SPEAKER 1: Sure. 1562 01:10:06,620 --> 01:10:08,000 Other questions? 1563 01:10:08,000 --> 01:10:10,399 All right, well let's just vary things a little bit here. 1564 01:10:10,399 --> 01:10:12,482 Just to see what some of the pitfalls might now be 1565 01:10:12,482 --> 01:10:14,240 without getting into the weeds of code. 1566 01:10:14,240 --> 01:10:18,229 Indeed, we'll try to save some of that for problem set 5's exploration. 1567 01:10:18,229 --> 01:10:22,520 But instead, let's imagine that we want to create a list here of our own. 1568 01:10:22,520 --> 01:10:25,700 I can offer, in exchange for a few volunteers, some foam fingers 1569 01:10:25,700 --> 01:10:27,617 to bring to the next game, perhaps. 1570 01:10:27,617 --> 01:10:29,450 Could we get maybe just one volunteer first? 1571 01:10:29,450 --> 01:10:30,109 Come on up. 1572 01:10:30,109 --> 01:10:33,109 You will be our linked list from the get go. 1573 01:10:33,109 --> 01:10:33,913 What's your name? 1574 01:10:33,913 --> 01:10:34,580 AUDIENCE: Pedro. 1575 01:10:34,580 --> 01:10:36,840 SPEAKER 1: Pedro, come on up. 1576 01:10:36,840 --> 01:10:38,090 All right, thank you to Pedro. 1577 01:10:38,090 --> 01:10:41,180 [AUDIENCE CLAPPING] 1578 01:10:41,180 --> 01:10:43,180 And if you want to just stand roughly over here. 1579 01:10:43,180 --> 01:10:45,729 But you are a null pointer so just point sort of at the ground, 1580 01:10:45,729 --> 01:10:46,930 as though you're pointing at 0. 1581 01:10:46,930 --> 01:10:47,430 All right. 1582 01:10:47,430 --> 01:10:50,027 So Pedro is our linked list of size 0, which pictorially 1583 01:10:50,027 --> 01:10:53,319 might look a little something like this for consistency with our past pictures. 1584 01:10:53,319 --> 01:10:58,000 Now suppose that we want to go ahead and malloc, oh, how about the number 2. 1585 01:10:58,000 --> 01:11:00,200 Can we get a volunteer to be on camera here? 1586 01:11:00,200 --> 01:11:00,700 OK. 1587 01:11:00,700 --> 01:11:01,867 You jumped out of your seat. 1588 01:11:01,867 --> 01:11:04,408 Do you want to come up? 1589 01:11:04,408 --> 01:11:06,200 OK, you really want the foam finger, I say. 1590 01:11:06,200 --> 01:11:06,370 All right. 1591 01:11:06,370 --> 01:11:07,450 Round of applause, sure. 1592 01:11:07,450 --> 01:11:12,690 [AUDIENCE CLAPPING] 1593 01:11:12,690 --> 01:11:13,235 OK. 1594 01:11:13,235 --> 01:11:14,110 And what's your name? 1595 01:11:14,110 --> 01:11:14,970 AUDIENCE: Caleb. 1596 01:11:14,970 --> 01:11:15,430 SPEAKER 1: Say again? 1597 01:11:15,430 --> 01:11:15,760 AUDIENCE: Caleb. 1598 01:11:15,760 --> 01:11:16,030 SPEAKER 1: Halen? 1599 01:11:16,030 --> 01:11:16,762 AUDIENCE: Caleb. 1600 01:11:16,762 --> 01:11:17,470 SPEAKER 1: Caleb. 1601 01:11:17,470 --> 01:11:18,770 Caleb, sorry. 1602 01:11:18,770 --> 01:11:19,270 All right. 1603 01:11:19,270 --> 01:11:21,790 So here is your number 2 for your number field. 1604 01:11:21,790 --> 01:11:23,020 And here is your pointer. 1605 01:11:23,020 --> 01:11:26,115 And come on, let's say that there was room for Caleb like, right there. 1606 01:11:26,115 --> 01:11:26,740 That's perfect. 1607 01:11:26,740 --> 01:11:29,480 So Caleb got malloced, if you will, over here. 1608 01:11:29,480 --> 01:11:33,805 So now if we want to insert Caleb and the number 2 into this linked list, 1609 01:11:33,805 --> 01:11:34,930 well what do we need to do? 1610 01:11:34,930 --> 01:11:36,340 I already initialized you to 2. 1611 01:11:36,340 --> 01:11:38,320 And pointing as you are to the ground means 1612 01:11:38,320 --> 01:11:40,630 you're initialized to null for your next field. 1613 01:11:40,630 --> 01:11:42,400 Pedro, what you should you-- perfect. 1614 01:11:42,400 --> 01:11:43,720 What should Pedro do. 1615 01:11:43,720 --> 01:11:44,620 That's fine, too. 1616 01:11:44,620 --> 01:11:46,195 So Pedro is now pointing at the list. 1617 01:11:46,195 --> 01:11:48,320 So now our list looks a little something like this. 1618 01:11:48,320 --> 01:11:49,540 So far, so good. 1619 01:11:49,540 --> 01:11:50,170 All is well. 1620 01:11:50,170 --> 01:11:52,670 So the first couple of these will be pretty straightforward. 1621 01:11:52,670 --> 01:11:56,180 Let's insert one more, if anyone really wants another foam finger. 1622 01:11:56,180 --> 01:11:57,680 Here, how about right in the middle. 1623 01:11:57,680 --> 01:11:58,870 Come on down. 1624 01:11:58,870 --> 01:12:01,678 And just in anticipation, how about let's malloc someone else. 1625 01:12:01,678 --> 01:12:03,220 OK, your friends are pointing at you. 1626 01:12:03,220 --> 01:12:05,350 Do you want to come down too, preemptively? 1627 01:12:05,350 --> 01:12:07,852 This is a pool of memory, if you will. 1628 01:12:07,852 --> 01:12:08,560 What's your name? 1629 01:12:08,560 --> 01:12:09,130 AUDIENCE: Hannah. 1630 01:12:09,130 --> 01:12:09,880 SPEAKER 1: Hannah. 1631 01:12:09,880 --> 01:12:10,600 All right, Hanna. 1632 01:12:10,600 --> 01:12:11,440 You are number 4. 1633 01:12:11,440 --> 01:12:13,180 [AUDIENCE CLAPPING] 1634 01:12:13,180 --> 01:12:14,810 And hang there for just a moment. 1635 01:12:14,810 --> 01:12:15,310 All right. 1636 01:12:15,310 --> 01:12:16,870 So we've just malloced Hannah. 1637 01:12:16,870 --> 01:12:20,140 And Hannah, how about Hannah, suppose you ended up over there 1638 01:12:20,140 --> 01:12:21,800 in just some random location. 1639 01:12:21,800 --> 01:12:22,300 All right. 1640 01:12:22,300 --> 01:12:25,960 So what should we now do, if the goal is to keep these things sorted? 1641 01:12:25,960 --> 01:12:26,560 How about? 1642 01:12:26,560 --> 01:12:28,538 So Pedro, do you have to update yourself? 1643 01:12:28,538 --> 01:12:29,080 AUDIENCE: No. 1644 01:12:29,080 --> 01:12:29,410 SPEAKER 1: No. 1645 01:12:29,410 --> 01:12:29,910 All right. 1646 01:12:29,910 --> 01:12:31,300 Caleb, what do you have to do? 1647 01:12:31,300 --> 01:12:31,800 OK. 1648 01:12:31,800 --> 01:12:34,692 And Hannah what should you be doing? 1649 01:12:34,692 --> 01:12:37,900 I would, it's just for you for now, so point at the ground representing null. 1650 01:12:37,900 --> 01:12:38,400 OK. 1651 01:12:38,400 --> 01:12:41,290 So, again demonstrating the fact that, unlike in past weeks where 1652 01:12:41,290 --> 01:12:43,810 we had our nice, clean array back, to back, to back, 1653 01:12:43,810 --> 01:12:46,380 contiguously, these guys are deliberately all over the stage. 1654 01:12:46,380 --> 01:12:47,380 So let's malloc another. 1655 01:12:47,380 --> 01:12:49,012 How about number 5. 1656 01:12:49,012 --> 01:12:49,720 What's your name? 1657 01:12:49,720 --> 01:12:50,440 AUDIENCE: Jonathan. 1658 01:12:50,440 --> 01:12:50,920 SPEAKER 1: Jonathan. 1659 01:12:50,920 --> 01:12:51,753 All right, Jonathan. 1660 01:12:51,753 --> 01:12:53,440 You are our number 5. 1661 01:12:53,440 --> 01:12:55,255 And pick your favorite place in memory. 1662 01:12:55,255 --> 01:12:56,200 [AUDIENCE CLAPPING] 1663 01:12:56,200 --> 01:12:56,700 OK. 1664 01:12:58,820 --> 01:12:59,320 All right. 1665 01:12:59,320 --> 01:13:01,548 So Jonathan's now over there. 1666 01:13:01,548 --> 01:13:02,590 And Hannah is over there. 1667 01:13:02,590 --> 01:13:04,447 So 5, we want to point Hannah at number 5. 1668 01:13:04,447 --> 01:13:06,280 So you, of course, are going to point there. 1669 01:13:06,280 --> 01:13:07,655 And where should you be pointing? 1670 01:13:07,655 --> 01:13:09,500 Down to represent null, as well. 1671 01:13:09,500 --> 01:13:10,000 OK. 1672 01:13:10,000 --> 01:13:11,553 So pretty straightforward. 1673 01:13:11,553 --> 01:13:13,220 But now things get a little interesting. 1674 01:13:13,220 --> 01:13:16,000 And here, we'll use a chance to, without the weeds of code, 1675 01:13:16,000 --> 01:13:19,090 point out how order of operations is really going to matter. 1676 01:13:19,090 --> 01:13:23,320 Suppose that I next want to allocate say, the number 1. 1677 01:13:23,320 --> 01:13:25,510 And I want to insert the number 1 into this list. 1678 01:13:25,510 --> 01:13:26,010 Yes. 1679 01:13:26,010 --> 01:13:27,620 This is what the code would look like. 1680 01:13:27,620 --> 01:13:31,180 But if we act this out-- could we get one more volunteer? 1681 01:13:31,180 --> 01:13:32,990 How about on the end there in the sweater. 1682 01:13:32,990 --> 01:13:33,490 Yeah. 1683 01:13:33,490 --> 01:13:34,780 Come on down. 1684 01:13:34,780 --> 01:13:35,950 We have, what's your name? 1685 01:13:35,950 --> 01:13:36,850 AUDIENCE: Lauren. 1686 01:13:36,850 --> 01:13:37,300 SPEAKER 1: Lauren. 1687 01:13:37,300 --> 01:13:37,540 OK. 1688 01:13:37,540 --> 01:13:38,650 Lauren, come on down. 1689 01:13:38,650 --> 01:13:43,975 [AUDIENCE CLAPPING] 1690 01:13:43,975 --> 01:13:45,850 And how about, Lauren, why don't you go right 1691 01:13:45,850 --> 01:13:47,470 in here in front, if you don't mind. 1692 01:13:47,470 --> 01:13:48,670 Here is your number. 1693 01:13:48,670 --> 01:13:49,780 Here is your pointer. 1694 01:13:49,780 --> 01:13:51,850 So I've initialized Lauren to the number 1. 1695 01:13:51,850 --> 01:13:54,460 And your pointer will be null, pointing at the ground. 1696 01:13:54,460 --> 01:13:57,003 Where do you belong if we're maintaining sorted order? 1697 01:13:57,003 --> 01:13:58,420 Looks like right at the beginning. 1698 01:13:58,420 --> 01:14:00,920 What should happen here? 1699 01:14:00,920 --> 01:14:01,420 OK. 1700 01:14:01,420 --> 01:14:06,100 So Pedro has presumed to point now at Lauren. 1701 01:14:06,100 --> 01:14:10,330 But how do you know where to point? 1702 01:14:10,330 --> 01:14:11,500 AUDIENCE: He's number 2. 1703 01:14:11,500 --> 01:14:13,400 SPEAKER 1: Pedro's undoing what he did a moment ago. 1704 01:14:13,400 --> 01:14:14,380 So this was deliberate. 1705 01:14:14,380 --> 01:14:17,750 And that was perfect that Pedro presumed to point immediately at Lauren. 1706 01:14:17,750 --> 01:14:18,250 Why? 1707 01:14:18,250 --> 01:14:21,950 You literally just orphaned all of these folks, all of these chunks of memory. 1708 01:14:21,950 --> 01:14:22,450 Why? 1709 01:14:22,450 --> 01:14:26,800 Because if Pedro was our only variable pointing at that chunk of memory, 1710 01:14:26,800 --> 01:14:29,800 this is the danger of using pointers, and dynamic memory allocation, 1711 01:14:29,800 --> 01:14:31,180 and building your own data structures. 1712 01:14:31,180 --> 01:14:33,138 The moment you point temporarily, if you could, 1713 01:14:33,138 --> 01:14:36,490 to Lauren, I have no idea where he's pointing to. 1714 01:14:36,490 --> 01:14:41,260 I have no idea how to get back to Caleb, or Hannah, or anyone else on stage. 1715 01:14:41,260 --> 01:14:42,040 So that was bad. 1716 01:14:42,040 --> 01:14:43,310 So you did undo it. 1717 01:14:43,310 --> 01:14:44,290 So that's good. 1718 01:14:44,290 --> 01:14:46,300 I think we need Lauren to make a decision first. 1719 01:14:46,300 --> 01:14:47,410 Who should you point at? 1720 01:14:47,410 --> 01:14:47,650 AUDIENCE: Caleb. 1721 01:14:47,650 --> 01:14:48,820 SPEAKER 1: So pointing at Caleb. 1722 01:14:48,820 --> 01:14:49,120 Why? 1723 01:14:49,120 --> 01:14:51,703 Because you're pointing at literally who Pedro is pointing at. 1724 01:14:51,703 --> 01:14:53,490 Pedro, now what are you safe to do? 1725 01:14:53,490 --> 01:14:53,990 Good. 1726 01:14:53,990 --> 01:14:55,730 So order of operations there matters. 1727 01:14:55,730 --> 01:14:59,830 And if we had just done this line of code in red here, list equals n. 1728 01:14:59,830 --> 01:15:02,740 That was like Pedro's first instinct, bad things happen. 1729 01:15:02,740 --> 01:15:04,700 And we orphaned the rest of the list. 1730 01:15:04,700 --> 01:15:08,350 But if we think through it logically and do this, as Lauren did for us, instead, 1731 01:15:08,350 --> 01:15:11,840 we've now updated the list to look a little something more like this. 1732 01:15:11,840 --> 01:15:12,910 Let's do one last one. 1733 01:15:12,910 --> 01:15:15,485 We got one more foam finger here for the number 3. 1734 01:15:15,485 --> 01:15:16,360 How about on the end? 1735 01:15:16,360 --> 01:15:16,860 Yeah. 1736 01:15:16,860 --> 01:15:18,190 You want to come down. 1737 01:15:18,190 --> 01:15:18,850 All right. 1738 01:15:18,850 --> 01:15:19,900 One final volunteer. 1739 01:15:19,900 --> 01:15:26,010 [AUDIENCE CLAPPING] 1740 01:15:26,010 --> 01:15:26,510 All right. 1741 01:15:26,510 --> 01:15:27,385 And what's your name? 1742 01:15:27,385 --> 01:15:28,230 AUDIENCE: Miriam. 1743 01:15:28,230 --> 01:15:28,430 SPEAKER 1: I'm sorry? 1744 01:15:28,430 --> 01:15:28,940 AUDIENCE: Miriam. 1745 01:15:28,940 --> 01:15:29,480 SPEAKER 1: Miriam. 1746 01:15:29,480 --> 01:15:29,750 All right. 1747 01:15:29,750 --> 01:15:30,860 So here is your number 3. 1748 01:15:30,860 --> 01:15:31,735 Here is your pointer. 1749 01:15:31,735 --> 01:15:35,370 If you want to go maybe in the middle of the stage in a random memory location. 1750 01:15:35,370 --> 01:15:39,270 So here, too, the goal is to maintain sorted order. 1751 01:15:39,270 --> 01:15:44,400 So let's ask the audience, who or what number should point at whom first here? 1752 01:15:44,400 --> 01:15:46,910 So we don't screw up and orphan some of the memory. 1753 01:15:46,910 --> 01:15:50,240 And if we do orphan memory, this is what's called, again per last week, 1754 01:15:50,240 --> 01:15:51,110 a memory leak. 1755 01:15:51,110 --> 01:15:53,420 Your Mac, your PC, your phone can start to slow down 1756 01:15:53,420 --> 01:15:56,610 if you keep asking for memory but never give it back or lose track of it. 1757 01:15:56,610 --> 01:15:58,430 So we want to get this right. 1758 01:15:58,430 --> 01:16:00,140 Who should point at whom? 1759 01:16:00,140 --> 01:16:01,370 Or what number? 1760 01:16:01,370 --> 01:16:02,312 Say again. 1761 01:16:02,312 --> 01:16:03,020 AUDIENCE: 3 to 4. 1762 01:16:03,020 --> 01:16:04,700 SPEAKER 1: 3 should point at 4. 1763 01:16:04,700 --> 01:16:08,090 So 3, do you want to point at 4. 1764 01:16:08,090 --> 01:16:09,800 And not, so, OK, good. 1765 01:16:09,800 --> 01:16:14,960 And how did you know, Miriam, whom to point at? 1766 01:16:14,960 --> 01:16:15,998 AUDIENCE: Copying Caleb. 1767 01:16:15,998 --> 01:16:16,790 SPEAKER 1: Perfect. 1768 01:16:16,790 --> 01:16:18,150 OK, so copying Caleb. 1769 01:16:18,150 --> 01:16:18,650 Why? 1770 01:16:18,650 --> 01:16:22,220 Because if you look at where this list is currently constructed, 1771 01:16:22,220 --> 01:16:25,070 and you can cheat on the board here, 2 is pointing to 4. 1772 01:16:25,070 --> 01:16:28,640 If you point at whoever Caleb, number 2, is pointing out, 1773 01:16:28,640 --> 01:16:31,460 that, indeed, leads you to Hannah for number 4. 1774 01:16:31,460 --> 01:16:35,600 So now what's the next step to stitch this together? 1775 01:16:35,600 --> 01:16:37,220 Our voice in the crowd. 1776 01:16:37,220 --> 01:16:38,150 AUDIENCE: 2 to 3. 1777 01:16:38,150 --> 01:16:39,260 SPEAKER 1: 2 to 3. 1778 01:16:39,260 --> 01:16:40,310 So, 2 to 3. 1779 01:16:40,310 --> 01:16:42,903 So Caleb, I think it's now safe for you to decouple. 1780 01:16:42,903 --> 01:16:44,820 Because someone is already pointing at Hannah. 1781 01:16:44,820 --> 01:16:45,945 We haven't orphaned anyone. 1782 01:16:45,945 --> 01:16:47,840 So now, if we follow the breadcrumbs, we've 1783 01:16:47,840 --> 01:16:52,870 got Pedro leading to 1, to 2, to 3, to 4, to 5. 1784 01:16:52,870 --> 01:16:55,370 We need the numbers back, but you can keep the foam fingers. 1785 01:16:55,370 --> 01:16:57,537 Thank you to our volunteers here. 1786 01:16:57,537 --> 01:16:58,370 AUDIENCE: Thank you. 1787 01:16:58,370 --> 01:16:58,870 Thank you. 1788 01:16:58,870 --> 01:17:00,260 [AUDIENCE CLAPPING] 1789 01:17:00,260 --> 01:17:03,257 SPEAKER 1: You can just put the numbers here. 1790 01:17:03,257 --> 01:17:04,090 AUDIENCE: Thank you. 1791 01:17:04,090 --> 01:17:05,257 SPEAKER 1: Thank you to all. 1792 01:17:05,257 --> 01:17:09,200 So this is only to say that when you start looking at the code this week 1793 01:17:09,200 --> 01:17:11,763 and in the problem set, it's going to be very easy to lose 1794 01:17:11,763 --> 01:17:13,180 sight of the forest for the trees. 1795 01:17:13,180 --> 01:17:15,220 Because the code does get really dense. 1796 01:17:15,220 --> 01:17:20,240 But the idea is, again, really do bubble up to these higher level descriptions. 1797 01:17:20,240 --> 01:17:23,300 And if you think about data structures at this level. 1798 01:17:23,300 --> 01:17:25,417 If you go off in program after a class like CS50 1799 01:17:25,417 --> 01:17:28,000 and your whiteboarding something with a friend or a colleague, 1800 01:17:28,000 --> 01:17:31,030 most people think at and talk at this level. 1801 01:17:31,030 --> 01:17:33,550 And they just assume that, yeah, if we went back and looked 1802 01:17:33,550 --> 01:17:36,890 at our textbooks or class notes, we could figure out how to implement this. 1803 01:17:36,890 --> 01:17:38,740 But the important stuff is the conversation. 1804 01:17:38,740 --> 01:17:40,120 And the idea is up here. 1805 01:17:40,120 --> 01:17:45,080 Even though, via this week, will we get some practice with the actual code. 1806 01:17:45,080 --> 01:17:49,090 So when it comes to analyzing an algorithm like this, 1807 01:17:49,090 --> 01:17:51,160 let's consider the following. 1808 01:17:51,160 --> 01:17:58,480 What might be now the running time of operations like searching and inserting 1809 01:17:58,480 --> 01:18:00,100 into a linked list? 1810 01:18:00,100 --> 01:18:01,810 We talked about arrays earlier. 1811 01:18:01,810 --> 01:18:04,810 And we had some binary search possibilities still, as soon 1812 01:18:04,810 --> 01:18:05,650 as it's an array. 1813 01:18:05,650 --> 01:18:08,830 But as soon as we have a linked list, these arrows, like our volunteers, 1814 01:18:08,830 --> 01:18:10,180 could be anywhere on stage. 1815 01:18:10,180 --> 01:18:11,888 And so you can't just assume that you can 1816 01:18:11,888 --> 01:18:14,680 jump arithmetically to the middle element, to the middle element, 1817 01:18:14,680 --> 01:18:15,500 to the middle one. 1818 01:18:15,500 --> 01:18:19,090 You pretty much have to follow all of these breadcrumbs again and again. 1819 01:18:19,090 --> 01:18:21,880 So how might that inform what we see? 1820 01:18:21,880 --> 01:18:23,595 Well, consider this too. 1821 01:18:23,595 --> 01:18:26,470 Even though I keep drawing all these pictures with all of the numbers 1822 01:18:26,470 --> 01:18:26,980 exposed. 1823 01:18:26,980 --> 01:18:28,772 And all of us humans in the room can easily 1824 01:18:28,772 --> 01:18:32,360 spot where the 1 is, where the 2 is, where the 3 is, the computer, again, 1825 01:18:32,360 --> 01:18:36,610 just like with our lockers and arrays, can only see one location at a time. 1826 01:18:36,610 --> 01:18:40,510 And the key thing with a linked list is that the only address 1827 01:18:40,510 --> 01:18:44,410 we've fundamentally been remembering is what Pedro represented a moment ago. 1828 01:18:44,410 --> 01:18:47,990 He was the link to all of the other nodes. 1829 01:18:47,990 --> 01:18:49,990 And, in turn, each person led to the next. 1830 01:18:49,990 --> 01:18:54,650 But without Pedro, we would have lost some of, or all of, the linked list. 1831 01:18:54,650 --> 01:18:56,950 So when you start with a linked list, if you 1832 01:18:56,950 --> 01:19:00,730 want to find an element as via search, you have to do it linearly. 1833 01:19:00,730 --> 01:19:02,200 Following all of the arrows. 1834 01:19:02,200 --> 01:19:04,210 Following all of the pointers on the stage 1835 01:19:04,210 --> 01:19:06,340 in order to get to the node in question. 1836 01:19:06,340 --> 01:19:09,700 And only once you hit null can you conclude, yep, it was there. 1837 01:19:09,700 --> 01:19:11,500 Or no, it was not. 1838 01:19:11,500 --> 01:19:14,440 So given that if a computer, essentially, 1839 01:19:14,440 --> 01:19:18,970 can only see the number 1, or the number 2, or the number 3, or the number 4, 1840 01:19:18,970 --> 01:19:22,270 or the number 5, one at a time, how might we 1841 01:19:22,270 --> 01:19:25,690 think about the running time of search? 1842 01:19:25,690 --> 01:19:27,610 And it is indeed Big O of n. 1843 01:19:27,610 --> 01:19:28,410 But why is that? 1844 01:19:28,410 --> 01:19:30,910 Well, in the worst case, the number you might be looking for 1845 01:19:30,910 --> 01:19:32,480 is all the way at the end. 1846 01:19:32,480 --> 01:19:35,710 And so, obviously, you're going to have to search all of the n elements. 1847 01:19:35,710 --> 01:19:37,943 And I drew these things with boxes on top of them. 1848 01:19:37,943 --> 01:19:40,360 Because, again, even though you and I can immediately see, 1849 01:19:40,360 --> 01:19:42,610 where the 5 is for instance, the computer 1850 01:19:42,610 --> 01:19:46,480 can only figure that out by starting at the beginning and going there. 1851 01:19:46,480 --> 01:19:48,400 So there, too, is another trade off. 1852 01:19:48,400 --> 01:19:52,030 It would seem that, overnight, we have lost the ability 1853 01:19:52,030 --> 01:19:57,190 to do a very powerful algorithm from week 0 known as binary search, right. 1854 01:19:57,190 --> 01:19:57,820 It's gone. 1855 01:19:57,820 --> 01:20:01,810 Because there's no way in this picture to jump mathematically 1856 01:20:01,810 --> 01:20:04,375 to the middle node, unless you remember where it is. 1857 01:20:04,375 --> 01:20:06,250 And then, remember where every other node is. 1858 01:20:06,250 --> 01:20:08,042 And at that point, you're back to an array. 1859 01:20:08,042 --> 01:20:12,380 Linked list, by design, only remember the next node in the list. 1860 01:20:12,380 --> 01:20:12,880 All right. 1861 01:20:12,880 --> 01:20:15,370 How about something like insert? 1862 01:20:15,370 --> 01:20:18,190 In the worst case, perhaps, how many steps 1863 01:20:18,190 --> 01:20:21,340 might it take to insert something into a linked list? 1864 01:20:21,340 --> 01:20:22,998 Someone else. 1865 01:20:22,998 --> 01:20:23,540 Someone else. 1866 01:20:23,540 --> 01:20:24,040 Yeah. 1867 01:20:24,040 --> 01:20:25,060 AUDIENCE: N squared. 1868 01:20:25,060 --> 01:20:25,480 SPEAKER 1: Say again? 1869 01:20:25,480 --> 01:20:26,320 AUDIENCE: N squared. 1870 01:20:26,320 --> 01:20:26,890 SPEAKER 1: N squared. 1871 01:20:26,890 --> 01:20:28,232 Fortunately, it's not that bad. 1872 01:20:28,232 --> 01:20:29,440 It's not as bad as n squared. 1873 01:20:29,440 --> 01:20:31,720 That typically means doing n things, n times. 1874 01:20:31,720 --> 01:20:36,260 And I think we can stay under that, but not a bad thought. 1875 01:20:36,260 --> 01:20:36,760 Yeah. 1876 01:20:36,760 --> 01:20:37,832 AUDIENCE: Is it n? 1877 01:20:37,832 --> 01:20:39,040 SPEAKER 1: Why would it be n? 1878 01:20:39,040 --> 01:20:42,787 AUDIENCE: Because the [INAUDIBLE]. 1879 01:20:42,787 --> 01:20:43,370 SPEAKER 1: OK. 1880 01:20:43,370 --> 01:20:45,650 So to summarize, you're proposing n. 1881 01:20:45,650 --> 01:20:47,513 Because to find where the thing goes, you 1882 01:20:47,513 --> 01:20:49,430 have to traverse, potentially, the whole list. 1883 01:20:49,430 --> 01:20:52,220 Because if I'm inserting the number 6 or the number 99, 1884 01:20:52,220 --> 01:20:54,770 that numerically belongs at the very end, 1885 01:20:54,770 --> 01:20:57,830 I can only find its location by looking for all of them. 1886 01:20:57,830 --> 01:20:59,368 At this point, though, in the term. 1887 01:20:59,368 --> 01:21:01,160 And really, at this point in the story, you 1888 01:21:01,160 --> 01:21:04,590 should start to question these very simplistic questions, to be honest. 1889 01:21:04,590 --> 01:21:08,360 Because the answer is almost always going to depend, right. 1890 01:21:08,360 --> 01:21:10,980 If I've just got a link to list that looks like this, 1891 01:21:10,980 --> 01:21:14,240 the first question back to someone asking this question 1892 01:21:14,240 --> 01:21:17,300 would be, well does the list need to be sorted, right? 1893 01:21:17,300 --> 01:21:19,692 I've drawn it as sorted and it might imply as much. 1894 01:21:19,692 --> 01:21:21,650 So that's a reasonable assumption to have made. 1895 01:21:21,650 --> 01:21:24,320 But if I don't care about maintaining sorted order, 1896 01:21:24,320 --> 01:21:28,190 I could actually insert into a linked list in constant time. 1897 01:21:28,190 --> 01:21:28,730 Why? 1898 01:21:28,730 --> 01:21:31,628 I could just keep inserting into the beginning, into the beginning, 1899 01:21:31,628 --> 01:21:32,420 into the beginning. 1900 01:21:32,420 --> 01:21:34,310 And even though the list is getting longer, 1901 01:21:34,310 --> 01:21:38,270 the number of steps required to insert something between the first element 1902 01:21:38,270 --> 01:21:40,220 is not growing at all. 1903 01:21:40,220 --> 01:21:42,740 You just keep inserting. 1904 01:21:42,740 --> 01:21:44,900 If you want to keep it sorted though, yes, it's 1905 01:21:44,900 --> 01:21:46,310 going to be, indeed, Big O of n. 1906 01:21:46,310 --> 01:21:47,840 But again, these kinds of, now, assumptions 1907 01:21:47,840 --> 01:21:49,048 are going to start to matter. 1908 01:21:49,048 --> 01:21:51,740 So let's for the sake of discussion say it's Big O of n, 1909 01:21:51,740 --> 01:21:53,660 if we do want to maintain sorted order. 1910 01:21:53,660 --> 01:21:56,810 But what about in the case of not caring. 1911 01:21:56,810 --> 01:21:58,628 It might indeed be a Big O of 1. 1912 01:21:58,628 --> 01:22:01,670 And now these are the kinds of decisions that will start to leave to you. 1913 01:22:01,670 --> 01:22:03,200 What about in the best case here? 1914 01:22:03,200 --> 01:22:05,240 If we're thinking about Big Omega notation, 1915 01:22:05,240 --> 01:22:07,632 then, frankly, we could just get lucky in the best case. 1916 01:22:07,632 --> 01:22:10,340 And the element we're looking for happens to be at the beginning. 1917 01:22:10,340 --> 01:22:14,570 Or heck, we just blindly insert to the beginning irrespective of the order 1918 01:22:14,570 --> 01:22:16,500 that we want to keep things in. 1919 01:22:16,500 --> 01:22:17,000 All right. 1920 01:22:17,000 --> 01:22:22,418 So besides then, how can we improve further on this design? 1921 01:22:22,418 --> 01:22:23,960 We don't need to stop at linked list. 1922 01:22:23,960 --> 01:22:26,090 Because, honestly, it's not been a clear win. 1923 01:22:26,090 --> 01:22:28,940 Like, linked list allow us to use more of our memory 1924 01:22:28,940 --> 01:22:32,430 because we don't need massive growing chunks of contiguous memory. 1925 01:22:32,430 --> 01:22:33,300 So that's a win. 1926 01:22:33,300 --> 01:22:37,310 But they still require Big O of n time to find the end of it, 1927 01:22:37,310 --> 01:22:38,630 if we care about order. 1928 01:22:38,630 --> 01:22:41,870 We're using at least twice as much memory for the darn pointer. 1929 01:22:41,870 --> 01:22:44,120 So that seems like a sidestep. 1930 01:22:44,120 --> 01:22:46,100 It's not really a step forward. 1931 01:22:46,100 --> 01:22:47,840 So can we do better? 1932 01:22:47,840 --> 01:22:52,157 Here's where we can now accelerate the story by just stipulating that, hey, 1933 01:22:52,157 --> 01:22:53,990 even if you haven't used this technique yet, 1934 01:22:53,990 --> 01:22:58,130 we would seem to have an ability to stitch together pieces of memory just 1935 01:22:58,130 --> 01:22:59,120 using pointers . 1936 01:22:59,120 --> 01:23:01,520 And anything you could imagine drawing with arrows, 1937 01:23:01,520 --> 01:23:04,140 you can implement, it would seem, in code. 1938 01:23:04,140 --> 01:23:06,620 So what if we leverage a second dimension. 1939 01:23:06,620 --> 01:23:09,137 Instead of just stringing together things laterally, 1940 01:23:09,137 --> 01:23:10,970 left to right, essentially, even though they 1941 01:23:10,970 --> 01:23:12,620 were bouncing around on the screen. 1942 01:23:12,620 --> 01:23:15,770 What if we start to leverage a second dimension here, so to speak. 1943 01:23:15,770 --> 01:23:19,400 And build more interesting structures in the computer's memory. 1944 01:23:19,400 --> 01:23:22,190 Well it turns out that in a computer's memory, 1945 01:23:22,190 --> 01:23:25,130 we could create a tree, similar to a family tree. 1946 01:23:25,130 --> 01:23:28,880 If you've ever seen or draw on a family tree with grandparents, and parents, 1947 01:23:28,880 --> 01:23:30,170 and siblings, and so forth. 1948 01:23:32,960 --> 01:23:36,170 So inverted branch of a tree that grows, typically 1949 01:23:36,170 --> 01:23:39,050 when it's drawn, downward instead of upward like a typical tree. 1950 01:23:39,050 --> 01:23:41,540 But that's something we could translate into code as well. 1951 01:23:41,540 --> 01:23:45,240 Specifically, let's do something called a binary search tree. 1952 01:23:45,240 --> 01:23:47,120 Which is a type of tree. 1953 01:23:47,120 --> 01:23:49,670 And what I mean by this is the following. 1954 01:23:49,670 --> 01:23:50,480 Notice this. 1955 01:23:50,480 --> 01:23:53,360 This is an example of an array from like week 2, 1956 01:23:53,360 --> 01:23:54,750 when we first talked about those. 1957 01:23:54,750 --> 01:23:56,450 And we had the lockers on stage. 1958 01:23:56,450 --> 01:24:02,480 And recall that what was nice about an array, if 1, it's sorted. 1959 01:24:02,480 --> 01:24:05,540 And 2, all of its numbers are indeed contiguous, 1960 01:24:05,540 --> 01:24:07,530 which is by definition an array. 1961 01:24:07,530 --> 01:24:09,270 We can just do some simple math. 1962 01:24:09,270 --> 01:24:13,980 For instance, if there are 7 elements in this array, and we do 7 divided by 2, 1963 01:24:13,980 --> 01:24:14,480 that's what? 1964 01:24:14,480 --> 01:24:17,330 3 and 1/2, round down through truncation, that's 3. 1965 01:24:17,330 --> 01:24:18,680 0, 1, 2, 3. 1966 01:24:18,680 --> 01:24:21,933 That gives me the middle element, arithmetically, in this thing. 1967 01:24:21,933 --> 01:24:24,350 And even though I have to be careful about rounding, using 1968 01:24:24,350 --> 01:24:28,430 simple arithmetic, I can very quickly, with a single line of code or math, 1969 01:24:28,430 --> 01:24:30,890 find for you the middle of the left half, of the left half, 1970 01:24:30,890 --> 01:24:32,182 of the right half, or whatever. 1971 01:24:32,182 --> 01:24:33,480 That's the power of arrays. 1972 01:24:33,480 --> 01:24:35,420 And that's what gave us binary search. 1973 01:24:35,420 --> 01:24:36,940 And how did binary search work? 1974 01:24:36,940 --> 01:24:38,190 Well, we looked at the middle. 1975 01:24:38,190 --> 01:24:39,830 And then, we went left or right. 1976 01:24:39,830 --> 01:24:45,080 And then, we went left or right again, implied by this color scheme here. 1977 01:24:45,080 --> 01:24:50,210 Wouldn't it be nice if we somehow preserved the new upsides 1978 01:24:50,210 --> 01:24:53,038 today of dynamic memory allocation, giving ourselves 1979 01:24:53,038 --> 01:24:55,580 the ability to just add another element, add another element, 1980 01:24:55,580 --> 01:24:56,750 add another element. 1981 01:24:56,750 --> 01:24:59,300 But retain the power of binary search. 1982 01:24:59,300 --> 01:25:04,100 Because log of n was much better than n, certainly for large data sets, right. 1983 01:25:04,100 --> 01:25:06,980 Even the phone book demonstrated as much weeks ago. 1984 01:25:06,980 --> 01:25:11,010 So what if I draw this same picture in 2 dimensions. 1985 01:25:11,010 --> 01:25:14,960 And I preserve the color scheme, just so it's obvious what came where. 1986 01:25:14,960 --> 01:25:18,500 What are these things look like now? 1987 01:25:18,500 --> 01:25:21,050 Maybe, like, things we might now call nodes, right. 1988 01:25:21,050 --> 01:25:25,030 A node is just a generic term for like, storing some data. 1989 01:25:25,030 --> 01:25:28,200 What if the data these nodes are storing are numbers. 1990 01:25:28,200 --> 01:25:29,730 So still integers. 1991 01:25:29,730 --> 01:25:33,860 But what if we connected these cleverly, like an old family tree. 1992 01:25:33,860 --> 01:25:39,230 Whereby, every node has not one pointer now, but as many as 2. 1993 01:25:39,230 --> 01:25:42,330 Maybe 0, like in the leaves at the bottom are in green. 1994 01:25:42,330 --> 01:25:45,450 But other nodes on the interior might have as many as 2. 1995 01:25:45,450 --> 01:25:47,250 Like having 2 children, so to speak. 1996 01:25:47,250 --> 01:25:49,420 And indeed, the vernacular here is exactly that. 1997 01:25:49,420 --> 01:25:51,330 This would be called the root of the tree. 1998 01:25:51,330 --> 01:25:54,270 Or this would be a parent, with respect to these children. 1999 01:25:54,270 --> 01:25:56,910 The green ones would be grandchildren, respect to these. 2000 01:25:56,910 --> 01:26:01,530 The green ones would be siblings with respect to each other. 2001 01:26:01,530 --> 01:26:02,370 And over there, too. 2002 01:26:02,370 --> 01:26:04,662 So all the same jargon you might use in the real world, 2003 01:26:04,662 --> 01:26:07,920 applies in the world of data structures and CS trees. 2004 01:26:07,920 --> 01:26:12,810 But this is interesting because I think we could build this now, this data 2005 01:26:12,810 --> 01:26:15,300 structure in the computer's memory. 2006 01:26:15,300 --> 01:26:15,840 How? 2007 01:26:15,840 --> 01:26:20,040 Well, suppose that we defined a node to be no longer just 2008 01:26:20,040 --> 01:26:22,110 this, a number in a next field. 2009 01:26:22,110 --> 01:26:24,870 What if we give ourselves a bit more room here? 2010 01:26:24,870 --> 01:26:29,730 And give ourselves a pointer called left and another one called right. 2011 01:26:29,730 --> 01:26:32,080 Both of which is a pointer to a struct node. 2012 01:26:32,080 --> 01:26:36,030 So same idea as before, but now we just make sure we think of these things 2013 01:26:36,030 --> 01:26:39,210 as pointing this way and this way, not just this way. 2014 01:26:39,210 --> 01:26:41,280 Not just a single direction, but 2. 2015 01:26:41,280 --> 01:26:45,180 So you could imagine, in code, building something up like this with a node. 2016 01:26:45,180 --> 01:26:48,570 That creates, in essence, this diagram here. 2017 01:26:48,570 --> 01:26:50,250 But why is this compelling? 2018 01:26:50,250 --> 01:26:52,290 Suppose I want to find the number 3. 2019 01:26:52,290 --> 01:26:54,840 I want to search for the number 3 in this tree. 2020 01:26:54,840 --> 01:26:58,200 It would seem, just like Pedro was the beginning of our linked list, 2021 01:26:58,200 --> 01:27:01,090 in the world of trees, the root, so to speak, 2022 01:27:01,090 --> 01:27:03,090 is the beginning of your data structure. 2023 01:27:03,090 --> 01:27:08,730 You can retain and remember this entire tree just by pointing at the root node, 2024 01:27:08,730 --> 01:27:09,270 ultimately. 2025 01:27:09,270 --> 01:27:12,330 One variable can hang on to this whole tree. 2026 01:27:12,330 --> 01:27:14,520 So how can I find the number 3? 2027 01:27:14,520 --> 01:27:18,660 Well, if I look at the root node and the number I'm looking for is less than. 2028 01:27:18,660 --> 01:27:20,250 Notice, I can go this way. 2029 01:27:20,250 --> 01:27:22,570 Or if it's greater than, I can go this way. 2030 01:27:22,570 --> 01:27:24,750 So I preserve that property of the phone book, 2031 01:27:24,750 --> 01:27:27,000 or just assorted array in general. 2032 01:27:27,000 --> 01:27:28,320 What's true over here? 2033 01:27:28,320 --> 01:27:31,328 If I'm looking for 3, I can go to the right of the 2 2034 01:27:31,328 --> 01:27:33,120 because that number is going to be greater. 2035 01:27:33,120 --> 01:27:35,680 If I go left, it's going to be smaller instead. 2036 01:27:35,680 --> 01:27:38,430 And here's an example of actually recursion. 2037 01:27:38,430 --> 01:27:42,090 Recursion in a physical sense much like the Mario's pyramid. 2038 01:27:42,090 --> 01:27:44,250 Which was recursively to find. 2039 01:27:44,250 --> 01:27:45,300 Notice this. 2040 01:27:45,300 --> 01:27:47,250 I claim this whole thing is a tree. 2041 01:27:47,250 --> 01:27:50,790 Specifically, a binary search tree, which means every node 2042 01:27:50,790 --> 01:27:53,880 has 2, or maybe 1, or maybe 0 children. 2043 01:27:53,880 --> 01:27:55,110 But no more than 2. 2044 01:27:55,110 --> 01:27:56,730 Hence the bi in binary. 2045 01:27:56,730 --> 01:28:02,160 And it's the case that every left child is smaller than the root. 2046 01:28:02,160 --> 01:28:05,130 And every right child is larger than the root. 2047 01:28:05,130 --> 01:28:08,100 That definition certainly works for 2, 4, and 6. 2048 01:28:08,100 --> 01:28:12,930 But it also works recursively for every sub tree, or branch of this tree. 2049 01:28:12,930 --> 01:28:14,910 Notice, if you think of this as the root, 2050 01:28:14,910 --> 01:28:16,980 it is indeed bigger than this left child. 2051 01:28:16,980 --> 01:28:19,080 And it's smaller than this right child. 2052 01:28:19,080 --> 01:28:21,600 And if you look even at the leaves, so to speak. 2053 01:28:21,600 --> 01:28:23,010 The grandchildren here. 2054 01:28:23,010 --> 01:28:26,687 This root node is bigger than its left child, if it existed. 2055 01:28:26,687 --> 01:28:28,020 So it's a meaningless statement. 2056 01:28:28,020 --> 01:28:30,210 And it's less than its right child. 2057 01:28:30,210 --> 01:28:33,000 Or it's not greater than, certainly, so that's meaningless too. 2058 01:28:33,000 --> 01:28:36,760 So we haven't violated the definition even for these leaves, as well. 2059 01:28:36,760 --> 01:28:40,230 And so, now, how many steps does it take to find in the worst case 2060 01:28:40,230 --> 01:28:44,580 any number in a binary search tree, it would seem? 2061 01:28:44,580 --> 01:28:46,530 So it seems 2, literally. 2062 01:28:46,530 --> 01:28:48,400 And the height of this thing is actually 3. 2063 01:28:48,400 --> 01:28:51,150 And so long story short, especially, if you're a little less comfy 2064 01:28:51,150 --> 01:28:53,310 with your logarithms from yesteryear. 2065 01:28:53,310 --> 01:28:57,120 Log base 2 is the number of times you can divide something in half, and half, 2066 01:28:57,120 --> 01:28:58,860 and half, until you get down to 1. 2067 01:28:58,860 --> 01:29:01,828 This is like a logarithm in the reverse direction. 2068 01:29:01,828 --> 01:29:03,120 Here's a whole lot of elements. 2069 01:29:03,120 --> 01:29:05,490 And we're having, we're having until we get down to 1. 2070 01:29:05,490 --> 01:29:09,643 So the height of this tree, that is to say, is log base 2 of n. 2071 01:29:09,643 --> 01:29:12,810 Which means that even in the worst case, the number you're looking for maybe 2072 01:29:12,810 --> 01:29:14,685 it's all the way at the bottom in the leaves. 2073 01:29:14,685 --> 01:29:15,330 Doesn't matter. 2074 01:29:15,330 --> 01:29:20,220 It's going to take log base 2 of n steps, or log of n steps, 2075 01:29:20,220 --> 01:29:23,830 to find, maximally, any one of those numbers. 2076 01:29:23,830 --> 01:29:28,620 So, again, binary search is back. 2077 01:29:28,620 --> 01:29:30,635 But we've paid a price, right. 2078 01:29:30,635 --> 01:29:32,010 This isn't a linked list anymore. 2079 01:29:32,010 --> 01:29:33,192 It's a tree. 2080 01:29:33,192 --> 01:29:36,150 But we've gained back binary search, which is pretty compelling, right. 2081 01:29:36,150 --> 01:29:38,775 That's where the whole class began, on making that distinction. 2082 01:29:38,775 --> 01:29:44,020 But what price have we paid to retain binary search in this new world. 2083 01:29:44,020 --> 01:29:44,520 Yeah. 2084 01:29:47,070 --> 01:29:49,050 It's no longer sorted left to right, but this 2085 01:29:49,050 --> 01:29:52,020 is a claim sorted, according to the binary search tree definition. 2086 01:29:52,020 --> 01:29:56,010 Where, again, left child is smaller than root. 2087 01:29:56,010 --> 01:29:58,440 And right child is greater than root. 2088 01:29:58,440 --> 01:30:01,860 So it is sorted, but it's sorted in a 2-dimensional sense, if you will. 2089 01:30:01,860 --> 01:30:02,910 Not just 1. 2090 01:30:02,910 --> 01:30:05,260 But another price paid? 2091 01:30:05,260 --> 01:30:06,670 AUDIENCE: [INAUDIBLE] nodes now. 2092 01:30:06,670 --> 01:30:07,462 SPEAKER 1: Exactly. 2093 01:30:07,462 --> 01:30:11,830 Every node now needs not one number, but 2, 3 pieces of data. 2094 01:30:11,830 --> 01:30:13,630 A number and now 2 pointers. 2095 01:30:13,630 --> 01:30:15,385 So, again, there's that trade off again. 2096 01:30:15,385 --> 01:30:17,260 Where, well, if you want to save time, you've 2097 01:30:17,260 --> 01:30:20,080 got to give something if you start giving space. 2098 01:30:20,080 --> 01:30:22,547 And you start using more space, you can speed up time. 2099 01:30:22,547 --> 01:30:23,380 Like, you've got it. 2100 01:30:23,380 --> 01:30:24,640 There's always a price paid. 2101 01:30:24,640 --> 01:30:30,400 And it's very often in space, or time, or complexity, or developer time, 2102 01:30:30,400 --> 01:30:32,030 the number of bugs you have to solve. 2103 01:30:32,030 --> 01:30:34,060 I mean, all of these are finite resources 2104 01:30:34,060 --> 01:30:35,833 that you have to juggle them on. 2105 01:30:35,833 --> 01:30:38,500 So if we consider now the code with which we can implement this, 2106 01:30:38,500 --> 01:30:40,120 here might be the node. 2107 01:30:40,120 --> 01:30:43,070 And how might we actually use something like this? 2108 01:30:43,070 --> 01:30:45,520 Well, let's take a look at, maybe, one final program. 2109 01:30:45,520 --> 01:30:49,640 And see here, before we transition to higher level concepts, ultimately. 2110 01:30:49,640 --> 01:30:54,070 Let me go ahead here and let me just open a program I wrote here in advance. 2111 01:30:54,070 --> 01:30:58,210 So let me, in a moment, copy over file called tree.c. 2112 01:30:58,210 --> 01:31:01,068 Which we'll have on the course's websites. 2113 01:31:01,068 --> 01:31:02,860 And I'll walk you through some of the logic 2114 01:31:02,860 --> 01:31:07,790 here that I've written for tree.c. 2115 01:31:07,790 --> 01:31:08,290 All right. 2116 01:31:08,290 --> 01:31:09,800 So what do we have here first? 2117 01:31:09,800 --> 01:31:14,440 So here is an implementation of a binary search tree for numbers. 2118 01:31:14,440 --> 01:31:18,860 And as before, I've played around and I've inserted the numbers manually. 2119 01:31:18,860 --> 01:31:20,290 So what's going on first? 2120 01:31:20,290 --> 01:31:24,130 Here is my definition of a node for a binary search tree, copied and pasted 2121 01:31:24,130 --> 01:31:27,010 from what I proposed on the board a moment ago. 2122 01:31:27,010 --> 01:31:29,710 Here are 2 prototypes for 2 functions, that I'll 2123 01:31:29,710 --> 01:31:31,780 show you in a moment, that allow me to free 2124 01:31:31,780 --> 01:31:35,170 an entire tree, one node at a time. 2125 01:31:35,170 --> 01:31:37,900 And then, also allow me to print the tree in order. 2126 01:31:37,900 --> 01:31:40,300 So even though they're not sorted left to right, 2127 01:31:40,300 --> 01:31:43,450 I bet if I'm clever about what child I print first, 2128 01:31:43,450 --> 01:31:46,670 I can reconstruct the idea of printing this tree properly. 2129 01:31:46,670 --> 01:31:49,150 So how might I implement a binary search tree? 2130 01:31:49,150 --> 01:31:50,440 Here's my main function. 2131 01:31:50,440 --> 01:31:53,020 Here is how I might represent a tree of size 0. 2132 01:31:53,020 --> 01:31:55,960 It's just a null pointer called tree. 2133 01:31:55,960 --> 01:31:58,060 Here's how I might add a number to that list. 2134 01:31:58,060 --> 01:32:02,080 So here, for instance, is me malllocing space for a node. 2135 01:32:02,080 --> 01:32:04,210 Storing it in a temporary variable called n. 2136 01:32:04,210 --> 01:32:06,070 Here is me just doing a safety check. 2137 01:32:06,070 --> 01:32:07,780 Make sure n does not equal null. 2138 01:32:07,780 --> 01:32:12,130 And then, here is me initializing this node to contain the number 2, first. 2139 01:32:12,130 --> 01:32:14,860 Then, initializing the left child of that node to be null. 2140 01:32:14,860 --> 01:32:17,510 And the right child of that null node to be null. 2141 01:32:17,510 --> 01:32:22,670 And then, initializing the tree itself to be equal to that particular node. 2142 01:32:22,670 --> 01:32:25,840 So at this point in the story, there's just one rectangle on the screen 2143 01:32:25,840 --> 01:32:28,740 containing the number 2 with no children. 2144 01:32:28,740 --> 01:32:29,240 All right. 2145 01:32:29,240 --> 01:32:31,630 Let's just add manually to this a little further. 2146 01:32:31,630 --> 01:32:34,780 Let's add another number to the list, by mallocing another node. 2147 01:32:34,780 --> 01:32:38,140 I don't need to declare n as a node* because it already exists at this 2148 01:32:38,140 --> 01:32:38,780 point. 2149 01:32:38,780 --> 01:32:40,720 Here's a little safety check. 2150 01:32:40,720 --> 01:32:45,280 I'm going to not bother with my, let me do this, free memory here. 2151 01:32:45,280 --> 01:32:47,240 Just to be safe. 2152 01:32:47,240 --> 01:32:49,803 Do I want to do this? 2153 01:32:49,803 --> 01:32:51,970 We want a free memory too, which I've not done here, 2154 01:32:51,970 --> 01:32:53,650 but I'll save that for another time. 2155 01:32:53,650 --> 01:32:55,990 Here, I'm going to initialize the number to 1. 2156 01:32:55,990 --> 01:33:00,100 I'm going to initialize the children of this node to null and null. 2157 01:33:00,100 --> 01:33:01,810 And now, I'm going to do this. 2158 01:33:01,810 --> 01:33:06,280 Initialize the tree's left child to be n. 2159 01:33:06,280 --> 01:33:09,222 So what that's essentially doing here is if this 2160 01:33:09,222 --> 01:33:12,430 is my root node, the single rectangle I described a moment ago that currently 2161 01:33:12,430 --> 01:33:14,530 has no children, neither left nor right. 2162 01:33:14,530 --> 01:33:16,480 Here's my new node with the number 1. 2163 01:33:16,480 --> 01:33:18,620 I want it to become the new left child. 2164 01:33:18,620 --> 01:33:22,150 So that line of code on the screen there, tree left equals n, 2165 01:33:22,150 --> 01:33:26,720 is like stitching these 2 together with a pointer from 2 to the 1. 2166 01:33:26,720 --> 01:33:27,220 All right. 2167 01:33:27,220 --> 01:33:30,100 The next lines of code, you can probably guess, 2168 01:33:30,100 --> 01:33:32,560 are me adding another number to the list. 2169 01:33:32,560 --> 01:33:33,730 Just the number 3. 2170 01:33:33,730 --> 01:33:39,200 So this is a simpler tree with 2, 1, and, 3 respectively. 2171 01:33:39,200 --> 01:33:41,710 And this code, let me wave my hands, is almost the same. 2172 01:33:41,710 --> 01:33:45,010 Except for the fact that I'm updating the tree's right child 2173 01:33:45,010 --> 01:33:46,990 to be this new and third node. 2174 01:33:46,990 --> 01:33:50,380 Let's now run the code before looking at those 2 functions. 2175 01:33:50,380 --> 01:33:54,280 Let me do make tree, ./tree. 2176 01:33:54,280 --> 01:33:55,510 And while I'll 1, 2, 3. 2177 01:33:55,510 --> 01:33:58,930 So it sounds like the data structure is sorted, to your concern earlier. 2178 01:33:58,930 --> 01:34:00,700 But how did I actually print this? 2179 01:34:00,700 --> 01:34:02,590 And then, eventually, free the whole thing? 2180 01:34:02,590 --> 01:34:05,980 Well let's look at the definition of first print tree. 2181 01:34:05,980 --> 01:34:08,950 And this is where things get interesting. 2182 01:34:08,950 --> 01:34:12,790 Print tree returns nothing so it's a void function. 2183 01:34:12,790 --> 01:34:18,520 But it takes a pointer to a root element as its sole argument, node* root. 2184 01:34:18,520 --> 01:34:19,690 Here's my safety check. 2185 01:34:19,690 --> 01:34:21,790 If root equals equals null, there's obviously 2186 01:34:21,790 --> 01:34:23,110 nothing to print, just return. 2187 01:34:23,110 --> 01:34:24,970 That goes without saying. 2188 01:34:24,970 --> 01:34:27,010 But here's where things get a little magical. 2189 01:34:27,010 --> 01:34:30,280 Otherwise, print your left child. 2190 01:34:30,280 --> 01:34:33,010 Then print your own number. 2191 01:34:33,010 --> 01:34:36,430 Then, print your right child. 2192 01:34:36,430 --> 01:34:41,700 What is this an example of, even though it's not mentioned by name here? 2193 01:34:41,700 --> 01:34:43,320 What programming technique here? 2194 01:34:43,320 --> 01:34:44,250 AUDIENCE: Recursion. 2195 01:34:44,250 --> 01:34:44,917 SPEAKER 1: Yeah. 2196 01:34:44,917 --> 01:34:48,372 So this is actually perhaps the most compelling use of recursion, yet. 2197 01:34:48,372 --> 01:34:50,580 It wasn't really that compelling with the Mario thing 2198 01:34:50,580 --> 01:34:52,710 because we had such an easy implementation with a for-loop loop 2199 01:34:52,710 --> 01:34:53,550 weeks ago. 2200 01:34:53,550 --> 01:34:58,170 But here is a perfect application of recursion, where your data structure 2201 01:34:58,170 --> 01:34:59,910 itself is recursive, right. 2202 01:34:59,910 --> 01:35:02,220 If you take any snip of any branch, it all 2203 01:35:02,220 --> 01:35:04,590 still looks like a tree, just a smaller one. 2204 01:35:04,590 --> 01:35:06,430 That lends itself to recursion. 2205 01:35:06,430 --> 01:35:11,010 So here is this leap of faith where I say, print my left tree, or my left sub 2206 01:35:11,010 --> 01:35:13,830 tree, if you will, via my child at the left. 2207 01:35:13,830 --> 01:35:17,130 Then, I'll print my own root node here in the middle. 2208 01:35:17,130 --> 01:35:19,740 Then, go ahead and print my right sub tree. 2209 01:35:19,740 --> 01:35:24,180 And because we have this base case that makes sure that if the root is null, 2210 01:35:24,180 --> 01:35:26,967 there's nothing to do, you're not going to recurse infinitely. 2211 01:35:26,967 --> 01:35:29,550 You're not going to call yourself again, and again, and again, 2212 01:35:29,550 --> 01:35:31,210 infinitely, many times. 2213 01:35:31,210 --> 01:35:35,400 So it works out and prints the 1, the 2, and the 3. 2214 01:35:35,400 --> 01:35:36,840 And notice what we could do, too. 2215 01:35:36,840 --> 01:35:40,260 If you wanted to print the tree in reverse order, you could do that. 2216 01:35:40,260 --> 01:35:43,050 Print your right tree first, the greater element. 2217 01:35:43,050 --> 01:35:43,950 Then, yourself. 2218 01:35:43,950 --> 01:35:45,330 Then, your smaller sub tree. 2219 01:35:45,330 --> 01:35:47,970 And if I do make tree here and ./tree, well now, 2220 01:35:47,970 --> 01:35:50,100 I've reversed the order of the list. 2221 01:35:50,100 --> 01:35:51,190 And that's pretty cool. 2222 01:35:51,190 --> 01:35:52,940 You can do it with a for-loop in an array. 2223 01:35:52,940 --> 01:35:56,370 But you can also do it, even with this 2-dimensional structure. 2224 01:35:56,370 --> 01:36:00,180 Let's lastly look at this free tree function. 2225 01:36:00,180 --> 01:36:02,160 And this one's almost the same. 2226 01:36:02,160 --> 01:36:05,400 Order doesn't matter in quite the same way, but it does still matter. 2227 01:36:05,400 --> 01:36:07,020 Here's what I did with free tree. 2228 01:36:07,020 --> 01:36:09,978 Well, if the root of the tree is null, there's obviously nothing to do. 2229 01:36:09,978 --> 01:36:10,560 Just return. 2230 01:36:10,560 --> 01:36:15,100 Otherwise, go ahead and free your left child and all of its descendants. 2231 01:36:15,100 --> 01:36:18,090 Then free your right child and all of its descendants. 2232 01:36:18,090 --> 01:36:19,900 And then, free yourself. 2233 01:36:19,900 --> 01:36:25,690 And again, free literally just frees the address in that variable. 2234 01:36:25,690 --> 01:36:27,570 It doesn't free the whole darn thing. 2235 01:36:27,570 --> 01:36:29,850 It just frees literally what's at that address. 2236 01:36:29,850 --> 01:36:33,900 Why was it important that I did line 72 last, though? 2237 01:36:33,900 --> 01:36:36,450 Why did I free the left child and the right child 2238 01:36:36,450 --> 01:36:39,973 before I freed myself, so to speak? 2239 01:36:39,973 --> 01:36:40,890 AUDIENCE: [INAUDIBLE]. 2240 01:36:40,890 --> 01:36:41,682 SPEAKER 1: Exactly. 2241 01:36:41,682 --> 01:36:46,140 If you free yourself first, if I had done incorrectly this line higher up, 2242 01:36:46,140 --> 01:36:50,820 you're not allowed to touch the left child tree or the right child tree. 2243 01:36:50,820 --> 01:36:53,350 Because the memory address is no longer valid at that point. 2244 01:36:53,350 --> 01:36:55,290 You would get some memory error, perhaps. 2245 01:36:55,290 --> 01:36:56,310 The program would crash. 2246 01:36:56,310 --> 01:36:57,990 Valgrind definitely wouldn't like it. 2247 01:36:57,990 --> 01:37:00,060 Bad things would otherwise happen. 2248 01:37:00,060 --> 01:37:01,890 But here, then, is an example of recursion. 2249 01:37:01,890 --> 01:37:06,360 And again, just a recursive use of an actual data structure. 2250 01:37:06,360 --> 01:37:09,120 And what's even cooler here is, relatively speaking, 2251 01:37:09,120 --> 01:37:11,640 suppose we wanted to search something like this. 2252 01:37:11,640 --> 01:37:15,720 Binary search actually gets pretty straightforward to implement 2. 2253 01:37:15,720 --> 01:37:16,410 For instance. 2254 01:37:16,410 --> 01:37:20,940 here might be the prototype for a search function for a binary search tree. 2255 01:37:20,940 --> 01:37:25,920 You give me the root of a tree, and you give me a number I'm looking for, 2256 01:37:25,920 --> 01:37:29,880 and I can pretty easily now return true if it's in there or false if it's not. 2257 01:37:29,880 --> 01:37:30,450 How? 2258 01:37:30,450 --> 01:37:32,430 Well, let's first ask a question. 2259 01:37:32,430 --> 01:37:35,395 If tree equals equals null, then you just return false. 2260 01:37:35,395 --> 01:37:38,520 Because if there's no tree, there's no number, so it's obviously not there. 2261 01:37:38,520 --> 01:37:39,860 Return false. 2262 01:37:39,860 --> 01:37:46,560 Else if, the number you're looking for is less than the tree's own number, 2263 01:37:46,560 --> 01:37:48,570 which direction should we go? 2264 01:37:48,570 --> 01:37:49,247 AUDIENCE: Left. 2265 01:37:49,247 --> 01:37:50,080 SPEAKER 1: OK, left. 2266 01:37:50,080 --> 01:37:51,190 How do we express that? 2267 01:37:51,190 --> 01:37:54,300 Well, let's just return the answer to this question. 2268 01:37:54,300 --> 01:37:58,440 Search the left sub tree, by way of my left child, 2269 01:37:58,440 --> 01:37:59,970 looking for the same number. 2270 01:37:59,970 --> 01:38:02,250 And you just assume through the beauty of recursion 2271 01:38:02,250 --> 01:38:05,400 that you're kicking the can and let yourself figure it out 2272 01:38:05,400 --> 01:38:06,600 with a smaller problem. 2273 01:38:06,600 --> 01:38:09,060 Just that snipped left tree instead. 2274 01:38:09,060 --> 01:38:13,320 Else if, the number you're looking for is greater than the tree's own number, 2275 01:38:13,320 --> 01:38:15,160 go to the right, as you might infer. 2276 01:38:15,160 --> 01:38:18,060 So I can just return the answer to this question. 2277 01:38:18,060 --> 01:38:21,150 Search my right sub tree for that same number. 2278 01:38:21,150 --> 01:38:23,020 And there's a fourth and final condition. 2279 01:38:23,020 --> 01:38:26,250 What's the fourth scenario we have to consider, explicitly? 2280 01:38:26,250 --> 01:38:26,760 Yeah. 2281 01:38:26,760 --> 01:38:27,780 AUDIENCE: The number. 2282 01:38:27,780 --> 01:38:29,822 SPEAKER 1: If the number, itself, is right there. 2283 01:38:29,822 --> 01:38:33,480 So else if, the number I'm looking for equals the tree's own number, 2284 01:38:33,480 --> 01:38:36,250 then and only then, should you return true. 2285 01:38:36,250 --> 01:38:38,490 And if you're thinking quickly here, there's 2286 01:38:38,490 --> 01:38:42,150 an optimization possible, better design opportunity. 2287 01:38:42,150 --> 01:38:43,650 Think back to even our scratch days. 2288 01:38:43,650 --> 01:38:45,770 What could we do a little better here? 2289 01:38:45,770 --> 01:38:46,710 You're pointing at it. 2290 01:38:46,710 --> 01:38:47,508 AUDIENCE: Else. 2291 01:38:47,508 --> 01:38:48,300 SPEAKER 1: Exactly. 2292 01:38:48,300 --> 01:38:49,140 An else suffices. 2293 01:38:49,140 --> 01:38:51,682 Because if there's logically only 4 things that could happen, 2294 01:38:51,682 --> 01:38:54,540 you're wasting your time by asking a fourth gratuitous question. 2295 01:38:54,540 --> 01:38:55,860 And else here suffices. 2296 01:38:55,860 --> 01:38:59,500 So here to, more so than the Mario example a few weeks ago, 2297 01:38:59,500 --> 01:39:02,100 there's just this elegance arguably to recursion. 2298 01:39:02,100 --> 01:39:02,850 And that's it. 2299 01:39:02,850 --> 01:39:03,960 This is not pseudocode. 2300 01:39:03,960 --> 01:39:07,950 This is the code for binary search on a binary search tree. 2301 01:39:07,950 --> 01:39:10,020 And so, recursion tends to work in lockstep 2302 01:39:10,020 --> 01:39:14,700 with these kinds of data structures that have this structure to them 2303 01:39:14,700 --> 01:39:16,180 as we're seeing here. 2304 01:39:16,180 --> 01:39:16,680 All right. 2305 01:39:16,680 --> 01:39:22,360 Any questions, then, on binary search as implemented here with a tree? 2306 01:39:22,360 --> 01:39:23,227 Yeah. 2307 01:39:23,227 --> 01:39:25,175 AUDIENCE: About like third years. 2308 01:39:25,175 --> 01:39:26,149 [INAUDIBLE] 2309 01:39:29,688 --> 01:39:30,730 SPEAKER 1: Good question. 2310 01:39:30,730 --> 01:39:36,690 So when returning a Boolean value, true and false are values that are defined 2311 01:39:36,690 --> 01:39:40,350 in a library called Standard Bool, S-T-D-B-O-O-L dot H. 2312 01:39:40,350 --> 01:39:42,480 With a header file that you can use. 2313 01:39:42,480 --> 01:39:49,258 It is the case that true is, it's not well defined what they are. 2314 01:39:49,258 --> 01:39:50,550 But they would map indeed, yes. 2315 01:39:50,550 --> 01:39:51,960 To 0 and 1, essentially. 2316 01:39:51,960 --> 01:39:54,390 But you should not compare them explicitly to 0 and 1. 2317 01:39:54,390 --> 01:39:57,390 When you're using true and false, you should compare them to each other. 2318 01:39:57,390 --> 01:40:01,375 AUDIENCE: I meant if it's in a code return. 2319 01:40:01,375 --> 01:40:02,250 SPEAKER 1: Oh, sorry. 2320 01:40:02,250 --> 01:40:05,850 So if I am in my own code from earlier, an avoid function, 2321 01:40:05,850 --> 01:40:08,280 it is totally fine to return. 2322 01:40:08,280 --> 01:40:10,950 You just can't return something explicitly. 2323 01:40:10,950 --> 01:40:12,720 So return just means that's it. 2324 01:40:12,720 --> 01:40:14,280 Quit out of this function. 2325 01:40:14,280 --> 01:40:16,150 You're not actually handing back a value. 2326 01:40:16,150 --> 01:40:19,770 So it's a way of short circuiting the execution. 2327 01:40:19,770 --> 01:40:22,050 If you don't like that, and some people do frown 2328 01:40:22,050 --> 01:40:26,760 upon having code return from functions prematurely, you could invert the logic 2329 01:40:26,760 --> 01:40:28,050 and do something like this. 2330 01:40:28,050 --> 01:40:31,740 If the root does not equal null, do all of these things. 2331 01:40:31,740 --> 01:40:34,020 And then, indent all three of these lines underneath. 2332 01:40:34,020 --> 01:40:35,490 That's perfectly fine too. 2333 01:40:35,490 --> 01:40:37,290 I happen to write it the other way just so 2334 01:40:37,290 --> 01:40:40,990 that there was explicitly a base case that I could point to on the screen. 2335 01:40:40,990 --> 01:40:43,920 Whereas, now, it's implicitly there for us only. 2336 01:40:43,920 --> 01:40:45,790 But a good observation too. 2337 01:40:45,790 --> 01:40:46,290 All right. 2338 01:40:46,290 --> 01:40:49,960 So let's ask the question as before about running time of this. 2339 01:40:49,960 --> 01:40:51,930 It would look like binary search is back. 2340 01:40:51,930 --> 01:40:57,600 And we can now do things in logarithmic time, but we should be careful. 2341 01:40:57,600 --> 01:40:59,940 Is this a binary search tree? 2342 01:40:59,940 --> 01:41:01,660 Just to be clear. 2343 01:41:01,660 --> 01:41:04,380 And again, a binary search tree is a tree 2344 01:41:04,380 --> 01:41:11,118 where the root is greater than its left child and smaller than its right child. 2345 01:41:11,118 --> 01:41:11,910 That's the essence. 2346 01:41:11,910 --> 01:41:13,380 So you're nodding your head. 2347 01:41:13,380 --> 01:41:15,280 You agree? 2348 01:41:15,280 --> 01:41:16,020 I agree. 2349 01:41:16,020 --> 01:41:18,030 So this is a binary search tree. 2350 01:41:18,030 --> 01:41:20,390 Is this a binary search tree? 2351 01:41:20,390 --> 01:41:21,330 [INTERPOSING VOICES] 2352 01:41:21,330 --> 01:41:21,830 OK. 2353 01:41:21,830 --> 01:41:22,860 I'm hearing yeses. 2354 01:41:22,860 --> 01:41:25,710 Or I'm hearing just my delay changing the vote it would seem. 2355 01:41:25,710 --> 01:41:28,080 So this is one of those trick questions. 2356 01:41:28,080 --> 01:41:30,480 This is a binary search tree because I've not 2357 01:41:30,480 --> 01:41:33,390 violated the definition of what I gave you, right. 2358 01:41:33,390 --> 01:41:39,480 Is there any example of a left child that is greater than its parent? 2359 01:41:39,480 --> 01:41:42,480 Or is there any example of a right child that's smaller than its parent? 2360 01:41:42,480 --> 01:41:44,897 That's just the opposite way of describing the same thing. 2361 01:41:44,897 --> 01:41:47,070 No, this is a binary search tree. 2362 01:41:47,070 --> 01:41:50,210 Unfortunately, it also looks like, albeit at a different axis, what? 2363 01:41:50,210 --> 01:41:51,210 AUDIENCE: A linked list. 2364 01:41:51,210 --> 01:41:51,900 SPEAKER 1: A linked list. 2365 01:41:51,900 --> 01:41:53,970 But you could imagine this happening, right. 2366 01:41:53,970 --> 01:41:56,640 Suppose that I hadn't been as thoughtful as I was earlier 2367 01:41:56,640 --> 01:41:59,970 by inserting 2, And then 1, and then 3. 2368 01:41:59,970 --> 01:42:02,160 Which nicely balanced everything out. 2369 01:42:02,160 --> 01:42:04,860 Suppose that instead, because of what the user is typing in 2370 01:42:04,860 --> 01:42:07,980 or whatever you contrive in your own code, suppose you insert a 1, 2371 01:42:07,980 --> 01:42:10,260 and then a 2, and then a 3. 2372 01:42:10,260 --> 01:42:12,850 Like, you've created a problem for yourself. 2373 01:42:12,850 --> 01:42:16,290 Because if we follow the same logic as before, going left or going right, 2374 01:42:16,290 --> 01:42:21,030 this is how you might implement a binary search tree accidentally 2375 01:42:21,030 --> 01:42:24,750 if you just blindly keep following that definition. 2376 01:42:24,750 --> 01:42:27,030 I mean, this would be better designed as what? 2377 01:42:27,030 --> 01:42:29,490 If we rotated the whole thing around. 2378 01:42:29,490 --> 01:42:30,870 And that's totally fine. 2379 01:42:30,870 --> 01:42:33,060 And those kinds of trees actually have names. 2380 01:42:33,060 --> 01:42:35,400 There's trees called AVL trees in computer science. 2381 01:42:35,400 --> 01:42:37,050 There are red-black black trees in computer science. 2382 01:42:37,050 --> 01:42:39,300 There are other types of trees that, additionally, 2383 01:42:39,300 --> 01:42:42,510 add some logic that tell you when you got to pivot the thing, 2384 01:42:42,510 --> 01:42:46,238 and rotate it, and snip off the root, and fix things in this way. 2385 01:42:46,238 --> 01:42:48,030 But a binary search tree, in and of itself, 2386 01:42:48,030 --> 01:42:51,670 does not guarantee that it will be balanced, so to speak. 2387 01:42:51,670 --> 01:42:54,240 And so, if you consider the worst case scenario 2388 01:42:54,240 --> 01:42:55,860 of even using a binary search tree. 2389 01:42:55,860 --> 01:42:57,960 If you're not smart about the code you're writing 2390 01:42:57,960 --> 01:43:00,180 and you just blindly follow this definition, 2391 01:43:00,180 --> 01:43:04,290 you might accidentally create a crazy, long and stringy binary search 2392 01:43:04,290 --> 01:43:07,050 tree that essentially looks like a linked list. 2393 01:43:07,050 --> 01:43:09,510 Because you're not even using any of the left children. 2394 01:43:09,510 --> 01:43:12,750 So unfortunately, the literal answer to the question 2395 01:43:12,750 --> 01:43:15,480 here is what's the running time of search? 2396 01:43:15,480 --> 01:43:17,400 Well, hopefully, log n. 2397 01:43:17,400 --> 01:43:19,980 But not if you don't maintain the balance of the tree. 2398 01:43:19,980 --> 01:43:25,290 Both, in certain search, could actually devolve into instead of big O of log n, 2399 01:43:25,290 --> 01:43:26,952 literally, big O of n. 2400 01:43:26,952 --> 01:43:29,160 If you don't somehow take into account, and we're not 2401 01:43:29,160 --> 01:43:30,720 going to do the code for that here. 2402 01:43:30,720 --> 01:43:34,140 It's a higher level thing you might explore down the road. 2403 01:43:34,140 --> 01:43:37,930 It can devolve into something that you might not have intended. 2404 01:43:37,930 --> 01:43:40,022 And so, now that we're talking about 2 dimensions, 2405 01:43:40,022 --> 01:43:41,730 it's really the onus is on the programmer 2406 01:43:41,730 --> 01:43:44,490 to consider what kinds of perverse situations might happen. 2407 01:43:44,490 --> 01:43:46,860 Where the thing devolves into a structure 2408 01:43:46,860 --> 01:43:50,350 that you don't actually want it to devolve into. 2409 01:43:50,350 --> 01:43:50,850 All right. 2410 01:43:50,850 --> 01:43:52,360 We've got just a few structures to go. 2411 01:43:52,360 --> 01:43:53,940 Let's go ahead and take one more 5 minute break here. 2412 01:43:53,940 --> 01:43:55,410 When we come back, we'll talk at this level 2413 01:43:55,410 --> 01:43:57,030 about some final applications of this. 2414 01:43:57,030 --> 01:43:58,510 See you in 5. 2415 01:43:58,510 --> 01:44:00,270 All right. 2416 01:44:00,270 --> 01:44:01,860 So we are back. 2417 01:44:01,860 --> 01:44:05,250 And as promised, we'll operate now at this higher level. 2418 01:44:05,250 --> 01:44:08,520 Where if we take for granted that, even though you haven't had an opportunity 2419 01:44:08,520 --> 01:44:11,312 to play with these techniques yet, you have the ability now in code 2420 01:44:11,312 --> 01:44:12,780 to stitch things together. 2421 01:44:12,780 --> 01:44:15,630 Both in a one dimension and even 2 dimensions, 2422 01:44:15,630 --> 01:44:17,970 to build things like lists and trees. 2423 01:44:17,970 --> 01:44:19,980 So if we have these building blocks. 2424 01:44:19,980 --> 01:44:22,680 Things like now arrays, and lists, and trees, 2425 01:44:22,680 --> 01:44:26,790 what if we start to amalgamate them such that we build things out 2426 01:44:26,790 --> 01:44:28,900 of multiple data structures? 2427 01:44:28,900 --> 01:44:32,360 Can we start to get some of the best of both worlds by way of, for instance, 2428 01:44:32,360 --> 01:44:33,710 something called a hash table. 2429 01:44:33,710 --> 01:44:37,540 So a hash table is a Swiss army knife of data structures 2430 01:44:37,540 --> 01:44:39,310 in that it's so commonly used. 2431 01:44:39,310 --> 01:44:44,000 Because it allows you to associate keys with value, so to speak. 2432 01:44:44,000 --> 01:44:49,060 So, for instance, it allows you to associate a username with a password. 2433 01:44:49,060 --> 01:44:51,070 Or a name with a number. 2434 01:44:51,070 --> 01:44:53,920 Or anything where you have to take something as input, 2435 01:44:53,920 --> 01:44:56,300 and get as output a corresponding piece of information. 2436 01:44:56,300 --> 01:44:59,210 A hash table is often a data structure of choice. 2437 01:44:59,210 --> 01:45:00,460 And here's what it looks like. 2438 01:45:00,460 --> 01:45:02,800 It's actually looks like an array, at first glance. 2439 01:45:02,800 --> 01:45:05,990 But for discussion's sake, I've drawn this array vertically, 2440 01:45:05,990 --> 01:45:06,920 which is totally fine. 2441 01:45:06,920 --> 01:45:08,660 It's still just an array. 2442 01:45:08,660 --> 01:45:13,720 But it allows you, a hash table, to jump to any of these locations randomly. 2443 01:45:13,720 --> 01:45:14,740 That is instantly. 2444 01:45:14,740 --> 01:45:18,130 So, for instance, there's actually 26 locations in this array. 2445 01:45:18,130 --> 01:45:21,100 Because I want to, for instance, store initially 2446 01:45:21,100 --> 01:45:23,980 names of people, for instance. 2447 01:45:23,980 --> 01:45:26,653 And wouldn't it be nice if the person's name starts with A, 2448 01:45:26,653 --> 01:45:27,820 I have a go to place for it. 2449 01:45:27,820 --> 01:45:28,780 Maybe the first box. 2450 01:45:28,780 --> 01:45:30,863 And if it starts with Z, I put them at the bottom. 2451 01:45:30,863 --> 01:45:33,070 So that I can jump instantly, arithmetically, 2452 01:45:33,070 --> 01:45:35,470 using a little bit of Ascii or Unicode fanciness, 2453 01:45:35,470 --> 01:45:38,540 exactly to the location that they want to they need to go. 2454 01:45:38,540 --> 01:45:40,690 So, for instance, here's our array 0 index. 2455 01:45:40,690 --> 01:45:42,130 0 through 25. 2456 01:45:42,130 --> 01:45:44,500 If I think of this, though, as A through Z, 2457 01:45:44,500 --> 01:45:46,370 I'm going to think of these 26 locations, 2458 01:45:46,370 --> 01:45:49,630 now in the context of a hash table, is what we'll generally call buckets. 2459 01:45:49,630 --> 01:45:52,010 So buckets into which you can put values. 2460 01:45:52,010 --> 01:45:56,380 So, for instance, suppose that we want to insert a value, one name 2461 01:45:56,380 --> 01:45:57,590 into this data structure. 2462 01:45:57,590 --> 01:45:59,260 And that name is say, Albus. 2463 01:45:59,260 --> 01:46:03,980 So Albus starting with A. Albus might go at the very beginning of this list. 2464 01:46:03,980 --> 01:46:04,480 All right. 2465 01:46:04,480 --> 01:46:06,188 And then, we want to insert another name. 2466 01:46:06,188 --> 01:46:07,630 This one happens to be Zacharias. 2467 01:46:07,630 --> 01:46:10,690 Starting with Z, so it goes all the way at the end of this data 2468 01:46:10,690 --> 01:46:12,490 structure in location 25 a.k.a. 2469 01:46:12,490 --> 01:46:13,390 Z. 2470 01:46:13,390 --> 01:46:17,260 And then, maybe a third name like Hermione, and that goes at location H 2471 01:46:17,260 --> 01:46:19,310 according to that position in the alphabet. 2472 01:46:19,310 --> 01:46:22,060 So this is great because in constant time, 2473 01:46:22,060 --> 01:46:26,020 I can insert and conversely search for any of these names, 2474 01:46:26,020 --> 01:46:27,700 based on the first letter of their name. 2475 01:46:27,700 --> 01:46:30,098 A, or Z, or H, in this case. 2476 01:46:30,098 --> 01:46:32,890 Let's fast forward and assume we put a whole bunch of other names-- 2477 01:46:32,890 --> 01:46:34,900 might look familiar, into this hash table. 2478 01:46:34,900 --> 01:46:39,110 It's great because every name has its own location. 2479 01:46:39,110 --> 01:46:43,480 But if you're thinking of names you don't yet see it on the screen, 2480 01:46:43,480 --> 01:46:45,710 we eventually encounter a problem with this, right. 2481 01:46:45,710 --> 01:46:49,480 When could something go wrong using a hash table like this 2482 01:46:49,480 --> 01:46:52,090 if we wanted to insert even more names? 2483 01:46:52,090 --> 01:46:54,290 What's going to eventually happen? 2484 01:46:54,290 --> 01:46:54,790 Yeah. 2485 01:46:54,790 --> 01:46:56,998 There's already someone with the first letter, right. 2486 01:46:56,998 --> 01:46:59,860 Like I haven't even mentioned Harry, for instance, or Hagrid. 2487 01:46:59,860 --> 01:47:01,750 And yet, Hermione's already using that spot. 2488 01:47:01,750 --> 01:47:04,030 So that invites the question, well, what happens? 2489 01:47:04,030 --> 01:47:07,600 Maybe, if we want to insert Harry next, do we maybe cheat and put him 2490 01:47:07,600 --> 01:47:08,710 at location I? 2491 01:47:08,710 --> 01:47:11,323 But then if there's a location I, where do we put them? 2492 01:47:11,323 --> 01:47:13,990 And it just feels like the situation could very quickly devolve. 2493 01:47:13,990 --> 01:47:16,930 But I've deliberately drawn this data structure, 2494 01:47:16,930 --> 01:47:19,990 that I claim as a hash table, in 2 directions. 2495 01:47:19,990 --> 01:47:22,120 An array vertically, here. 2496 01:47:22,120 --> 01:47:25,300 But what might this be hinting I'm using horizontally, 2497 01:47:25,300 --> 01:47:28,300 even though I'm drawing the rectangles a little differently from before? 2498 01:47:28,300 --> 01:47:29,092 AUDIENCE: An array. 2499 01:47:29,092 --> 01:47:29,758 SPEAKER 1: Yeah. 2500 01:47:29,758 --> 01:47:31,091 Maybe another array, to be fair. 2501 01:47:31,091 --> 01:47:34,258 But, honestly, arrays are such a pain with the allocating, and reallocating, 2502 01:47:34,258 --> 01:47:34,810 and so forth. 2503 01:47:34,810 --> 01:47:38,600 These look like the beginnings of a linked list, if you will. 2504 01:47:38,600 --> 01:47:42,190 Where the name is where the number used to be, even though I'm drawing it 2505 01:47:42,190 --> 01:47:44,200 horizontally now just for discussion's sake. 2506 01:47:44,200 --> 01:47:47,800 And this seems to be a pointer that isn't pointing anywhere yet. 2507 01:47:47,800 --> 01:47:53,080 But it looks like the array is 26 pointers, some of which are null, 2508 01:47:53,080 --> 01:47:53,920 that is empty. 2509 01:47:53,920 --> 01:47:56,675 Some of which are pointing at the first node in a linked list. 2510 01:47:56,675 --> 01:47:59,050 So that's really what a hash table might be in your mind. 2511 01:47:59,050 --> 01:48:03,828 An amalgam of an array, whose elements are linked lists. 2512 01:48:03,828 --> 01:48:06,370 And in theory, this gives you the best of both worlds, right. 2513 01:48:06,370 --> 01:48:09,430 You get random access with high probability, right. 2514 01:48:09,430 --> 01:48:12,620 You get to jump immediately to the location you want to put someone. 2515 01:48:12,620 --> 01:48:15,430 But, if you run into this perverse situation where there's someone 2516 01:48:15,430 --> 01:48:16,870 already there, OK, fine. 2517 01:48:16,870 --> 01:48:20,350 It starts to devolve into a linked list, but it's at least 26 2518 01:48:20,350 --> 01:48:21,580 smaller length lists. 2519 01:48:21,580 --> 01:48:24,670 Not one massive linked list, which would be Big O of n. 2520 01:48:24,670 --> 01:48:26,480 And quite slow to solve. 2521 01:48:26,480 --> 01:48:28,630 So if Harry gets inserted in Hagrid. 2522 01:48:28,630 --> 01:48:32,780 Yeah, you have to chain them together, so to speak, in this way. 2523 01:48:32,780 --> 01:48:35,645 But, at least you've not painted yourself into a corner. 2524 01:48:35,645 --> 01:48:38,770 And in fact, if we fast forward and put a whole bunch of familiar names in, 2525 01:48:38,770 --> 01:48:41,120 the data structure starts to look like this. 2526 01:48:41,120 --> 01:48:43,460 So the chains not terribly long. 2527 01:48:43,460 --> 01:48:46,270 And some of them are actually of size 0 because there's just 2528 01:48:46,270 --> 01:48:49,150 some unpopular letters of the alphabet among these names. 2529 01:48:49,150 --> 01:48:51,100 But it seems better than just putting everyone 2530 01:48:51,100 --> 01:48:53,860 in one big array, or one big linked list. 2531 01:48:53,860 --> 01:48:58,190 We're trying to balance these trade offs a little bit in the middle here. 2532 01:48:58,190 --> 01:49:00,410 Well, how might we represent something like this? 2533 01:49:00,410 --> 01:49:02,140 Here's how we could describe this thing. 2534 01:49:02,140 --> 01:49:05,320 A node in the context of a linked list could be this. 2535 01:49:05,320 --> 01:49:08,860 I have an array called word of type char. 2536 01:49:08,860 --> 01:49:13,060 And it's big enough to fit the longest word in the alphabet plus 1. 2537 01:49:13,060 --> 01:49:14,890 And the plus 1 why, probably? 2538 01:49:14,890 --> 01:49:15,760 AUDIENCE: The null. 2539 01:49:15,760 --> 01:49:16,730 SPEAKER 1: The null character. 2540 01:49:16,730 --> 01:49:19,840 So I'm assuming that longest word is like a constant defined elsewhere 2541 01:49:19,840 --> 01:49:20,470 in the story. 2542 01:49:20,470 --> 01:49:22,735 And it's something big like 40, 100, whatever. 2543 01:49:22,735 --> 01:49:25,810 Whatever the longest word in the Harry Potter universe 2544 01:49:25,810 --> 01:49:28,440 is or the English dictionary is. 2545 01:49:28,440 --> 01:49:34,050 Longest word plus 1 should be sufficient to store any name in the story here. 2546 01:49:34,050 --> 01:49:36,360 And then, what else does it each of these nodes have? 2547 01:49:36,360 --> 01:49:40,060 Well it has a pointer to another node. 2548 01:49:40,060 --> 01:49:42,390 So here's how we might implement the notion of a node 2549 01:49:42,390 --> 01:49:46,710 in the context of storing not integers, but names. 2550 01:49:46,710 --> 01:49:48,360 Instead, like this. 2551 01:49:48,360 --> 01:49:51,360 But how do we decide what the hash table itself is? 2552 01:49:51,360 --> 01:49:55,140 Well, if we now have a definition of a node, we could have a variable in main, 2553 01:49:55,140 --> 01:49:57,510 or even globally, called hash table. 2554 01:49:57,510 --> 01:50:02,910 That itself is an array of node* pointers. 2555 01:50:02,910 --> 01:50:05,310 That is an array of pointers to nodes. 2556 01:50:05,310 --> 01:50:07,290 The beginnings of linked lists. 2557 01:50:07,290 --> 01:50:08,950 Number of buckets is to me. 2558 01:50:08,950 --> 01:50:11,083 I proposed, verbally, that it be 26. 2559 01:50:11,083 --> 01:50:13,500 But honestly, if you get a lot of collisions, so to speak. 2560 01:50:13,500 --> 01:50:15,623 A lot of H names trying to go to the same place. 2561 01:50:15,623 --> 01:50:17,790 Well, maybe, we need to be smarter and not just look 2562 01:50:17,790 --> 01:50:19,207 at the first letter of their name. 2563 01:50:19,207 --> 01:50:20,800 But, maybe, the first and the second. 2564 01:50:20,800 --> 01:50:24,900 So it's H-A and H-E. But wait, no, then Harry and Hagrid still collide. 2565 01:50:24,900 --> 01:50:27,840 But we start to at least make the problem a little less 2566 01:50:27,840 --> 01:50:31,500 impactful by tinkering with something like the number of buckets 2567 01:50:31,500 --> 01:50:32,880 in a hash table like this. 2568 01:50:32,880 --> 01:50:37,560 But how do we decide where someone goes in a hash table in this way? 2569 01:50:37,560 --> 01:50:39,900 Well, it's an old school problem of input and output. 2570 01:50:39,900 --> 01:50:43,260 The input to the problem is going to be something like the name. 2571 01:50:43,260 --> 01:50:45,300 And the algorithm in the middle, as of today, 2572 01:50:45,300 --> 01:50:47,730 is going to be something called a hash function. 2573 01:50:47,730 --> 01:50:49,620 A hash function is generally something that 2574 01:50:49,620 --> 01:50:53,370 takes as input, a string, a number, whatever, and produces 2575 01:50:53,370 --> 01:50:55,860 as output a location in our context. 2576 01:50:55,860 --> 01:50:57,750 Like a number 0 through 25. 2577 01:50:57,750 --> 01:50:59,490 Or 0 through 16,000. 2578 01:50:59,490 --> 01:51:02,190 Or whatever the number of buckets you want is, 2579 01:51:02,190 --> 01:51:06,370 it's going to just tell you where to put that input at a specific location. 2580 01:51:06,370 --> 01:51:10,200 So, for instance, Albus, according to the story thus far, gave me back to 0 2581 01:51:10,200 --> 01:51:10,710 as output. 2582 01:51:10,710 --> 01:51:12,570 Zacharias gave me 25. 2583 01:51:12,570 --> 01:51:15,300 So the hash function, in the middle of that black box, 2584 01:51:15,300 --> 01:51:17,760 is pretty simplistic in this story. 2585 01:51:17,760 --> 01:51:21,360 It's just looking at the Ascii value, it seems, of the first letter 2586 01:51:21,360 --> 01:51:22,110 in their name. 2587 01:51:22,110 --> 01:51:25,150 And then, subtracting off what capital A is 65. 2588 01:51:25,150 --> 01:51:29,470 So like doing some math to get back in number between 0 and 25. 2589 01:51:29,470 --> 01:51:32,610 So that's how we got to this point in the story. 2590 01:51:32,610 --> 01:51:37,440 And how might we, then, resolve the problem further and use 2591 01:51:37,440 --> 01:51:39,060 this notion of hashing more generally? 2592 01:51:39,060 --> 01:51:40,935 Well just for demonstration sake here, here's 2593 01:51:40,935 --> 01:51:43,290 actually some buckets, literally. 2594 01:51:43,290 --> 01:51:46,380 And we've labeled, in advance, these buckets with the suits 2595 01:51:46,380 --> 01:51:47,800 from a deck of cards. 2596 01:51:47,800 --> 01:51:49,770 So we've got some spades. 2597 01:51:49,770 --> 01:51:54,600 And we've got diamonds here. 2598 01:51:54,600 --> 01:51:58,110 And we've got, what else here? 2599 01:51:58,110 --> 01:52:01,890 Clubs and hearts. 2600 01:52:01,890 --> 01:52:04,592 So we have a deck of cards here, for instance, right. 2601 01:52:04,592 --> 01:52:07,050 And this is something you, yourself, might do instinctively 2602 01:52:07,050 --> 01:52:09,420 if you're getting ready to start playing a game of cards. 2603 01:52:09,420 --> 01:52:11,587 You're just cleaning up or you want things in order. 2604 01:52:11,587 --> 01:52:13,963 Like, here is literally a jumbo deck of cards. 2605 01:52:13,963 --> 01:52:16,380 What would be the easiest way for me to sort these things? 2606 01:52:16,380 --> 01:52:19,088 Well we've got a whole bunch of sorting algorithms from the past. 2607 01:52:19,088 --> 01:52:21,630 So I could go through like, here's the 3 of diamonds. 2608 01:52:21,630 --> 01:52:23,880 And I could, here let me throw this up on the screen. 2609 01:52:23,880 --> 01:52:25,570 Just so, if you're far in back. 2610 01:52:25,570 --> 01:52:27,900 So here's diamonds. 2611 01:52:27,900 --> 01:52:28,890 I could put this here. 2612 01:52:28,890 --> 01:52:30,510 3, 4. 2613 01:52:30,510 --> 01:52:32,130 I could do this in order here. 2614 01:52:32,130 --> 01:52:34,540 But a lot of us, honestly, if given a deck of cards. 2615 01:52:34,540 --> 01:52:37,290 And you just want to clean it up and sort it in order, 2616 01:52:37,290 --> 01:52:38,620 you might do things like this. 2617 01:52:38,620 --> 01:52:42,030 Well here's my input, 3 of diamonds, let's put it in this bucket. 2618 01:52:42,030 --> 01:52:43,770 4 of diamonds, this bucket. 2619 01:52:43,770 --> 01:52:45,640 5 of diamonds, this bucket. 2620 01:52:45,640 --> 01:52:49,500 And if you keep going through the cards, here's seven of hearts, hearts bucket. 2621 01:52:49,500 --> 01:52:51,210 8's bucket. 2622 01:52:51,210 --> 01:52:53,070 Queen of spades over here. 2623 01:52:53,070 --> 01:52:55,020 And it's still going to take you 52 steps. 2624 01:52:55,020 --> 01:52:58,020 But at the end of it, you have hashed all of the cards 2625 01:52:58,020 --> 01:52:59,610 into 4 distinct buckets. 2626 01:52:59,610 --> 01:53:02,490 And now you have problems of size 13, which 2627 01:53:02,490 --> 01:53:06,030 is a little more tenable than doing one massive 52 card problem. 2628 01:53:06,030 --> 01:53:08,070 You can now do 4, 13 size problems. 2629 01:53:08,070 --> 01:53:11,790 And so hashing is something that even you and I might do instinctively. 2630 01:53:11,790 --> 01:53:16,680 Taking as input some card, some name, and producing as output some location. 2631 01:53:16,680 --> 01:53:21,960 A temporary pile in which you want to stage things, so to speak. 2632 01:53:21,960 --> 01:53:24,442 But these collisions are inevitable. 2633 01:53:24,442 --> 01:53:27,150 And honestly, if we kept going through the Harry Potter universe, 2634 01:53:27,150 --> 01:53:29,950 some of these chains would get longer, and longer and longer. 2635 01:53:29,950 --> 01:53:33,330 Which means that instead of getting someone's name quickly, 2636 01:53:33,330 --> 01:53:36,178 by searching for them or inserting them, might 2637 01:53:36,178 --> 01:53:37,720 start taking a decent amount of time. 2638 01:53:37,720 --> 01:53:40,770 So what could we do instead to resolve situations like this? 2639 01:53:40,770 --> 01:53:44,370 If the problem, fundamentally, is that the first letter is just too darn 2640 01:53:44,370 --> 01:53:47,387 popular, H, we need to take in more input. 2641 01:53:47,387 --> 01:53:49,720 Not just the first letter but maybe the first 2 letters. 2642 01:53:49,720 --> 01:53:52,770 So if we do that, we can go from A through Z 2643 01:53:52,770 --> 01:53:59,200 to something more extreme like maybe H-A, H-B, H-C, H-D, H-F, and so forth. 2644 01:53:59,200 --> 01:54:02,670 So that now Harry and Hermione end up at different locations. 2645 01:54:02,670 --> 01:54:05,590 But, darn it, Hagrid still collides with Harry. 2646 01:54:05,590 --> 01:54:07,380 So it's better than before. 2647 01:54:07,380 --> 01:54:09,550 The chains aren't quite as long. 2648 01:54:09,550 --> 01:54:11,410 But the problem isn't fundamentally gone. 2649 01:54:11,410 --> 01:54:14,640 And in this case here, anyone know how many buckets we just 2650 01:54:14,640 --> 01:54:22,830 increased to, if we now look at not just a through Z but AA through ZZ, roughly? 2651 01:54:22,830 --> 01:54:24,183 AUDIENCE: 26 squared. 2652 01:54:24,183 --> 01:54:24,850 SPEAKER 1: Yeah. 2653 01:54:24,850 --> 01:54:25,440 OK, good. 2654 01:54:25,440 --> 01:54:28,980 So the easy answer to 26 squared are 676. 2655 01:54:28,980 --> 01:54:30,570 So that's a lot more buckets. 2656 01:54:30,570 --> 01:54:33,040 And this is why I only showed a few of them on the screen. 2657 01:54:33,040 --> 01:54:33,930 So that's a lot more. 2658 01:54:33,930 --> 01:54:37,050 And it spreads things out in particular. 2659 01:54:37,050 --> 01:54:38,640 What if we take this one step further? 2660 01:54:38,640 --> 01:54:44,130 Instead of H-A, we do like H-A-A, H-A-B, H-A-C, H-Z-Z, and so forth. 2661 01:54:44,130 --> 01:54:46,080 Well now, we have an even better situation. 2662 01:54:46,080 --> 01:54:48,480 Because Hermoine has her one spot. 2663 01:54:48,480 --> 01:54:49,770 Harry has his one spot. 2664 01:54:49,770 --> 01:54:51,840 Hagrid has his one spot. 2665 01:54:51,840 --> 01:54:53,880 But there's a trade off here. 2666 01:54:53,880 --> 01:54:57,240 The upside is now, arithmetically, we can find their locations 2667 01:54:57,240 --> 01:54:58,620 in constant time. 2668 01:54:58,620 --> 01:55:00,030 Maybe, technically 3 steps. 2669 01:55:00,030 --> 01:55:03,940 But 3 is constant, no matter how many other names are in here, it would seem. 2670 01:55:03,940 --> 01:55:07,152 But what's the downside here? 2671 01:55:07,152 --> 01:55:07,860 Sorry, say again. 2672 01:55:07,860 --> 01:55:08,490 AUDIENCE: Memory. 2673 01:55:08,490 --> 01:55:09,240 SPEAKER 1: Memory. 2674 01:55:09,240 --> 01:55:10,290 So significantly more. 2675 01:55:10,290 --> 01:55:15,840 We're now up to 17,576 buckets, which itself isn't that big a deal, right. 2676 01:55:15,840 --> 01:55:17,740 Computers have a lot of memory these days. 2677 01:55:17,740 --> 01:55:21,450 But as you can infer, I can't really think 2678 01:55:21,450 --> 01:55:26,160 of someone whose name started with H-E-Q, for instance, in the Harry 2679 01:55:26,160 --> 01:55:26,832 Potter universe. 2680 01:55:26,832 --> 01:55:29,040 And if we keep going, definitely don't know of anyone 2681 01:55:29,040 --> 01:55:32,040 whose name started with Z-Z-Z or A-A-A. There's 2682 01:55:32,040 --> 01:55:37,390 a lot of not useful combinations that have to be there mathematically, 2683 01:55:37,390 --> 01:55:41,040 so that you can do a bit of math and jump to randomly, so to speak, 2684 01:55:41,040 --> 01:55:42,292 the precise location. 2685 01:55:42,292 --> 01:55:43,750 But they're just going to be empty. 2686 01:55:43,750 --> 01:55:47,380 So it's a very sparsely populated array, so to speak. 2687 01:55:47,380 --> 01:55:50,640 So what does that really mean for performance, ultimately? 2688 01:55:50,640 --> 01:55:53,400 Well let's consider, again, in the context of our Big O notation. 2689 01:55:53,400 --> 01:55:56,790 It turns out that a hash table, technically speaking, 2690 01:55:56,790 --> 01:56:00,870 is still just going to give us Big O of n in the worst case. 2691 01:56:00,870 --> 01:56:01,470 Why? 2692 01:56:01,470 --> 01:56:04,440 If you have some crazy perverse case where everyone in the universe 2693 01:56:04,440 --> 01:56:07,950 has a name that starts with A, or starts with H, or starts with Z, 2694 01:56:07,950 --> 01:56:09,240 you just get really unlucky. 2695 01:56:09,240 --> 01:56:11,117 And your chain is massively long. 2696 01:56:11,117 --> 01:56:13,200 Well then, at that point, it's just a linked list. 2697 01:56:13,200 --> 01:56:14,117 It's not a hash table. 2698 01:56:14,117 --> 01:56:16,380 It's like the perverse situation with the tree, where 2699 01:56:16,380 --> 01:56:22,200 if you insert it without any mind for keeping it balance, it just evolves. 2700 01:56:22,200 --> 01:56:26,400 But there's a difference here between a theoretical performance 2701 01:56:26,400 --> 01:56:28,020 and an actual performance. 2702 01:56:28,020 --> 01:56:31,290 If you look back at the the hash table here, 2703 01:56:31,290 --> 01:56:37,890 this is absolutely, in practice, going to be faster than a single linked list. 2704 01:56:37,890 --> 01:56:40,860 Mathematically, asymptotically, big O notation, sure. 2705 01:56:40,860 --> 01:56:41,700 It's all the same. 2706 01:56:41,700 --> 01:56:42,630 Big O of n. 2707 01:56:42,630 --> 01:56:46,500 But if what we're really caring about is real humans using our software, 2708 01:56:46,500 --> 01:56:48,990 there's something to be said for crafting a data structure. 2709 01:56:48,990 --> 01:56:51,570 That technically, if this data were uniformly distributed, 2710 01:56:51,570 --> 01:56:55,450 is 26 times faster than a linked list alone. 2711 01:56:55,450 --> 01:57:00,720 And so, there's this tension too between systems, types of CS, 2712 01:57:00,720 --> 01:57:01,847 and theoretical CS. 2713 01:57:01,847 --> 01:57:03,930 Where yeah, theoretically, these are all the same. 2714 01:57:03,930 --> 01:57:06,660 But in practice, for making real-world software, 2715 01:57:06,660 --> 01:57:12,390 improving this speed by a factor of 26 in this case, let alone 576 or more, 2716 01:57:12,390 --> 01:57:14,170 might actually make a big difference. 2717 01:57:14,170 --> 01:57:15,670 But there's going to be a trade off. 2718 01:57:15,670 --> 01:57:19,540 And that's typically some other resource like giving up more space. 2719 01:57:19,540 --> 01:57:20,040 All right. 2720 01:57:20,040 --> 01:57:23,100 How about another data structure we could build. 2721 01:57:23,100 --> 01:57:26,010 Let me fast forward to something here called a trie. 2722 01:57:26,010 --> 01:57:28,920 So a trie, a weird name in pronunciation. 2723 01:57:28,920 --> 01:57:31,950 Short for retrieval, pronounced trie typically. 2724 01:57:31,950 --> 01:57:37,680 A trie is a tree that actually gives us constant time lookup, 2725 01:57:37,680 --> 01:57:41,040 even for massive data sets. 2726 01:57:41,040 --> 01:57:42,090 What do I mean by this? 2727 01:57:42,090 --> 01:57:47,230 In the world of a trie, you create a tree out of arrays. 2728 01:57:47,230 --> 01:57:49,560 So we're really getting into the Frankenstein territory 2729 01:57:49,560 --> 01:57:52,320 of just building things up with spare parts of data structures 2730 01:57:52,320 --> 01:57:53,500 that we have here. 2731 01:57:53,500 --> 01:57:56,460 But the root of a trie is, itself, an array. 2732 01:57:56,460 --> 01:57:58,530 For instance, of size 26. 2733 01:57:58,530 --> 01:58:04,800 Where each element in that trie points to another node, 2734 01:58:04,800 --> 01:58:06,510 which is to say another array. 2735 01:58:06,510 --> 01:58:09,480 And each of those locations in the array represents a letter 2736 01:58:09,480 --> 01:58:10,920 of the alphabet like A through Z. 2737 01:58:10,920 --> 01:58:14,970 So for instance, if you wanted to store the names of the Harry Potter universe, 2738 01:58:14,970 --> 01:58:19,050 not in a hash table, not in a linked list, not in a tree, but in a trie. 2739 01:58:19,050 --> 01:58:23,820 What you would do is hash on every letter in the person's name one 2740 01:58:23,820 --> 01:58:24,640 at a time. 2741 01:58:24,640 --> 01:58:28,050 So a trie is like a multi-tier hash table, in a sense. 2742 01:58:28,050 --> 01:58:29,770 Where you first look at the first letter, 2743 01:58:29,770 --> 01:58:32,478 then the second letter, then the third, and you do the following. 2744 01:58:32,478 --> 01:58:35,940 For instance, each of these locations represents a letter A 2745 01:58:35,940 --> 01:58:39,450 through Z. Suppose I wanted to insert someone's name into this 2746 01:58:39,450 --> 01:58:43,530 that starts with the letter H, like Hagrid for instance. 2747 01:58:43,530 --> 01:58:46,360 Well, I go to the location H. I see it's null, 2748 01:58:46,360 --> 01:58:49,440 which means I need to malloc myself another node or another array. 2749 01:58:49,440 --> 01:58:50,970 And that's depicted here. 2750 01:58:50,970 --> 01:58:54,810 Then, suppose I want to store the second letter in Hagrid's name, 2751 01:58:54,810 --> 01:58:57,432 an A. So I go to that location in the second node. 2752 01:58:57,432 --> 01:58:58,890 And I see, OK, it's currently null. 2753 01:58:58,890 --> 01:58:59,932 There's nothing below it. 2754 01:58:59,932 --> 01:59:02,440 So I allocate another node using malloc or the like. 2755 01:59:02,440 --> 01:59:06,690 And now I have H-A-G. And I continue this with R-I-D. 2756 01:59:06,690 --> 01:59:10,240 And then, when I get to the bottom of this person's name, 2757 01:59:10,240 --> 01:59:12,840 I just have to indicate here in color, but probably 2758 01:59:12,840 --> 01:59:14,280 with a Boolean value or something. 2759 01:59:14,280 --> 01:59:18,190 Like a true value that says, a name stops here. 2760 01:59:18,190 --> 01:59:23,740 So that it's clear that the person's name is not H-A, or H-A-G, or H-A-G-R, 2761 01:59:23,740 --> 01:59:28,270 or H-A-G-R-I. It's H-A-G-R-I-D. And the D is green, 2762 01:59:28,270 --> 01:59:31,600 just to indicate there's like some other Boolean value that just says, yes. 2763 01:59:31,600 --> 01:59:35,300 This is the node in which the name stops. 2764 01:59:35,300 --> 01:59:40,240 And if I continue this logic, here's how I might insert someone like Harry. 2765 01:59:40,240 --> 01:59:43,420 And here's how I might insert someone like Hermione. 2766 01:59:43,420 --> 01:59:48,010 And what's interesting about the design here is that some of these names 2767 01:59:48,010 --> 01:59:49,930 share a common prefix. 2768 01:59:49,930 --> 01:59:52,990 Which starts to get compelling because you're reusing space. 2769 01:59:52,990 --> 01:59:57,910 You're using the same nodes for names like H-A-G and H-A-R 2770 01:59:57,910 --> 02:00:00,370 because they share H and an A in common. 2771 02:00:00,370 --> 02:00:02,630 And they all share an H in common. 2772 02:00:02,630 --> 02:00:06,340 So you have this data structure now that, itself, is a tree. 2773 02:00:06,340 --> 02:00:10,090 Each node in the tree is, itself, an array. 2774 02:00:10,090 --> 02:00:13,690 And we, therefore, might implement this thing using code like this. 2775 02:00:13,690 --> 02:00:19,195 Every node is containing, I'll do it in reverse order, an array. 2776 02:00:19,195 --> 02:00:21,820 I'll call it children because that's what it really represents. 2777 02:00:21,820 --> 02:00:24,130 Up to 26 children for each of these nodes. 2778 02:00:24,130 --> 02:00:25,430 Size of the alphabet. 2779 02:00:25,430 --> 02:00:28,360 So I might have used just a constant for number 26, 2780 02:00:28,360 --> 02:00:30,400 to give myself 26 letters of the alphabet. 2781 02:00:30,400 --> 02:00:34,630 And each of those arrays stores that many node stars. 2782 02:00:34,630 --> 02:00:36,550 That many pointers to another node. 2783 02:00:36,550 --> 02:00:38,020 And here's an example of the Bool. 2784 02:00:38,020 --> 02:00:40,750 This is what I represented in green on the slide a moment ago. 2785 02:00:40,750 --> 02:00:42,580 I also need another piece of data. 2786 02:00:42,580 --> 02:00:45,520 Just a 0 or 1, a true or false, that says yes. 2787 02:00:45,520 --> 02:00:50,810 A name stops in this node or it's just a path to the rest of the person's name. 2788 02:00:50,810 --> 02:00:55,090 But the upside of this is that the height of this tree 2789 02:00:55,090 --> 02:00:58,090 is only as tall as the person's longest name. 2790 02:00:58,090 --> 02:01:04,930 H-A-G-R-I-D or H-E-R-M-O-I-N-E. And notice that no matter how many other 2791 02:01:04,930 --> 02:01:08,740 people are in this data structure, there's 3 at the moment, 2792 02:01:08,740 --> 02:01:13,150 if there were 3 million, it would still take me how many steps to search 2793 02:01:13,150 --> 02:01:14,500 for Hermoine? 2794 02:01:14,500 --> 02:01:19,750 H-E-R-M-I-O-N-E. So, 8 steps total. 2795 02:01:19,750 --> 02:01:24,580 No matter if there's 2 other people, 2 million, 10 million other people. 2796 02:01:24,580 --> 02:01:28,660 Because the path to her name is always on the same path. 2797 02:01:28,660 --> 02:01:33,550 And if you assume that there's a maximum limit on the length of names 2798 02:01:33,550 --> 02:01:34,420 in the human world. 2799 02:01:34,420 --> 02:01:36,510 Maybe it's 40, 100, whatever. 2800 02:01:36,510 --> 02:01:38,260 Whatever the longest name in the world is. 2801 02:01:38,260 --> 02:01:39,160 That's constant. 2802 02:01:39,160 --> 02:01:41,630 Maybe it's 40, 100, but that's constant. 2803 02:01:41,630 --> 02:01:44,840 Which is to say that with a trie, technically speaking, 2804 02:01:44,840 --> 02:01:49,480 it is the case that your lookup time, Big O of n, a big O notation, 2805 02:01:49,480 --> 02:01:51,520 would be big O of 1. 2806 02:01:51,520 --> 02:01:54,580 It's constant time, because unlike every other data structure 2807 02:01:54,580 --> 02:01:59,440 we've looked at, with a trie, the amount of time it takes you to find one person 2808 02:01:59,440 --> 02:02:02,920 or insert one person is completely independent of how 2809 02:02:02,920 --> 02:02:07,210 many other pieces of data are already in the data structure. 2810 02:02:07,210 --> 02:02:09,970 And this holds true even if one name is a prefix of another. 2811 02:02:09,970 --> 02:02:13,373 I don't think there was a Daniel or Danielle in the Harry Potter universe 2812 02:02:13,373 --> 02:02:14,290 that I could think of. 2813 02:02:14,290 --> 02:02:18,400 But, D-A-N-I-E-L could be one name. 2814 02:02:18,400 --> 02:02:20,988 And, therefore, we have a true there in green. 2815 02:02:20,988 --> 02:02:22,780 And if there's a longer name like Danielle. 2816 02:02:22,780 --> 02:02:24,760 Then, you keep going until you get to the E. 2817 02:02:24,760 --> 02:02:27,550 So you can still have with a trie, one name that's 2818 02:02:27,550 --> 02:02:29,660 a substring of another name. 2819 02:02:29,660 --> 02:02:32,380 So it's not as though we've created a problem there. 2820 02:02:32,380 --> 02:02:34,052 That, too, is still possible. 2821 02:02:34,052 --> 02:02:36,760 But at the end of the day, it only takes a finite number of steps 2822 02:02:36,760 --> 02:02:38,410 to find any of these people. 2823 02:02:38,410 --> 02:02:41,320 And again, that's what's particularly compelling. 2824 02:02:41,320 --> 02:02:43,398 That you effectively have constant time lookup. 2825 02:02:43,398 --> 02:02:44,440 So that's amazing, right. 2826 02:02:44,440 --> 02:02:48,153 We've gone through this whole story for weeks now of like, linear time. 2827 02:02:48,153 --> 02:02:49,570 And then, it went up to n squared. 2828 02:02:49,570 --> 02:02:50,350 And then, log n. 2829 02:02:50,350 --> 02:02:55,430 And now constant time, what's the price paid for a data structure like this? 2830 02:02:55,430 --> 02:02:58,630 This so-called trie? 2831 02:02:58,630 --> 02:02:59,810 What's the downside here? 2832 02:02:59,810 --> 02:03:01,540 There's got to be a catch. 2833 02:03:01,540 --> 02:03:03,970 And in fact, tries are not actually used that often, 2834 02:03:03,970 --> 02:03:07,500 amazing as they might sound on some CS level here. 2835 02:03:07,500 --> 02:03:08,260 AUDIENCE: Memory. 2836 02:03:08,260 --> 02:03:09,520 SPEAKER 1: Memory. 2837 02:03:09,520 --> 02:03:10,735 In what sense? 2838 02:03:10,735 --> 02:03:12,898 AUDIENCE: Much like a [INAUDIBLE]. 2839 02:03:12,898 --> 02:03:13,690 SPEAKER 1: Exactly. 2840 02:03:13,690 --> 02:03:15,610 If you're storing all of these darn arrays 2841 02:03:15,610 --> 02:03:18,870 it's, again, a sparsely populated data structure. 2842 02:03:18,870 --> 02:03:19,870 And you can see it here. 2843 02:03:19,870 --> 02:03:23,800 Granted there's only 3 names, but most of those boxes, most of those pointers, 2844 02:03:23,800 --> 02:03:25,490 are going to remain null. 2845 02:03:25,490 --> 02:03:28,540 So this is an incredibly wide data structure, if you will. 2846 02:03:28,540 --> 02:03:31,040 It uses a huge amount of memory to store the names. 2847 02:03:31,040 --> 02:03:32,860 But again, you've got to pick a lane. 2848 02:03:32,860 --> 02:03:35,980 Either you're going to minimize space or you're going to minimize time. 2849 02:03:35,980 --> 02:03:39,240 It's not really possible to get truly the best of both worlds. 2850 02:03:39,240 --> 02:03:41,290 You have to decide where the inflection point is 2851 02:03:41,290 --> 02:03:44,110 for the device you're writing software for, how much memory it has, 2852 02:03:44,110 --> 02:03:45,460 how expensive it is. 2853 02:03:45,460 --> 02:03:48,980 And again, taking all of these things into account. 2854 02:03:48,980 --> 02:03:51,400 So lastly, let's do one further abstraction. 2855 02:03:51,400 --> 02:03:54,910 So even higher level to discuss something that are generally 2856 02:03:54,910 --> 02:03:56,962 known as abstract data structures. 2857 02:03:56,962 --> 02:03:58,670 It turns out we could spend like all day, 2858 02:03:58,670 --> 02:04:00,250 all week, talking about different things we 2859 02:04:00,250 --> 02:04:01,700 could build with these data structures. 2860 02:04:01,700 --> 02:04:03,658 But for the most part, now that we have arrays. 2861 02:04:03,658 --> 02:04:06,430 Now that we have linked lists or their cousin's trees, which 2862 02:04:06,430 --> 02:04:07,428 are 2-dimensional. 2863 02:04:07,428 --> 02:04:09,220 And beyond that, there's even graphs, where 2864 02:04:09,220 --> 02:04:12,407 the arrows can go in multiple directions, not just down, so to speak. 2865 02:04:12,407 --> 02:04:14,740 Now that we have this ability to stitch things together, 2866 02:04:14,740 --> 02:04:16,790 we can solve all different types of problems. 2867 02:04:16,790 --> 02:04:20,740 So, for instance, a very common type of data structure 2868 02:04:20,740 --> 02:04:24,730 to use in a program, or even our human world, are things called queues. 2869 02:04:24,730 --> 02:04:28,780 A queue being a data structure like a line outside of a store. 2870 02:04:28,780 --> 02:04:30,850 Where it has what's called a FIFO property. 2871 02:04:30,850 --> 02:04:32,240 First In, First Out. 2872 02:04:32,240 --> 02:04:34,660 Which is great for fairness, at least in the human world. 2873 02:04:34,660 --> 02:04:38,800 And if you've ever waited outside of Tasty Burger, or Salsa Fresca, 2874 02:04:38,800 --> 02:04:40,990 or some other restaurant nearby, presumably, 2875 02:04:40,990 --> 02:04:43,780 if you're queuing up at the counter, you want 2876 02:04:43,780 --> 02:04:46,270 them store to maintain a FIFO system. 2877 02:04:46,270 --> 02:04:47,530 First in and first out. 2878 02:04:47,530 --> 02:04:51,160 So that whoever's first in line gets their food first and gets out first. 2879 02:04:51,160 --> 02:04:54,710 So a queue is actually a computer science term, too. 2880 02:04:54,710 --> 02:04:57,460 And even if you're still in the habit of printing things on paper, 2881 02:04:57,460 --> 02:04:59,710 there are things you might have heard called printer 2882 02:04:59,710 --> 02:05:02,050 queues, which also do things in order. 2883 02:05:02,050 --> 02:05:04,467 The first person to send their essay to the printer 2884 02:05:04,467 --> 02:05:06,550 should, ideally, be printed before the last person 2885 02:05:06,550 --> 02:05:08,920 to send their essay to the printer. 2886 02:05:08,920 --> 02:05:10,720 Again, in the interest of fairness. 2887 02:05:10,720 --> 02:05:12,370 But how can you implement a queue? 2888 02:05:12,370 --> 02:05:15,250 Well, you typically have to implement 2 fundamental operations, 2889 02:05:15,250 --> 02:05:16,810 enqueue and dequeue. 2890 02:05:16,810 --> 02:05:19,910 So adding something to it and removing something from it. 2891 02:05:19,910 --> 02:05:23,650 And the interesting thing here is that how do you implement a queue? 2892 02:05:23,650 --> 02:05:26,650 Well in the human world, you would just have literally physical space 2893 02:05:26,650 --> 02:05:29,290 for humans to line up from left to right, or right to left. 2894 02:05:29,290 --> 02:05:30,333 Same in a computer. 2895 02:05:30,333 --> 02:05:33,250 Like a printer queue, if you send a whole bunch of jobs to be printed, 2896 02:05:33,250 --> 02:05:35,350 a whole bunch of essays or documents, well, you 2897 02:05:35,350 --> 02:05:37,430 need a chunk of memory like an array. 2898 02:05:37,430 --> 02:05:37,930 All right. 2899 02:05:37,930 --> 02:05:40,150 Well, if you use an array, what's a problem 2900 02:05:40,150 --> 02:05:43,760 that could happen in the world of printing, for instance? 2901 02:05:43,760 --> 02:05:47,020 If you use an array to store all of the documents that need to be printed. 2902 02:05:47,020 --> 02:05:48,178 AUDIENCE: It can be filled. 2903 02:05:48,178 --> 02:05:49,720 SPEAKER 1: It could be filled, right. 2904 02:05:49,720 --> 02:05:53,020 So if the programmer decided, HP or whoever makes the printer decides, 2905 02:05:53,020 --> 02:05:56,680 oh, you can send like a megabyte worth of documents to this printer at once. 2906 02:05:56,680 --> 02:05:58,730 At some point you might get an error message, 2907 02:05:58,730 --> 02:06:00,100 which says, sorry out of memory. 2908 02:06:00,100 --> 02:06:00,995 Wait a few minutes. 2909 02:06:00,995 --> 02:06:03,370 Which is maybe a reasonable solution, but a little annoy. 2910 02:06:03,370 --> 02:06:07,000 Or HP could write code that maybe dynamically resizes the array 2911 02:06:07,000 --> 02:06:07,670 or so forth. 2912 02:06:07,670 --> 02:06:10,240 But at that point, maybe they should just use a linked list. 2913 02:06:10,240 --> 02:06:11,170 And they could. 2914 02:06:11,170 --> 02:06:14,890 So there, too, you could implement the notion of a queue 2915 02:06:14,890 --> 02:06:16,238 using a linked list instead. 2916 02:06:16,238 --> 02:06:18,280 You're going to spend more memory, but you're not 2917 02:06:18,280 --> 02:06:20,650 going to run out of space in your array. 2918 02:06:20,650 --> 02:06:22,493 Which might be more compelling. 2919 02:06:22,493 --> 02:06:24,160 This happens even in the physical world. 2920 02:06:24,160 --> 02:06:27,640 You go to the store and you start having to line up outside and down the road. 2921 02:06:27,640 --> 02:06:31,927 And like, for a really busy store, they run out of space so they make do. 2922 02:06:31,927 --> 02:06:34,510 But in that case, it tends to be more of an array just because 2923 02:06:34,510 --> 02:06:36,965 of the physical notion of humans lining up. 2924 02:06:36,965 --> 02:06:38,590 But there's other data structures, too. 2925 02:06:38,590 --> 02:06:41,715 If you've ever gone to the dining hall and picked up like a Harvard or Yale 2926 02:06:41,715 --> 02:06:46,870 tray, you're typically picking up the last tray that was just cleaned, 2927 02:06:46,870 --> 02:06:48,730 not the first tray that was cleaned. 2928 02:06:48,730 --> 02:06:49,240 Why? 2929 02:06:49,240 --> 02:06:53,170 Because these cafeteria trays stack up on top of each other. 2930 02:06:53,170 --> 02:06:56,410 And indeed a stack is another type of abstract data structure. 2931 02:06:56,410 --> 02:06:58,870 In the physical world, it's literally something physical 2932 02:06:58,870 --> 02:07:01,030 like a stack of trays. 2933 02:07:01,030 --> 02:07:03,940 Which have what we would call a LIFO property. 2934 02:07:03,940 --> 02:07:05,460 Last In, First Out. 2935 02:07:05,460 --> 02:07:07,210 So as these things come out of the washer, 2936 02:07:07,210 --> 02:07:09,520 they're putting the most recent ones on the top. 2937 02:07:09,520 --> 02:07:13,240 And then you, the human, are probably taking the most recently cleaned one. 2938 02:07:13,240 --> 02:07:15,700 Which means in the extreme, no one on campus 2939 02:07:15,700 --> 02:07:19,135 might ever use that very first tray. 2940 02:07:19,135 --> 02:07:21,010 Which is probably fine in the world of trays, 2941 02:07:21,010 --> 02:07:24,970 but would really be bad in the world of Tasty Burger lining up for food if LIFO 2942 02:07:24,970 --> 02:07:26,770 were the property being implemented. 2943 02:07:26,770 --> 02:07:28,840 But here, too, it could be an array. 2944 02:07:28,840 --> 02:07:29,950 It could be a linked list. 2945 02:07:29,950 --> 02:07:31,533 And you see this, honestly, every day. 2946 02:07:31,533 --> 02:07:33,760 If you're using Gmail and your Gmail inbox. 2947 02:07:33,760 --> 02:07:36,280 That is actually a stack, at least by default, 2948 02:07:36,280 --> 02:07:39,678 where your newest message last in are the first ones 2949 02:07:39,678 --> 02:07:40,720 at the top of the screen. 2950 02:07:40,720 --> 02:07:42,580 That's a LIFO data structure. 2951 02:07:42,580 --> 02:07:44,710 And it means that you see your most recent emails. 2952 02:07:44,710 --> 02:07:47,168 But if you have a busy day, you're getting a lot of emails, 2953 02:07:47,168 --> 02:07:48,430 it might not be a good thing. 2954 02:07:48,430 --> 02:07:50,830 Because now you're ignoring the people who wrote you 2955 02:07:50,830 --> 02:07:53,140 way earlier in the day or the week. 2956 02:07:53,140 --> 02:07:55,600 So LIFO and FIFO are just properties that you 2957 02:07:55,600 --> 02:07:58,360 can achieve with these very specific types of data structures. 2958 02:07:58,360 --> 02:08:00,110 And the parliaments in the world of stacks 2959 02:08:00,110 --> 02:08:03,970 is to push something onto a stack or pop something out. 2960 02:08:03,970 --> 02:08:06,160 These are here, for instance, as an example of why 2961 02:08:06,160 --> 02:08:07,450 might you always wear the same color. 2962 02:08:07,450 --> 02:08:09,710 Well, if you're storing all of your clothes in a stack, 2963 02:08:09,710 --> 02:08:11,530 you might not ever get to the different colored 2964 02:08:11,530 --> 02:08:12,970 clothes at the bottom of the list. 2965 02:08:12,970 --> 02:08:17,890 And in fact, to paint this picture, we have a couple of minute video here. 2966 02:08:17,890 --> 02:08:20,890 Just to paint this here, made by a faculty member elsewhere. 2967 02:08:20,890 --> 02:08:23,830 Let's go ahead and dim the lights for just a minute or 2 here. 2968 02:08:23,830 --> 02:08:27,985 So that we can take a look at Jack learning some facts. 2969 02:08:27,985 --> 02:08:28,610 [VIDEO PLAYING] 2970 02:08:28,610 --> 02:08:31,360 SPEAKER 2: Once upon a time, there was a guy named Jack. 2971 02:08:31,360 --> 02:08:34,750 When it came to making friends Jack did not have the knack. 2972 02:08:34,750 --> 02:08:37,720 So Jack went to talk to the most popular guy he knew. 2973 02:08:37,720 --> 02:08:40,390 He went up to Lou and asked, what do I do? 2974 02:08:40,390 --> 02:08:42,850 Lou saw that his friend was really distressed. 2975 02:08:42,850 --> 02:08:45,560 Well, Lou began, just look how you're dressed. 2976 02:08:45,560 --> 02:08:48,130 Don't you have any clothes with a different look? 2977 02:08:48,130 --> 02:08:49,210 Yes, said Jack. 2978 02:08:49,210 --> 02:08:50,530 I sure do. 2979 02:08:50,530 --> 02:08:52,720 Come to my house and I'll showed them to you. 2980 02:08:52,720 --> 02:08:54,010 So they went off the Jack's. 2981 02:08:54,010 --> 02:08:57,700 And Jack showed Lou the box, where he kept all his shirts, and his pants, 2982 02:08:57,700 --> 02:08:58,750 at his socks. 2983 02:08:58,750 --> 02:09:01,720 Lou said, I see you have all your clothes in a pile. 2984 02:09:01,720 --> 02:09:04,300 Why don't you wear some others once in a while? 2985 02:09:04,300 --> 02:09:07,450 Jack said, well, when I remove clothes and socks, 2986 02:09:07,450 --> 02:09:10,180 I wash them and put them away in the box. 2987 02:09:10,180 --> 02:09:12,670 Then comes the next morning and up I hop. 2988 02:09:12,670 --> 02:09:15,910 I go to the box and get my clothes off the top. 2989 02:09:15,910 --> 02:09:18,520 Lou quickly realized the problem with Jack. 2990 02:09:18,520 --> 02:09:21,490 He kept clothes, CDs, and books in a stack. 2991 02:09:21,490 --> 02:09:23,920 When he'd reached for something to read or to wear, 2992 02:09:23,920 --> 02:09:26,530 he chose a top book or underwear. 2993 02:09:26,530 --> 02:09:28,920 Then when he was done he would put it right back. 2994 02:09:28,920 --> 02:09:31,500 Back it would go on top of the stack. 2995 02:09:31,500 --> 02:09:33,870 I know the solution, said a triumphant Lou. 2996 02:09:33,870 --> 02:09:36,510 You need to learn to start using a queue. 2997 02:09:36,510 --> 02:09:39,300 Lou took Jack's clothes and hung them in a closet. 2998 02:09:39,300 --> 02:09:42,120 And when he had emptied the box, he just tossed it. 2999 02:09:42,120 --> 02:09:45,990 Then he said, now Jack, at the end of the day, put your clothes on the left 3000 02:09:45,990 --> 02:09:47,470 when you put them away. 3001 02:09:47,470 --> 02:09:50,190 Then tomorrow morning when you see the sunshine, get 3002 02:09:50,190 --> 02:09:52,920 your clothes from the right, from the end of the line. 3003 02:09:52,920 --> 02:09:55,800 Don't you see, said Lou, it will be so nice. 3004 02:09:55,800 --> 02:09:59,130 You'll wear everything once before you wear something twice. 3005 02:09:59,130 --> 02:10:02,070 And with everything in queues in his closet and shelf, 3006 02:10:02,070 --> 02:10:04,680 Jack started to feel quite sure of himself. 3007 02:10:04,680 --> 02:10:07,155 All thanks to Lou and his wonderful queue. 3008 02:10:09,220 --> 02:10:12,220 SPEAKER 1: So just to help you realize that these things are everywhere. 3009 02:10:12,220 --> 02:10:14,830 [AUDIENCE CLAPPING] 3010 02:10:14,830 --> 02:10:16,380 Even in our human world. 3011 02:10:16,380 --> 02:10:18,060 If you've ever lined up at this place. 3012 02:10:18,060 --> 02:10:19,980 Anyone recognize this? 3013 02:10:19,980 --> 02:10:22,800 OK, so sweetgreen, little salad place in the square. 3014 02:10:22,800 --> 02:10:24,690 This is if you order online or in advance, 3015 02:10:24,690 --> 02:10:27,232 your food ends up according to the first letter in your name. 3016 02:10:27,232 --> 02:10:29,482 Which actually sounds awfully reminiscent of something 3017 02:10:29,482 --> 02:10:30,300 like a hash table. 3018 02:10:30,300 --> 02:10:33,360 And in fact, no matter whether you implement a hash table like we 3019 02:10:33,360 --> 02:10:35,130 did, with an array and linked list. 3020 02:10:35,130 --> 02:10:37,335 Or with 3 shelves like this. 3021 02:10:37,335 --> 02:10:40,320 This is actually an abstract data type called a dictionary. 3022 02:10:40,320 --> 02:10:43,680 And a dictionary, just like in our human world, has keys and values. 3023 02:10:43,680 --> 02:10:45,390 Words and their definitions. 3024 02:10:45,390 --> 02:10:49,890 This just has letters of the alphabet and salads as their value. 3025 02:10:49,890 --> 02:10:52,260 But here, too, there's a real world constraint. 3026 02:10:52,260 --> 02:10:55,740 In what kind of scenario does this system at sweetgreen 3027 02:10:55,740 --> 02:10:58,410 devolve into a problem, for instance? 3028 02:10:58,410 --> 02:11:02,100 Because they, too, are using only finite space, finite storage. 3029 02:11:02,100 --> 02:11:03,090 What could go wrong? 3030 02:11:03,090 --> 02:11:03,360 Yeah. 3031 02:11:03,360 --> 02:11:04,290 AUDIENCE: Run out of space. 3032 02:11:04,290 --> 02:11:04,530 SPEAKER 1: Yeah. 3033 02:11:04,530 --> 02:11:05,910 If they run out of space on the shelf and there's 3034 02:11:05,910 --> 02:11:08,380 a lot of people whose names start with D, or E, or whatever. 3035 02:11:08,380 --> 02:11:09,300 And so, they just pile up. 3036 02:11:09,300 --> 02:11:11,880 And then, maybe, they kind of overflow into the E's or the F's. 3037 02:11:11,880 --> 02:11:13,800 And they probably don't really care because any human 3038 02:11:13,800 --> 02:11:16,290 is going to come by, and just eyeball it, and figure it out anyway. 3039 02:11:16,290 --> 02:11:18,780 But in the world of a computer, you're the one coding 3040 02:11:18,780 --> 02:11:20,670 and have to be ever so precise. 3041 02:11:20,670 --> 02:11:24,240 We thought we would lastly do one final thing here. 3042 02:11:24,240 --> 02:11:28,045 In advance, we prepared a linked list of sorts in the audience. 3043 02:11:28,045 --> 02:11:29,670 Since this has become a bit of a thing. 3044 02:11:29,670 --> 02:11:32,530 I am starting to represent the beginning of this linked list. 3045 02:11:32,530 --> 02:11:37,110 And so far as I have a pointer here with seat location G9. 3046 02:11:37,110 --> 02:11:40,500 Whoever is in G9, would you mind standing up? 3047 02:11:40,500 --> 02:11:43,170 And what letter is on your sheet there? 3048 02:11:43,170 --> 02:11:44,100 AUDIENCE: F15. 3049 02:11:44,100 --> 02:11:46,650 SPEAKER 1: OK, so you have S15 and your letter-- 3050 02:11:46,650 --> 02:11:47,305 AUDIENCE: F15. 3051 02:11:47,305 --> 02:11:48,180 SPEAKER 1: Say again? 3052 02:11:48,180 --> 02:11:48,870 AUDIENCE: F. 3053 02:11:48,870 --> 02:11:49,680 SPEAKER 1: F15. 3054 02:11:49,680 --> 02:11:51,990 So I see you're holding a C in your node. 3055 02:11:51,990 --> 02:11:55,500 You are pointing to, if you could physically, F15. 3056 02:11:55,500 --> 02:11:56,880 F15, what do you hold? 3057 02:11:56,880 --> 02:11:57,780 AUDIENCE: S. 3058 02:11:57,780 --> 02:12:00,390 SPEAKER 1: You have an S. And who should you be pointing at? 3059 02:12:00,390 --> 02:12:01,170 AUDIENCE: F5. 3060 02:12:01,170 --> 02:12:01,930 SPEAKER 1: F5. 3061 02:12:01,930 --> 02:12:03,240 Could you stand up, F5. 3062 02:12:03,240 --> 02:12:04,950 You're holding a 5, I see. 3063 02:12:04,950 --> 02:12:06,030 What address? 3064 02:12:06,030 --> 02:12:07,020 AUDIENCE: F12. 3065 02:12:07,020 --> 02:12:08,040 SPEAKER 1: F12. 3066 02:12:08,040 --> 02:12:08,820 Big finale. 3067 02:12:08,820 --> 02:12:13,020 F12, if you'd like to stand up holding a 0 and null, which means that was CS50. 3068 02:12:13,020 --> 02:12:16,540 [AUDIENCE CLAPPING] 3069 02:12:16,540 --> 02:12:17,040 All right. 3070 02:12:17,040 --> 02:12:19,340 We'll see you next time. 3071 02:12:19,340 --> 02:12:54,000 [MUSIC PLAYING] 247103

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.