All language subtitles for 2022_lecture6-720p-en

af Afrikaans
ak Akan
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bn Bengali
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fr French
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
ko Korean
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranî)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
fa Persian
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal)
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es Spanish
es-419 Spanish (Latin American)
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian Download
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 0 00:00:00,000 --> 00:00:05,988 [MUSIC PLAYING] 1 00:00:05,988 --> 00:01:17,990 2 00:01:17,990 --> 00:01:21,800 DAVID J. MALAN: All right, this is CS50, and this is already week 6. 3 00:01:21,800 --> 00:01:24,458 And this is the week in which you learn yet another language. 4 00:01:24,458 --> 00:01:26,750 But the goal is not just to teach you another language, 5 00:01:26,750 --> 00:01:29,480 for languages sake, as we transition today 6 00:01:29,480 --> 00:01:32,780 and in the coming weeks from C, where we've spent the past several weeks, now 7 00:01:32,780 --> 00:01:33,440 to Python. 8 00:01:33,440 --> 00:01:37,530 The goal ultimately is to teach you all how to teach yourselves new languages, 9 00:01:37,530 --> 00:01:40,020 so that by the end of this course, it's not in your mind, 10 00:01:40,020 --> 00:01:42,710 the fact that you learned how to program in C 11 00:01:42,710 --> 00:01:44,960 or learned some weeks back how to program in Scratch, 12 00:01:44,960 --> 00:01:48,170 but really how you learned how to program fundamentally, 13 00:01:48,170 --> 00:01:50,630 in a paradigm known as procedural programming, 14 00:01:50,630 --> 00:01:53,450 as well as with some taste today, and in the weeks to come, 15 00:01:53,450 --> 00:01:55,310 of other aspects of programming languages, 16 00:01:55,310 --> 00:01:58,010 like object-oriented programming, and more. 17 00:01:58,010 --> 00:02:00,180 So recall, though, back in week zero, Hello, world 18 00:02:00,180 --> 00:02:01,680 looked a little something like this. 19 00:02:01,680 --> 00:02:03,387 And the world was quite simple. 20 00:02:03,387 --> 00:02:05,720 All you had to do was drag and drop these puzzle pieces. 21 00:02:05,720 --> 00:02:08,960 But there were still functions and conditionals and loops and variables 22 00:02:08,960 --> 00:02:11,030 and all of those kinds of primitives. 23 00:02:11,030 --> 00:02:14,300 We then transitioned, of course, to a much more arcane language that 24 00:02:14,300 --> 00:02:15,840 looked a little something like this. 25 00:02:15,840 --> 00:02:17,798 And even now, some weeks later, you might still 26 00:02:17,798 --> 00:02:20,470 be struggling with some of the syntax or getting annoying bugs 27 00:02:20,470 --> 00:02:22,970 when you try to compile your code, and it just doesn't work. 28 00:02:22,970 --> 00:02:24,800 But there, too, the past few weeks, we've 29 00:02:24,800 --> 00:02:28,130 been focusing on functions and loops and variables, conditionals, and really 30 00:02:28,130 --> 00:02:29,550 all of those same ideas. 31 00:02:29,550 --> 00:02:33,710 And so what we begin to do today is to, one, simplify the language 32 00:02:33,710 --> 00:02:38,840 we're using, transitioning from C now to Python, this now being the equivalent 33 00:02:38,840 --> 00:02:42,200 program in Python, and look at its relative simplicity, 34 00:02:42,200 --> 00:02:43,940 but also transitioning to look at how you 35 00:02:43,940 --> 00:02:45,800 can implement these same kinds of features, 36 00:02:45,800 --> 00:02:47,430 just using a different language. 37 00:02:47,430 --> 00:02:49,250 So we're going to see a lot of code today. 38 00:02:49,250 --> 00:02:53,150 And you won't have nearly as much practice with Python as you did with C. 39 00:02:53,150 --> 00:02:56,210 But that's because so many of the ideas are still going to be with us. 40 00:02:56,210 --> 00:02:58,580 And, really, it's going to be a process of figuring out, all right, 41 00:02:58,580 --> 00:02:59,413 I want to do a loop. 42 00:02:59,413 --> 00:03:01,760 I know how to do it in C. How do I do this in Python? 43 00:03:01,760 --> 00:03:02,990 How do I do the same with conditionals? 44 00:03:02,990 --> 00:03:04,710 How do I declare variables, and the like, 45 00:03:04,710 --> 00:03:07,460 and moving forward, not just in CS50, but in life in general, 46 00:03:07,460 --> 00:03:10,760 if you continue programming and learn some other language after the class, 47 00:03:10,760 --> 00:03:14,270 if in 5-10 years, there's a new, more popular language that you pick up, 48 00:03:14,270 --> 00:03:16,520 it's just going to be a matter of googling and looking 49 00:03:16,520 --> 00:03:18,410 at websites like Stack Overflow and the like, 50 00:03:18,410 --> 00:03:21,350 to look at just basic building blocks of programming languages, 51 00:03:21,350 --> 00:03:24,680 because you already speak, after these past 6 plus weeks, 52 00:03:24,680 --> 00:03:27,500 you already speak programming itself fundamentally. 53 00:03:27,500 --> 00:03:31,070 All right, so let's do a few quick comparisons, left and right, of what 54 00:03:31,070 --> 00:03:32,960 something might have looked like in Scratch, 55 00:03:32,960 --> 00:03:34,820 and what it then looked like in C, but now, 56 00:03:34,820 --> 00:03:36,770 as of today, what it's going to look like in Python. 57 00:03:36,770 --> 00:03:38,853 Then we'll turn our attention to the command line, 58 00:03:38,853 --> 00:03:42,510 ultimately, in order to implement some actual programs. 59 00:03:42,510 --> 00:03:45,740 So in Scratch, we had functions like this, say Hello, 60 00:03:45,740 --> 00:03:47,270 world, a verb or an action. 61 00:03:47,270 --> 00:03:49,740 In C it looked a little something like this, 62 00:03:49,740 --> 00:03:53,150 and a bit of a cryptic mess the first week, you had the printf, 63 00:03:53,150 --> 00:03:54,290 you had the double quotes. 64 00:03:54,290 --> 00:03:55,980 You had the semicolon, the parentheses. 65 00:03:55,980 --> 00:03:58,423 So there's a lot more syntax just to do the same thing. 66 00:03:58,423 --> 00:04:01,340 We're not going to get rid of all of that syntax now, but as of today, 67 00:04:01,340 --> 00:04:05,580 in Python, that same statement is going to look a little something like this. 68 00:04:05,580 --> 00:04:07,640 And just to perhaps call out the obvious, what 69 00:04:07,640 --> 00:04:12,050 is different or, now, simpler in Python versus C, even 70 00:04:12,050 --> 00:04:13,640 in this simple example here? 71 00:04:13,640 --> 00:04:14,545 Yeah. 72 00:04:14,545 --> 00:04:17,420 AUDIENCE: Now print, instead of printf would be, something like that. 73 00:04:17,420 --> 00:04:19,837 DAVID J. MALAN: Good, so it's now print instead of printf. 74 00:04:19,837 --> 00:04:21,110 And there's also no semicolon. 75 00:04:21,110 --> 00:04:23,103 And there's one other subtlety, over here. 76 00:04:23,103 --> 00:04:24,020 AUDIENCE: No new line. 77 00:04:24,020 --> 00:04:25,640 DAVID J. MALAN: Yeah, so no new line, and that 78 00:04:25,640 --> 00:04:27,110 doesn't mean it's not going to be printed. 79 00:04:27,110 --> 00:04:29,402 It just turns out that one of the differences we'll see 80 00:04:29,402 --> 00:04:31,640 is that, with print, you get the new line for free. 81 00:04:31,640 --> 00:04:34,950 It automatically gets outputted by default, being sort of a common case. 82 00:04:34,950 --> 00:04:37,190 But you can override it, we'll see, ultimately, too. 83 00:04:37,190 --> 00:04:38,300 How about in Scratch? 84 00:04:38,300 --> 00:04:42,082 We had multiple functions like this, that not only said something 85 00:04:42,082 --> 00:04:43,790 on the screen, but also asked a question, 86 00:04:43,790 --> 00:04:47,300 thereby being another function that returned a value, called answer. 87 00:04:47,300 --> 00:04:49,730 In C we saw code that looked a little something 88 00:04:49,730 --> 00:04:53,420 like this, whereby that first line declares a variable called answer, 89 00:04:53,420 --> 00:04:55,790 sets it equal to the return value of getString, 90 00:04:55,790 --> 00:04:57,740 one of the functions from the CS50 library, 91 00:04:57,740 --> 00:05:00,980 and then the same double quotes and parentheses and semicolon. 92 00:05:00,980 --> 00:05:05,390 Then we had this format code in C that allowed us, with %S, 93 00:05:05,390 --> 00:05:07,760 to actually print out that same value. 94 00:05:07,760 --> 00:05:10,400 In Python, this, too, is going to look a little bit simpler. 95 00:05:10,400 --> 00:05:13,460 Instead, we're going to have answer equals getString, 96 00:05:13,460 --> 00:05:16,070 quote unquote "What's your name," and then print, 97 00:05:16,070 --> 00:05:18,870 with a plus sign and a little bit of new syntax. 98 00:05:18,870 --> 00:05:21,650 But let's see if we can't just infer from this example what 99 00:05:21,650 --> 00:05:22,860 it is that's going on. 100 00:05:22,860 --> 00:05:25,670 Well, first missing on the left is what? 101 00:05:25,670 --> 00:05:28,620 To the left of the equal sign, there's no what this time? 102 00:05:28,620 --> 00:05:29,870 Feel free to just call it out. 103 00:05:29,870 --> 00:05:30,690 AUDIENCE: Type. 104 00:05:30,690 --> 00:05:31,460 DAVID J. MALAN: So there's no type. 105 00:05:31,460 --> 00:05:33,770 There's no type, like the word string, which 106 00:05:33,770 --> 00:05:38,090 even though that was a type in CS50, every other variable in C 107 00:05:38,090 --> 00:05:41,437 did we use Int or string or float, or Bool or something else. 108 00:05:41,437 --> 00:05:43,520 In Python, there are still going to be data types, 109 00:05:43,520 --> 00:05:45,980 today onward, but you, the programmer, don't 110 00:05:45,980 --> 00:05:49,042 have to bother telling the computer what types you're using. 111 00:05:49,042 --> 00:05:50,750 The computer is going to be smart enough, 112 00:05:50,750 --> 00:05:53,240 the language, really, is going to be smart enough, to just figure it out 113 00:05:53,240 --> 00:05:54,260 from context. 114 00:05:54,260 --> 00:05:56,150 Meanwhile, on the right hand side, getString 115 00:05:56,150 --> 00:05:57,858 is going to be a function we'll use today 116 00:05:57,858 --> 00:06:01,320 and this week, which comes from a Python version of the CS50 library. 117 00:06:01,320 --> 00:06:04,370 But we'll also start to take off those training wheels, so that you'll 118 00:06:04,370 --> 00:06:07,670 see how to do things without any CS50 library moving forward, 119 00:06:07,670 --> 00:06:09,290 using a different function instead. 120 00:06:09,290 --> 00:06:12,920 As before, no semicolon, but the rest of the syntax is pretty much the same 121 00:06:12,920 --> 00:06:13,430 here. 122 00:06:13,430 --> 00:06:16,013 This starts, of course, to get a little bit different, though. 123 00:06:16,013 --> 00:06:17,650 We're using print instead of printf. 124 00:06:17,650 --> 00:06:20,860 But now, even though this looks a little cryptic, 125 00:06:20,860 --> 00:06:23,110 perhaps, if you've never programmed before CS50, 126 00:06:23,110 --> 00:06:27,130 what might that plus be doing, just based on inference here. 127 00:06:27,130 --> 00:06:27,880 What do you think? 128 00:06:27,880 --> 00:06:31,720 AUDIENCE: Adding answer to the string Hello. 129 00:06:31,720 --> 00:06:34,990 DAVID J. MALAN: Yeah, so adding answer to the string Hello, 130 00:06:34,990 --> 00:06:37,030 and adding, so to speak, not mathematically, 131 00:06:37,030 --> 00:06:39,580 but in the form of joining them together, much like we 132 00:06:39,580 --> 00:06:43,040 saw the joined block in Scratch, or concatenation was the term of art 133 00:06:43,040 --> 00:06:43,540 there. 134 00:06:43,540 --> 00:06:46,810 This plus sign appends, if you will, whatever's 135 00:06:46,810 --> 00:06:48,625 in answer to whatever is quoted here. 136 00:06:48,625 --> 00:06:51,250 And I deliberately left a space there, so that grammatically it 137 00:06:51,250 --> 00:06:53,422 looks nice, after the comma as well. 138 00:06:53,422 --> 00:06:54,880 Now there's another way to do this. 139 00:06:54,880 --> 00:06:57,130 And it, too, is going to look cryptic at first glance. 140 00:06:57,130 --> 00:06:59,510 But it just gets easier and more convenient over time. 141 00:06:59,510 --> 00:07:04,580 You can also change this second line to be this, instead. 142 00:07:04,580 --> 00:07:05,770 So what's going on here. 143 00:07:05,770 --> 00:07:08,710 This is actually a relatively new feature of Python in the past couple 144 00:07:08,710 --> 00:07:11,020 of years, where now what you're seeing is, yes, 145 00:07:11,020 --> 00:07:13,580 a string, between these same double quotes, 146 00:07:13,580 --> 00:07:17,075 but this is what Python would call a format string, or Fstring. 147 00:07:17,075 --> 00:07:20,200 And it literally starts with the letter F, which admittedly looks, I think, 148 00:07:20,200 --> 00:07:20,980 a little weird. 149 00:07:20,980 --> 00:07:24,700 But that just indicates that Python should 150 00:07:24,700 --> 00:07:29,110 assume that anything inside of curly braces inside of the string 151 00:07:29,110 --> 00:07:32,560 should be interpolated, so to speak, which is a fancy term saying, 152 00:07:32,560 --> 00:07:36,160 substitute the value of any variables therein. 153 00:07:36,160 --> 00:07:38,030 And it can do some other things as well. 154 00:07:38,030 --> 00:07:42,040 So answer is a variable, declared, of course, on this first line. 155 00:07:42,040 --> 00:07:46,300 This Fstring, then, says to Python, print out Hello comma space, and then 156 00:07:46,300 --> 00:07:47,950 the value of Answer. 157 00:07:47,950 --> 00:07:52,390 If, by contrast, you omitted the curly braces, 158 00:07:52,390 --> 00:07:54,040 just take a guess, what would happen? 159 00:07:54,040 --> 00:07:56,920 What would the symptom of that bug be, if you accidentally 160 00:07:56,920 --> 00:08:00,010 forgot the curly braces, but maybe still had the F there? 161 00:08:00,010 --> 00:08:01,750 AUDIENCE: It would print below it, too. 162 00:08:01,750 --> 00:08:04,300 DAVID J. MALAN: Yeah, it would literally print Hello, comma answer, because it's 163 00:08:04,300 --> 00:08:05,200 going to take you literally. 164 00:08:05,200 --> 00:08:07,690 So the curly braces just kind of allow you to plug things in. 165 00:08:07,690 --> 00:08:09,350 And, again, it looks a little more cryptic, 166 00:08:09,350 --> 00:08:11,267 but it's just going to save us time over time. 167 00:08:11,267 --> 00:08:14,120 And if any of you programmed in Java in high school, for instance, 168 00:08:14,120 --> 00:08:16,630 you saw plus in that context, too, for concatenation. 169 00:08:16,630 --> 00:08:19,755 This just kind of makes your code a little tighter, a little more succinct. 170 00:08:19,755 --> 00:08:21,730 So it's a convenient feature now in Python. 171 00:08:21,730 --> 00:08:24,190 All right, this was an example in Scratch of a variable, 172 00:08:24,190 --> 00:08:26,740 setting a variable like counter equal to 0. 173 00:08:26,740 --> 00:08:30,460 In C it looked like this, where you specify the type, the name, 174 00:08:30,460 --> 00:08:32,230 and then the value, with a semicolon. 175 00:08:32,230 --> 00:08:35,096 In Python, it's going to look like this. 176 00:08:35,096 --> 00:08:36,429 And I'll state the obvious here. 177 00:08:36,429 --> 00:08:39,340 You don't need to mention the type, just like before with string. 178 00:08:39,340 --> 00:08:41,030 And you don't need a semicolon. 179 00:08:41,030 --> 00:08:42,130 So it's a little simpler. 180 00:08:42,130 --> 00:08:45,005 If you want a variable, just write it and set it equal to some value. 181 00:08:45,005 --> 00:08:48,070 But the single equal sign still behaves the same as in C. 182 00:08:48,070 --> 00:08:50,440 Suppose we wanted to increment counter by one. 183 00:08:50,440 --> 00:08:52,750 In Scratch, we use this puzzle piece here. 184 00:08:52,750 --> 00:08:55,250 In C, we could do this, actually, in a few different ways. 185 00:08:55,250 --> 00:08:57,400 There was this way, if counter already exists, 186 00:08:57,400 --> 00:08:59,980 you just say counter equals counter plus 1. 187 00:08:59,980 --> 00:09:04,840 There was the slightly less verbose way, where you could say, oops, sorry. 188 00:09:04,840 --> 00:09:06,400 Let me do the first sentence first. 189 00:09:06,400 --> 00:09:08,690 In Python, that same thing, as you might guess, 190 00:09:08,690 --> 00:09:12,160 is actually going to be almost the same, you just throw away the semicolon. 191 00:09:12,160 --> 00:09:15,370 And the mathematics are ultimately the same, copying from right to left, 192 00:09:15,370 --> 00:09:17,290 via the assignment operator. 193 00:09:17,290 --> 00:09:19,570 Now, recall, in C, that we had this shorthand 194 00:09:19,570 --> 00:09:22,000 notation, which did the same thing. 195 00:09:22,000 --> 00:09:26,980 In Python, you can similarly do the same thing, just no need for the semicolon. 196 00:09:26,980 --> 00:09:29,290 The only step backwards we're taking, if you 197 00:09:29,290 --> 00:09:33,790 were a big fan of counter plus plus, that doesn't exist in Python, 198 00:09:33,790 --> 00:09:34,625 nor minus minus. 199 00:09:34,625 --> 00:09:35,500 You just can't do it. 200 00:09:35,500 --> 00:09:40,210 You have to do the plus equals 1 or plus/minus or minus equals 1 201 00:09:40,210 --> 00:09:43,720 to achieve that same result. All right, how about in Python 2? 202 00:09:43,720 --> 00:09:46,360 Here in Scratch, recall, was a conditional, 203 00:09:46,360 --> 00:09:49,990 asking a silly question like is x less than y, and if so, just say as much. 204 00:09:49,990 --> 00:09:53,980 In C, that looked a little something like this, printf and if 205 00:09:53,980 --> 00:09:57,310 with the parentheses, the curly braces, the semicolon, and all of that. 206 00:09:57,310 --> 00:10:00,610 In Python, this is going to get a little more pleasant to type, too. 207 00:10:00,610 --> 00:10:03,320 It's going to be just this. 208 00:10:03,320 --> 00:10:06,460 And if someone wants to call out some of the obvious changes here, 209 00:10:06,460 --> 00:10:10,365 what has been simplified now in Python for a conditional, it would seem? 210 00:10:10,365 --> 00:10:11,740 Yeah, what's missing, or changed? 211 00:10:11,740 --> 00:10:12,350 AUDIENCE: Braces. 212 00:10:12,350 --> 00:10:13,405 DAVID J. MALAN: So no curly braces. 213 00:10:13,405 --> 00:10:14,740 AUDIENCE: Colon is back. 214 00:10:14,740 --> 00:10:15,370 DAVID J. MALAN: I'm sorry? 215 00:10:15,370 --> 00:10:16,510 AUDIENCE: Using the colon instead. 216 00:10:16,510 --> 00:10:18,593 DAVID J. MALAN: And we're using the colon instead. 217 00:10:18,593 --> 00:10:20,620 So I got rid of the curly braces in Python. 218 00:10:20,620 --> 00:10:22,193 But I'm using a colon instead. 219 00:10:22,193 --> 00:10:24,110 And even though this is a single line of code, 220 00:10:24,110 --> 00:10:28,450 so long as you indent subsequent lines along with the printf, 221 00:10:28,450 --> 00:10:32,830 that's going to imply that everything, if the if condition is true, 222 00:10:32,830 --> 00:10:36,970 should be executed below it, until you start to un-indent and start writing 223 00:10:36,970 --> 00:10:38,470 a different line of code altogether. 224 00:10:38,470 --> 00:10:41,000 So indentation in Python is important. 225 00:10:41,000 --> 00:10:45,100 So this is among the reasons we've emphasized axes like style, 226 00:10:45,100 --> 00:10:46,840 just how well styled your code is. 227 00:10:46,840 --> 00:10:49,360 And honestly, we've seen, certainly, in office hours, 228 00:10:49,360 --> 00:10:52,000 and you've seen in your own code, sort of a tendency sometimes 229 00:10:52,000 --> 00:10:55,030 to be a little lax when it comes to indentation, right? 230 00:10:55,030 --> 00:10:57,670 If you're one of those folks who likes to indent everything 231 00:10:57,670 --> 00:11:01,210 on the left hand side of the window, yeah, it might compile and run. 232 00:11:01,210 --> 00:11:04,870 But it's not particularly readable by you or anyone else. 233 00:11:04,870 --> 00:11:08,590 Python actually addresses this by just requiring indentation, 234 00:11:08,590 --> 00:11:09,790 when logically needed. 235 00:11:09,790 --> 00:11:14,050 So Python is going to force you to start inventing properly now, if that's been, 236 00:11:14,050 --> 00:11:16,680 perhaps, a tendency otherwise. 237 00:11:16,680 --> 00:11:17,620 What else is missing? 238 00:11:17,620 --> 00:11:19,050 Well, we have no semicolon here. 239 00:11:19,050 --> 00:11:21,150 Of course, it's print instead of printf. 240 00:11:21,150 --> 00:11:23,820 But otherwise, those seem to be the primary differences. 241 00:11:23,820 --> 00:11:25,680 What about something larger in Scratch? 242 00:11:25,680 --> 00:11:28,812 If an if-else block, like this, you can perhaps 243 00:11:28,812 --> 00:11:30,270 guess what it's going to look like. 244 00:11:30,270 --> 00:11:33,540 In C it looks like this, curly braces semicolons, and so forth. 245 00:11:33,540 --> 00:11:37,530 In Python, it's going to now look like this, almost the same, 246 00:11:37,530 --> 00:11:38,820 but indentation is important. 247 00:11:38,820 --> 00:11:39,960 The colons are important. 248 00:11:39,960 --> 00:11:42,810 And there's one other difference that's now again visible here, 249 00:11:42,810 --> 00:11:44,670 but we didn't call it out a second ago. 250 00:11:44,670 --> 00:11:47,760 What else is different in Python versus C for these conditionals? 251 00:11:47,760 --> 00:11:48,471 Yeah. 252 00:11:48,471 --> 00:11:51,120 AUDIENCE: You don't have any parentheses around the condition. 253 00:11:51,120 --> 00:11:51,700 DAVID J. MALAN: Perfect. 254 00:11:51,700 --> 00:11:54,090 We don't have any parentheses around the condition, 255 00:11:54,090 --> 00:11:55,710 the Boolean expression itself. 256 00:11:55,710 --> 00:11:56,567 And why not? 257 00:11:56,567 --> 00:11:57,900 Well, it's just simpler to type. 258 00:11:57,900 --> 00:11:58,950 It's less to type. 259 00:11:58,950 --> 00:12:00,450 You can still use parentheses. 260 00:12:00,450 --> 00:12:02,550 And, in fact, you might want to or need to, 261 00:12:02,550 --> 00:12:07,470 if you want to combine thoughts and do this and that, or this or that. 262 00:12:07,470 --> 00:12:10,920 But by default, you no longer need or should have those parentheses. 263 00:12:10,920 --> 00:12:12,150 Just say what you mean. 264 00:12:12,150 --> 00:12:14,440 Lastly, with conditionals, we had something like this, 265 00:12:14,440 --> 00:12:16,770 an if else if else statement. 266 00:12:16,770 --> 00:12:18,840 In C, it looked a little something like this. 267 00:12:18,840 --> 00:12:20,880 In Python, it's going to get really tighter now. 268 00:12:20,880 --> 00:12:25,830 It's just if, and this is the curiosity, elif x greater than y. 269 00:12:25,830 --> 00:12:31,110 So it's not else if, it's literally one keyword, elif, and the colons 270 00:12:31,110 --> 00:12:33,315 remain now on each of the three lines. 271 00:12:33,315 --> 00:12:34,690 But the indentation is important. 272 00:12:34,690 --> 00:12:36,480 And if we did want to do multiple things, 273 00:12:36,480 --> 00:12:40,238 we could just indent below each of these conditionals, as well. 274 00:12:40,238 --> 00:12:42,030 All right, let me pause there first, to see 275 00:12:42,030 --> 00:12:44,490 if there's any questions on these syntactic differences. 276 00:12:44,490 --> 00:12:45,247 Yeah. 277 00:12:45,247 --> 00:12:47,532 AUDIENCE: My thought is maybe like, it's good, 278 00:12:47,532 --> 00:12:51,160 though, does it matter if there's this in between thing like that, but 279 00:12:51,160 --> 00:12:52,170 and why. 280 00:12:52,170 --> 00:12:55,050 DAVID J. MALAN: In between, between what and what? 281 00:12:55,050 --> 00:12:58,420 AUDIENCE: So like the left-hand side and like the right side spaces? 282 00:12:58,420 --> 00:13:01,830 DAVID J. MALAN: Ah, good question, is Python sensitive 283 00:13:01,830 --> 00:13:03,750 to spaces and where they go? 284 00:13:03,750 --> 00:13:06,390 Sometimes no, sometimes yes, is the short answer. 285 00:13:06,390 --> 00:13:10,080 Stylistically, though, you should be practicing what we're preaching here, 286 00:13:10,080 --> 00:13:14,265 whereby you do have spaces to the left and right of binary operators, 287 00:13:14,265 --> 00:13:16,140 that they're called, something like less than 288 00:13:16,140 --> 00:13:18,348 or greater than is a binary operator, because there's 289 00:13:18,348 --> 00:13:20,580 two operands to the left and to the right of them. 290 00:13:20,580 --> 00:13:23,640 And in fact, in Python, more so than the world of C, 291 00:13:23,640 --> 00:13:26,340 there's actually formal style conventions. 292 00:13:26,340 --> 00:13:30,687 Not only within CS50 have we had a style guide on the course's website, 293 00:13:30,687 --> 00:13:34,020 for instance, that just dictates how you should write your code so that it looks 294 00:13:34,020 --> 00:13:34,945 like everyone else's. 295 00:13:34,945 --> 00:13:37,320 In the Python community, they take this one step further, 296 00:13:37,320 --> 00:13:41,260 and there's an actual standard whereby you don't have to adhere to it, 297 00:13:41,260 --> 00:13:44,310 but generally speaking, in the real world, someone would reprimand you, 298 00:13:44,310 --> 00:13:47,100 would reject your code, if you're trying to contribute it to another project, 299 00:13:47,100 --> 00:13:48,730 if you don't adhere to these standards. 300 00:13:48,730 --> 00:13:51,690 So while you could be lax with some of this white space, 301 00:13:51,690 --> 00:13:52,860 do make things readable. 302 00:13:52,860 --> 00:13:56,775 And that's Python theme, for the code to be as readable as possible. 303 00:13:56,775 --> 00:13:59,400 All right, so let's take a look at a couple of other constructs 304 00:13:59,400 --> 00:14:01,360 before transitioning to some actual code. 305 00:14:01,360 --> 00:14:04,110 This, of course, in Scratch was a loop, meowing forever. 306 00:14:04,110 --> 00:14:08,340 In C, the closest we could get was doing something while true, because true 307 00:14:08,340 --> 00:14:09,100 never changes. 308 00:14:09,100 --> 00:14:12,060 So it's sort of a simple way of just saying do this forever. 309 00:14:12,060 --> 00:14:14,940 In Python, it's pretty much the same thing, 310 00:14:14,940 --> 00:14:16,740 but a couple of small differences here. 311 00:14:16,740 --> 00:14:18,600 The parentheses are gone. 312 00:14:18,600 --> 00:14:19,598 The colon is there. 313 00:14:19,598 --> 00:14:20,640 The indentation is there. 314 00:14:20,640 --> 00:14:24,263 No semicolon, and there's one other subtle difference. 315 00:14:24,263 --> 00:14:24,930 What do you see? 316 00:14:24,930 --> 00:14:25,920 AUDIENCE: True is capitalized? 317 00:14:25,920 --> 00:14:28,003 DAVID J. MALAN: True is capitalized, just because. 318 00:14:28,003 --> 00:14:30,570 Both true and false are Boolean values in Python. 319 00:14:30,570 --> 00:14:33,150 But you've got to start capitalizing them, just because. 320 00:14:33,150 --> 00:14:35,040 All right, how about a loop like this, where 321 00:14:35,040 --> 00:14:38,460 you repeat something a finite number of times, like meowing three times. 322 00:14:38,460 --> 00:14:41,050 In C, we could do this a few different ways. 323 00:14:41,050 --> 00:14:44,790 There's this very mechanical way, where you initialize a variable like i 324 00:14:44,790 --> 00:14:45,570 to zero. 325 00:14:45,570 --> 00:14:49,350 You then use a while loop and check if i is less than 3, 326 00:14:49,350 --> 00:14:51,187 the total number of times you want to meow. 327 00:14:51,187 --> 00:14:52,770 Then you print what you want to print. 328 00:14:52,770 --> 00:14:56,370 You increment i using this syntax, or the longer, more verbose syntax, 329 00:14:56,370 --> 00:14:57,880 with plus equals or whatnot. 330 00:14:57,880 --> 00:15:00,210 And then you do it again and again and again. 331 00:15:00,210 --> 00:15:04,170 In Python, you can do it functionally the same way, same idea, 332 00:15:04,170 --> 00:15:05,580 slightly different syntax. 333 00:15:05,580 --> 00:15:08,190 You just don't bother saying what type of variable you want. 334 00:15:08,190 --> 00:15:11,038 Python will infer from the fact that there's a 0 right there. 335 00:15:11,038 --> 00:15:12,330 You don't need the parentheses. 336 00:15:12,330 --> 00:15:13,260 You do need the colon. 337 00:15:13,260 --> 00:15:14,760 You do need the indentation. 338 00:15:14,760 --> 00:15:17,910 You can't do the i plus plus, but you can do this other technique, 339 00:15:17,910 --> 00:15:20,100 as we could have done in C, as well. 340 00:15:20,100 --> 00:15:22,320 How else might we do this, though, too? 341 00:15:22,320 --> 00:15:24,540 Well. it turns out in C, we could do something 342 00:15:24,540 --> 00:15:28,230 like this, which, again, sort of cryptic at first glance, 343 00:15:28,230 --> 00:15:31,170 became perhaps more familiar, where you have initialization, 344 00:15:31,170 --> 00:15:34,920 a conditional, and then an update that you do after each iteration. 345 00:15:34,920 --> 00:15:37,950 In Python, there isn't really an analog. 346 00:15:37,950 --> 00:15:40,500 There is no analog in Python, where you have 347 00:15:40,500 --> 00:15:43,380 the parentheses and the multiple semicolons in the same line. 348 00:15:43,380 --> 00:15:47,010 Instead, there is a for loop, but it's meant to read a little more 349 00:15:47,010 --> 00:15:50,550 like English, for i in 0, 1, and 2. 350 00:15:50,550 --> 00:15:54,780 So we'll see in a bit, these square brackets represent an array, now 351 00:15:54,780 --> 00:15:57,090 to be called a list in Python. 352 00:15:57,090 --> 00:16:01,290 So lists in Python are more like link lists than they are arrays. 353 00:16:01,290 --> 00:16:02,380 More on that soon. 354 00:16:02,380 --> 00:16:06,210 So this just means for i and the following list of three values. 355 00:16:06,210 --> 00:16:09,820 And on each iteration of this loop, Python automatically, for you, 356 00:16:09,820 --> 00:16:11,250 it first sets i to zero. 357 00:16:11,250 --> 00:16:12,840 Then it sets i to one. 358 00:16:12,840 --> 00:16:17,880 Then it sets i to two, so that you effectively do things three times. 359 00:16:17,880 --> 00:16:21,450 But this doesn't necessarily scale, as I've drawn it on the board. 360 00:16:21,450 --> 00:16:25,140 Suppose you took this at face value as the way 361 00:16:25,140 --> 00:16:28,980 you iterate some number of times in Python, using a for loop. 362 00:16:28,980 --> 00:16:33,482 At what point does this approach perhaps get bad, or bad design? 363 00:16:33,482 --> 00:16:35,190 Let me give folks just a moment to think. 364 00:16:35,190 --> 00:16:36,415 Yeah, in back. 365 00:16:36,415 --> 00:16:39,082 AUDIENCE: If you don't know how many times, last time, you know, 366 00:16:39,082 --> 00:16:41,083 you've got the link in there. 367 00:16:41,083 --> 00:16:43,500 DAVID J. MALAN: Sure, if you don't know how many times you 368 00:16:43,500 --> 00:16:47,460 want to loop or iterate, you can't really create a hard-coded list 369 00:16:47,460 --> 00:16:48,750 like that, of 0, 1, 2. 370 00:16:48,750 --> 00:16:50,323 Other thoughts? 371 00:16:50,323 --> 00:16:52,990 AUDIENCE: So you want to say raise a large number of allowances. 372 00:16:52,990 --> 00:16:55,740 DAVID J. MALAN: Yeah, if you're iterating a large number of times, 373 00:16:55,740 --> 00:16:57,640 this list is going to get longer and longer, 374 00:16:57,640 --> 00:16:59,932 and you're just kind of stupidly going to be typing out 375 00:16:59,932 --> 00:17:03,660 like comma 3, comma 4, comma 5, comma dot dot dot, comma 99, comma 100. 376 00:17:03,660 --> 00:17:06,160 I mean, your code would start to look atrocious, eventually. 377 00:17:06,160 --> 00:17:07,510 So there is a better way. 378 00:17:07,510 --> 00:17:10,359 In Python, there is a function, or technically a type, 379 00:17:10,359 --> 00:17:14,530 called range, that essentially magically gives you back a range of values 380 00:17:14,530 --> 00:17:17,599 from 0 on up to, but not through a value. 381 00:17:17,599 --> 00:17:21,609 So the effect of this line of code, for i in the following range, 382 00:17:21,609 --> 00:17:24,484 essentially hands you back a list of three values, 383 00:17:24,484 --> 00:17:26,359 thereby letting you do something three times. 384 00:17:26,359 --> 00:17:29,067 And if you want to do something 99 times instead, you, of course, 385 00:17:29,067 --> 00:17:30,575 just change the 3 to a 99. 386 00:17:30,575 --> 00:17:31,075 Question. 387 00:17:31,075 --> 00:17:35,090 AUDIENCE: Is there a way to start the beginning point of that range 388 00:17:35,090 --> 00:17:39,410 at a number or an integer that's higher than zero, or is there never a really 389 00:17:39,410 --> 00:17:40,460 any point to do so? 390 00:17:40,460 --> 00:17:41,540 DAVID J. MALAN: A really good question, can 391 00:17:41,540 --> 00:17:43,440 you start counting at a higher number. 392 00:17:43,440 --> 00:17:46,910 So not 0, which is the implied default, but something larger than that. 393 00:17:46,910 --> 00:17:51,560 Yes, so it turns out the range function takes multiple arguments, not just one 394 00:17:51,560 --> 00:17:54,998 but maybe two or even three, that allows you to customize this behavior. 395 00:17:54,998 --> 00:17:56,540 So you can customize where it begins. 396 00:17:56,540 --> 00:17:57,920 You can customize the increment. 397 00:17:57,920 --> 00:17:59,712 By default, it's one, but if you want to do 398 00:17:59,712 --> 00:18:02,582 every two values, for like evens or odds, you could do that as well, 399 00:18:02,582 --> 00:18:03,540 and a few other things. 400 00:18:03,540 --> 00:18:05,930 And before long, we'll take a look at some Python documentation 401 00:18:05,930 --> 00:18:08,810 that will become your authoritative source for answers like that. 402 00:18:08,810 --> 00:18:10,790 Like, what can this function do. 403 00:18:10,790 --> 00:18:15,020 Other questions on this thus far? 404 00:18:15,020 --> 00:18:19,980 Seeing none, so what else might we compare and contrast here. 405 00:18:19,980 --> 00:18:24,320 Well, in the world of C, recall that we had a whole bunch of built-in data 406 00:18:24,320 --> 00:18:28,310 types, like these here, Bool and char and double and float, and so forth, 407 00:18:28,310 --> 00:18:31,670 string, which happened to come from the CS50 library. 408 00:18:31,670 --> 00:18:35,990 But the language C itself certainly understood the idea of strings, 409 00:18:35,990 --> 00:18:40,700 because the backslash 0, the support for %S and printf, that's all native, 410 00:18:40,700 --> 00:18:43,370 built into C, not a CS50 simplification. 411 00:18:43,370 --> 00:18:45,620 All we did, and revealed, as of a couple of weeks 412 00:18:45,620 --> 00:18:48,050 ago, is that string, this data type, is just 413 00:18:48,050 --> 00:18:52,730 a synonym for a typedef for char star, which is part of the language natively. 414 00:18:52,730 --> 00:18:55,610 In Python now, this list actually gets a little shorter, at least 415 00:18:55,610 --> 00:18:57,443 for these common primitive data types. 416 00:18:57,443 --> 00:19:00,110 Still going to have bulls, we're going to have floats, and Ints, 417 00:19:00,110 --> 00:19:02,600 and we're going to have strings, but we're going to call them STRs. 418 00:19:02,600 --> 00:19:04,760 And this is not a CS50 thing from the library, 419 00:19:04,760 --> 00:19:08,300 STR, S-T-R, is, in fact, a data type in Python, 420 00:19:08,300 --> 00:19:12,260 that's going to do a lot more than strings did for us automatically in C. 421 00:19:12,260 --> 00:19:17,133 Ints and floats, meanwhile, don't need the corresponding longs and doubles, 422 00:19:17,133 --> 00:19:19,550 because, in fact, among the problems Python solves for us, 423 00:19:19,550 --> 00:19:22,340 too, Ints can get as big as you want. 424 00:19:22,340 --> 00:19:25,220 Integer overflow is no longer going to be an issue. 425 00:19:25,220 --> 00:19:27,950 Per week 1, the language solves that for us. 426 00:19:27,950 --> 00:19:29,790 Floating point imprecision, unfortunately, 427 00:19:29,790 --> 00:19:31,190 is still a problem that remains. 428 00:19:31,190 --> 00:19:34,730 But there are libraries, code that other people have written, as we briefly 429 00:19:34,730 --> 00:19:37,010 discussed in weeks past, that allow you to do 430 00:19:37,010 --> 00:19:40,250 scientific or financial computing, using libraries that build 431 00:19:40,250 --> 00:19:42,625 on top of these data types, as well. 432 00:19:42,625 --> 00:19:45,500 So there's other data types, too, in Python, which we'll see actually 433 00:19:45,500 --> 00:19:48,710 gives us a whole bunch of more power and capability, 434 00:19:48,710 --> 00:19:51,500 things called ranges, like we just saw, lists, 435 00:19:51,500 --> 00:19:54,080 like I called out verbally, with the square brackets, 436 00:19:54,080 --> 00:19:56,900 things called tuples, for things like x comma y, 437 00:19:56,900 --> 00:20:00,305 or latitude, longitude, dictionaries, or Dicts, 438 00:20:00,305 --> 00:20:03,740 which allow you to store keys and values, much like our hash tables 439 00:20:03,740 --> 00:20:06,973 from last time, and then sets in the mathematical sense, where they filter 440 00:20:06,973 --> 00:20:09,890 out duplicates for you, and you can just put a whole bunch of numbers, 441 00:20:09,890 --> 00:20:13,910 a whole bunch of words or whatnot, and the language, via this data type, 442 00:20:13,910 --> 00:20:16,400 will filter out duplicates for you. 443 00:20:16,400 --> 00:20:19,985 Now there's going to be a few functions we give you this week and beyond, 444 00:20:19,985 --> 00:20:22,610 training wheels that we're then going to very quickly take off, 445 00:20:22,610 --> 00:20:26,060 just because, as we'll see today, they just simplify the process of getting 446 00:20:26,060 --> 00:20:29,205 user input correctly, without accidentally writing buggy code, 447 00:20:29,205 --> 00:20:32,330 just when you're trying to get Hello, World, or something similar, to work. 448 00:20:32,330 --> 00:20:36,050 And we'll give you functions, not like, not as long as this list in C, 449 00:20:36,050 --> 00:20:38,630 but a subset of these, get float, get Int, 450 00:20:38,630 --> 00:20:41,660 and get string, that'll automate the process of getting 451 00:20:41,660 --> 00:20:45,410 user input in a way that's more resilient against potential bugs. 452 00:20:45,410 --> 00:20:47,270 But we'll see what those bugs might be. 453 00:20:47,270 --> 00:20:50,120 And the way we're going to do this is similar in spirit to C. 454 00:20:50,120 --> 00:20:54,380 Instead of doing include, CS50.h, like we did in C, 455 00:20:54,380 --> 00:20:57,290 you're going to now start saying import CS50. 456 00:20:57,290 --> 00:21:00,560 Python supports, similar to C, libraries, 457 00:21:00,560 --> 00:21:02,300 but there aren't header files anymore. 458 00:21:02,300 --> 00:21:05,090 You just use the name of the library in Python. 459 00:21:05,090 --> 00:21:08,450 And if you want to import CS50's functions, you just say import CS50. 460 00:21:08,450 --> 00:21:12,470 Or, if you want to be more precise, and not just import the whole thing, which 461 00:21:12,470 --> 00:21:15,860 could be slow, if you've got a really big library with a lot of functionality 462 00:21:15,860 --> 00:21:19,730 in it, you can be more precise and say from CS50, import get float. 463 00:21:19,730 --> 00:21:23,480 From CS50 import get Int, from CSM 50 import get string, 464 00:21:23,480 --> 00:21:26,270 or you can just separate them by commas and import 3 465 00:21:26,270 --> 00:21:30,550 and only 3 things from a particular library, like ours. 466 00:21:30,550 --> 00:21:32,300 But starting today and onward, we're going 467 00:21:32,300 --> 00:21:35,450 to start making much more heavy use of libraries, code 468 00:21:35,450 --> 00:21:38,570 that other people wrote, so that we're no longer reinventing the wheel. 469 00:21:38,570 --> 00:21:41,875 We're not making our own linked lists, our own trees, our own dictionaries. 470 00:21:41,875 --> 00:21:44,250 We're going to start standing on the shoulders of others, 471 00:21:44,250 --> 00:21:47,120 so that you can get real work done, so to speak, faster, 472 00:21:47,120 --> 00:21:51,710 by building your software on top of others' code as well. 473 00:21:51,710 --> 00:21:55,110 All right, so that's it for the syntactic tour of the language, 474 00:21:55,110 --> 00:21:56,360 and the sort of core features. 475 00:21:56,360 --> 00:21:58,320 Soon we'll transition to application thereof. 476 00:21:58,320 --> 00:22:04,040 But let me pause here to see if there's any questions on syntax or primitives 477 00:22:04,040 --> 00:22:10,340 or otherwise, or otherwise. 478 00:22:10,340 --> 00:22:12,204 Oh, yes, in back. 479 00:22:12,204 --> 00:22:16,163 AUDIENCE: Why don't Python have the increment operators. 480 00:22:16,163 --> 00:22:18,330 DAVID J. MALAN: I'm sorry, say it again, why doesn't 481 00:22:18,330 --> 00:22:19,788 Python have what kind of operators? 482 00:22:19,788 --> 00:22:22,578 AUDIENCE: Why doesn't Python have the increment operator? 483 00:22:22,578 --> 00:22:25,620 DAVID J. MALAN: Sorry, someone coughed when you said something operators. 484 00:22:25,620 --> 00:22:26,948 AUDIENCE: The increment. 485 00:22:26,948 --> 00:22:28,740 DAVID J. MALAN: Oh, the increment operator? 486 00:22:28,740 --> 00:22:30,407 I'd have to check the history, honestly. 487 00:22:30,407 --> 00:22:32,910 Python has tended to be a fairly minimus language. 488 00:22:32,910 --> 00:22:36,090 And if you can do something one way, the community, arguably, 489 00:22:36,090 --> 00:22:40,145 has tended to not give you multiple ways to do the same thing syntactically. 490 00:22:40,145 --> 00:22:41,520 There's probably a better answer. 491 00:22:41,520 --> 00:22:45,840 And I'll see if I can dig in and post something online, to follow up on that. 492 00:22:45,840 --> 00:22:49,870 All right, so before we transition to now writing some actual code, 493 00:22:49,870 --> 00:22:54,870 let me go ahead and consider exactly how we're going to write code. 494 00:22:54,870 --> 00:22:58,770 In the world of C, recall that it's generally been a 2-step process. 495 00:22:58,770 --> 00:23:04,230 We create a file called like Hello.c, and then, step one, make Hello, step 2, 496 00:23:04,230 --> 00:23:05,400 ./Hello. 497 00:23:05,400 --> 00:23:08,130 Or, if you think back to week two, when we sort of peeled back 498 00:23:08,130 --> 00:23:11,100 the layer of what Hello, of what make was doing, 499 00:23:11,100 --> 00:23:14,310 you could more verbosely type out the name of the actual compiler, 500 00:23:14,310 --> 00:23:17,640 Clang in our case, command line arguments like dash Oh, Hello, 501 00:23:17,640 --> 00:23:19,840 to specify what name you want to create. 502 00:23:19,840 --> 00:23:21,660 And then you can specify the file name. 503 00:23:21,660 --> 00:23:25,050 And then you can specify what libraries you want to link in. 504 00:23:25,050 --> 00:23:26,550 So that was a very verbose approach. 505 00:23:26,550 --> 00:23:28,930 But it was always a two-step approach. 506 00:23:28,930 --> 00:23:31,680 And so, even as you've been doing recent problem sets, 507 00:23:31,680 --> 00:23:35,400 odds are you've realized that, any time you want to make a change to your code, 508 00:23:35,400 --> 00:23:39,660 or make a change to your code and try and test your code again, 509 00:23:39,660 --> 00:23:42,360 you're constantly doing those two steps. 510 00:23:42,360 --> 00:23:45,840 Moving forward in Python, it's going to become simpler, 511 00:23:45,840 --> 00:23:47,610 and it's going to be just this. 512 00:23:47,610 --> 00:23:50,460 The file name is going to change, but that might go without saying. 513 00:23:50,460 --> 00:23:55,260 It's going to be something like Hello.py, P-Y, instead of Hello.c. 514 00:23:55,260 --> 00:23:57,990 And that's just a convention, using a different file extension. 515 00:23:57,990 --> 00:24:00,780 But there's no compilation step per se. 516 00:24:00,780 --> 00:24:04,170 You jump right to the execution of your code. 517 00:24:04,170 --> 00:24:07,200 And so Python, it turns out, is the name, not only of the language 518 00:24:07,200 --> 00:24:12,150 we're going to start using, it's also the name of a program on a Mac, a PC, 519 00:24:12,150 --> 00:24:16,020 assuming it's been pre-installed, that interprets the language for you. 520 00:24:16,020 --> 00:24:20,100 This is to say that Python is generally described as being interpreted, 521 00:24:20,100 --> 00:24:21,360 not compiled. 522 00:24:21,360 --> 00:24:25,170 And by that, I mean you get to skip, from the programmer's perspective, 523 00:24:25,170 --> 00:24:26,370 that compilation step. 524 00:24:26,370 --> 00:24:30,870 There is no manual step in the world of Python, typically, of writing your code 525 00:24:30,870 --> 00:24:34,530 and then compiling it to zeros and ones, and then running the zeros and ones. 526 00:24:34,530 --> 00:24:36,870 Instead, these kind of two steps get collapsed 527 00:24:36,870 --> 00:24:42,570 into the illusion of one, whereby you, instead, are able to just run the code, 528 00:24:42,570 --> 00:24:46,200 and let the computer figure out how to actually convert it 529 00:24:46,200 --> 00:24:48,240 to something the computer understands. 530 00:24:48,240 --> 00:24:51,850 And the way we do that is via this old process, input and output. 531 00:24:51,850 --> 00:24:53,910 But now, when you have source code, it's going 532 00:24:53,910 --> 00:24:56,850 to be passed into an interpreter, not a compiler. 533 00:24:56,850 --> 00:24:59,400 And the best analog of this is just to perhaps point out 534 00:24:59,400 --> 00:25:01,950 that, in the human world, if you speak, or don't speak, 535 00:25:01,950 --> 00:25:05,640 multiple human languages, it can be a pretty slow process from going 536 00:25:05,640 --> 00:25:07,270 from one language to another. 537 00:25:07,270 --> 00:25:10,170 For instance, here are step-by-step instructions for finding someone 538 00:25:10,170 --> 00:25:12,540 in a phone book, unfortunately, in Spanish. 539 00:25:12,540 --> 00:25:15,360 Unfortunately, if you don't speak or read Spanish. 540 00:25:15,360 --> 00:25:16,560 You could figure this out. 541 00:25:16,560 --> 00:25:19,380 You could run this algorithm, but you're going to have to do some googling, 542 00:25:19,380 --> 00:25:22,130 or you're going to have to open up literal dictionary from Spanish 543 00:25:22,130 --> 00:25:23,460 to English and convert this. 544 00:25:23,460 --> 00:25:27,060 And the catch with translating any language, human or computer 545 00:25:27,060 --> 00:25:30,850 or otherwise, is that you're going to pay a price, typically some time. 546 00:25:30,850 --> 00:25:33,840 And so converting this in Spanish to this in English 547 00:25:33,840 --> 00:25:36,360 is just going to take you longer than if this were already 548 00:25:36,360 --> 00:25:38,453 in your native language. 549 00:25:38,453 --> 00:25:41,370 And that's going to be one of the subtleties with the world of Python. 550 00:25:41,370 --> 00:25:45,180 Yes, it's a feature that you can just run the code without having 551 00:25:45,180 --> 00:25:47,880 to bother compiling it manually first. 552 00:25:47,880 --> 00:25:49,050 But we might pay a price. 553 00:25:49,050 --> 00:25:50,815 And things might be a little slower. 554 00:25:50,815 --> 00:25:52,440 Now, there's ways to chip away at that. 555 00:25:52,440 --> 00:25:53,815 But we'll see an example thereof. 556 00:25:53,815 --> 00:25:56,700 In fact, let me transition now to just a couple of examples 557 00:25:56,700 --> 00:26:00,660 that demonstrate how Python is not only easier for many people 558 00:26:00,660 --> 00:26:03,240 to use, perhaps yourselves too, because it throws away 559 00:26:03,240 --> 00:26:06,120 a lot of the annoying syntax, it shortens the number of lines 560 00:26:06,120 --> 00:26:09,810 you have to write, and also it comes with so many darn libraries, 561 00:26:09,810 --> 00:26:14,740 you can just do so much more without having to write the code yourself. 562 00:26:14,740 --> 00:26:17,670 So, as an example of this, let me switch over here 563 00:26:17,670 --> 00:26:24,090 to this image from problem set 4, which is the Weeks Bridge down by the Charles 564 00:26:24,090 --> 00:26:25,290 River here in Cambridge. 565 00:26:25,290 --> 00:26:27,245 And this is the original photo, pretty clear, 566 00:26:27,245 --> 00:26:30,370 and it's even higher res if we looked at the original version of the photo. 567 00:26:30,370 --> 00:26:33,660 But there have been no filters, a la Instagram, applied to this photo. 568 00:26:33,660 --> 00:26:36,750 Recall, for problem set four, you had to implement a few filters. 569 00:26:36,750 --> 00:26:38,460 And among them might have been blur. 570 00:26:38,460 --> 00:26:41,610 And blur was probably among the more challenging of the ones, 571 00:26:41,610 --> 00:26:44,190 because you had to iterate over all of the pixels, 572 00:26:44,190 --> 00:26:47,130 you had to take into account what's above, what's below, to the left, 573 00:26:47,130 --> 00:26:47,490 to the right. 574 00:26:47,490 --> 00:26:49,448 I mean, there was a lot of math and arithmetic. 575 00:26:49,448 --> 00:26:52,620 And if you ultimately got it, it was probably a great sense of satisfaction. 576 00:26:52,620 --> 00:26:54,780 But that was probably several hours later. 577 00:26:54,780 --> 00:26:57,540 In a language like Python, where there might 578 00:26:57,540 --> 00:27:01,170 be libraries that had been written by others, on whose shoulders 579 00:27:01,170 --> 00:27:03,880 you can stand, we could perhaps do something like this. 580 00:27:03,880 --> 00:27:08,280 Let me go ahead and run a program, or write a program, called Blur.py here. 581 00:27:08,280 --> 00:27:12,130 And in Blur.py, in VS Code, let me just do this. 582 00:27:12,130 --> 00:27:15,370 Let me import from a library, not the CS50 library, 583 00:27:15,370 --> 00:27:19,620 but the Pillow library, so to speak, a keyword called image 584 00:27:19,620 --> 00:27:23,330 and another one called image filter, then let me go ahead 585 00:27:23,330 --> 00:27:26,420 and say, let me open the current version of this image, which 586 00:27:26,420 --> 00:27:27,740 is called Bridge.bmp. 587 00:27:27,740 --> 00:27:30,260 So the before version of the image will be 588 00:27:30,260 --> 00:27:34,550 the result of calling image.open quote unquote "Bridge.bmp," 589 00:27:34,550 --> 00:27:37,040 and then, let me create an after version. 590 00:27:37,040 --> 00:27:38,840 So you'll see before and after. 591 00:27:38,840 --> 00:27:45,010 After equals the before version .filter of image filter. 592 00:27:45,010 --> 00:27:46,760 And there is, if I read the documentation, 593 00:27:46,760 --> 00:27:49,052 I'll see that there's something called a box blur, that 594 00:27:49,052 --> 00:27:52,160 allows you to blur in box format, like one pixel above, 595 00:27:52,160 --> 00:27:53,750 below, left, and right. 596 00:27:53,750 --> 00:27:55,367 So I'll do one pixel there. 597 00:27:55,367 --> 00:27:57,950 And then, after that's done, let me go ahead and save the file 598 00:27:57,950 --> 00:28:01,070 as something like Out.bmp. 599 00:28:01,070 --> 00:28:02,180 That's it. 600 00:28:02,180 --> 00:28:04,910 Assuming this library works as described, 601 00:28:04,910 --> 00:28:08,060 I am opening the file in Python, using line 3. 602 00:28:08,060 --> 00:28:09,680 And this is somewhat new syntax. 603 00:28:09,680 --> 00:28:13,250 In the world of Python, we're going to start making use of the dot operator 604 00:28:13,250 --> 00:28:15,320 more, because in the world of Python, you have 605 00:28:15,320 --> 00:28:19,700 what's called object-oriented programming, or OOP, as a term of art. 606 00:28:19,700 --> 00:28:22,470 And what this means is that you still have functions, 607 00:28:22,470 --> 00:28:24,980 you still have variables, but sometimes those functions 608 00:28:24,980 --> 00:28:28,850 are embedded inside of the variables, or, more specifically, 609 00:28:28,850 --> 00:28:30,710 inside of the data types themselves. 610 00:28:30,710 --> 00:28:34,430 Think back to C. When you wanted to convert something to uppercase, 611 00:28:34,430 --> 00:28:38,582 there was a to upper function that takes as input an argument that's a char. 612 00:28:38,582 --> 00:28:41,540 And you can pass in any char you want, and it will uppercase it for you 613 00:28:41,540 --> 00:28:42,890 and give you back a value. 614 00:28:42,890 --> 00:28:46,160 Well, you know what, if that's such a common paradigm, where 615 00:28:46,160 --> 00:28:49,850 upper-casing chars is a useful thing, what the world of Python does 616 00:28:49,850 --> 00:28:54,470 is it embeds into the string data type, or char if you will, 617 00:28:54,470 --> 00:28:59,240 the ability just to uppercase any char by treating the char, or the string, 618 00:28:59,240 --> 00:29:02,150 as though it's a struct in C. Recall that structs 619 00:29:02,150 --> 00:29:04,400 encapsulate multiple types of values. 620 00:29:04,400 --> 00:29:07,610 In object-oriented programming, in a language like Python, 621 00:29:07,610 --> 00:29:11,510 you can encapsulate not just values, but also functionality. 622 00:29:11,510 --> 00:29:13,818 Functions can now be inside of structs. 623 00:29:13,818 --> 00:29:15,860 But we're not going to call them structs anymore. 624 00:29:15,860 --> 00:29:17,270 We're going to call them objects. 625 00:29:17,270 --> 00:29:19,130 But that's just a different vernacular. 626 00:29:19,130 --> 00:29:20,870 So what am I doing here? 627 00:29:20,870 --> 00:29:23,870 Inside of the image library, there's a function called open, 628 00:29:23,870 --> 00:29:26,630 and it takes an argument, the name of the file, to open. 629 00:29:26,630 --> 00:29:30,260 Once I have a variable called before, that is a struct, or technically 630 00:29:30,260 --> 00:29:33,290 an object, inside of which is now, because it 631 00:29:33,290 --> 00:29:36,140 was returned from this function, a function 632 00:29:36,140 --> 00:29:38,280 called filter, that takes an argument. 633 00:29:38,280 --> 00:29:41,660 The argument here happens to be image.boxblur1, 634 00:29:41,660 --> 00:29:42,830 which itself is a function. 635 00:29:42,830 --> 00:29:44,803 But it just returns the filter to use. 636 00:29:44,803 --> 00:29:46,970 And then, after, dot save does what you might think. 637 00:29:46,970 --> 00:29:48,150 It just saves the file. 638 00:29:48,150 --> 00:29:51,470 So instead of using fopen and fwrite, you just say dot save, 639 00:29:51,470 --> 00:29:54,510 and that does all of that messy work for you. 640 00:29:54,510 --> 00:29:57,230 So it's just, what, four lines of code total? 641 00:29:57,230 --> 00:30:00,240 Let me go ahead and go down to my terminal window. 642 00:30:00,240 --> 00:30:03,533 Let me go ahead and show you with LS that, at the moment, 643 00:30:03,533 --> 00:30:05,450 whoops, sorry, let me not bother showing that, 644 00:30:05,450 --> 00:30:07,160 because I have other examples to come. 645 00:30:07,160 --> 00:30:14,310 I'm going to go ahead and do Python of Blur.py, nope, sorry, wrong place. 646 00:30:14,310 --> 00:30:15,570 I did need to make a command. 647 00:30:15,570 --> 00:30:16,280 There we go. 648 00:30:16,280 --> 00:30:19,340 OK, let me go ahead and type LS inside of my filter directory, which 649 00:30:19,340 --> 00:30:21,560 is among the sample code online today. 650 00:30:21,560 --> 00:30:24,800 There's only one file called Bridge.bmp, dammit, 651 00:30:24,800 --> 00:30:27,630 I'm trying to get these things ready at the same time. 652 00:30:27,630 --> 00:30:28,730 Let me rewind. 653 00:30:28,730 --> 00:30:32,120 Let me move this code into place. 654 00:30:32,120 --> 00:30:34,710 All right, I've gone ahead and moved this file, Blur.py, 655 00:30:34,710 --> 00:30:37,190 into a folder called filter, inside of which 656 00:30:37,190 --> 00:30:42,080 there's another file called Bridge.bmp, which we can confer with LS. 657 00:30:42,080 --> 00:30:44,390 Let me now go ahead and run Python, which 658 00:30:44,390 --> 00:30:46,700 is my interpreter, and also the name of the language, 659 00:30:46,700 --> 00:30:48,990 and run Python on this file. 660 00:30:48,990 --> 00:30:51,348 So much like running the Spanish algorithm 661 00:30:51,348 --> 00:30:53,390 through Google Translate, or something like that, 662 00:30:53,390 --> 00:30:55,650 as input, to get back the English output, 663 00:30:55,650 --> 00:30:59,540 this is going to translate the Python language to something 664 00:30:59,540 --> 00:31:01,760 this computer, or this cloud-based environment, 665 00:31:01,760 --> 00:31:05,070 understands, and then run the corresponding code, top to bottom, 666 00:31:05,070 --> 00:31:05,707 left to right. 667 00:31:05,707 --> 00:31:07,040 I'm going to go ahead and Enter. 668 00:31:07,040 --> 00:31:08,930 No error message is generally a good thing. 669 00:31:08,930 --> 00:31:11,960 If I type LS you'll now see out.bmp. 670 00:31:11,960 --> 00:31:13,295 Let me go ahead and open that. 671 00:31:13,295 --> 00:31:15,920 And, you know what, just to make clear what's really happening, 672 00:31:15,920 --> 00:31:17,087 let me blur it even further. 673 00:31:17,087 --> 00:31:20,550 Let's make a box that's not just one pixel around, but 10. 674 00:31:20,550 --> 00:31:21,950 So let's make that change. 675 00:31:21,950 --> 00:31:24,830 And let me just go ahead and rerun it with Python of Blur.py. 676 00:31:24,830 --> 00:31:27,320 I still have Out.bmp. 677 00:31:27,320 --> 00:31:32,100 Let me go ahead and open Out.bmp and show you first the before, 678 00:31:32,100 --> 00:31:33,680 which looks like this. 679 00:31:33,680 --> 00:31:34,550 That's the original. 680 00:31:34,550 --> 00:31:37,820 And now, crossing my fingers, four lines of code later, 681 00:31:37,820 --> 00:31:39,758 the result of blurring it, as well. 682 00:31:39,758 --> 00:31:42,050 So the library is doing all of the same kind of legwork 683 00:31:42,050 --> 00:31:44,120 that you all did for the assignment, but it's 684 00:31:44,120 --> 00:31:48,303 encapsulated it all into a single library, that you can then use instead. 685 00:31:48,303 --> 00:31:50,720 Those of you who might have been feeling more comfortable, 686 00:31:50,720 --> 00:31:52,595 might have done a little something like this. 687 00:31:52,595 --> 00:31:56,900 Let me go ahead and open up one other file, called Edges.py. 688 00:31:56,900 --> 00:32:00,290 And in Edges.py, I'm again going to import from the Pillow library 689 00:32:00,290 --> 00:32:03,010 the image keyword, and the image filter. 690 00:32:03,010 --> 00:32:05,510 Then I'm going to go ahead and create a before image, that's 691 00:32:05,510 --> 00:32:09,590 a result of calling image.open of the same thing, Bridge.bmp, 692 00:32:09,590 --> 00:32:16,910 then I'm going to go ahead and run a filter on that, called image, whoops, 693 00:32:16,910 --> 00:32:21,850 image filter.find edges, which is like a content, if you will, 694 00:32:21,850 --> 00:32:23,708 defined inside of this library for us. 695 00:32:23,708 --> 00:32:25,750 And then I'm going to do after.save quote unquote 696 00:32:25,750 --> 00:32:28,210 "Out.bmp," using the same file name. 697 00:32:28,210 --> 00:32:36,490 I'm now going to run Python of Edges.py, after, sorry, user error. 698 00:32:36,490 --> 00:32:38,930 We'll see what syntax error means soon. 699 00:32:38,930 --> 00:32:41,470 Let me go ahead and run the code now, Edges.py. 700 00:32:41,470 --> 00:32:44,830 Let me now open that new file, Out.bmp. 701 00:32:44,830 --> 00:32:49,510 And before we had this, and now, especially if what will look familiar 702 00:32:49,510 --> 00:32:52,210 if we did the more comfortable version of P set 4, 703 00:32:52,210 --> 00:32:55,340 we now get this, after just four lines of code. 704 00:32:55,340 --> 00:32:58,120 So again, suggesting the power of using a language that's better 705 00:32:58,120 --> 00:32:59,560 optimized for the tool at hand. 706 00:32:59,560 --> 00:33:02,950 And at the risk of really making folks sad, let's go ahead 707 00:33:02,950 --> 00:33:06,820 and re-implement, if we could, problem set five, real quickly here. 708 00:33:06,820 --> 00:33:11,080 Let me go ahead and open another version of this code, 709 00:33:11,080 --> 00:33:14,307 wherein I have a C version, just from problem 710 00:33:14,307 --> 00:33:16,390 set five, wherein you implemented a spell checker, 711 00:33:16,390 --> 00:33:18,640 loading 100,000 plus words into memory. 712 00:33:18,640 --> 00:33:22,390 And then you kept track of just how much time and memory it took. 713 00:33:22,390 --> 00:33:24,340 And that probably took a while, implementing 714 00:33:24,340 --> 00:33:26,530 all of those functions in Dictionary.c. 715 00:33:26,530 --> 00:33:32,240 Let me instead now go into a new file, called Dictionary.py. 716 00:33:32,240 --> 00:33:35,200 And let me stipulate, for the sake of discussion, 717 00:33:35,200 --> 00:33:37,660 that we already wrote in advance, Speller.py, 718 00:33:37,660 --> 00:33:39,850 which corresponds to Speller.c. 719 00:33:39,850 --> 00:33:41,380 You didn't write either of those. 720 00:33:41,380 --> 00:33:43,600 Recall for problem set five, we gave you Speller.c. 721 00:33:43,600 --> 00:33:45,558 Assume that we're going to give you Speller.py. 722 00:33:45,558 --> 00:33:52,030 So the onus on us right now is only to implement Speller, Dictionary.py. 723 00:33:52,030 --> 00:33:54,940 All right, so I'm going to go ahead and define a few functions. 724 00:33:54,940 --> 00:33:58,000 And we're going to see now the syntax for defining functions in Python. 725 00:33:58,000 --> 00:34:02,230 I want to go ahead and define first, a hash table, which 726 00:34:02,230 --> 00:34:04,840 was the very first thing you defined in Dictionary.c. 727 00:34:04,840 --> 00:34:09,969 I'm going to go ahead, then, and say words gets this, give me a dictionary, 728 00:34:09,969 --> 00:34:11,683 otherwise known as a hash table. 729 00:34:11,683 --> 00:34:13,600 All right, now let me define a function called 730 00:34:13,600 --> 00:34:16,630 check, which was the first function you might have implemented. 731 00:34:16,630 --> 00:34:19,000 Check is going to take a word, and you'll see in Python, 732 00:34:19,000 --> 00:34:20,375 the syntax is a little different. 733 00:34:20,375 --> 00:34:21,880 You don't specify the return type. 734 00:34:21,880 --> 00:34:24,610 You use the word Def instead to define. 735 00:34:24,610 --> 00:34:28,540 You still specify the name of the function and any arguments thereto. 736 00:34:28,540 --> 00:34:31,210 But you omit any mention of types. 737 00:34:31,210 --> 00:34:33,280 But you do use a colon and indent. 738 00:34:33,280 --> 00:34:37,780 So how do I check if a word is in my dictionary, or in my hash table? 739 00:34:37,780 --> 00:34:41,440 Well, in Python, I can just say, if word in words, 740 00:34:41,440 --> 00:34:46,570 go ahead and return true, else go ahead and return false, done, 741 00:34:46,570 --> 00:34:47,949 with the check function. 742 00:34:47,949 --> 00:34:49,639 All right, now I want to do like load. 743 00:34:49,639 --> 00:34:52,639 That was the heavy lift, where you had to load the big file into memory. 744 00:34:52,639 --> 00:34:54,306 So let me define a function called load. 745 00:34:54,306 --> 00:34:56,650 It takes a string, the name of a file to load. 746 00:34:56,650 --> 00:34:59,980 So I'll call that Dictionary, just like in C, but no data type. 747 00:34:59,980 --> 00:35:04,180 Let me go ahead and open a file by using an open function in Python, 748 00:35:04,180 --> 00:35:06,740 by opening that Dictionary in read mode. 749 00:35:06,740 --> 00:35:10,360 So this is a little similar to fopen, a function in C you might recall. 750 00:35:10,360 --> 00:35:12,880 Then let me iterate over every line in the file. 751 00:35:12,880 --> 00:35:17,800 In Python, this is pretty pleasant, for line in file colon indent. 752 00:35:17,800 --> 00:35:22,510 How, now, do I get at the current word, and then strip off the new line, 753 00:35:22,510 --> 00:35:25,570 because in this file of words, 140,000 words, 754 00:35:25,570 --> 00:35:28,752 there's word backslash n, word backslash n, all right? 755 00:35:28,752 --> 00:35:31,210 Well, let me go ahead and get a word from the current line, 756 00:35:31,210 --> 00:35:34,840 but strip off, from the right end of the string, the new line, which 757 00:35:34,840 --> 00:35:37,540 the Rstrip function in Python does for me. 758 00:35:37,540 --> 00:35:42,370 Then let me go ahead and add to my dictionary, or hash table, that word, 759 00:35:42,370 --> 00:35:43,030 done. 760 00:35:43,030 --> 00:35:45,535 Let me go ahead and close the file for good measure. 761 00:35:45,535 --> 00:35:48,160 And then let me go ahead and return true, because all was well. 762 00:35:48,160 --> 00:35:50,320 That's it for the load function in Python. 763 00:35:50,320 --> 00:35:51,580 How about the size function? 764 00:35:51,580 --> 00:35:54,820 This did not take any arguments, it just returns the size of the hash table 765 00:35:54,820 --> 00:35:55,990 or dictionary in Python. 766 00:35:55,990 --> 00:35:59,980 I can do that by returning the length of the dictionary in question. 767 00:35:59,980 --> 00:36:04,660 And then lastly, gone from the world of Python is malloc and free. 768 00:36:04,660 --> 00:36:06,090 Memory is managed for you. 769 00:36:06,090 --> 00:36:08,950 So no matter what I do, there's nothing to unload. 770 00:36:08,950 --> 00:36:10,820 The computer will do that for me. 771 00:36:10,820 --> 00:36:14,860 So I give you, in these functions, problem set five in Python. 772 00:36:14,860 --> 00:36:17,020 So, I'm sorry, we made you write it in C first. 773 00:36:17,020 --> 00:36:20,620 But the implication now is that, what are you getting for free, 774 00:36:20,620 --> 00:36:21,850 in a language like Python? 775 00:36:21,850 --> 00:36:24,370 Well, encapsulated in this one line of code 776 00:36:24,370 --> 00:36:28,270 is much of what you wrote for problem set five, implementing 777 00:36:28,270 --> 00:36:31,270 your array for all of your letters of the alphabet or more, 778 00:36:31,270 --> 00:36:34,390 all of the linked lists that you implemented to create chains, 779 00:36:34,390 --> 00:36:35,930 to store all of those words. 780 00:36:35,930 --> 00:36:37,060 All of that is happening. 781 00:36:37,060 --> 00:36:40,090 It's just someone else in the world wrote that code for you. 782 00:36:40,090 --> 00:36:43,060 And you can now use it by way of a dictionary. 783 00:36:43,060 --> 00:36:45,550 And actually, I can change this a little bit, 784 00:36:45,550 --> 00:36:48,670 because add is technically not the right function to use here. 785 00:36:48,670 --> 00:36:51,620 I'm actually treating the dictionary as something simpler, a set. 786 00:36:51,620 --> 00:36:55,420 So I'm going to make one tweak, set recall was another data type in Python. 787 00:36:55,420 --> 00:36:57,700 But set just allows it to handle duplicates, 788 00:36:57,700 --> 00:37:00,430 and it allows me to just throw things into it by literally 789 00:37:00,430 --> 00:37:02,320 using a function as simple as add. 790 00:37:02,320 --> 00:37:05,170 And I'm going to make one other tweak here, 791 00:37:05,170 --> 00:37:09,790 because, when I'm checking a word, it's possible it might be given 792 00:37:09,790 --> 00:37:12,520 to me in uppercase or capitalized. 793 00:37:12,520 --> 00:37:15,880 It's not going to necessarily come in in the same lowercase format 794 00:37:15,880 --> 00:37:17,470 that my dictionary did. 795 00:37:17,470 --> 00:37:22,390 I can force every word to lowercase by using word.lower. 796 00:37:22,390 --> 00:37:24,500 And I don't have to do it character for character, 797 00:37:24,500 --> 00:37:29,800 I can do the whole darn string at once, by just saying word.lower. 798 00:37:29,800 --> 00:37:32,860 All right, let me go ahead and open up a terminal window here. 799 00:37:32,860 --> 00:37:36,118 And let me go into, first, my C version, on the left. 800 00:37:36,118 --> 00:37:39,160 And actually I'm going to go ahead and split my terminal window into two. 801 00:37:39,160 --> 00:37:44,007 And on the right, I'm going to go into a version that I essentially just wrote. 802 00:37:44,007 --> 00:37:46,840 But it's also available online, if you want to play along afterward. 803 00:37:46,840 --> 00:37:50,170 I'm going to go ahead and make speller in C on the left, 804 00:37:50,170 --> 00:37:52,270 and note that it takes a moment to compile. 805 00:37:52,270 --> 00:37:56,530 Then I'm going to be ready to run speller of dictionaries, 806 00:37:56,530 --> 00:37:59,330 let's do like the Sherlock Holmes text, which is pretty big. 807 00:37:59,330 --> 00:38:03,970 And then over here, let me get ready to run Python of speller 808 00:38:03,970 --> 00:38:07,733 on texts/homes.txt2. 809 00:38:07,733 --> 00:38:10,150 So the syntax is a little different at the command prompt. 810 00:38:10,150 --> 00:38:12,880 I just, on the left, have to compile the code, with make, 811 00:38:12,880 --> 00:38:14,650 and then run it with ./speller. 812 00:38:14,650 --> 00:38:16,370 On the right, I don't need to compile it. 813 00:38:16,370 --> 00:38:17,860 But I do need to use the interpreter. 814 00:38:17,860 --> 00:38:20,230 So even though the lines are wrapping a little bit here, 815 00:38:20,230 --> 00:38:22,180 let me go ahead and run it on the right. 816 00:38:22,180 --> 00:38:24,305 And I'm going to count how long it takes, verbally, 817 00:38:24,305 --> 00:38:25,570 for demonstration sake. 818 00:38:25,570 --> 00:38:28,720 One Mississippi, two Mississippi, three Mississippi, OK, 819 00:38:28,720 --> 00:38:31,190 so it's like three seconds, give or take. 820 00:38:31,190 --> 00:38:33,520 Now running it in Python, keeping in mind, 821 00:38:33,520 --> 00:38:37,103 I spent way fewer hours implementing a spell checker in Python 822 00:38:37,103 --> 00:38:38,770 than you might have in problem set five. 823 00:38:38,770 --> 00:38:42,007 But what's the trade-off going to be, and what kinds of design decisions 824 00:38:42,007 --> 00:38:43,840 do we all now need to be making consciously? 825 00:38:43,840 --> 00:38:46,300 Here we go, on the right, in Python. 826 00:38:46,300 --> 00:38:50,020 One Mississippi, two Mississippi, three Mississippi, four Mississippi, 827 00:38:50,020 --> 00:38:54,070 five Mississippi, six Mississippi, seven Mississippi, eight Mississippi, 828 00:38:54,070 --> 00:38:57,100 nine Mississippi, 10 Mississippi, 11 Mississippi, 829 00:38:57,100 --> 00:38:59,990 all right, so 10 or 11 seconds. 830 00:38:59,990 --> 00:39:01,980 So which one is better? 831 00:39:01,980 --> 00:39:06,550 Let's go to the group here, which of these programs is the better one? 832 00:39:06,550 --> 00:39:10,780 How might you answer that question, based on demonstration alone? 833 00:39:10,780 --> 00:39:11,530 What do you think? 834 00:39:11,530 --> 00:39:13,738 AUDIENCE: I think Python's better for the programmer, 835 00:39:13,738 --> 00:39:17,847 more comfortable for the programmer, but C is better for the user. 836 00:39:17,847 --> 00:39:19,680 DAVID J. MALAN: OK, so Python, to summarize, 837 00:39:19,680 --> 00:39:23,460 is better for the programmer, because it was way faster to write, 838 00:39:23,460 --> 00:39:26,460 but C is maybe better for the computer, because it's much faster to run. 839 00:39:26,460 --> 00:39:28,127 I think that's a reasonable formulation. 840 00:39:28,127 --> 00:39:29,430 Other opinions? 841 00:39:29,430 --> 00:39:30,588 Yeah. 842 00:39:30,588 --> 00:39:32,880 AUDIENCE: I think it depends on the size of the project 843 00:39:32,880 --> 00:39:33,910 that you're dealing with. 844 00:39:33,910 --> 00:39:36,285 So if it's going to be something that's relatively quick, 845 00:39:36,285 --> 00:39:38,710 I might not care that it takes 10 seconds to do it. 846 00:39:38,710 --> 00:39:40,910 And it could be way faster to do it with Python. 847 00:39:40,910 --> 00:39:44,070 Whereas with C, if I'm dealing with something like a massive data 848 00:39:44,070 --> 00:39:48,300 set or something huge, then that time is going to really build up on, 849 00:39:48,300 --> 00:39:52,740 it might be worth it to put in the upfront effort and just load it into C, 850 00:39:52,740 --> 00:39:56,260 so the process continually will run faster over a longer period of time. 851 00:39:56,260 --> 00:39:57,430 DAVID J. MALAN: Absolutely, a really good answer. 852 00:39:57,430 --> 00:40:00,300 And let me summarize, is it depends on the workload, if you will. 853 00:40:00,300 --> 00:40:04,050 If you have a very large data set, you might 854 00:40:04,050 --> 00:40:07,128 want to optimize your code to be as fast and performant as it can be, 855 00:40:07,128 --> 00:40:09,420 especially if you're running that code again and again. 856 00:40:09,420 --> 00:40:10,950 Maybe you're a company like Google. 857 00:40:10,950 --> 00:40:13,110 People are searching a huge database all the time. 858 00:40:13,110 --> 00:40:15,750 You really want to squeeze every bit of performance 859 00:40:15,750 --> 00:40:17,222 as you can out of the computer. 860 00:40:17,222 --> 00:40:19,680 You might want to have someone smart take a language like C 861 00:40:19,680 --> 00:40:21,450 and write it at a very low level. 862 00:40:21,450 --> 00:40:22,500 It's going to be painful. 863 00:40:22,500 --> 00:40:23,400 They're going to have bugs. 864 00:40:23,400 --> 00:40:26,150 They're going to have to deal with memory management and the like. 865 00:40:26,150 --> 00:40:29,490 But if and when it works correctly, it's going to be much faster, it would seem. 866 00:40:29,490 --> 00:40:32,280 By contrast, if you have a data set that's big, 867 00:40:32,280 --> 00:40:35,820 and 140,000 words is not small, but you don't 868 00:40:35,820 --> 00:40:38,940 want to spend like 5 hours, 10 hours, a week of your time, 869 00:40:38,940 --> 00:40:41,063 building a spell checker or a dictionary, 870 00:40:41,063 --> 00:40:43,980 you can instead leverage a different language with different libraries 871 00:40:43,980 --> 00:40:48,690 and build on top of it, in order to prioritize the human time instead. 872 00:40:48,690 --> 00:40:50,841 Other thoughts? 873 00:40:50,841 --> 00:40:52,789 AUDIENCE: Would you, because with Python, 874 00:40:52,789 --> 00:40:56,928 doesn't it also like convert the words, or like 875 00:40:56,928 --> 00:40:58,539 convert the words, for a lesson? 876 00:40:58,539 --> 00:41:00,581 When we convert that into the same version again, 877 00:41:00,581 --> 00:41:04,148 do we just take that into view? 878 00:41:04,148 --> 00:41:06,940 DAVID J. MALAN: That's a perfect segue to exactly the next point we 879 00:41:06,940 --> 00:41:09,340 wanted to make, which was, is there something in between? 880 00:41:09,340 --> 00:41:10,360 And indeed there is. 881 00:41:10,360 --> 00:41:12,970 I'm oversimplifying what this language is actually doing. 882 00:41:12,970 --> 00:41:15,280 It's not as stark a difference as saying, like, hey, 883 00:41:15,280 --> 00:41:18,340 Python is four times slower than C. Like that's not the right takeaway. 884 00:41:18,340 --> 00:41:21,460 There are absolutely ways that engineers can optimize languages, 885 00:41:21,460 --> 00:41:23,230 as they have already done for Python. 886 00:41:23,230 --> 00:41:25,840 And in fact, I've configured my settings in such a way 887 00:41:25,840 --> 00:41:28,777 that I've kind of dramatized just how big the difference is. 888 00:41:28,777 --> 00:41:30,610 It is going to be slower, Python, typically, 889 00:41:30,610 --> 00:41:31,930 than the equivalent C program. 890 00:41:31,930 --> 00:41:33,940 But it doesn't have to be as big of a gap 891 00:41:33,940 --> 00:41:37,720 as it is here, because, indeed, among the features you can turn on in Python 892 00:41:37,720 --> 00:41:40,120 is to save some intermediate results. 893 00:41:40,120 --> 00:41:43,360 Technically speaking, yes, Python is interpreting 894 00:41:43,360 --> 00:41:46,690 Dictionary.py and these other files, translating them 895 00:41:46,690 --> 00:41:48,203 from one language to another. 896 00:41:48,203 --> 00:41:51,370 But that doesn't mean it has to do that every darn time you run the program. 897 00:41:51,370 --> 00:41:57,020 As you propose, you can save, or cache, C-A-C-H-E, the results of that process. 898 00:41:57,020 --> 00:42:00,440 So that the second time and the third time are actually notably faster. 899 00:42:00,440 --> 00:42:03,430 And, in fact, Python itself, the interpreter, the most popular version 900 00:42:03,430 --> 00:42:05,980 thereof, itself is actually implemented in C. 901 00:42:05,980 --> 00:42:09,290 So you can make sure that your interpreter is as fast as possible. 902 00:42:09,290 --> 00:42:11,350 And what then is maybe the high level takeaway? 903 00:42:11,350 --> 00:42:14,320 Yes, if you are going to try to squeeze every bit of performance 904 00:42:14,320 --> 00:42:17,710 out of your code, and maybe code is constrained. 905 00:42:17,710 --> 00:42:19,150 Maybe you have very small devices. 906 00:42:19,150 --> 00:42:20,770 Maybe it's like a watch nowadays. 907 00:42:20,770 --> 00:42:26,320 Or maybe it's a sensor that's installed in some small format in an appliance, 908 00:42:26,320 --> 00:42:29,710 or in infrastructure, where you don't have much battery life 909 00:42:29,710 --> 00:42:31,630 and you don't have much size, you might want 910 00:42:31,630 --> 00:42:33,710 to minimize just how much work is being done. 911 00:42:33,710 --> 00:42:36,743 And so the faster the code runs, and the better it's going to be, 912 00:42:36,743 --> 00:42:38,410 if it's implemented something low level. 913 00:42:38,410 --> 00:42:42,310 So C is still very commonly used for certain types of applications. 914 00:42:42,310 --> 00:42:45,580 But, again, if you just want to solve real world problems, 915 00:42:45,580 --> 00:42:49,840 and get real work done, and your time is just as, if not more, valuable 916 00:42:49,840 --> 00:42:52,000 than the device you're running it on, long term, 917 00:42:52,000 --> 00:42:55,358 you know what, Python is among the most popular languages as well. 918 00:42:55,358 --> 00:42:58,150 And frankly, if I were implementing a spell checker moving forward, 919 00:42:58,150 --> 00:42:59,710 I'm probably starting with Python. 920 00:42:59,710 --> 00:43:01,543 And I'm not going to waste time implementing 921 00:43:01,543 --> 00:43:04,930 all of that low-level stuff, because the whole point of using newer, 922 00:43:04,930 --> 00:43:09,460 modern languages is to use abstractions that other people have created for you. 923 00:43:09,460 --> 00:43:12,910 And by abstraction, I mean something like the dictionary function, 924 00:43:12,910 --> 00:43:15,370 that just gives you a dictionary, or hash table, 925 00:43:15,370 --> 00:43:19,225 or the equivalent version that I used, which in this case was a set. 926 00:43:19,225 --> 00:43:22,720 All right, any questions, then, on Python thus far? 927 00:43:22,720 --> 00:43:25,730 928 00:43:25,730 --> 00:43:26,710 No, all right. 929 00:43:26,710 --> 00:43:27,710 Oh, yeah, in the middle. 930 00:43:27,710 --> 00:43:29,920 AUDIENCE: Could you compile the Python code, 931 00:43:29,920 --> 00:43:34,610 or is there some, I'd imagine that with the audience that can happen, 932 00:43:34,610 --> 00:43:38,180 but it feels like if you can just come up with a Python compiler, 933 00:43:38,180 --> 00:43:40,093 that would give you the best of both worlds. 934 00:43:40,093 --> 00:43:42,260 DAVID J. MALAN: Really good question or observation, 935 00:43:42,260 --> 00:43:43,718 could you just compile Python code? 936 00:43:43,718 --> 00:43:47,180 Yes, absolutely, this idea of compiling code or interpreting code 937 00:43:47,180 --> 00:43:49,490 is not native to the language itself. 938 00:43:49,490 --> 00:43:52,410 It tends to be native to the conventions that we humans use. 939 00:43:52,410 --> 00:43:54,730 So you could actually write an interpreter for C 940 00:43:54,730 --> 00:43:57,980 that would read it top to bottom, left to right, converting it to, on the fly, 941 00:43:57,980 --> 00:44:01,640 something the computer understands, but historically that's not been the case. 942 00:44:01,640 --> 00:44:03,560 C is generally a compiled language. 943 00:44:03,560 --> 00:44:04,670 But it doesn't have to be. 944 00:44:04,670 --> 00:44:08,010 What Python nowadays is actually doing is what you described earlier. 945 00:44:08,010 --> 00:44:10,220 It technically is, sort of unbeknownst to us, 946 00:44:10,220 --> 00:44:13,970 compiling the code, technically not into 0's and 1's, technically 947 00:44:13,970 --> 00:44:17,510 into something called byte code, which is this intermediate step that 948 00:44:17,510 --> 00:44:21,510 just doesn't take as much time as it would to recompile the whole thing. 949 00:44:21,510 --> 00:44:24,377 And this is an area of research for computer scientists working 950 00:44:24,377 --> 00:44:26,960 in programming languages, to improve these kinds of paradigms. 951 00:44:26,960 --> 00:44:27,500 Why? 952 00:44:27,500 --> 00:44:30,740 Well, honestly, for you and I, the programmer, it's just much easier to, 953 00:44:30,740 --> 00:44:33,800 one, run the code and not worry about the stupid second step 954 00:44:33,800 --> 00:44:35,100 of compiling it all the time. 955 00:44:35,100 --> 00:44:35,600 Why? 956 00:44:35,600 --> 00:44:38,220 It's literally half as many steps for me, the human. 957 00:44:38,220 --> 00:44:40,500 And that's a nice thing to optimize for. 958 00:44:40,500 --> 00:44:44,330 And ultimately, too, you might want all of the fancy features that 959 00:44:44,330 --> 00:44:45,920 come with these other languages. 960 00:44:45,920 --> 00:44:47,960 So you should really just be fine-tuning how 961 00:44:47,960 --> 00:44:51,800 you can enable these features, as opposed to shying away from them here. 962 00:44:51,800 --> 00:44:54,590 And, in fact, the only time I personally ever use C 963 00:44:54,590 --> 00:44:57,950 is from like September to October of every year, during CS50. 964 00:44:57,950 --> 00:45:00,350 Almost every other month do I reach for Python, 965 00:45:00,350 --> 00:45:03,690 or another language called JavaScript, to actually get real work done, 966 00:45:03,690 --> 00:45:07,640 which is not to impugn C. It's just that those other languages tend to be better 967 00:45:07,640 --> 00:45:11,030 fits for the amount of time I have to allocate, and the types of problems 968 00:45:11,030 --> 00:45:11,905 that I want to solve. 969 00:45:11,905 --> 00:45:14,405 All right, let's go ahead and take a five minute break here. 970 00:45:14,405 --> 00:45:17,390 And when we come back, we'll start writing some programs from Scratch. 971 00:45:17,390 --> 00:45:18,300 All right. 972 00:45:18,300 --> 00:45:21,740 So let's go ahead and start writing some code from the beginning 973 00:45:21,740 --> 00:45:24,710 here, whereby we start small with some simple examples, 974 00:45:24,710 --> 00:45:28,042 and then we'll build our way up to more sophisticated examples in Python. 975 00:45:28,042 --> 00:45:29,750 But what we'll do along the way is first, 976 00:45:29,750 --> 00:45:31,865 look side by side at what the C code looked 977 00:45:31,865 --> 00:45:34,640 like way back in week 1 or 2 or 3 and so forth, 978 00:45:34,640 --> 00:45:36,890 and then write the corresponding Python code at right. 979 00:45:36,890 --> 00:45:39,530 And then we'll transition just to focusing on Python itself. 980 00:45:39,530 --> 00:45:42,322 What I've done in advance today is I've downloaded some of the code 981 00:45:42,322 --> 00:45:44,930 from the course's website, my source 6 directory, which 982 00:45:44,930 --> 00:45:47,825 contains all of the pre-written C code from weeks past. 983 00:45:47,825 --> 00:45:49,700 But it'll also have copies of the Python code 984 00:45:49,700 --> 00:45:51,660 we'll write here together and look at. 985 00:45:51,660 --> 00:45:55,445 So first, here is Hello.c back from week 0. 986 00:45:55,445 --> 00:45:57,323 And this was version 0 of it. 987 00:45:57,323 --> 00:45:58,740 I'm going to go ahead and do this. 988 00:45:58,740 --> 00:46:02,240 I'm going to go ahead and split my code window up here. 989 00:46:02,240 --> 00:46:05,042 I'm going to go ahead and create a new file called Hello.py. 990 00:46:05,042 --> 00:46:07,250 And this isn't something you'll typically have to do, 991 00:46:07,250 --> 00:46:08,810 laying your code out side by side. 992 00:46:08,810 --> 00:46:10,880 But I've just clicked the little icon in VS Code 993 00:46:10,880 --> 00:46:14,330 that looks like two columns, that splits my code editor into two places, 994 00:46:14,330 --> 00:46:17,330 so that we can, in fact, see things, for now, side by side, 995 00:46:17,330 --> 00:46:18,788 with my terminal window down below. 996 00:46:18,788 --> 00:46:21,747 All right, now I'm going to go ahead and write the corresponding Python 997 00:46:21,747 --> 00:46:24,560 program on the right, which, recall, was just print, quote 998 00:46:24,560 --> 00:46:27,170 unquote, "Hello, world," and that's it. 999 00:46:27,170 --> 00:46:29,420 Now down in my terminal window, I'm going 1000 00:46:29,420 --> 00:46:33,080 to go ahead and run Python of Hello.py, Enter, and voila, 1001 00:46:33,080 --> 00:46:34,450 we've got Hello.py working. 1002 00:46:34,450 --> 00:46:36,950 So again, I'm not going to play any further with the C code. 1003 00:46:36,950 --> 00:46:38,930 It's there just to jog your memory left and right. 1004 00:46:38,930 --> 00:46:41,240 So let's now look at a second version of Hello, world 1005 00:46:41,240 --> 00:46:44,452 from that first week, whereby if I go and get Hello1.c, 1006 00:46:44,452 --> 00:46:46,160 I'm going to drag that over to the right. 1007 00:46:46,160 --> 00:46:48,980 Whoops, I'm going to go ahead and drag that over to the left here. 1008 00:46:48,980 --> 00:46:51,950 And now, on the right, let's modify Hello.py 1009 00:46:51,950 --> 00:46:55,700 to look a little more like this second version in C, all right? 1010 00:46:55,700 --> 00:46:59,867 I want to get an answer from the user as a return value, 1011 00:46:59,867 --> 00:47:01,700 but I also want to get some input from them. 1012 00:47:01,700 --> 00:47:05,420 So from CS50, I'm going to import the function called getString for now. 1013 00:47:05,420 --> 00:47:07,170 We're going to get rid of that eventually, 1014 00:47:07,170 --> 00:47:08,962 but for now, it's a helpful training wheel. 1015 00:47:08,962 --> 00:47:11,180 And then down here, I'm going to say, answer 1016 00:47:11,180 --> 00:47:14,510 equals getString quote unquote, "What's your name"? 1017 00:47:14,510 --> 00:47:15,980 Question mark, space. 1018 00:47:15,980 --> 00:47:17,453 But no semicolon, no data type. 1019 00:47:17,453 --> 00:47:19,370 And then I'm going to go ahead and print, just 1020 00:47:19,370 --> 00:47:25,118 like the first example on the slide, Hello, comma space plus answer. 1021 00:47:25,118 --> 00:47:26,660 And now let me go ahead and run this. 1022 00:47:26,660 --> 00:47:29,660 Python, of Hello.py, all right, it's asking me what's my name. 1023 00:47:29,660 --> 00:47:30,170 David. 1024 00:47:30,170 --> 00:47:31,370 Hello comma David. 1025 00:47:31,370 --> 00:47:36,507 But it's worth calling attention to the fact that I've also simplified further. 1026 00:47:36,507 --> 00:47:38,840 It's not just that the individual functions are simpler. 1027 00:47:38,840 --> 00:47:42,470 What is also now glaringly omitted from my Python code at right, 1028 00:47:42,470 --> 00:47:44,657 both in this version, and the previous version. 1029 00:47:44,657 --> 00:47:46,115 What did I not bother implementing? 1030 00:47:46,115 --> 00:47:47,267 AUDIENCE: The main code. 1031 00:47:47,267 --> 00:47:49,850 DAVID J. MALAN: Yeah, so I didn't even need to implement main. 1032 00:47:49,850 --> 00:47:53,210 We'll revisit the main function, because having a main function 1033 00:47:53,210 --> 00:47:54,860 actually does solve problems sometimes. 1034 00:47:54,860 --> 00:47:56,090 But it's no longer required. 1035 00:47:56,090 --> 00:47:59,750 In C you have to have that to kick-start the entire process of actually running 1036 00:47:59,750 --> 00:48:00,337 your code. 1037 00:48:00,337 --> 00:48:03,170 And in fact, if you were missing main, as you might have experienced 1038 00:48:03,170 --> 00:48:06,033 if you accidentally compiled Helpers.c instead of the file 1039 00:48:06,033 --> 00:48:08,450 that contained main, you would have seen a compiler error. 1040 00:48:08,450 --> 00:48:09,658 In Python it's not necessary. 1041 00:48:09,658 --> 00:48:12,410 In Python you can just jump right in, start programming, and boom, 1042 00:48:12,410 --> 00:48:13,350 you're good to go. 1043 00:48:13,350 --> 00:48:15,225 Especially if it's a small program like this, 1044 00:48:15,225 --> 00:48:18,210 you don't need the added overhead or complexity of a main function. 1045 00:48:18,210 --> 00:48:19,860 So that's one other difference here. 1046 00:48:19,860 --> 00:48:23,390 All right, there are a few other ways we could say Hello, world. 1047 00:48:23,390 --> 00:48:26,160 Recall that I could use a format string. 1048 00:48:26,160 --> 00:48:30,360 So I could put this whole thing in quotes, I could use this f prefix. 1049 00:48:30,360 --> 00:48:33,250 And then let me go ahead and run Python of Hello.py again. 1050 00:48:33,250 --> 00:48:35,250 You can perhaps see where we're going with this. 1051 00:48:35,250 --> 00:48:37,170 Let me type my name, David, and here we go. 1052 00:48:37,170 --> 00:48:39,570 OK, that's the mistake that someone identified earlier, 1053 00:48:39,570 --> 00:48:41,040 you need the curly braces. 1054 00:48:41,040 --> 00:48:44,940 Otherwise no variables are interpolated, that is substituted, 1055 00:48:44,940 --> 00:48:46,390 with their actual values. 1056 00:48:46,390 --> 00:48:50,160 So if I go back in and add those curly braces to the F string, 1057 00:48:50,160 --> 00:48:54,632 now let me run Python of Hello.py, type in my name, and there we go. 1058 00:48:54,632 --> 00:48:55,590 We're back in business. 1059 00:48:55,590 --> 00:48:56,388 Which one's better? 1060 00:48:56,388 --> 00:48:57,180 I mean, it depends. 1061 00:48:57,180 --> 00:49:00,540 But generally speaking, making shorter, more concise code 1062 00:49:00,540 --> 00:49:01,870 tends to be a good thing. 1063 00:49:01,870 --> 00:49:06,450 So stylistically, the F string is probably a reasonable instinct to have. 1064 00:49:06,450 --> 00:49:09,280 All right, well, what more can we do besides this? 1065 00:49:09,280 --> 00:49:12,180 Well, let me go ahead here and let's get rid of the training wheel 1066 00:49:12,180 --> 00:49:13,230 altogether, actually. 1067 00:49:13,230 --> 00:49:15,180 So same C code at left. 1068 00:49:15,180 --> 00:49:18,150 Let me get rid of the CS50 library, which we will ultimately, 1069 00:49:18,150 --> 00:49:19,620 in a couple of weeks, anyway. 1070 00:49:19,620 --> 00:49:22,560 I can't use getString, but I can use a function 1071 00:49:22,560 --> 00:49:24,730 that comes with Python called input. 1072 00:49:24,730 --> 00:49:28,050 And, in fact, this is actually a one-for-one substitution, pretty much. 1073 00:49:28,050 --> 00:49:31,380 There's really no downside to using input instead of getString. 1074 00:49:31,380 --> 00:49:33,420 We implement getString just for consistency 1075 00:49:33,420 --> 00:49:37,800 with what you saw in C. Python of Hello.py, what's your name, David. 1076 00:49:37,800 --> 00:49:39,310 Still actually works the same. 1077 00:49:39,310 --> 00:49:41,227 So gone are the CS50 specific training wheels. 1078 00:49:41,227 --> 00:49:43,227 But we're going to bring them back shortly, just 1079 00:49:43,227 --> 00:49:45,240 to deal with integers or floats or other values, 1080 00:49:45,240 --> 00:49:47,490 too, because it's going to make our lives a little simpler, 1081 00:49:47,490 --> 00:49:48,510 with error checking. 1082 00:49:48,510 --> 00:49:52,350 All right, any questions, before we now pivot to revisiting other examples 1083 00:49:52,350 --> 00:49:56,280 from week 1, but now in Python? 1084 00:49:56,280 --> 00:49:58,110 All right, let me go ahead and open up now. 1085 00:49:58,110 --> 00:50:03,240 Let's say Calculator0.c, which was one of the first examples we did involving 1086 00:50:03,240 --> 00:50:06,870 math and operators like that, as well as functions like getInt, 1087 00:50:06,870 --> 00:50:11,820 let me go ahead and create a new file now called Calculator.py, 1088 00:50:11,820 --> 00:50:15,360 at right, so that I have my C code at left still, 1089 00:50:15,360 --> 00:50:16,950 and my Python code at right. 1090 00:50:16,950 --> 00:50:20,610 All right, let me go dive into a translation of this code into Python. 1091 00:50:20,610 --> 00:50:23,100 I am going to use getInt from the CS50 library. 1092 00:50:23,100 --> 00:50:24,960 So let me import that. 1093 00:50:24,960 --> 00:50:27,340 I'm going to go ahead now and get an Int from the user. 1094 00:50:27,340 --> 00:50:31,000 So x equals getInt, and I'll ask them for an x value, 1095 00:50:31,000 --> 00:50:32,430 just like we did weeks ago. 1096 00:50:32,430 --> 00:50:37,800 No need to specify a semicolon, though, or an Int for the x. 1097 00:50:37,800 --> 00:50:38,940 It will just figure it out. 1098 00:50:38,940 --> 00:50:42,090 Y is going to get another Int via y colon, 1099 00:50:42,090 --> 00:50:46,830 and then down here, I'm going to go ahead and say print of x plus y. 1100 00:50:46,830 --> 00:50:48,720 So this is already a bit new. 1101 00:50:48,720 --> 00:50:53,400 Recall, the C version required that I use this format string, as well 1102 00:50:53,400 --> 00:50:54,428 as printf itself. 1103 00:50:54,428 --> 00:50:56,220 Python is just a little more user-friendly. 1104 00:50:56,220 --> 00:50:59,670 If all you want to do is print out a value, like x plus y, just print it. 1105 00:50:59,670 --> 00:51:02,610 Don't futz with any percent signs or format codes. 1106 00:51:02,610 --> 00:51:05,160 It's not printf, it's indeed just print now. 1107 00:51:05,160 --> 00:51:08,610 All right, let me go ahead and run Python of Calculator.py, 1108 00:51:08,610 --> 00:51:13,620 Enter, just do a quick sample, 1 plus 2 indeed equals 3. 1109 00:51:13,620 --> 00:51:16,410 As an aside, suppose I had taken a different approach 1110 00:51:16,410 --> 00:51:19,508 to importing the whole CS50 library, functionally, it's the same. 1111 00:51:19,508 --> 00:51:21,550 You're not to notice any performance impact here. 1112 00:51:21,550 --> 00:51:22,690 It's a small library. 1113 00:51:22,690 --> 00:51:25,680 But notice what does not work now, whereas it did work 1114 00:51:25,680 --> 00:51:31,110 in C. Python of Calculator.py, Enter, we see our first traceback deliberately 1115 00:51:31,110 --> 00:51:31,690 here. 1116 00:51:31,690 --> 00:51:33,570 So a traceback is just a term of art that 1117 00:51:33,570 --> 00:51:37,210 says, here is a trace back through all of the functions 1118 00:51:37,210 --> 00:51:38,250 that just got executed. 1119 00:51:38,250 --> 00:51:40,170 In the world of C, you might call this a stack 1120 00:51:40,170 --> 00:51:42,937 trace, stack being the operative word. 1121 00:51:42,937 --> 00:51:45,270 Recall that when we talked about the stack and the heap, 1122 00:51:45,270 --> 00:51:48,077 the stack, like a stack of trays, was all of the functions that 1123 00:51:48,077 --> 00:51:49,660 might get called, one after the other. 1124 00:51:49,660 --> 00:51:54,330 We had main, we had swap, then swap went away, and then main finished, recall. 1125 00:51:54,330 --> 00:51:58,020 So here's a trace back of all of the functions or code that got executed. 1126 00:51:58,020 --> 00:52:00,880 There's not really any functions other than my file itself. 1127 00:52:00,880 --> 00:52:02,350 Otherwise there'd be more detail. 1128 00:52:02,350 --> 00:52:05,580 But even though it's a little cryptic, we can perhaps infer from the output 1129 00:52:05,580 --> 00:52:09,960 here, name error, so something related to the name of something, name, getInt 1130 00:52:09,960 --> 00:52:10,950 is not defined. 1131 00:52:10,950 --> 00:52:14,190 And this of course, happens on line 3 over there. 1132 00:52:14,190 --> 00:52:15,520 All right, so why is that? 1133 00:52:15,520 --> 00:52:19,170 Well, Python essentially allows us to namespace 1134 00:52:19,170 --> 00:52:21,750 our functions that come from libraries. 1135 00:52:21,750 --> 00:52:25,290 There was a problem in C. If you were using the CS50 library, 1136 00:52:25,290 --> 00:52:27,180 and thus had access to getInt, getString, 1137 00:52:27,180 --> 00:52:29,850 and so forth, you could not use another library 1138 00:52:29,850 --> 00:52:31,590 that had the same function names. 1139 00:52:31,590 --> 00:52:33,510 They would collide, and the compiler would not 1140 00:52:33,510 --> 00:52:36,030 know how to link them together correctly. 1141 00:52:36,030 --> 00:52:41,520 In Python, and other languages like JavaScript, and in Java, 1142 00:52:41,520 --> 00:52:45,270 you have support for effectively what would be called namespaces. 1143 00:52:45,270 --> 00:52:50,370 You can isolate variables and function names to their own namespace, 1144 00:52:50,370 --> 00:52:52,590 like their own container in memory. 1145 00:52:52,590 --> 00:52:55,560 And what this means is, if you import all of CS50, 1146 00:52:55,560 --> 00:52:59,730 you have to say that the getInt you want is inside the CS50 library. 1147 00:52:59,730 --> 00:53:03,180 So just like with the image blurring, and the image edges 1148 00:53:03,180 --> 00:53:08,430 before, where I had to specify image dot and image filter dot, similarly here, 1149 00:53:08,430 --> 00:53:11,970 am I specifying with a dot operator, albeit a little differently, that I 1150 00:53:11,970 --> 00:53:14,410 want CS50.getInt in both places. 1151 00:53:14,410 --> 00:53:18,120 And now if I rerun Python of Calculator.py, 1 and 2, 1152 00:53:18,120 --> 00:53:19,860 now we're back in business. 1153 00:53:19,860 --> 00:53:20,790 Which one is better? 1154 00:53:20,790 --> 00:53:24,790 Generally speaking, it depends on just how many functions 1155 00:53:24,790 --> 00:53:26,040 you're using from the library. 1156 00:53:26,040 --> 00:53:29,040 If you're using a whole bunch of functions, just import the whole thing. 1157 00:53:29,040 --> 00:53:33,333 If you're only using maybe one or two, import them line by line. 1158 00:53:33,333 --> 00:53:35,750 All right, so let's go ahead and make a little tweak here. 1159 00:53:35,750 --> 00:53:38,917 Let's get rid of this library and take this training wheel off, 1160 00:53:38,917 --> 00:53:41,750 too, as quickly as we introduced it, though for the problems set six 1161 00:53:41,750 --> 00:53:44,310 you'll be able to use all of these same functions. 1162 00:53:44,310 --> 00:53:48,110 Suppose I get rid of this, and I just use the input function, 1163 00:53:48,110 --> 00:53:51,710 just like I did by replacing getString earlier. 1164 00:53:51,710 --> 00:53:54,710 Let me go ahead now and run this version of the code. 1165 00:53:54,710 --> 00:54:00,964 Python of Calculator.py, OK, how about 1 plus 2 equals 3. 1166 00:54:00,964 --> 00:54:02,660 Huh. 1167 00:54:02,660 --> 00:54:05,330 All right, obviously wrong, incorrect. 1168 00:54:05,330 --> 00:54:09,890 Can anyone explain what just happened, based on instincts? 1169 00:54:09,890 --> 00:54:10,890 What just happened here. 1170 00:54:10,890 --> 00:54:11,390 Yeah. 1171 00:54:11,390 --> 00:54:12,620 AUDIENCE: You want an answer? 1172 00:54:12,620 --> 00:54:13,745 DAVID J. MALAN: Sure, yeah. 1173 00:54:13,745 --> 00:54:17,930 AUDIENCE: Say you have a number of strings that don't have Ints, 1174 00:54:17,930 --> 00:54:21,320 so you would part with them and say, printing one, two, better. 1175 00:54:21,320 --> 00:54:24,650 DAVID J. MALAN: Exactly, Python is interpreting, or treating, 1176 00:54:24,650 --> 00:54:26,810 both x and y as strings, which is actually 1177 00:54:26,810 --> 00:54:29,120 what the input function returns by default. 1178 00:54:29,120 --> 00:54:32,150 And so plus is now being interpreted as concatenation, as we defined it 1179 00:54:32,150 --> 00:54:32,660 earlier. 1180 00:54:32,660 --> 00:54:35,780 So x plus y isn't x plus y mathematically, 1181 00:54:35,780 --> 00:54:38,480 but in terms of string joining, just like in Scratch. 1182 00:54:38,480 --> 00:54:41,690 So that's why we're getting 12, or really one two, 1183 00:54:41,690 --> 00:54:43,040 which isn't itself a number. 1184 00:54:43,040 --> 00:54:44,180 It, too, is another string. 1185 00:54:44,180 --> 00:54:45,950 So we somehow need to convert things. 1186 00:54:45,950 --> 00:54:49,040 And we didn't have this ability quite as easily in C. 1187 00:54:49,040 --> 00:54:52,670 We did have like the A to i function, ASCII to integer, 1188 00:54:52,670 --> 00:54:54,270 which did allow you to do this. 1189 00:54:54,270 --> 00:54:59,390 The analog in Python is actually just to do a cast, a typecast, using Int. 1190 00:54:59,390 --> 00:55:02,750 So just like in C, you can use the keyword Int, 1191 00:55:02,750 --> 00:55:04,500 but you use it a little differently. 1192 00:55:04,500 --> 00:55:09,300 Notice that I'm not doing parenthesis Int close parenthesis before the value. 1193 00:55:09,300 --> 00:55:11,010 I'm using Int as a function. 1194 00:55:11,010 --> 00:55:13,430 So indeed, in Python, Int is a function. 1195 00:55:13,430 --> 00:55:16,610 Float is a function, that you can pass values into, 1196 00:55:16,610 --> 00:55:18,270 to do this kind of conversion. 1197 00:55:18,270 --> 00:55:22,010 So now, if I run Python of Calculator.py, 1 and 2, 1198 00:55:22,010 --> 00:55:25,430 now we're back in business, and getting the answer of 3. 1199 00:55:25,430 --> 00:55:27,240 But there's kind of a catch here. 1200 00:55:27,240 --> 00:55:28,430 There's always going to be a trade-off. 1201 00:55:28,430 --> 00:55:30,560 Like that sounds amazing that it just works in this way. 1202 00:55:30,560 --> 00:55:32,450 We can throw away the CS50 library already. 1203 00:55:32,450 --> 00:55:37,130 But what if the user accidentally types, or maliciously types in, 1204 00:55:37,130 --> 00:55:39,035 like a cat, instead of a number. 1205 00:55:39,035 --> 00:55:40,910 Damn, well, there's one of these trace backs. 1206 00:55:40,910 --> 00:55:42,780 Like, now my program has crashed. 1207 00:55:42,780 --> 00:55:45,342 This is similar in spirit to the kinds of segfaults 1208 00:55:45,342 --> 00:55:46,550 that you might have had in C. 1209 00:55:46,550 --> 00:55:47,840 But they're not segfaults per se. 1210 00:55:47,840 --> 00:55:49,507 It doesn't necessarily relate to memory. 1211 00:55:49,507 --> 00:55:55,290 This time it relates to actual runtime values, not being as expected. 1212 00:55:55,290 --> 00:55:58,250 So this time it's not a name error, it's a value error, 1213 00:55:58,250 --> 00:56:02,580 invalid literal for Int with base 10 quote unquote "cat." 1214 00:56:02,580 --> 00:56:06,800 So, again, it's written for sort of a programmer, more than sort 1215 00:56:06,800 --> 00:56:09,650 of a typical person, because it's pretty arcane, the language here. 1216 00:56:09,650 --> 00:56:10,900 But let's try to interpret it. 1217 00:56:10,900 --> 00:56:14,862 Invalid literal, a literal is just something someone typed for Int, which 1218 00:56:14,862 --> 00:56:16,320 is the function name, with base 10. 1219 00:56:16,320 --> 00:56:18,170 It's just defaulting to decimal numbers. 1220 00:56:18,170 --> 00:56:20,415 Cat is apparently not a decimal number. 1221 00:56:20,415 --> 00:56:23,040 It doesn't look like it, therefore it can't be treated like it. 1222 00:56:23,040 --> 00:56:24,930 Therefore, there's a value error. 1223 00:56:24,930 --> 00:56:26,750 So what can we do? 1224 00:56:26,750 --> 00:56:30,200 Unfortunately, you would have to somehow catch this error. 1225 00:56:30,200 --> 00:56:32,450 And the only way to do that in Python really 1226 00:56:32,450 --> 00:56:34,970 is by way of another feature that C did not have, 1227 00:56:34,970 --> 00:56:37,400 namely, what are called exceptions. 1228 00:56:37,400 --> 00:56:42,080 An exception is exactly what just happened, name error, value error. 1229 00:56:42,080 --> 00:56:45,590 They are things that can go wrong when your Python code is running, 1230 00:56:45,590 --> 00:56:50,670 that aren't necessarily going to be detected until you run your code. 1231 00:56:50,670 --> 00:56:56,240 So in Python, and in JavaScript, and in Java, and other more modern languages, 1232 00:56:56,240 --> 00:56:59,240 there's this ability to actually try to do something, 1233 00:56:59,240 --> 00:57:01,015 except if something goes wrong. 1234 00:57:01,015 --> 00:57:03,140 And in fact, I'm going to introduce a bit of syntax 1235 00:57:03,140 --> 00:57:05,557 here, even though we won't have to use this much just yet. 1236 00:57:05,557 --> 00:57:09,980 Instead of just blindly converting x to an Int, let me go ahead 1237 00:57:09,980 --> 00:57:11,970 and try to do that. 1238 00:57:11,970 --> 00:57:15,380 And if there's an exception, go ahead and say something 1239 00:57:15,380 --> 00:57:22,280 like print, that is not an Int. 1240 00:57:22,280 --> 00:57:25,538 And then I'm going to do something like exit, right there. 1241 00:57:25,538 --> 00:57:27,080 And let me go ahead and do this here. 1242 00:57:27,080 --> 00:57:31,370 Let me try to get y, except if there's an exception. 1243 00:57:31,370 --> 00:57:35,997 Then let me go ahead and say, again, that is not an Int exclamation point. 1244 00:57:35,997 --> 00:57:38,330 And then I'm going to exit from there to, otherwise I'll 1245 00:57:38,330 --> 00:57:39,860 go ahead and print x plus y. 1246 00:57:39,860 --> 00:57:46,460 If I run Python of Calculator.py now, whoops, oh, 1247 00:57:46,460 --> 00:57:48,680 forgot my close quote, sorry. 1248 00:57:48,680 --> 00:57:54,560 All right, so close quote, Python of Calculator.py, 1 and 2 still work. 1249 00:57:54,560 --> 00:57:57,800 But if I try to type in something wrong like cat, now 1250 00:57:57,800 --> 00:57:59,310 it actually detects the error. 1251 00:57:59,310 --> 00:58:01,850 So what is the CS50 library in Python doing? 1252 00:58:01,850 --> 00:58:05,600 It's actually doing that try and accept for you, because suffice it to say, 1253 00:58:05,600 --> 00:58:08,540 otherwise your programs for something simple, like a calculator, 1254 00:58:08,540 --> 00:58:09,900 start to get longer and longer. 1255 00:58:09,900 --> 00:58:13,160 So we factored that kind of logic out to the CS50 getInt 1256 00:58:13,160 --> 00:58:14,690 function and get float function. 1257 00:58:14,690 --> 00:58:18,783 But underneath the hood, they're essentially doing this, try except, 1258 00:58:18,783 --> 00:58:20,450 but they're being a little more precise. 1259 00:58:20,450 --> 00:58:24,450 They're detecting a specific error, and they are doing it in a loop, 1260 00:58:24,450 --> 00:58:27,050 so that these functions will get executed again and again. 1261 00:58:27,050 --> 00:58:30,710 In fact, the best way to do this is to say except if there's a value error, 1262 00:58:30,710 --> 00:58:34,078 then print that error message out to the user. 1263 00:58:34,078 --> 00:58:36,870 And again, let's not get too into the weeds here with this feature. 1264 00:58:36,870 --> 00:58:38,760 We've already put into the CS50 library. 1265 00:58:38,760 --> 00:58:41,060 But that's why, for instance, we bootstrap things, 1266 00:58:41,060 --> 00:58:44,420 by just using these functions out of the box. 1267 00:58:44,420 --> 00:58:47,610 All right, let's do something more with our calculator here. 1268 00:58:47,610 --> 00:58:49,010 How about this. 1269 00:58:49,010 --> 00:58:51,890 In the world of C, we had another version 1270 00:58:51,890 --> 00:58:56,990 of this code, which actually did some division by way of-- 1271 00:58:56,990 --> 00:59:01,680 which actually did division of numbers, not just the addition herein. 1272 00:59:01,680 --> 00:59:05,990 So let me go ahead and close the C version, and let's focus only on Python 1273 00:59:05,990 --> 00:59:07,942 now, doing some of these same lines of codes. 1274 00:59:07,942 --> 00:59:09,650 But I'm going to go ahead and just assume 1275 00:59:09,650 --> 00:59:12,140 that the user is going to cooperate and use proper input. 1276 00:59:12,140 --> 00:59:16,310 So from CS50, import getInt, that will deal with any errors for me. 1277 00:59:16,310 --> 00:59:23,640 X gets getInt, ask the user for an Int x, y equals getInt, 1278 00:59:23,640 --> 00:59:25,170 ask the user for an Int y. 1279 00:59:25,170 --> 00:59:27,010 And then, let's go ahead and do this. 1280 00:59:27,010 --> 00:59:31,110 Let's declare a variable called z, set it equal to x divided by y. 1281 00:59:31,110 --> 00:59:32,850 Then let's go ahead and print z. 1282 00:59:32,850 --> 00:59:37,240 Still no need for a format string, I can just print out the variable's value. 1283 00:59:37,240 --> 00:59:39,240 Let me go ahead and run Python of Calculator.py. 1284 00:59:39,240 --> 00:59:43,650 Let me do 1, 10, and I get 0.1. 1285 00:59:43,650 --> 00:59:49,260 What did I get in C, though, if you think back. 1286 00:59:49,260 --> 00:59:52,076 What would we have happened in C? 1287 00:59:52,076 --> 00:59:53,420 AUDIENCE: Zero? 1288 00:59:53,420 --> 00:59:55,640 DAVID J. MALAN: Yeah, we would have gotten zero in C. 1289 00:59:55,640 --> 00:59:57,998 But why, in C, when you divide one Int by another, 1290 00:59:57,998 --> 00:59:59,915 and those Ints are like 1 and 10 respectively? 1291 00:59:59,915 --> 01:00:01,677 AUDIENCE: It'll give you an integer back. 1292 01:00:01,677 --> 01:00:03,260 DAVID J. MALAN: It will give you what? 1293 01:00:03,260 --> 01:00:04,343 AUDIENCE: An integer back. 1294 01:00:04,343 --> 01:00:07,910 DAVID J. MALAN: It will give you an integer back, and, unfortunately, 0.1, 1295 01:00:07,910 --> 01:00:09,860 the integer part of it is indeed zero. 1296 01:00:09,860 --> 01:00:11,970 So this was an example of truncation. 1297 01:00:11,970 --> 01:00:14,540 So truncation was an issue in C. But it would 1298 01:00:14,540 --> 01:00:17,450 seem as though this is no longer a problem in Python, 1299 01:00:17,450 --> 01:00:21,290 insofar as the division operator actually handles that for us. 1300 01:00:21,290 --> 01:00:24,230 As an aside, if you want the old behavior, because it actually 1301 01:00:24,230 --> 01:00:27,020 is sometimes useful for rounding or flooring values, 1302 01:00:27,020 --> 01:00:29,570 you can actually use two slashes. 1303 01:00:29,570 --> 01:00:31,620 And now you get the C behavior. 1304 01:00:31,620 --> 01:00:33,710 So that now 1 divided by 10 is zero. 1305 01:00:33,710 --> 01:00:36,230 So you don't give up that capability, but at least it 1306 01:00:36,230 --> 01:00:37,610 does a more sensible default. 1307 01:00:37,610 --> 01:00:41,030 Most people, especially new programmers, when dividing one value by another, 1308 01:00:41,030 --> 01:00:44,000 would want to get 0.1, not 0, for reasons 1309 01:00:44,000 --> 01:00:46,100 that indeed we had to explain weeks ago. 1310 01:00:46,100 --> 01:00:49,940 But what about another problem we had with the world of floats before, 1311 01:00:49,940 --> 01:00:52,040 whereby there is imprecision? 1312 01:00:52,040 --> 01:00:54,980 Let me go ahead and, somewhat cryptically, print out the value of z 1313 01:00:54,980 --> 01:00:55,860 as follows. 1314 01:00:55,860 --> 01:00:58,340 I'm going to format it using an f-string. 1315 01:00:58,340 --> 01:01:02,720 And I'm going to go ahead and format, not just z, because this is essentially 1316 01:01:02,720 --> 01:01:03,450 the same thing. 1317 01:01:03,450 --> 01:01:06,620 Notice this, if I do Python of Calculator.py, 1 and 10, 1318 01:01:06,620 --> 01:01:09,770 I get, by default, just one significant digit. 1319 01:01:09,770 --> 01:01:13,920 But if I use this syntax in Python, which we won't have to use often, 1320 01:01:13,920 --> 01:01:16,550 I can actually do in C like I did before, 1321 01:01:16,550 --> 01:01:19,650 50 significant digits after the decimal point. 1322 01:01:19,650 --> 01:01:24,020 So now let me rerun Python of Calculator.py 1 and 10, 1323 01:01:24,020 --> 01:01:26,990 and let's see if floating point imprecision is still with us. 1324 01:01:26,990 --> 01:01:28,280 Unfortunately, it is. 1325 01:01:28,280 --> 01:01:30,950 And you can see as much here, the f-string, the format string, 1326 01:01:30,950 --> 01:01:33,990 is just showing us now 50 digits instead of the default one. 1327 01:01:33,990 --> 01:01:36,110 So we've not solved all problems. 1328 01:01:36,110 --> 01:01:38,845 But we have solved at least some. 1329 01:01:38,845 --> 01:01:41,720 All right, before we pivot away from a mere calculator, any questions 1330 01:01:41,720 --> 01:01:45,350 now on syntax or concepts or the like? 1331 01:01:45,350 --> 01:01:46,070 Yeah. 1332 01:01:46,070 --> 01:01:49,320 AUDIENCE: Do you think the double slash you get 1333 01:01:49,320 --> 01:01:51,937 has merit, how do you comment on that? 1334 01:01:51,937 --> 01:01:53,270 DAVID J. MALAN: How do you what? 1335 01:01:53,270 --> 01:01:54,228 Oh, how do you comment. 1336 01:01:54,228 --> 01:01:57,410 Really good question, if you're using double slash for division 1337 01:01:57,410 --> 01:01:59,870 with flooring or truncation, like I described, 1338 01:01:59,870 --> 01:02:01,850 how do you do a comment in Python. 1339 01:02:01,850 --> 01:02:03,380 This is a comment. 1340 01:02:03,380 --> 01:02:05,930 And the convention is actually to use a complete sentence, 1341 01:02:05,930 --> 01:02:07,473 like with a capital T here. 1342 01:02:07,473 --> 01:02:09,890 You don't need a period unless there's multiple sentences. 1343 01:02:09,890 --> 01:02:12,840 And technically, it should be above the line of code by convention. 1344 01:02:12,840 --> 01:02:15,120 So you would use a hash symbol instead. 1345 01:02:15,120 --> 01:02:16,080 Good question. 1346 01:02:16,080 --> 01:02:17,420 I haven't seen those yet. 1347 01:02:17,420 --> 01:02:20,750 All right, let's go ahead and make something else here, how about. 1348 01:02:20,750 --> 01:02:23,430 Let me go ahead and open up, for instance, 1349 01:02:23,430 --> 01:02:29,090 an example called Points1.c, which we saw a few weeks back. 1350 01:02:29,090 --> 01:02:33,530 And let me go ahead on the other side and create a file called Points.py. 1351 01:02:33,530 --> 01:02:36,890 This was a program, recall, that asked the user how many points they 1352 01:02:36,890 --> 01:02:39,388 lost on the first assignment. 1353 01:02:39,388 --> 01:02:41,180 And then it went ahead and just printed out 1354 01:02:41,180 --> 01:02:43,790 whether they lost fewer points than me, because I lost two, 1355 01:02:43,790 --> 01:02:47,117 if you recall the photo, more points than me, or the same points as me. 1356 01:02:47,117 --> 01:02:49,700 Let me go ahead and zoom out so we can see a bit more of this. 1357 01:02:49,700 --> 01:02:54,208 And let me now, on the top right here, go about implementing this in Python. 1358 01:02:54,208 --> 01:02:56,750 So I want to first prompt the user for some number of points. 1359 01:02:56,750 --> 01:03:00,540 So from CS50 let's import getInt, so it handles the error-checking. 1360 01:03:00,540 --> 01:03:03,410 Let's then do points equals getInt, and ask 1361 01:03:03,410 --> 01:03:07,430 the user, how many points did you lose, question mark. 1362 01:03:07,430 --> 01:03:11,990 Then let's go ahead and say, if points less than two, which was my value, 1363 01:03:11,990 --> 01:03:15,800 print, you lost fewer points than me. 1364 01:03:15,800 --> 01:03:23,270 Otherwise, if it's else if points greater than 2, go ahead and print, 1365 01:03:23,270 --> 01:03:27,070 you lost more points than me. 1366 01:03:27,070 --> 01:03:30,800 Else let's go ahead and handle the final scenario, which is you 1367 01:03:30,800 --> 01:03:34,600 lost the same number of points as me. 1368 01:03:34,600 --> 01:03:39,230 Before I run this, does anyone want to point out a mistake I've already made? 1369 01:03:39,230 --> 01:03:39,730 Yeah. 1370 01:03:39,730 --> 01:03:41,390 AUDIENCE: Else if has to be elif. 1371 01:03:41,390 --> 01:03:44,690 DAVID J. MALAN: Yeah, so else if in C is actually now elif in Python. 1372 01:03:44,690 --> 01:03:45,780 It's a single word. 1373 01:03:45,780 --> 01:03:49,790 So let me change this to elif, and now cross my fingers, Python of Points.py, 1374 01:03:49,790 --> 01:03:53,330 suppose you lost three points on some assignment. 1375 01:03:53,330 --> 01:03:55,190 You lost more points than my two. 1376 01:03:55,190 --> 01:03:57,808 If you only lost one point, you lost fewer points than me. 1377 01:03:57,808 --> 01:03:58,850 So the logic is the same. 1378 01:03:58,850 --> 01:04:01,040 But notice the code is much tighter. 1379 01:04:01,040 --> 01:04:04,700 In 10 total lines, we did in what was 24 lines, because we've 1380 01:04:04,700 --> 01:04:06,350 thrown away a lot of the syntax. 1381 01:04:06,350 --> 01:04:08,370 The curly braces are no longer necessary. 1382 01:04:08,370 --> 01:04:10,230 The parentheses are gone, the semicolons. 1383 01:04:10,230 --> 01:04:13,670 So this is why it just tends to be more pleasant pretty quickly, 1384 01:04:13,670 --> 01:04:16,310 using a language like this. 1385 01:04:16,310 --> 01:04:18,770 All right, let's do one other example here. 1386 01:04:18,770 --> 01:04:23,000 In C, recall that we were able to determine the parity of some number, 1387 01:04:23,000 --> 01:04:24,590 if something is even or odd. 1388 01:04:24,590 --> 01:04:29,000 Well, in Python, let me go ahead and create a file called Parity.py, 1389 01:04:29,000 --> 01:04:32,810 and let's look for a moment at the C version at left. 1390 01:04:32,810 --> 01:04:36,680 Here was the code in C that we used to determine the parity of a number. 1391 01:04:36,680 --> 01:04:39,800 And, really, the key takeaway from all these lines 1392 01:04:39,800 --> 01:04:41,290 was just the remainder operator. 1393 01:04:41,290 --> 01:04:42,540 And that one is still with us. 1394 01:04:42,540 --> 01:04:44,998 So this is a simple demonstration, just to make that point, 1395 01:04:44,998 --> 01:04:48,770 if in Python, I want to determine whether a number is even or odd. 1396 01:04:48,770 --> 01:04:53,150 Well, let's go ahead and from CS50, import getInt, then let's go ahead 1397 01:04:53,150 --> 01:04:58,610 and get a number like n from the user, using getInt, and ask them for n. 1398 01:04:58,610 --> 01:05:04,220 And then let's go ahead and say, if n percent sign 2 equals 0, 1399 01:05:04,220 --> 01:05:08,270 then let's go ahead and print quote unquote "Even." 1400 01:05:08,270 --> 01:05:13,753 Else let's go ahead and print out Odd, but before I run this, 1401 01:05:13,753 --> 01:05:16,670 anyone want to instinctively, even though we've not talked about this, 1402 01:05:16,670 --> 01:05:19,010 point out a mistake here? 1403 01:05:19,010 --> 01:05:19,810 What I did wrong? 1404 01:05:19,810 --> 01:05:20,810 AUDIENCE: Double equals. 1405 01:05:20,810 --> 01:05:22,435 DAVID J. MALAN: Yeah, so double equals. 1406 01:05:22,435 --> 01:05:25,850 Again, so even though some of the stuff is changing, some of the same ideas 1407 01:05:25,850 --> 01:05:26,430 are the same. 1408 01:05:26,430 --> 01:05:28,520 So this, too, should be a double equal sign, 1409 01:05:28,520 --> 01:05:30,620 because I'm comparing for equality here. 1410 01:05:30,620 --> 01:05:32,153 And why is this the right math? 1411 01:05:32,153 --> 01:05:34,070 Well, if you divide a number by 2, it's either 1412 01:05:34,070 --> 01:05:36,290 going to have 0 or 1 as a remainder. 1413 01:05:36,290 --> 01:05:39,030 And that's going to determine if it's even or odd for us. 1414 01:05:39,030 --> 01:05:42,200 So let's run Python of Parity.py, type in a number like 50, 1415 01:05:42,200 --> 01:05:44,660 and hopefully we get, indeed, even. 1416 01:05:44,660 --> 01:05:46,910 So again, same idea, but now we're down to eight lines 1417 01:05:46,910 --> 01:05:48,560 of code instead of the 20. 1418 01:05:48,560 --> 01:05:50,810 Well, let's now do something a little more interactive 1419 01:05:50,810 --> 01:05:54,680 and a little representative of tools that actually ask the user questions. 1420 01:05:54,680 --> 01:06:00,320 In C, recall that we had this agreement program, Agree.c. 1421 01:06:00,320 --> 01:06:04,280 And then let's go ahead and implement a corresponding version in Python, 1422 01:06:04,280 --> 01:06:05,870 in a file called Agree.py. 1423 01:06:05,870 --> 01:06:08,570 And let's look at the C version first. 1424 01:06:08,570 --> 01:06:10,700 On the left, we used get char here. 1425 01:06:10,700 --> 01:06:13,190 And then we used the double vertical bars 1426 01:06:13,190 --> 01:06:16,430 to check if C is equal to capital Y or lowercase y. 1427 01:06:16,430 --> 01:06:18,500 And then we did the same thing for n for no. 1428 01:06:18,500 --> 01:06:24,380 And so let's go over here and let's do from CS50, import get-- 1429 01:06:24,380 --> 01:06:26,570 OK, get char is not a thing. 1430 01:06:26,570 --> 01:06:29,090 And this here is another difference with Python. 1431 01:06:29,090 --> 01:06:32,510 There is no data type for individual characters. 1432 01:06:32,510 --> 01:06:34,640 You have strings, STRs, and, honestly, those 1433 01:06:34,640 --> 01:06:36,620 are fine, because if you have a STR that's 1434 01:06:36,620 --> 01:06:38,960 just one character, for all intents and purposes, 1435 01:06:38,960 --> 01:06:40,710 it is just a single character. 1436 01:06:40,710 --> 01:06:41,960 So it's just a simplification. 1437 01:06:41,960 --> 01:06:43,200 You don't have to think as much. 1438 01:06:43,200 --> 01:06:45,658 You don't have to worry about double quotes, single quotes. 1439 01:06:45,658 --> 01:06:49,350 In fact, in Python, you can use double quotes or single quotes, 1440 01:06:49,350 --> 01:06:50,930 so long as you're consistent. 1441 01:06:50,930 --> 01:06:52,970 So long as you're consistent, the single quotes 1442 01:06:52,970 --> 01:06:55,670 do not mean something different, like they do in C. 1443 01:06:55,670 --> 01:06:58,340 So I'm going to go ahead and use getString here, 1444 01:06:58,340 --> 01:07:01,220 although, strictly speaking, I could just use the input function, 1445 01:07:01,220 --> 01:07:02,480 as we saw before. 1446 01:07:02,480 --> 01:07:07,250 I'm going to get a string from the user that asks them this, getString, 1447 01:07:07,250 --> 01:07:10,557 quote unquote, "Do you agree," like a little checkbox or interactive prompt, 1448 01:07:10,557 --> 01:07:13,640 where you have to say yes or no, you want to agree to the following terms, 1449 01:07:13,640 --> 01:07:14,580 or whatnot. 1450 01:07:14,580 --> 01:07:18,110 And then let's translate the conditionals to Python, now, too. 1451 01:07:18,110 --> 01:07:25,850 So if S equals equals quote-unquote "Y," or S equals equals lowercase y, 1452 01:07:25,850 --> 01:07:32,180 let's go ahead and print out agreed, just like in C, elif S equals 1453 01:07:32,180 --> 01:07:35,540 equals N or S equals equals little n. 1454 01:07:35,540 --> 01:07:38,058 Let's go ahead, then, and print out not agreed. 1455 01:07:38,058 --> 01:07:40,850 And you can already see, perhaps, one of the differences here, too. 1456 01:07:40,850 --> 01:07:43,700 Is Python a little more English-like, in that 1457 01:07:43,700 --> 01:07:47,610 you just literally use the English word or, instead of the two vertical bars. 1458 01:07:47,610 --> 01:07:50,370 But it's ultimately doing the same thing. 1459 01:07:50,370 --> 01:07:53,390 Can we simplify this code a bit, though. 1460 01:07:53,390 --> 01:07:55,340 This would be a little annoying if we wanted 1461 01:07:55,340 --> 01:07:57,800 to add support, not just for big Y and little y, 1462 01:07:57,800 --> 01:08:04,230 but Yes or big Yes or little yes or big Y, lowercase e, capital S, right? 1463 01:08:04,230 --> 01:08:07,130 There's a lot of permutations of Y-E-S or just y, 1464 01:08:07,130 --> 01:08:08,720 that we ideally should tolerate. 1465 01:08:08,720 --> 01:08:11,470 Otherwise, the user is going to have to type exactly what we want, 1466 01:08:11,470 --> 01:08:12,770 which isn't very user-friendly. 1467 01:08:12,770 --> 01:08:15,050 Any intuition for how we could logically, 1468 01:08:15,050 --> 01:08:18,270 even if you don't know how to do it in code, make this better? 1469 01:08:18,270 --> 01:08:18,770 Yeah. 1470 01:08:18,770 --> 01:08:21,535 AUDIENCE: Write way over the list, and then up, 1471 01:08:21,535 --> 01:08:22,910 it's like the things in the list. 1472 01:08:22,910 --> 01:08:27,050 DAVID J. MALAN: Nice, yeah, we saw an example of a list before, just 0, 1, 2. 1473 01:08:27,050 --> 01:08:29,899 Why don't we take that same idea and ask a similar question. 1474 01:08:29,899 --> 01:08:34,819 If S is in the following list of values, Y or little y, 1475 01:08:34,819 --> 01:08:38,600 or heck, let me add to the list now, yes, or maybe all capital YES. 1476 01:08:38,600 --> 01:08:40,779 And it's going to get a little annoying, admittedly, 1477 01:08:40,779 --> 01:08:43,750 but this is still better than the alternative, with all the or's. 1478 01:08:43,750 --> 01:08:45,640 I could do things like this, and so forth. 1479 01:08:45,640 --> 01:08:47,740 There's a whole bunch more permutations. 1480 01:08:47,740 --> 01:08:50,470 But let's leave this alone, and let me just go into here 1481 01:08:50,470 --> 01:08:57,279 and change this to, if S is in the following list of N or little n or no, 1482 01:08:57,279 --> 01:09:00,460 and I won't do as, let's just not worry about the weird capitalizations 1483 01:09:00,460 --> 01:09:01,600 there, for now. 1484 01:09:01,600 --> 01:09:02,800 Let's go ahead and run this. 1485 01:09:02,800 --> 01:09:05,950 Python of Agree.py, do I agree? 1486 01:09:05,950 --> 01:09:08,740 Y. OK, how about yes? 1487 01:09:08,740 --> 01:09:10,359 All right, how about big Yes. 1488 01:09:10,359 --> 01:09:11,850 OK, that does not seem to work. 1489 01:09:11,850 --> 01:09:14,350 Notice it did not say agreed, and it did not say not agreed. 1490 01:09:14,350 --> 01:09:15,410 It didn't detect it. 1491 01:09:15,410 --> 01:09:17,180 So how can I do this? 1492 01:09:17,180 --> 01:09:20,770 Well, you know what I could do, what I don't really 1493 01:09:20,770 --> 01:09:22,240 need the uppercase and lowercase. 1494 01:09:22,240 --> 01:09:24,189 Let me tighten this list up a little bit. 1495 01:09:24,189 --> 01:09:27,640 And why don't I just force S to be lowercase. 1496 01:09:27,640 --> 01:09:31,000 S.lower, recall, whether it's one character or more, 1497 01:09:31,000 --> 01:09:34,180 is a function built into STRs now, strings in Python, 1498 01:09:34,180 --> 01:09:35,950 that forces the whole thing to lowercase. 1499 01:09:35,950 --> 01:09:37,450 So now, watch what I can do. 1500 01:09:37,450 --> 01:09:42,700 Python of Agree.py, little y, that works, big Y, that works. 1501 01:09:42,700 --> 01:09:47,840 Big Yes, that works, big Y, little e, big S, that also works. 1502 01:09:47,840 --> 01:09:50,910 So we've now handled, in one fell swoop, a whole bunch more logic. 1503 01:09:50,910 --> 01:09:52,910 And you know what, we can tighten this up a bit. 1504 01:09:52,910 --> 01:09:56,350 Here's an opportunity, in Python, for slightly better design. 1505 01:09:56,350 --> 01:10:00,070 What have I done in here that's a little redundant? 1506 01:10:00,070 --> 01:10:04,180 Does anyone see an opportunity to eliminate a redundancy, 1507 01:10:04,180 --> 01:10:06,820 doing something more times than you need. 1508 01:10:06,820 --> 01:10:08,030 Is a stretch here, no. 1509 01:10:08,030 --> 01:10:08,530 Yep. 1510 01:10:08,530 --> 01:10:11,163 AUDIENCE: You can do S dot lower, above. 1511 01:10:11,163 --> 01:10:13,330 DAVID J. MALAN: We could move the S dot lower above. 1512 01:10:13,330 --> 01:10:15,310 Notice that I'm using S dot lower twice. 1513 01:10:15,310 --> 01:10:17,870 But it's going to give me the same answer both times. 1514 01:10:17,870 --> 01:10:20,080 So I could do a couple of things here. 1515 01:10:20,080 --> 01:10:24,700 I could, first of all, get rid of this lower, and get rid of this lower, 1516 01:10:24,700 --> 01:10:28,720 and then above this, maybe I could do something like this, S equal-- 1517 01:10:28,720 --> 01:10:31,600 I can't just do this, because that throws the value away. 1518 01:10:31,600 --> 01:10:34,240 It does the math, but it doesn't convert the string itself. 1519 01:10:34,240 --> 01:10:35,840 It's going to return a value. 1520 01:10:35,840 --> 01:10:38,260 So I have to say S equals s.lower. 1521 01:10:38,260 --> 01:10:39,340 I could do that. 1522 01:10:39,340 --> 01:10:41,840 Or, honestly, I can chain these things together. 1523 01:10:41,840 --> 01:10:46,070 And this is not something we saw in C. If getString returns a string, 1524 01:10:46,070 --> 01:10:49,240 and strings have functions like lower in them, 1525 01:10:49,240 --> 01:10:52,330 you can chain these functions together, like this, and do dot this, 1526 01:10:52,330 --> 01:10:53,788 dot that, dot this other thing. 1527 01:10:53,788 --> 01:10:56,830 And eventually you want to stop, because it's going to become crazy long. 1528 01:10:56,830 --> 01:10:58,810 But this is reasonable, still fits on the screen. 1529 01:10:58,810 --> 01:10:59,560 It's pretty tight. 1530 01:10:59,560 --> 01:11:01,690 It does in one place what I was doing in two. 1531 01:11:01,690 --> 01:11:03,010 So I think that's OK. 1532 01:11:03,010 --> 01:11:05,980 Let me go ahead and do Python of Agree.py one last time. 1533 01:11:05,980 --> 01:11:07,120 Let's try it one last time. 1534 01:11:07,120 --> 01:11:10,360 And it's still working as intended. 1535 01:11:10,360 --> 01:11:12,700 Also if I tried those other inputs as well. 1536 01:11:12,700 --> 01:11:13,435 Yeah, question. 1537 01:11:13,435 --> 01:11:19,290 AUDIENCE: Could you add on like a for uppercase as well, for like upper, 1538 01:11:19,290 --> 01:11:22,700 and then cover all the functions where it's lowercase, for all the functions 1539 01:11:22,700 --> 01:11:25,450 where it's uppercase as well, or could you not just do this again. 1540 01:11:25,450 --> 01:11:29,095 1541 01:11:29,095 --> 01:11:30,470 DAVID J. MALAN: Let me summarize. 1542 01:11:30,470 --> 01:11:33,340 Could we handle uppercase and lowercase together in some form? 1543 01:11:33,340 --> 01:11:35,020 I'm actually doing that already. 1544 01:11:35,020 --> 01:11:36,370 I just have to pick a lane. 1545 01:11:36,370 --> 01:11:39,307 I have to either be all lowercase in my logic or all uppercase, 1546 01:11:39,307 --> 01:11:41,140 and not worry about what the human types in, 1547 01:11:41,140 --> 01:11:43,240 because no matter what the human types in, I'm 1548 01:11:43,240 --> 01:11:44,950 forcing their input to lowercase. 1549 01:11:44,950 --> 01:11:48,280 And then I am using a lowercase list of values. 1550 01:11:48,280 --> 01:11:49,520 If I want to flip that, fine. 1551 01:11:49,520 --> 01:11:51,040 I just have to be self-consistent. 1552 01:11:51,040 --> 01:11:52,420 But I'm handling that already. 1553 01:11:52,420 --> 01:11:53,223 Yeah. 1554 01:11:53,223 --> 01:11:56,953 AUDIENCE: Are strings no longer an array of characters? 1555 01:11:56,953 --> 01:11:58,870 DAVID J. MALAN: A really good loaded questions 1556 01:11:58,870 --> 01:12:02,080 are strings no longer an array of characters? 1557 01:12:02,080 --> 01:12:04,120 Conceptually, yes, underneath the hood, no. 1558 01:12:04,120 --> 01:12:06,190 They're a little more sophisticated than that, 1559 01:12:06,190 --> 01:12:08,590 because with strings, you have a few changes. 1560 01:12:08,590 --> 01:12:10,600 Not only do they have functions built into them, 1561 01:12:10,600 --> 01:12:12,580 because strings are now what we call objects, 1562 01:12:12,580 --> 01:12:14,500 in what's called object-oriented programming. 1563 01:12:14,500 --> 01:12:17,042 And we're going to keep seeing examples of this dot operator. 1564 01:12:17,042 --> 01:12:21,550 They are also immutable, so to speak, I-M-M-U-T-A-B-L-E. 1565 01:12:21,550 --> 01:12:25,180 Immutable means they cannot be changed, which means, unlike C, 1566 01:12:25,180 --> 01:12:28,750 you can't go into a string and change its individual characters. 1567 01:12:28,750 --> 01:12:31,480 You can make a copy of the string that makes a change, 1568 01:12:31,480 --> 01:12:33,698 but you can't change the original string itself. 1569 01:12:33,698 --> 01:12:35,740 This is both a little annoying, maybe, sometimes. 1570 01:12:35,740 --> 01:12:38,365 But it's also pretty protective, because you can't do screw-ups 1571 01:12:38,365 --> 01:12:41,680 like I did weeks ago, when I was trying to copy S and call it T. 1572 01:12:41,680 --> 01:12:43,270 And then one affected the other. 1573 01:12:43,270 --> 01:12:47,080 Python, underneath the hood, is handling all of the memory management 1574 01:12:47,080 --> 01:12:48,550 and the pointers and all of that. 1575 01:12:48,550 --> 01:12:51,040 There are no pointers in Python. 1576 01:12:51,040 --> 01:12:55,840 So If that wasn't clear, all of that pain, if you will, all of that power, 1577 01:12:55,840 --> 01:13:00,280 is now handled by the language itself, not by us, the programmers. 1578 01:13:00,280 --> 01:13:02,440 All right, so let's introduce maybe some loops, 1579 01:13:02,440 --> 01:13:04,390 like we've been in the habit of doing. 1580 01:13:04,390 --> 01:13:08,170 Let me open up Meow.c, which was an example in C, just meowing 1581 01:13:08,170 --> 01:13:09,730 a bunch of times textually. 1582 01:13:09,730 --> 01:13:12,800 Let me create a file called Meow.py here on the right. 1583 01:13:12,800 --> 01:13:15,190 And notice on the left, this was correct code in C, 1584 01:13:15,190 --> 01:13:16,670 but it was kind of poorly designed. 1585 01:13:16,670 --> 01:13:17,170 Why? 1586 01:13:17,170 --> 01:13:19,450 Because it was a missed opportunity for a loop. 1587 01:13:19,450 --> 01:13:22,460 Why say something three times when you can say it just once? 1588 01:13:22,460 --> 01:13:25,990 So in Python, let me do it the poorly designed way first. 1589 01:13:25,990 --> 01:13:27,400 Let me print out meow. 1590 01:13:27,400 --> 01:13:31,210 And, like I generally should not, let me copy, paste it three times, 1591 01:13:31,210 --> 01:13:33,670 run Python of Meow.py, and it works. 1592 01:13:33,670 --> 01:13:35,318 OK, but not good practice. 1593 01:13:35,318 --> 01:13:37,360 So let me go ahead and improve this a little bit. 1594 01:13:37,360 --> 01:13:38,990 And there's a few ways to do this. 1595 01:13:38,990 --> 01:13:44,050 If I wanted to do this three times, I could instead do something like this. 1596 01:13:44,050 --> 01:13:48,010 For i in range of 3, recall that that was the better version, 1597 01:13:48,010 --> 01:13:51,370 rather than arbitrarily enumerate numbers yourself, let me go ahead 1598 01:13:51,370 --> 01:13:53,490 and print out quote unquote "Meow." 1599 01:13:53,490 --> 01:13:56,077 Now if I run Python of Meow, still seems to work. 1600 01:13:56,077 --> 01:13:57,910 So it's a little tighter, and, my God, like, 1601 01:13:57,910 --> 01:13:59,952 programs can't really get much shorter than this. 1602 01:13:59,952 --> 01:14:04,300 We're down to two lines of code, no main function, no gratuitous syntax. 1603 01:14:04,300 --> 01:14:06,580 Let's now improve the design further, like we 1604 01:14:06,580 --> 01:14:09,550 did in C, by introducing a function called 1605 01:14:09,550 --> 01:14:11,230 meow, that actually does the meowing. 1606 01:14:11,230 --> 01:14:13,000 So this was our first abstraction, recall, 1607 01:14:13,000 --> 01:14:18,100 both in Scratch and in C. Let me focus now entirely on the Python version 1608 01:14:18,100 --> 01:14:18,760 here. 1609 01:14:18,760 --> 01:14:23,485 Let me go ahead and first define a function. 1610 01:14:23,485 --> 01:14:26,890 1611 01:14:26,890 --> 01:14:30,250 Let me first go ahead and do this, for i in range of 3, 1612 01:14:30,250 --> 01:14:33,430 let's assume for the moment that there's a meow function, 1613 01:14:33,430 --> 01:14:34,720 that I'm just going to call. 1614 01:14:34,720 --> 01:14:38,320 Let's now go ahead and define, using the Def key word, which we saw briefly 1615 01:14:38,320 --> 01:14:41,170 with the speller demonstration, a function 1616 01:14:41,170 --> 01:14:42,880 called meow that takes no arguments. 1617 01:14:42,880 --> 01:14:45,460 And all it does for now is print meow. 1618 01:14:45,460 --> 01:14:50,620 Let me now go ahead and run Python of Meow.py Enter, huh, one 1619 01:14:50,620 --> 01:14:51,950 of those trace backs. 1620 01:14:51,950 --> 01:14:54,080 So this is another name error. 1621 01:14:54,080 --> 01:14:57,080 And, again, name meow is not defined. 1622 01:14:57,080 --> 01:14:59,080 What's your instinct here, even though we've not 1623 01:14:59,080 --> 01:15:00,760 tripped over this yet in Python? 1624 01:15:00,760 --> 01:15:03,130 Where does your mind go here? 1625 01:15:03,130 --> 01:15:03,670 Yeah. 1626 01:15:03,670 --> 01:15:06,080 AUDIENCE: Does it read top to bottom, left to right? 1627 01:15:06,080 --> 01:15:09,600 I'm guessing we could find a new case. 1628 01:15:09,600 --> 01:15:13,020 DAVID J. MALAN: Perfect, as smart, as smarter as Python seems to be, 1629 01:15:13,020 --> 01:15:14,770 it still makes certain assumptions. 1630 01:15:14,770 --> 01:15:18,010 And if it hasn't seen a keyword yet, it just doesn't exist. 1631 01:15:18,010 --> 01:15:21,000 So if you want it to exist, we have to be a little clever here. 1632 01:15:21,000 --> 01:15:24,090 I could just put it, flip it around, like this. 1633 01:15:24,090 --> 01:15:26,470 But this honestly isn't particularly good design. 1634 01:15:26,470 --> 01:15:26,970 Why? 1635 01:15:26,970 --> 01:15:30,390 Because now, if you, the reader of your code, whether you 1636 01:15:30,390 --> 01:15:32,970 wrote it or someone else, you kind of have to go fishing now. 1637 01:15:32,970 --> 01:15:34,560 Like where does this program begin? 1638 01:15:34,560 --> 01:15:38,130 And even though, yes, it's obvious that it begins on line four, logically, 1639 01:15:38,130 --> 01:15:40,710 like, if the file were longer, you're going to be annoyed 1640 01:15:40,710 --> 01:15:43,180 and fishing visually for the right lines of code. 1641 01:15:43,180 --> 01:15:44,397 So let's reintroduce main. 1642 01:15:44,397 --> 01:15:46,230 And indeed, this would be a common paradigm. 1643 01:15:46,230 --> 01:15:49,380 When you want to start having abstractions in your own functions, 1644 01:15:49,380 --> 01:15:53,460 just put your own code in main, so that, one, you can leave it up top, and two, 1645 01:15:53,460 --> 01:15:55,650 you can solve the problem we just encountered. 1646 01:15:55,650 --> 01:15:58,860 So let me define a function called main that has that same loop, 1647 01:15:58,860 --> 01:16:00,240 meowing three times. 1648 01:16:00,240 --> 01:16:02,040 But now watch what happens. 1649 01:16:02,040 --> 01:16:07,350 Let me go into my terminal and run Python of Meow.py, Enter. 1650 01:16:07,350 --> 01:16:07,850 Nothing. 1651 01:16:07,850 --> 01:16:10,500 1652 01:16:10,500 --> 01:16:14,050 All right, investigate this. 1653 01:16:14,050 --> 01:16:16,290 What could explain this symptom. 1654 01:16:16,290 --> 01:16:18,020 I have not told you the answer yet. 1655 01:16:18,020 --> 01:16:19,770 So all you have is your instinct, assuming 1656 01:16:19,770 --> 01:16:21,720 you've never touched Python before. 1657 01:16:21,720 --> 01:16:26,800 What might explain this symptom, where nothing is meowing? 1658 01:16:26,800 --> 01:16:27,300 Yeah? 1659 01:16:27,300 --> 01:16:28,970 AUDIENCE: Didn't run the main function. 1660 01:16:28,970 --> 01:16:31,178 DAVID J. MALAN: Yeah, I didn't run the main function. 1661 01:16:31,178 --> 01:16:33,390 So in C, this is functionality you get for free. 1662 01:16:33,390 --> 01:16:34,765 You have to have a main function. 1663 01:16:34,765 --> 01:16:37,580 But, heck, so long as you make it, it will be called for you. 1664 01:16:37,580 --> 01:16:41,390 In Python, this is just a convention, to create a main function, 1665 01:16:41,390 --> 01:16:43,200 borrowing a very common name for it. 1666 01:16:43,200 --> 01:16:46,320 But if you want to call that main function, you have to do it. 1667 01:16:46,320 --> 01:16:48,110 So this looks a little weird, admittedly, 1668 01:16:48,110 --> 01:16:50,030 that you have to call your own main function now, 1669 01:16:50,030 --> 01:16:51,860 and it has to be at the bottom of the file, 1670 01:16:51,860 --> 01:16:55,040 because only once the interpreter gets to the bottom of the file, 1671 01:16:55,040 --> 01:16:58,460 have all of your functions been defined, higher up. 1672 01:16:58,460 --> 01:16:59,990 But this solves both problems. 1673 01:16:59,990 --> 01:17:02,450 It keeps your code, that's the main part of your code, 1674 01:17:02,450 --> 01:17:03,660 at the very top of the file. 1675 01:17:03,660 --> 01:17:06,980 So it's just obvious to you, and a TF, or any reader in the future, 1676 01:17:06,980 --> 01:17:09,140 where the program logically starts. 1677 01:17:09,140 --> 01:17:13,310 But it also ensures that main is not called until everything else, main 1678 01:17:13,310 --> 01:17:15,660 included, has been defined. 1679 01:17:15,660 --> 01:17:17,648 So this is another perfect example of we're 1680 01:17:17,648 --> 01:17:19,440 learning a new language for the first time. 1681 01:17:19,440 --> 01:17:21,020 You're not going to have heard all of the answers before. 1682 01:17:21,020 --> 01:17:24,830 Just apply some logic, as to, like, all right, what could explain this symptom. 1683 01:17:24,830 --> 01:17:28,190 Start to infer how the language does or doesn't work. 1684 01:17:28,190 --> 01:17:32,450 If I now go and run this, Python of Meow.py, now we're back in business. 1685 01:17:32,450 --> 01:17:35,360 And just so you have seen it, there is a quote 1686 01:17:35,360 --> 01:17:38,840 unquote "better" way of doing this, that solves different problems that we 1687 01:17:38,840 --> 01:17:42,050 are not going to encounter, certainly in these initial days. 1688 01:17:42,050 --> 01:17:45,440 Typically, you would see in online tutorials or books, 1689 01:17:45,440 --> 01:17:49,400 something that looks like this, where you actually have a weird conditional 1690 01:17:49,400 --> 01:17:50,810 with multiple underscores. 1691 01:17:50,810 --> 01:17:54,470 That's functionally the same thing, but it solves problems with libraries, 1692 01:17:54,470 --> 01:17:57,840 if we ourselves were implementing a library or something similar in spirit. 1693 01:17:57,840 --> 01:18:00,882 But we're going to keep things simpler and just write main at the bottom, 1694 01:18:00,882 --> 01:18:03,355 because we're not going to encounter that problem just yet. 1695 01:18:03,355 --> 01:18:06,230 All right, let's make one change to this, just to show how it's done. 1696 01:18:06,230 --> 01:18:11,420 In C, the last version of meow also took command line argument, sorry, also 1697 01:18:11,420 --> 01:18:13,910 took arguments to the function meow. 1698 01:18:13,910 --> 01:18:16,490 So suppose that I want to factor this out. 1699 01:18:16,490 --> 01:18:19,250 And I want to just call meow as a better abstraction, where I just 1700 01:18:19,250 --> 01:18:21,080 say meow this number of times. 1701 01:18:21,080 --> 01:18:24,290 And I figure out how many times by just, like, putting in number 3 1702 01:18:24,290 --> 01:18:26,990 or using getInt or something like that, to figure out 1703 01:18:26,990 --> 01:18:28,550 how many times to say meow. 1704 01:18:28,550 --> 01:18:31,820 Well, now, I have to define inside my meow function, in input, 1705 01:18:31,820 --> 01:18:38,330 let's call it n, and then use that, as by doing this, for i in range of n, 1706 01:18:38,330 --> 01:18:41,640 let me go ahead and print out meow that many times. 1707 01:18:41,640 --> 01:18:43,820 So again, the only thing that's different in C 1708 01:18:43,820 --> 01:18:47,630 is we don't bother specifying return types for any of these functions, 1709 01:18:47,630 --> 01:18:52,230 and we don't bother specifying the type of our arguments or our variables. 1710 01:18:52,230 --> 01:18:54,930 So same ideas, simpler in some sense. 1711 01:18:54,930 --> 01:18:56,660 We're just throwing away keystrokes. 1712 01:18:56,660 --> 01:18:59,450 All right, let me run this one final time, Python of Meow.py, 1713 01:18:59,450 --> 01:19:02,390 and we still have the same program. 1714 01:19:02,390 --> 01:19:04,110 All right, let me pause here. 1715 01:19:04,110 --> 01:19:04,780 Any questions? 1716 01:19:04,780 --> 01:19:06,030 And I know this is going fast. 1717 01:19:06,030 --> 01:19:11,355 But hopefully, the C code is still somewhat familiar. 1718 01:19:11,355 --> 01:19:11,855 Yeah. 1719 01:19:11,855 --> 01:19:17,530 AUDIENCE: Is there any difference between global and local variables. 1720 01:19:17,530 --> 01:19:18,780 DAVID J. MALAN: Good question. 1721 01:19:18,780 --> 01:19:21,238 Is there any difference between global and local variables? 1722 01:19:21,238 --> 01:19:23,850 Short answer, yes, and we would run into that same problem, 1723 01:19:23,850 --> 01:19:25,320 if we declare a variable in one function, 1724 01:19:25,320 --> 01:19:27,445 another function is not going to have access to it. 1725 01:19:27,445 --> 01:19:30,660 We can solve that by putting variables globally. 1726 01:19:30,660 --> 01:19:32,760 But we don't have all of the features we had in C, 1727 01:19:32,760 --> 01:19:35,160 like there's no such thing as a constant in Python. 1728 01:19:35,160 --> 01:19:36,900 The mentality in the Python community is, 1729 01:19:36,900 --> 01:19:39,480 if you don't want some value to change, don't touch it. 1730 01:19:39,480 --> 01:19:40,630 Like just don't screw up. 1731 01:19:40,630 --> 01:19:42,240 So there's trade-offs here, too. 1732 01:19:42,240 --> 01:19:45,000 Some languages are stronger or more defensive than that. 1733 01:19:45,000 --> 01:19:48,990 But that, too, is part of the mindset with this particular language. 1734 01:19:48,990 --> 01:19:49,770 [SIREN] 1735 01:19:49,770 --> 01:19:50,645 DAVID J. MALAN: Yeah. 1736 01:19:50,645 --> 01:19:52,937 AUDIENCE: There is really only one green line, in the-- 1737 01:19:52,937 --> 01:19:54,437 DAVID J. MALAN: Oh, sorry, where's-- 1738 01:19:54,437 --> 01:19:55,080 say it louder. 1739 01:19:55,080 --> 01:19:58,342 AUDIENCE: There has only been one green line printed at a time. 1740 01:19:58,342 --> 01:20:00,050 DAVID J. MALAN: That is an amazing segue. 1741 01:20:00,050 --> 01:20:01,370 Let's come to that in just a moment, because we're 1742 01:20:01,370 --> 01:20:03,620 going to recreate also that Mario example, where 1743 01:20:03,620 --> 01:20:06,925 we had like the question marks for the coins and the vertical bars. 1744 01:20:06,925 --> 01:20:08,550 So let's come back to that in a second. 1745 01:20:08,550 --> 01:20:09,656 And your question? 1746 01:20:09,656 --> 01:20:13,362 AUDIENCE: If strings are immutable, and every time you like make a copy. 1747 01:20:13,362 --> 01:20:15,320 DAVID J. MALAN: Correct, strings are immutable. 1748 01:20:15,320 --> 01:20:19,220 Any time you seem to be modifying it, as with the lower function, 1749 01:20:19,220 --> 01:20:20,480 you're getting back a copy. 1750 01:20:20,480 --> 01:20:22,940 So it's taking a little more memory somewhere. 1751 01:20:22,940 --> 01:20:26,145 But you don't have to deal with it Python's doing that for you. 1752 01:20:26,145 --> 01:20:28,892 AUDIENCE: So you don't free anything. 1753 01:20:28,892 --> 01:20:30,100 DAVID J. MALAN: Say it again? 1754 01:20:30,100 --> 01:20:31,226 You don't need what? 1755 01:20:31,226 --> 01:20:34,663 AUDIENCE: You don't free like taking leave on stuff. 1756 01:20:34,663 --> 01:20:36,330 DAVID J. MALAN: You don't free anything. 1757 01:20:36,330 --> 01:20:38,870 So if you weren't a big fan, over the past couple of weeks, 1758 01:20:38,870 --> 01:20:42,860 of malloc or free or memory or addresses, or all 1759 01:20:42,860 --> 01:20:44,990 of those low level implementation details, 1760 01:20:44,990 --> 01:20:47,390 Python is the language for you, because all of that 1761 01:20:47,390 --> 01:20:49,340 is handled for you automatically. 1762 01:20:49,340 --> 01:20:50,780 Java does the same. 1763 01:20:50,780 --> 01:20:51,960 JavaScript does the same. 1764 01:20:51,960 --> 01:20:52,460 Yeah. 1765 01:20:52,460 --> 01:20:58,244 AUDIENCE: Each up for the variable, you put it before the name, use of the body 1766 01:20:58,244 --> 01:20:59,700 before the name, correct? 1767 01:20:59,700 --> 01:21:03,785 Well, if there isn't a main function in Python, how do you define those words? 1768 01:21:03,785 --> 01:21:05,910 DAVID J. MALAN: How do you define a global variable 1769 01:21:05,910 --> 01:21:07,493 if there's no main function in Python? 1770 01:21:07,493 --> 01:21:11,480 Global variables, by definition, always need to be outside of main, as well. 1771 01:21:11,480 --> 01:21:12,480 So that's not a problem. 1772 01:21:12,480 --> 01:21:15,300 If I wanted to have a function that's outside of, 1773 01:21:15,300 --> 01:21:19,703 and, therefore, global to all of these, like global-- 1774 01:21:19,703 --> 01:21:22,620 actually, don't use the word global, that's a special word in Python-- 1775 01:21:22,620 --> 01:21:27,450 variable equals Foo, F-O-O, just as an arbitrary string 1776 01:21:27,450 --> 01:21:31,410 value that a computer scientist would typically use, that is now global. 1777 01:21:31,410 --> 01:21:34,000 There are some caveats, though, as to how you access that. 1778 01:21:34,000 --> 01:21:36,010 But let's come back to that another time. 1779 01:21:36,010 --> 01:21:38,030 But that problem is solvable, too. 1780 01:21:38,030 --> 01:21:38,530 All right. 1781 01:21:38,530 --> 01:21:39,780 So let's go ahead and do this. 1782 01:21:39,780 --> 01:21:43,050 To come back to the question about the print command, let me go ahead 1783 01:21:43,050 --> 01:21:45,300 and create a file now called Mario.py. 1784 01:21:45,300 --> 01:21:47,700 Won't bother showing the C code anymore. 1785 01:21:47,700 --> 01:21:49,590 We'll focus just on the new language here. 1786 01:21:49,590 --> 01:21:54,540 But recall that, in Python, in Mario, we wanted to first do something like this. 1787 01:21:54,540 --> 01:21:57,600 This was a random screen from the side scroller version 1 1788 01:21:57,600 --> 01:21:58,800 of Super Mario Brothers. 1789 01:21:58,800 --> 01:22:02,820 And we just want to print like three hashes to represent those three blocks. 1790 01:22:02,820 --> 01:22:04,950 Well, in Python, we could do something like this, 1791 01:22:04,950 --> 01:22:11,280 print, oh, sorry, for i in the range of 3, go ahead and print out quote unquote 1792 01:22:11,280 --> 01:22:11,828 "hash." 1793 01:22:11,828 --> 01:22:13,620 And I think this is pretty straightforward. 1794 01:22:13,620 --> 01:22:16,260 Python of Mario.py, we get our three hashes. 1795 01:22:16,260 --> 01:22:18,850 You could imagine parameterizing this now, though, 1796 01:22:18,850 --> 01:22:20,350 and getting actual user input. 1797 01:22:20,350 --> 01:22:21,730 So let's do that. 1798 01:22:21,730 --> 01:22:27,420 Let me go up here and let me go and say from CS50, import getInt, 1799 01:22:27,420 --> 01:22:31,090 and then let's get the input from the user. 1800 01:22:31,090 --> 01:22:33,210 So it actually is a value n, like, all right, 1801 01:22:33,210 --> 01:22:38,190 getInt the height of the column of bricks that you want to do. 1802 01:22:38,190 --> 01:22:42,270 And then, let's go ahead and print out n hashes instead of three. 1803 01:22:42,270 --> 01:22:43,560 So let me run this. 1804 01:22:43,560 --> 01:22:45,385 Let's print out like five hashes. 1805 01:22:45,385 --> 01:22:47,760 OK, one, two, three, four, five, that seems to work, too. 1806 01:22:47,760 --> 01:22:49,677 And it's going to work for any positive value. 1807 01:22:49,677 --> 01:22:53,400 But it's not going to work for, how about negative 1? 1808 01:22:53,400 --> 01:22:54,660 That just doesn't do anything. 1809 01:22:54,660 --> 01:22:55,747 But that seems OK. 1810 01:22:55,747 --> 01:22:58,830 But also recall that it's not going to work if the user types in something 1811 01:22:58,830 --> 01:23:03,990 weird, like, oh, sorry, it is going to work if the user types in something 1812 01:23:03,990 --> 01:23:05,790 weird like cat, why? 1813 01:23:05,790 --> 01:23:08,820 We're using CS50's getInt function, which is 1814 01:23:08,820 --> 01:23:11,710 handling all of those headaches for us. 1815 01:23:11,710 --> 01:23:15,180 But, what if the user indeed types a negative number? 1816 01:23:15,180 --> 01:23:16,110 We're tolerating that. 1817 01:23:16,110 --> 01:23:17,860 So that was the bug I wanted to highlight. 1818 01:23:17,860 --> 01:23:20,250 It would be nice to re-prompt them and re-prompt them. 1819 01:23:20,250 --> 01:23:22,560 And in C, what was the programming construct we 1820 01:23:22,560 --> 01:23:25,020 used when we wanted to ask the user a question. 1821 01:23:25,020 --> 01:23:29,280 And then, if they didn't cooperate, prompt them again, prompt them again. 1822 01:23:29,280 --> 01:23:29,890 What was that? 1823 01:23:29,890 --> 01:23:30,390 Yeah. 1824 01:23:30,390 --> 01:23:30,750 AUDIENCE: Do while loop. 1825 01:23:30,750 --> 01:23:32,100 DAVID J. MALAN: Yeah, do while loop, right? 1826 01:23:32,100 --> 01:23:34,830 That was useful, because it's almost the same as a while loop. 1827 01:23:34,830 --> 01:23:38,100 But instead of checking a condition, and then doing something, 1828 01:23:38,100 --> 01:23:39,948 you do something and then check a condition, 1829 01:23:39,948 --> 01:23:42,240 which makes sense with user input, because what are you 1830 01:23:42,240 --> 01:23:44,615 even going to check if the user hasn't done anything yet? 1831 01:23:44,615 --> 01:23:46,200 You need that inverted logic. 1832 01:23:46,200 --> 01:23:50,010 Unfortunately in Python, there is no do while loop. 1833 01:23:50,010 --> 01:23:51,300 There is a for loop. 1834 01:23:51,300 --> 01:23:52,740 There is a while loop. 1835 01:23:52,740 --> 01:23:55,590 And frankly, those are enough to recreate this idea. 1836 01:23:55,590 --> 01:23:59,160 And the way to do this in Python, the Pythonic way, which 1837 01:23:59,160 --> 01:24:02,160 is another term of art in the community, is to say this. 1838 01:24:02,160 --> 01:24:06,300 Deliberately induce an infinite loop, while True, with capital T for true. 1839 01:24:06,300 --> 01:24:09,930 And then do what you got to do, like get an Int from a user, 1840 01:24:09,930 --> 01:24:12,060 asking them for the height of this thing. 1841 01:24:12,060 --> 01:24:18,270 And then, if that is what you want, like a number greater than zero, go ahead 1842 01:24:18,270 --> 01:24:20,020 and break out of the loop. 1843 01:24:20,020 --> 01:24:25,440 So this is how, in Python, you could recreate the idea of a do while loop. 1844 01:24:25,440 --> 01:24:27,315 You deliberately induce an infinite loop. 1845 01:24:27,315 --> 01:24:29,190 So something's going to happen at least once. 1846 01:24:29,190 --> 01:24:32,280 Then, if you get the answer you want, you break out of it, 1847 01:24:32,280 --> 01:24:34,330 effectively achieving the same logic. 1848 01:24:34,330 --> 01:24:37,080 So this is the Pythonic way of doing a do while loop. 1849 01:24:37,080 --> 01:24:41,760 Let me go ahead and run Python of Mario.py, type in 3 this time. 1850 01:24:41,760 --> 01:24:44,670 And now I get back just the 3 hashes as well. 1851 01:24:44,670 --> 01:24:50,310 What if, though, I wanted to get rid of, how about ultimately 1852 01:24:50,310 --> 01:24:55,058 that CS50 library function, and also encapsulate this in a function. 1853 01:24:55,058 --> 01:24:57,100 Well, let's go ahead and tweak this a little bit. 1854 01:24:57,100 --> 01:24:59,070 Let me go ahead and remove this temporarily. 1855 01:24:59,070 --> 01:25:01,680 Give myself a main function, so I don't make the same mistake 1856 01:25:01,680 --> 01:25:03,360 as I did initially earlier. 1857 01:25:03,360 --> 01:25:07,110 And let me give myself a function called get height that takes no arguments. 1858 01:25:07,110 --> 01:25:10,620 And inside of that function is going to be that same code. 1859 01:25:10,620 --> 01:25:14,280 But I don't want to break in this case, I want to return n. 1860 01:25:14,280 --> 01:25:17,293 So, recall, that if you return from a function, you're done, 1861 01:25:17,293 --> 01:25:19,210 you're going to exit from right at that point. 1862 01:25:19,210 --> 01:25:20,320 So this would be fine. 1863 01:25:20,320 --> 01:25:22,680 You can just say return n inside of the loop, 1864 01:25:22,680 --> 01:25:25,320 or, if you would prefer to break out, you 1865 01:25:25,320 --> 01:25:26,940 could do something like this instead. 1866 01:25:26,940 --> 01:25:32,700 Break, and then down here, you could return, down here, 1867 01:25:32,700 --> 01:25:34,630 you could return n as well. 1868 01:25:34,630 --> 01:25:37,290 And let me make one point here before we go back up to main. 1869 01:25:37,290 --> 01:25:41,490 This is a little different from C. And this one's subtle. 1870 01:25:41,490 --> 01:25:47,250 What have I done here that in C would have been a bug, but is apparently not, 1871 01:25:47,250 --> 01:25:48,315 I claim, in Python. 1872 01:25:48,315 --> 01:25:50,860 1873 01:25:50,860 --> 01:25:52,220 It's super subtle, this one. 1874 01:25:52,220 --> 01:25:52,720 Yeah. 1875 01:25:52,720 --> 01:25:55,911 AUDIENCE: So aren't we like defining mostly object, 1876 01:25:55,911 --> 01:25:59,470 like we're using it first, defining an object? 1877 01:25:59,470 --> 01:26:04,275 [INAUDIBLE] 1878 01:26:04,275 --> 01:26:07,150 DAVID J. MALAN: So similar, it's not quite that we're using it first. 1879 01:26:07,150 --> 01:26:10,980 So it's OK not to declare a variable with like the data type. 1880 01:26:10,980 --> 01:26:15,420 We've addressed that before, but on line 9, we're assigning n a value, it seems. 1881 01:26:15,420 --> 01:26:18,600 And then we return n on line 12. 1882 01:26:18,600 --> 01:26:20,190 But notice the indentation. 1883 01:26:20,190 --> 01:26:25,410 In the world of C, if we had declared a variable inside of a loop, on line 9, 1884 01:26:25,410 --> 01:26:28,200 it would have been scoped to that loop, which 1885 01:26:28,200 --> 01:26:31,530 means as soon as you get out of that loop, like further down in the program, 1886 01:26:31,530 --> 01:26:33,340 n would not exist. 1887 01:26:33,340 --> 01:26:36,090 It would be local to the curly braces therein. 1888 01:26:36,090 --> 01:26:39,720 Here, logically, curly braces are gone, but the indentation 1889 01:26:39,720 --> 01:26:44,250 makes clear that n is still inside of this loop, between lines 8 through 11. 1890 01:26:44,250 --> 01:26:47,280 But n is actually still in scope in Python. 1891 01:26:47,280 --> 01:26:50,380 The moment you create a variable in Python, for better or for worse, 1892 01:26:50,380 --> 01:26:53,760 It is available everywhere within that function, even outside 1893 01:26:53,760 --> 01:26:55,690 of the loop in which you defined it. 1894 01:26:55,690 --> 01:26:59,070 So this logic is actually OK in Python. 1895 01:26:59,070 --> 01:27:02,138 In C, recall, to solve this same problem, 1896 01:27:02,138 --> 01:27:04,680 we would have had to do something a little hackish like this, 1897 01:27:04,680 --> 01:27:09,600 like define n up here on line 8, so that it exists, now, on line 10, 1898 01:27:09,600 --> 01:27:12,000 and so that it exists on line 13. 1899 01:27:12,000 --> 01:27:15,700 That is no longer an issue or need, in Python. 1900 01:27:15,700 --> 01:27:17,700 Once you create a variable, even if it's nested, 1901 01:27:17,700 --> 01:27:19,867 nested, nested inside of some loops or conditionals, 1902 01:27:19,867 --> 01:27:23,520 it still exists within the function itself. 1903 01:27:23,520 --> 01:27:27,870 All right, any questions then on this, before we now run this and then get 1904 01:27:27,870 --> 01:27:31,680 rid of the CS50 library again? 1905 01:27:31,680 --> 01:27:34,300 OK, so let me go ahead and get the height from the user. 1906 01:27:34,300 --> 01:27:36,758 Let's go ahead and create a variable in main called height. 1907 01:27:36,758 --> 01:27:38,460 Let's call this get height function. 1908 01:27:38,460 --> 01:27:43,380 And then let's use that height value, instead of something hardcoded there. 1909 01:27:43,380 --> 01:27:45,000 And let me see if this all works now. 1910 01:27:45,000 --> 01:27:46,410 Python of Mario.py. 1911 01:27:46,410 --> 01:27:49,110 Hopefully, I haven't messed up, but I did. 1912 01:27:49,110 --> 01:27:51,460 But this is an easy fix now. 1913 01:27:51,460 --> 01:27:51,960 Yeah. 1914 01:27:51,960 --> 01:27:53,085 AUDIENCE: Got to call main. 1915 01:27:53,085 --> 01:27:54,543 DAVID J. MALAN: I got to call main. 1916 01:27:54,543 --> 01:27:55,980 So again, I deleted that earlier. 1917 01:27:55,980 --> 01:27:56,920 But let me bring it back. 1918 01:27:56,920 --> 01:27:58,128 So I'm actually calling main. 1919 01:27:58,128 --> 01:28:02,190 Let me rerun Python of Mario.py, there we go, height 3. 1920 01:28:02,190 --> 01:28:03,880 Now it seems to be working. 1921 01:28:03,880 --> 01:28:05,880 So let's do one last thing with Mario, just 1922 01:28:05,880 --> 01:28:08,980 to tie together that idea now of exceptions from before. 1923 01:28:08,980 --> 01:28:11,070 Again, exceptions are a feature of Python, 1924 01:28:11,070 --> 01:28:13,060 whereby you can try to do something. 1925 01:28:13,060 --> 01:28:16,710 And if there's a problem, you can handle it in any way you see fit. 1926 01:28:16,710 --> 01:28:20,070 Previously, I handled it by just yelling at the user that that's not an Int. 1927 01:28:20,070 --> 01:28:23,460 But let's actually use this to re-implement CS50's own getInt 1928 01:28:23,460 --> 01:28:24,240 function. 1929 01:28:24,240 --> 01:28:27,130 Let me throw away CS50's getInt function. 1930 01:28:27,130 --> 01:28:32,880 And now let me go ahead and replace getInt with input. 1931 01:28:32,880 --> 01:28:35,670 But it's not sufficient to just use input. 1932 01:28:35,670 --> 01:28:39,480 What do I have to add to this line of code on line 8? 1933 01:28:39,480 --> 01:28:40,740 If I want to get back an Int? 1934 01:28:40,740 --> 01:28:41,790 AUDIENCE: The Int function. 1935 01:28:41,790 --> 01:28:43,832 DAVID J. MALAN: Yeah, I have to cast it to an Int 1936 01:28:43,832 --> 01:28:46,500 by calling the Int function around that value, 1937 01:28:46,500 --> 01:28:48,750 or I could do it on a separate line, just to be clear. 1938 01:28:48,750 --> 01:28:52,110 I could also do n equals Int of n. 1939 01:28:52,110 --> 01:28:55,020 That would work too, but it's sort of an unnecessary extra line. 1940 01:28:55,020 --> 01:28:57,990 This is not sufficient, because that does not change the value. 1941 01:28:57,990 --> 01:28:58,935 It creates the value. 1942 01:28:58,935 --> 01:29:00,060 But then it throws it away. 1943 01:29:00,060 --> 01:29:01,192 We need to assign it. 1944 01:29:01,192 --> 01:29:03,900 So the conventional way to do this would probably be in one line, 1945 01:29:03,900 --> 01:29:05,358 just to keep things nice and tight. 1946 01:29:05,358 --> 01:29:06,780 So that works fine now. 1947 01:29:06,780 --> 01:29:11,470 If I run Python of Mario.py, I can still type in 3, and all as well. 1948 01:29:11,470 --> 01:29:15,720 I can still type in negative 1, because that is an Int that I am handling. 1949 01:29:15,720 --> 01:29:18,750 What I'm not yet handling is weird input like cat 1950 01:29:18,750 --> 01:29:21,760 or some string that is not a base 10 number. 1951 01:29:21,760 --> 01:29:23,880 So here, again, is my traceback. 1952 01:29:23,880 --> 01:29:27,000 And notice that here, let me scroll up a little bit, 1953 01:29:27,000 --> 01:29:31,620 here we can actually see more detail in the traceback. 1954 01:29:31,620 --> 01:29:36,900 Notice that, just like in C, or just like in the debugger in VS Code, 1955 01:29:36,900 --> 01:29:38,100 you can see a few things. 1956 01:29:38,100 --> 01:29:41,490 You can see mention of module, that just means your file, main, which 1957 01:29:41,490 --> 01:29:43,013 is my main function, and get height. 1958 01:29:43,013 --> 01:29:44,430 So notice, it's kind of backwards. 1959 01:29:44,430 --> 01:29:46,720 It's top to bottom instead of bottom up, as we drew it 1960 01:29:46,720 --> 01:29:48,720 on the board the other day, and as we envisioned 1961 01:29:48,720 --> 01:29:50,520 stacks of trays in the cafeteria. 1962 01:29:50,520 --> 01:29:52,680 But this is your stack, of functions that 1963 01:29:52,680 --> 01:29:54,330 have been called, from top to bottom. 1964 01:29:54,330 --> 01:29:57,360 Get height is the most recent, main is the very first, 1965 01:29:57,360 --> 01:29:59,200 value error is the problem. 1966 01:29:59,200 --> 01:30:03,740 So let's try to do, let's try to do this literally, except if there's an error. 1967 01:30:03,740 --> 01:30:04,740 So what do I want to do? 1968 01:30:04,740 --> 01:30:09,720 I'm going to go in here, and I'm going to say, try to do the following. 1969 01:30:09,720 --> 01:30:17,070 Whoops, try to do the following, except if there's a value error, value error, 1970 01:30:17,070 --> 01:30:20,640 then go ahead and say something, well, like before, print, 1971 01:30:20,640 --> 01:30:23,830 that's not an integer exclamation point. 1972 01:30:23,830 --> 01:30:26,760 But the difference this time is because I'm in a loop, the user 1973 01:30:26,760 --> 01:30:29,200 is going to have a chance to recover from this issue. 1974 01:30:29,200 --> 01:30:32,340 So if I run Mario.py, 3 still works as before. 1975 01:30:32,340 --> 01:30:35,880 If I run Mario.py and type in cat, I detect it now, 1976 01:30:35,880 --> 01:30:39,240 and because I'm still in that loop, and because the program hasn't crashed, 1977 01:30:39,240 --> 01:30:43,050 because I've caught, so to speak, the value error, using this line of code 1978 01:30:43,050 --> 01:30:46,950 here, that's the way in Python to detect these kinds of errors, 1979 01:30:46,950 --> 01:30:49,680 that would otherwise end up being on the user's own screen. 1980 01:30:49,680 --> 01:30:51,540 If I type in cat, dog, that doesn't work. 1981 01:30:51,540 --> 01:30:56,820 If I type in, though, 2, I get my two hashes, because that's, indeed, an Int. 1982 01:30:56,820 --> 01:30:58,740 Are any questions on this, and we're not going 1983 01:30:58,740 --> 01:31:00,750 to spend too much time on exceptions, but just wanted 1984 01:31:00,750 --> 01:31:03,680 to show you what's involved with getting rid of those training wheels. 1985 01:31:03,680 --> 01:31:04,180 Yeah. 1986 01:31:04,180 --> 01:31:05,763 AUDIENCE: Then the hash marks in line. 1987 01:31:05,763 --> 01:31:07,305 DAVID J. MALAN: OK, so let's do this. 1988 01:31:07,305 --> 01:31:09,140 That actually comes to the earlier question 1989 01:31:09,140 --> 01:31:11,060 about printing the hashes on the same line, 1990 01:31:11,060 --> 01:31:13,808 or maybe something like this, where we have the little bricks 1991 01:31:13,808 --> 01:31:15,350 in the sky, or little question marks. 1992 01:31:15,350 --> 01:31:17,725 Let's recreate this idea, because the problem with print, 1993 01:31:17,725 --> 01:31:20,930 as was noted earlier, is you're automatically printing out new lines. 1994 01:31:20,930 --> 01:31:22,460 But what if we don't want that. 1995 01:31:22,460 --> 01:31:24,740 Well, let's change this program entirely. 1996 01:31:24,740 --> 01:31:26,310 Let me throw away all the functions. 1997 01:31:26,310 --> 01:31:29,220 Let's just go to a simpler world, where we're just doing this. 1998 01:31:29,220 --> 01:31:30,912 So let me start fresh in Mario.py. 1999 01:31:30,912 --> 01:31:33,120 I'm not going to bother with exceptions or functions. 2000 01:31:33,120 --> 01:31:39,410 Let's just do a very simple program, to create this idea, for i in range of 4 2001 01:31:39,410 --> 01:31:42,860 this time, because there are four of these things in the sky. 2002 01:31:42,860 --> 01:31:45,230 Let's go ahead and just print out a question mark 2003 01:31:45,230 --> 01:31:47,450 to represent each of those bricks. 2004 01:31:47,450 --> 01:31:51,140 Odds are you know this not going to end well, because these are unfortunately, 2005 01:31:51,140 --> 01:31:54,450 as you've predicted, on separate lines. 2006 01:31:54,450 --> 01:31:57,380 So it turns out that the print function actually 2007 01:31:57,380 --> 01:32:00,320 takes in multiple arguments, not just the thing you want to print, 2008 01:32:00,320 --> 01:32:03,650 but also some additional arguments, that allow you to specify 2009 01:32:03,650 --> 01:32:06,170 what the default line ending should be. 2010 01:32:06,170 --> 01:32:09,110 But what's interesting about this is that, if you 2011 01:32:09,110 --> 01:32:12,630 want to change the line ending to be something like, 2012 01:32:12,630 --> 01:32:16,790 quote unquote, "that is nothing," instead of backslash n, 2013 01:32:16,790 --> 01:32:19,310 this is not sufficient, because in Python, you 2014 01:32:19,310 --> 01:32:21,770 can have two types of arguments, or parameters. 2015 01:32:21,770 --> 01:32:25,160 Some arguments are positional, which is the fancy way of saying it's 2016 01:32:25,160 --> 01:32:26,690 a comma separated list of arguments. 2017 01:32:26,690 --> 01:32:29,540 And that's what we did all the time in C. Something comma, something 2018 01:32:29,540 --> 01:32:31,665 comma, something, we did it in printf all the time, 2019 01:32:31,665 --> 01:32:33,980 and in other functions that took multiple arguments. 2020 01:32:33,980 --> 01:32:37,880 In Python, you have, not only positional arguments, 2021 01:32:37,880 --> 01:32:41,660 where you just separate them by commas, to give one or two or three or more 2022 01:32:41,660 --> 01:32:42,650 arguments. 2023 01:32:42,650 --> 01:32:46,220 There are also named arguments, which looks weird but is 2024 01:32:46,220 --> 01:32:48,140 helpful for reasons like this. 2025 01:32:48,140 --> 01:32:50,900 If you read the documentation, you will see 2026 01:32:50,900 --> 01:32:54,740 that there is a named argument that Python accepts, called end. 2027 01:32:54,740 --> 01:32:57,680 And if you set that equal to something, that 2028 01:32:57,680 --> 01:33:00,200 will be used as the end of every line, instead 2029 01:33:00,200 --> 01:33:02,750 of the default, which the documentation will also say 2030 01:33:02,750 --> 01:33:04,700 is quote unquote backslash n. 2031 01:33:04,700 --> 01:33:09,000 So this line here has no effect on my logic at the moment. 2032 01:33:09,000 --> 01:33:13,280 But if I change it to just quote unquote, essentially overriding 2033 01:33:13,280 --> 01:33:18,470 the default new line character, and now run Mario again, now I get all four 2034 01:33:18,470 --> 01:33:19,278 on the same line. 2035 01:33:19,278 --> 01:33:20,570 There's a bit of a bug, though. 2036 01:33:20,570 --> 01:33:23,610 My prompt is not meant to be on the same line. 2037 01:33:23,610 --> 01:33:25,640 So I can fix that by just printing nothing. 2038 01:33:25,640 --> 01:33:28,640 But, really, it's not nothing, because you get the new line for free. 2039 01:33:28,640 --> 01:33:32,930 So let me run Python of Mario.py again, and now we 2040 01:33:32,930 --> 01:33:36,140 have what I intended in the first place, which was a little something that 2041 01:33:36,140 --> 01:33:37,170 looked like this. 2042 01:33:37,170 --> 01:33:40,910 And this is just one example of an argument that has a name. 2043 01:33:40,910 --> 01:33:43,280 But this is a common paradigm in Python 2, 2044 01:33:43,280 --> 01:33:46,250 to not just separate things by commas, but to be very specific, 2045 01:33:46,250 --> 01:33:50,810 because the print function might take 5, 10, even 20 different arguments. 2046 01:33:50,810 --> 01:33:54,628 And my God, if you had to enumerate like 10 or 20 commas, 2047 01:33:54,628 --> 01:33:55,670 you're going to screw up. 2048 01:33:55,670 --> 01:33:57,587 You're going to get things in the wrong order. 2049 01:33:57,587 --> 01:34:00,600 Named arguments allow you to be resilient against that. 2050 01:34:00,600 --> 01:34:02,690 So you only specify arguments by name, and it 2051 01:34:02,690 --> 01:34:06,004 doesn't matter what order they are in. 2052 01:34:06,004 --> 01:34:10,160 All right, any questions, then, on this, and the overriding of new line. 2053 01:34:10,160 --> 01:34:14,270 And to be clear, you can do something like, very weird, 2054 01:34:14,270 --> 01:34:19,910 but logically expected, like this, by just changing the line ending, too. 2055 01:34:19,910 --> 01:34:21,830 But the right way to solve the Mario problem 2056 01:34:21,830 --> 01:34:25,652 would be just to override it to be nothing like this. 2057 01:34:25,652 --> 01:34:27,110 All right, how about this for cool. 2058 01:34:27,110 --> 01:34:29,000 And this is why a lot of people like Python. 2059 01:34:29,000 --> 01:34:30,440 Suppose you don't really like loops. 2060 01:34:30,440 --> 01:34:31,970 You don't really like three-line programs, 2061 01:34:31,970 --> 01:34:34,637 because that was kind of three times longer than it needs to be. 2062 01:34:34,637 --> 01:34:39,200 What if you just printed out a question mark four times? 2063 01:34:39,200 --> 01:34:43,380 Python, whoops, Python of Mario.py, that also works. 2064 01:34:43,380 --> 01:34:46,550 So it turns out that, just like the plus operator in Python 2065 01:34:46,550 --> 01:34:50,570 can join things together, the multiply operator is not 2066 01:34:50,570 --> 01:34:51,840 arithmetic in this case. 2067 01:34:51,840 --> 01:34:56,070 It actually means, take this and concatenate it four times over. 2068 01:34:56,070 --> 01:34:59,000 So that's a way of just distilling into one line what 2069 01:34:59,000 --> 01:35:02,750 would have otherwise taken multiple lines in C, fewer, but still multiple 2070 01:35:02,750 --> 01:35:07,130 lines in Python, but is really now rather succinct in Python, 2071 01:35:07,130 --> 01:35:08,385 by doing that instead. 2072 01:35:08,385 --> 01:35:11,510 Let's do one last Mario example, which looked a little something like this. 2073 01:35:11,510 --> 01:35:14,090 If this is another part of the Mario interface, 2074 01:35:14,090 --> 01:35:16,800 this is like a grid of like 3 by 3 bricks, for instance. 2075 01:35:16,800 --> 01:35:20,690 So two dimensions now, just not just vertical, not horizontal, but now both. 2076 01:35:20,690 --> 01:35:23,130 Let's print out something like that, using hashes. 2077 01:35:23,130 --> 01:35:26,070 Well, how about, how do I do this. 2078 01:35:26,070 --> 01:35:29,210 So how about for i in range of 3. 2079 01:35:29,210 --> 01:35:34,280 Then I could do for j in range of 3, just because j comes after I 2080 01:35:34,280 --> 01:35:35,810 and that's reasonable for counting. 2081 01:35:35,810 --> 01:35:41,000 I could now print out a hash symbol, well, let's see what this does. 2082 01:35:41,000 --> 01:35:47,660 Python of Mario.py, OK, that's just one crazy long column. 2083 01:35:47,660 --> 01:35:51,240 What do I need to fix and where here, to make this look like this? 2084 01:35:51,240 --> 01:35:55,850 So 3 by 3 bricks, instead of one long column. 2085 01:35:55,850 --> 01:35:56,450 Any instincts? 2086 01:35:56,450 --> 01:36:00,500 AUDIENCE: Why don't we create a line and then we'll skip it. 2087 01:36:00,500 --> 01:36:03,450 DAVID J. MALAN: OK, so after printing 3, we want to skip a line. 2088 01:36:03,450 --> 01:36:05,750 So maybe like print out a blank line here. 2089 01:36:05,750 --> 01:36:06,740 OK, let's try that. 2090 01:36:06,740 --> 01:36:09,920 I like that instinct, right, print 3, new line, print 3, new line. 2091 01:36:09,920 --> 01:36:12,260 Let's go ahead and run Python of Mario.py. 2092 01:36:12,260 --> 01:36:16,580 OK, it's more visible, what I'm doing, but still wrong. 2093 01:36:16,580 --> 01:36:19,110 What can I, what's the remaining fix, though? 2094 01:36:19,110 --> 01:36:19,610 Yeah. 2095 01:36:19,610 --> 01:36:22,790 AUDIENCE: So right behind the two. 2096 01:36:22,790 --> 01:36:25,680 DAVID J. MALAN: Yeah, I'm getting an extra new line here, 2097 01:36:25,680 --> 01:36:27,870 which I don't want while I'm on this row. 2098 01:36:27,870 --> 01:36:31,850 So let me do n equals quote unquote, and now, together, your solutions might 2099 01:36:31,850 --> 01:36:33,950 take us the whole way there. 2100 01:36:33,950 --> 01:36:37,345 Python of Mario.py, voila, now we've got it, in two dimensions. 2101 01:36:37,345 --> 01:36:38,720 And even this, we can tighten up. 2102 01:36:38,720 --> 01:36:41,220 Like, we could just use the little trick we learned. 2103 01:36:41,220 --> 01:36:45,230 So we could just say, print a hash times 3 times, 2104 01:36:45,230 --> 01:36:47,810 and we can get rid of one of those loops altogether. 2105 01:36:47,810 --> 01:36:50,930 All it's doing is, whoops, all it's doing is automating that process. 2106 01:36:50,930 --> 01:36:53,060 But, no, I don't want to do that. 2107 01:36:53,060 --> 01:36:54,832 What do I, how do I fix this here. 2108 01:36:54,832 --> 01:36:56,540 I don't think I want this anymore, right? 2109 01:36:56,540 --> 01:36:58,350 Because that's giving me an extra new line. 2110 01:36:58,350 --> 01:37:01,260 So now this program is really tightened up. 2111 01:37:01,260 --> 01:37:03,050 Same thing, two lines of code. 2112 01:37:03,050 --> 01:37:07,220 But we're now implementing this same two dimensional structure here. 2113 01:37:07,220 --> 01:37:10,440 All right, any questions here on these? 2114 01:37:10,440 --> 01:37:10,940 Yeah. 2115 01:37:10,940 --> 01:37:16,790 AUDIENCE: Is there any practical reason why when we write n, n is, I mean, 2116 01:37:16,790 --> 01:37:19,850 the print function, you don't put any spaces in it. 2117 01:37:19,850 --> 01:37:22,430 DAVID J. MALAN: If I print n, any spaces. 2118 01:37:22,430 --> 01:37:23,300 Say that once more. 2119 01:37:23,300 --> 01:37:25,440 AUDIENCE: Whenever we write n, for example, 2120 01:37:25,440 --> 01:37:28,850 the print function is, you know, in order 2121 01:37:28,850 --> 01:37:33,820 to stop it from going to a new line, it seems like any spaces, 2122 01:37:33,820 --> 01:37:37,800 we did like n equals and then too close. 2123 01:37:37,800 --> 01:37:38,820 There were no spaces. 2124 01:37:38,820 --> 01:37:40,300 Did you do that on purpose? 2125 01:37:40,300 --> 01:37:42,300 DAVID J. MALAN: Oh. 2126 01:37:42,300 --> 01:37:43,200 yes, good question. 2127 01:37:43,200 --> 01:37:44,242 I see what you're saying. 2128 01:37:44,242 --> 01:37:48,030 So in a previous version, let me rewind in time, when we had this, 2129 01:37:48,030 --> 01:37:49,170 I did not put spaces. 2130 01:37:49,170 --> 01:37:51,720 The convention in Python is not to do that. 2131 01:37:51,720 --> 01:37:52,350 Why? 2132 01:37:52,350 --> 01:37:54,263 It just starts to add too much space. 2133 01:37:54,263 --> 01:37:56,430 And this is a little inconsistent, because, earlier, 2134 01:37:56,430 --> 01:37:58,470 when we talked about like pluses or spaces 2135 01:37:58,470 --> 01:38:00,750 around the less than or equal signs, I did say add it. 2136 01:38:00,750 --> 01:38:03,010 Here it's actually clearer and recommended 2137 01:38:03,010 --> 01:38:04,260 to keep them tighter together. 2138 01:38:04,260 --> 01:38:07,560 Otherwise it just becomes harder to read where the gaps are. 2139 01:38:07,560 --> 01:38:08,820 Good observation. 2140 01:38:08,820 --> 01:38:14,357 All right, let's do, how about, another five minute break. 2141 01:38:14,357 --> 01:38:14,940 Let's do that. 2142 01:38:14,940 --> 01:38:17,732 And then we're going to dive into some more sophisticated problems, 2143 01:38:17,732 --> 01:38:21,160 and then ultimately build with some audio and visual examples, as well. 2144 01:38:21,160 --> 01:38:23,130 See you in five. 2145 01:38:23,130 --> 01:38:28,260 All right, so almost all of the examples we just did 2146 01:38:28,260 --> 01:38:30,540 were recreations of what we did in week 1. 2147 01:38:30,540 --> 01:38:33,120 And recall that week 1 was like our most syntax-heavy week. 2148 01:38:33,120 --> 01:38:36,930 It was when we were first learning how to program in C. But after week 1, 2149 01:38:36,930 --> 01:38:39,900 we began to focus a bit more on ideas, like arrays, 2150 01:38:39,900 --> 01:38:41,640 and other higher-level constructs. 2151 01:38:41,640 --> 01:38:44,880 And we'll do that again here, condensing some of those first early weeks 2152 01:38:44,880 --> 01:38:47,250 into a fewer set of examples in Python. 2153 01:38:47,250 --> 01:38:50,020 And we'll culminate by actually taking Python out for a spin, 2154 01:38:50,020 --> 01:38:52,300 and doing things that would be way harder to do, 2155 01:38:52,300 --> 01:38:56,830 and way more time-consuming to do in C, even more so than the speller example. 2156 01:38:56,830 --> 01:38:59,790 But how do you go about figuring out what functions exist, 2157 01:38:59,790 --> 01:39:02,970 if you didn't hear it in class, you don't see it online, 2158 01:39:02,970 --> 01:39:06,480 but you want to see it officially, you can go to the Python documentation, 2159 01:39:06,480 --> 01:39:08,220 docs.python.org here. 2160 01:39:08,220 --> 01:39:11,340 And I will disclaim that, honestly, the Python documentation is not 2161 01:39:11,340 --> 01:39:12,750 terribly user-friendly. 2162 01:39:12,750 --> 01:39:15,240 Google will often be your friend, so googling something 2163 01:39:15,240 --> 01:39:19,350 you're interested in, to find your way to the appropriate page on Python.org, 2164 01:39:19,350 --> 01:39:22,410 or StackOverflow.com is another popular website. 2165 01:39:22,410 --> 01:39:24,780 As always, though, the line should be googling 2166 01:39:24,780 --> 01:39:27,600 things like, how do I convert a string to lowercase. 2167 01:39:27,600 --> 01:39:29,070 Like that's reasonable to Google. 2168 01:39:29,070 --> 01:39:33,160 Or how to convert to uppercase or how implement function in Python. 2169 01:39:33,160 --> 01:39:37,950 But googling, of course, things like how to implement problem set 6 in CS50, 2170 01:39:37,950 --> 01:39:39,120 of course, crosses the line. 2171 01:39:39,120 --> 01:39:42,078 But moving forward, and really with programming in general, like Google 2172 01:39:42,078 --> 01:39:44,220 and Stack Overflow are your friends, but the line 2173 01:39:44,220 --> 01:39:46,540 is between the reasonable and the unreasonable. 2174 01:39:46,540 --> 01:39:49,890 So let me officially use the Python documentation search, just 2175 01:39:49,890 --> 01:39:52,530 to search for something like the lowercase function. 2176 01:39:52,530 --> 01:39:54,540 Like, I know I can lowercase things in Python. 2177 01:39:54,540 --> 01:39:55,980 I don't quite remember how. 2178 01:39:55,980 --> 01:39:57,870 So let me just search for the word lower. 2179 01:39:57,870 --> 01:40:00,810 You're going to get, often, an overwhelming number of results, 2180 01:40:00,810 --> 01:40:03,678 because Python is a pretty big language, with lots of functionality. 2181 01:40:03,678 --> 01:40:05,970 And you're going to want to look for familiar patterns. 2182 01:40:05,970 --> 01:40:09,060 For whatever reason, string.lower, which is probably 2183 01:40:09,060 --> 01:40:12,420 more popular or more commonly used than these other ones, is third on the list. 2184 01:40:12,420 --> 01:40:15,460 But it's purple, because I clicked it a moment ago, when looking for it. 2185 01:40:15,460 --> 01:40:18,450 So str.lower is probably what I want, because I 2186 01:40:18,450 --> 01:40:21,060 am interested at the moment in lower casing strings. 2187 01:40:21,060 --> 01:40:25,258 When I click on that, this is an example of what Python's documentation tends 2188 01:40:25,258 --> 01:40:25,800 to look like. 2189 01:40:25,800 --> 01:40:27,340 It's in this general format. 2190 01:40:27,340 --> 01:40:29,340 Here's my str.lower function. 2191 01:40:29,340 --> 01:40:31,540 This returns a copy of the string, with all 2192 01:40:31,540 --> 01:40:33,750 of the cased characters converted to lowercase, 2193 01:40:33,750 --> 01:40:35,670 and the lower-casing algorithm, dot dot dot. 2194 01:40:35,670 --> 01:40:37,168 So that doesn't give me much. 2195 01:40:37,168 --> 01:40:38,460 It doesn't give me sample code. 2196 01:40:38,460 --> 01:40:40,210 But it does say what the function does. 2197 01:40:40,210 --> 01:40:43,890 And if we keep looking, you'll see mention of Lstrip, which is left strip. 2198 01:40:43,890 --> 01:40:48,120 I used its analog, Rstrip before, right strip, which allows you to remove, 2199 01:40:48,120 --> 01:40:51,000 that is strip, from the end of a string, something like white space, 2200 01:40:51,000 --> 01:40:52,930 like a new line, or even something else. 2201 01:40:52,930 --> 01:40:56,410 And if you scroll through string, this web page here. 2202 01:40:56,410 --> 01:40:58,110 And we're halfway down the page already. 2203 01:40:58,110 --> 01:41:00,180 If you see my scroll bar, tiny on the right, 2204 01:41:00,180 --> 01:41:05,250 there's a huge amount of functionality built into string objects, here. 2205 01:41:05,250 --> 01:41:08,460 And this is just testament to just how rich the language itself is. 2206 01:41:08,460 --> 01:41:12,620 But it's also reason to reassure that the goal, when 2207 01:41:12,620 --> 01:41:14,870 playing around with some new language and learning it, 2208 01:41:14,870 --> 01:41:16,598 is not to learn it exhaustively. 2209 01:41:16,598 --> 01:41:18,390 Just like in English or any human language, 2210 01:41:18,390 --> 01:41:20,640 there's always going to be vocab words you don't know, 2211 01:41:20,640 --> 01:41:23,563 ways of presenting the same information in some language. 2212 01:41:23,563 --> 01:41:25,230 That's going to be the case with Python. 2213 01:41:25,230 --> 01:41:28,620 And what we'll do today and this week in problem set 6 is really 2214 01:41:28,620 --> 01:41:30,120 get your footing with this language. 2215 01:41:30,120 --> 01:41:33,300 But you won't know all of Python, just like you won't know all of C. 2216 01:41:33,300 --> 01:41:36,300 And, honestly, you won't know all of any of these languages on your own, 2217 01:41:36,300 --> 01:41:38,800 unless you're, perhaps, using them full time professionally, 2218 01:41:38,800 --> 01:41:42,370 and even then, there's more libraries than one might even retain themselves. 2219 01:41:42,370 --> 01:41:45,420 So let's actually now pivot to a few other ideas, 2220 01:41:45,420 --> 01:41:47,560 that we'll implement in Python, in a moment. 2221 01:41:47,560 --> 01:41:50,010 Let me switch back over to VS Code here. 2222 01:41:50,010 --> 01:41:55,260 And let me whip up, say, a recreation of our scores example from week two, 2223 01:41:55,260 --> 01:41:57,883 where we averaged like three scores together. 2224 01:41:57,883 --> 01:42:00,300 And that was an opportunity in week 2 to play with arrays, 2225 01:42:00,300 --> 01:42:02,430 to realize how constrained arrays are. 2226 01:42:02,430 --> 01:42:03,720 They can't grow or shrink. 2227 01:42:03,720 --> 01:42:05,040 You have to decide in advance. 2228 01:42:05,040 --> 01:42:07,110 But let's see what's different here in Python. 2229 01:42:07,110 --> 01:42:11,580 So let me do Scores.py, and let me give myself an array in Python 2230 01:42:11,580 --> 01:42:15,780 called scores, sorry, let me give myself a variable in Python called scores. 2231 01:42:15,780 --> 01:42:17,940 Set it equal to a list of three scores, which 2232 01:42:17,940 --> 01:42:22,560 are the same ones we've used before, 72, 73, 33, in this context 2233 01:42:22,560 --> 01:42:24,630 meant to be scores, not ASCII values. 2234 01:42:24,630 --> 01:42:26,520 And then let's just do the average of these. 2235 01:42:26,520 --> 01:42:28,630 So average will be another variable. 2236 01:42:28,630 --> 01:42:32,910 And it turns out I can do, well, how did I sum these before? 2237 01:42:32,910 --> 01:42:36,580 I probably had a for loop to add one, then I knew how long they were. 2238 01:42:36,580 --> 01:42:39,580 Turns out in Python, you can just say sum of scores 2239 01:42:39,580 --> 01:42:41,530 divided by the length of scores. 2240 01:42:41,530 --> 01:42:43,130 That's going to give me my average. 2241 01:42:43,130 --> 01:42:46,210 So sum is a function that takes a list, in this case, as input, 2242 01:42:46,210 --> 01:42:49,000 and it just does the sum for you, with a for loop or whatever 2243 01:42:49,000 --> 01:42:49,930 underneath the hood. 2244 01:42:49,930 --> 01:42:53,480 Len gives you the length of the list, how many things are in it. 2245 01:42:53,480 --> 01:42:55,240 So I can dynamically figure that out. 2246 01:42:55,240 --> 01:43:00,340 Now let me go ahead and print out, using print, the word average, and then, 2247 01:43:00,340 --> 01:43:03,628 in curly braces, the actual average, close quote. 2248 01:43:03,628 --> 01:43:05,920 All right, so let's run this code, Python of Scores.py. 2249 01:43:05,920 --> 01:43:11,050 And there is my average, in this case, 59.33333 and so forth, 2250 01:43:11,050 --> 01:43:12,310 based on the math. 2251 01:43:12,310 --> 01:43:14,500 Well, let's actually, now, change this a little bit 2252 01:43:14,500 --> 01:43:17,625 and make it a little more interesting, and actually get input from the user 2253 01:43:17,625 --> 01:43:19,190 rather than hard coding this. 2254 01:43:19,190 --> 01:43:22,568 Let me go back up here and use from CS50 import getInt, 2255 01:43:22,568 --> 01:43:25,360 because I don't want to deal with all the exceptions and the loops. 2256 01:43:25,360 --> 01:43:27,820 Like, I just want to use someone else's function here. 2257 01:43:27,820 --> 01:43:31,600 Let me give myself an empty list called scores. 2258 01:43:31,600 --> 01:43:34,480 And this is not something we were able to do in C, right? 2259 01:43:34,480 --> 01:43:36,610 Because in C, if you tried to make an empty array, 2260 01:43:36,610 --> 01:43:39,590 well, that's pretty stupid, because you can't add things to it. 2261 01:43:39,590 --> 01:43:40,910 It's a fixed size. 2262 01:43:40,910 --> 01:43:42,650 So it wouldn't even let you do that. 2263 01:43:42,650 --> 01:43:45,640 But I can just create an empty list in Python, 2264 01:43:45,640 --> 01:43:48,340 because lists, unlike arrays, are really lengthless. 2265 01:43:48,340 --> 01:43:49,750 They'll grow and shrink. 2266 01:43:49,750 --> 01:43:52,870 But you and I are not dealing with all the pointers underneath the hood. 2267 01:43:52,870 --> 01:43:54,770 Python's doing that for us. 2268 01:43:54,770 --> 01:43:58,435 So now, let's go ahead and get a whole bunch of scores from the user. 2269 01:43:58,435 --> 01:43:59,810 How about three of them in total. 2270 01:43:59,810 --> 01:44:05,350 So for i in range of 3, let's go ahead and grab a score from the user, 2271 01:44:05,350 --> 01:44:07,810 using getInt, asking them for score. 2272 01:44:07,810 --> 01:44:14,840 And then let's go ahead and append, to the scores list, that particular score. 2273 01:44:14,840 --> 01:44:17,200 So it turns out that a list, and I could read the Python 2274 01:44:17,200 --> 01:44:21,280 documentation to confirm as much, lists have a function built into them, 2275 01:44:21,280 --> 01:44:25,155 and functions built into objects are generally known as methods, 2276 01:44:25,155 --> 01:44:26,530 if you've heard that term before. 2277 01:44:26,530 --> 01:44:29,320 Same idea, but whereas a function kind of stands on its own, 2278 01:44:29,320 --> 01:44:33,430 a method is a function built into an object, like a list here. 2279 01:44:33,430 --> 01:44:35,917 That's going to achieve the same result. Strictly speaking, 2280 01:44:35,917 --> 01:44:37,000 I don't need the variable. 2281 01:44:37,000 --> 01:44:40,603 Just like in C, I could tighten this up and do something like this as well. 2282 01:44:40,603 --> 01:44:42,520 But, I don't know, I kind of like it this way. 2283 01:44:42,520 --> 01:44:45,970 It's more clear, to me, at least, that what I'm doing here, getting the score 2284 01:44:45,970 --> 01:44:47,838 and then appending it to the list. 2285 01:44:47,838 --> 01:44:49,630 Now the rest of the code can stay the same. 2286 01:44:49,630 --> 01:44:54,700 Python of Scores.py, score will be 72, 73, 33. 2287 01:44:54,700 --> 01:44:55,820 And I get back the math. 2288 01:44:55,820 --> 01:44:58,840 But now the program's a little more dynamic, which is nice. 2289 01:44:58,840 --> 01:45:00,940 But there's other syntax I could use here. 2290 01:45:00,940 --> 01:45:04,330 Just so you've seen it, Python does have some neat syntactic tricks, 2291 01:45:04,330 --> 01:45:06,850 whereby, if you don't want to do scores.append, 2292 01:45:06,850 --> 01:45:11,290 you can actually say scores plus equals this score. 2293 01:45:11,290 --> 01:45:15,730 So you can actually concatenate lists together in Python 2. 2294 01:45:15,730 --> 01:45:18,340 Just as we used plus to join two strings together, 2295 01:45:18,340 --> 01:45:21,400 you can use plus to join two lists together. 2296 01:45:21,400 --> 01:45:24,040 The catch is, you need to put the one score I'm 2297 01:45:24,040 --> 01:45:26,770 adding here in a list of its own, which is kind of silly. 2298 01:45:26,770 --> 01:45:31,330 But it's necessary, so that this thing and this thing are both lists. 2299 01:45:31,330 --> 01:45:33,970 To do this more verbosely, which most programmers wouldn't 2300 01:45:33,970 --> 01:45:36,310 do, but just for clarity, this is the same thing 2301 01:45:36,310 --> 01:45:38,950 as saying scores plus this score. 2302 01:45:38,950 --> 01:45:42,910 So now maybe it's a little more clear that scores and brackets score 2303 01:45:42,910 --> 01:45:47,680 plural, sorry, singular, are both lists themselves, being concatenated 2304 01:45:47,680 --> 01:45:48,860 or joined together. 2305 01:45:48,860 --> 01:45:51,740 So two different ways, not sure one is better than the other. 2306 01:45:51,740 --> 01:45:57,640 This way is pretty common, but .append is also quite reasonable as well. 2307 01:45:57,640 --> 01:46:00,340 All right, how about another example from week two. 2308 01:46:00,340 --> 01:46:03,070 This one was called uppercase. 2309 01:46:03,070 --> 01:46:06,320 So let me do this in Uppercase.py, though, this time. 2310 01:46:06,320 --> 01:46:10,180 And let me import from CS50, get string again. 2311 01:46:10,180 --> 01:46:14,020 And let me go ahead and say, before will be my first variable. 2312 01:46:14,020 --> 01:46:17,500 Let me get a string from the user, asking them for a before string. 2313 01:46:17,500 --> 01:46:22,660 And then let me go ahead and say, after, just to demonstrate some changes, 2314 01:46:22,660 --> 01:46:25,190 upper-casing to this string. 2315 01:46:25,190 --> 01:46:27,850 Let me change my line ending to be that, using our new trick. 2316 01:46:27,850 --> 01:46:31,490 And this is where things get cool in Python, relatively speaking. 2317 01:46:31,490 --> 01:46:35,050 If I want to iterate over all of the characters in a string, 2318 01:46:35,050 --> 01:46:38,140 and print them out in uppercase, one way to do that would be this. 2319 01:46:38,140 --> 01:46:46,032 For c in the before string, go ahead and print out C.uppercase, sorry, C.upper, 2320 01:46:46,032 --> 01:46:49,240 but don't end the line yet, because I want to keep these all on the same line 2321 01:46:49,240 --> 01:46:50,440 until I'm all done. 2322 01:46:50,440 --> 01:46:51,490 So what am I doing? 2323 01:46:51,490 --> 01:46:54,970 Python of Uppercase.py, let me type in Hello in all lowercase. 2324 01:46:54,970 --> 01:46:57,010 I've just upper-cased the whole string. 2325 01:46:57,010 --> 01:46:57,700 How? 2326 01:46:57,700 --> 01:47:00,130 I first get string, calling it before. 2327 01:47:00,130 --> 01:47:02,680 I then just print out some fluffy text that says after colon, 2328 01:47:02,680 --> 01:47:04,840 and I get rid of the line ending, just so I can kind of line these up. 2329 01:47:04,840 --> 01:47:06,632 Notice I hit the spacebar a couple of times 2330 01:47:06,632 --> 01:47:08,620 just so letters line up to be pretty. 2331 01:47:08,620 --> 01:47:10,780 For c and before, this is new. 2332 01:47:10,780 --> 01:47:14,500 This is powerful in C, sorry, in Python, whereby 2333 01:47:14,500 --> 01:47:17,590 you don't have to do like Int i equals 0 and i less than this, 2334 01:47:17,590 --> 01:47:22,310 you could just say, for c in the string in question, for c and before. 2335 01:47:22,310 --> 01:47:25,510 And then here is just upper-casing that specific character, 2336 01:47:25,510 --> 01:47:27,700 and making sure we don't output a new line too soon. 2337 01:47:27,700 --> 01:47:29,920 But this is actually more work than I need to do. 2338 01:47:29,920 --> 01:47:34,000 Based on what we've seen thus far, like from our agreement example, 2339 01:47:34,000 --> 01:47:35,620 can I tighten this up further? 2340 01:47:35,620 --> 01:47:40,340 Can I collapse lines 5 and 6, maybe even 7, all together? 2341 01:47:40,340 --> 01:47:46,550 If the goal of this program is just to uppercase the before string, 2342 01:47:46,550 --> 01:47:49,640 how might I do this? 2343 01:47:49,640 --> 01:47:50,480 Yeah, in back. 2344 01:47:50,480 --> 01:47:52,287 AUDIENCE: Would it be str.upper? 2345 01:47:52,287 --> 01:47:54,620 DAVID J. MALAN: Str.upper, yeah, so I could do something 2346 01:47:54,620 --> 01:47:57,500 like this, after gets before.upper. 2347 01:47:57,500 --> 01:47:59,750 So it's not stir literally dot upper, stir 2348 01:47:59,750 --> 01:48:01,500 just represents the string in question. 2349 01:48:01,500 --> 01:48:04,620 So it would be before.upper, but right idea otherwise. 2350 01:48:04,620 --> 01:48:08,130 And so let me go ahead and just tweak my print statement a little bit. 2351 01:48:08,130 --> 01:48:12,810 Let me just go ahead and print out the after variable here, after creating it. 2352 01:48:12,810 --> 01:48:15,440 So this line is the same, I'm getting a string called before. 2353 01:48:15,440 --> 01:48:18,530 I'm creating another variable called after, and, as you propose, 2354 01:48:18,530 --> 01:48:21,960 I'm calling upper on the whole string, not one character at a time. 2355 01:48:21,960 --> 01:48:22,460 Why? 2356 01:48:22,460 --> 01:48:23,360 Because it's allowed. 2357 01:48:23,360 --> 01:48:27,350 And, again, in Python, there aren't technically characters individually. 2358 01:48:27,350 --> 01:48:28,760 There's only strings, anyway. 2359 01:48:28,760 --> 01:48:30,600 So I might as well do them all at once. 2360 01:48:30,600 --> 01:48:34,220 So if I rerun the code now, Python of Uppercase.py. 2361 01:48:34,220 --> 01:48:39,080 Now I'll type in Hello in all lowercase, and, oh, so close, 2362 01:48:39,080 --> 01:48:42,110 I think I can get rid of this override, because I'm 2363 01:48:42,110 --> 01:48:45,510 printing the whole thing out at once, not character by character. 2364 01:48:45,510 --> 01:48:49,880 So now if I type in Hello before, now I have an even tighter version 2365 01:48:49,880 --> 01:48:52,080 of the program here. 2366 01:48:52,080 --> 01:48:55,910 All right, any questions, then, on lists or on strings, 2367 01:48:55,910 --> 01:49:01,240 and what this kind of function, upper, represents, with its docs. 2368 01:49:01,240 --> 01:49:01,740 No? 2369 01:49:01,740 --> 01:49:04,760 All right, so a couple other building blocks before we start. 2370 01:49:04,760 --> 01:49:05,855 Oh. 2371 01:49:05,855 --> 01:49:06,480 Where was that? 2372 01:49:06,480 --> 01:49:08,010 AUDIENCE: To the right. 2373 01:49:08,010 --> 01:49:10,050 DAVID J. MALAN: To the right, right. 2374 01:49:10,050 --> 01:49:11,040 Yes, thank you. 2375 01:49:11,040 --> 01:49:17,202 AUDIENCE: Could you write, very close to variable string, and then print upper, 2376 01:49:17,202 --> 01:49:19,257 you start creating a variable upper. 2377 01:49:19,257 --> 01:49:21,840 DAVID J. MALAN: Yes, do I have to create this variable, upper? 2378 01:49:21,840 --> 01:49:22,590 No, I don't. 2379 01:49:22,590 --> 01:49:24,870 I could actually tighten this up, and, if you really 2380 01:49:24,870 --> 01:49:28,170 want to see something neat, inside of the curly braces, 2381 01:49:28,170 --> 01:49:31,050 you don't have to just put the names of variables. 2382 01:49:31,050 --> 01:49:33,600 You can put a small amount of logic, so long 2383 01:49:33,600 --> 01:49:36,780 as it doesn't start to look stupid and kind of overwhelmingly complex, such 2384 01:49:36,780 --> 01:49:38,940 that it's sort of bad design at that point. 2385 01:49:38,940 --> 01:49:40,540 I can tighten this up like this. 2386 01:49:40,540 --> 01:49:44,610 And now we're in Python of Uppercase.py, writing Hello again. 2387 01:49:44,610 --> 01:49:45,730 And that, too, works. 2388 01:49:45,730 --> 01:49:47,280 But I would be careful about this. 2389 01:49:47,280 --> 01:49:50,483 You want to resist the temptation of having like a long line of code that's 2390 01:49:50,483 --> 01:49:53,400 inside the curly braces, because it's just going to be harder to read. 2391 01:49:53,400 --> 01:49:55,890 But, absolutely, you could indeed do that, too. 2392 01:49:55,890 --> 01:49:58,950 All right, how about command line arguments, which was one thing 2393 01:49:58,950 --> 01:50:03,030 we introduced in week two also, so that we could actually have the ability 2394 01:50:03,030 --> 01:50:06,750 to take input from the user, whoops. 2395 01:50:06,750 --> 01:50:10,270 So we could actually take input from the user at the command line, 2396 01:50:10,270 --> 01:50:13,210 so as to take literally command line arguments. 2397 01:50:13,210 --> 01:50:16,020 These are a little different, but it follows the same paradigm. 2398 01:50:16,020 --> 01:50:19,860 There's no main by default. And there's no Def main int 2399 01:50:19,860 --> 01:50:26,050 arg c char, or we called it string, argv by default. There's none of this. 2400 01:50:26,050 --> 01:50:30,510 So if you want access to the argument vector, argv, you import it. 2401 01:50:30,510 --> 01:50:35,100 And it turns out, there's another module in Python, or library in Python 2402 01:50:35,100 --> 01:50:39,180 called CIS, and you can import from the system this thing called argv. 2403 01:50:39,180 --> 01:50:41,357 So same idea, different place. 2404 01:50:41,357 --> 01:50:42,940 Now I'm going to go ahead and do this. 2405 01:50:42,940 --> 01:50:47,820 Let's write a program that just requires that the user types in two, a word, 2406 01:50:47,820 --> 01:50:50,050 after the program's name, or none at all. 2407 01:50:50,050 --> 01:50:56,670 So if the length of argv equals 2, let's go ahead and print out, how about, 2408 01:50:56,670 --> 01:51:05,088 Hello comma argv bracket 1 close quote, else if they don't type two words 2409 01:51:05,088 --> 01:51:08,130 total at the prompt, let's just say the default's, like we did weeks ago, 2410 01:51:08,130 --> 01:51:09,160 Hello, world. 2411 01:51:09,160 --> 01:51:12,180 So the only thing that's new here is we're importing argv from CIS, 2412 01:51:12,180 --> 01:51:15,450 and we're using this fancy f-string format, which kind of to your point, 2413 01:51:15,450 --> 01:51:18,510 too, it's putting more complex logic in the curly braces. 2414 01:51:18,510 --> 01:51:19,270 But that's OK. 2415 01:51:19,270 --> 01:51:23,890 In this case, it's a list called argv, and we're getting bracket 1 from it. 2416 01:51:23,890 --> 01:51:27,780 Let's do Python of Argv.py, Enter, Hello, world. 2417 01:51:27,780 --> 01:51:31,480 What if I do Argv.py David at the command line. 2418 01:51:31,480 --> 01:51:32,730 Now I get Hello, David. 2419 01:51:32,730 --> 01:51:34,680 So there's one curiosity here. 2420 01:51:34,680 --> 01:51:39,375 Python is not included in argv, whereas in C, dot 2421 01:51:39,375 --> 01:51:41,940 slash whatever was the first thing. 2422 01:51:41,940 --> 01:51:45,510 If the analog in Python is that the name of your Python program 2423 01:51:45,510 --> 01:51:49,800 is the first thing, in bracket 0, which is why David is in bracket 1, 2424 01:51:49,800 --> 01:51:55,740 the word Python does not appear in the argv list, just to be clear. 2425 01:51:55,740 --> 01:51:57,990 But otherwise, the idea of these arguments 2426 01:51:57,990 --> 01:52:00,383 is exactly the same as before. 2427 01:52:00,383 --> 01:52:02,550 And in fact, what you can do, which is kind of cool, 2428 01:52:02,550 --> 01:52:05,730 is, because argv is a list, you can do things like this. 2429 01:52:05,730 --> 01:52:10,890 For arg in argv, go ahead and print out each argument. 2430 01:52:10,890 --> 01:52:12,990 So instead of using a for loop and i and all 2431 01:52:12,990 --> 01:52:17,220 of this, if I do Python of argv Enter, it just writes the program's name. 2432 01:52:17,220 --> 01:52:21,960 If I do Python of argv Foo, it puts Argv.py and Foo. 2433 01:52:21,960 --> 01:52:26,520 If I do, sorry, if I do Foo and bar, those words all print out. 2434 01:52:26,520 --> 01:52:28,770 If I do Foobar baz, those print out too. 2435 01:52:28,770 --> 01:52:31,830 And Foo and bar or baz are like a mathematician's x and y and z 2436 01:52:31,830 --> 01:52:35,200 for computer scientists, when you just need some placeholder words. 2437 01:52:35,200 --> 01:52:36,420 So this is just nice. 2438 01:52:36,420 --> 01:52:40,020 It reads a little more like English, and a for loop is just much more concise, 2439 01:52:40,020 --> 01:52:43,530 allows you to iterate very quickly when you want something like that. 2440 01:52:43,530 --> 01:52:46,170 Suppose I only wanted the real words that the human typed 2441 01:52:46,170 --> 01:52:47,250 after the program's name. 2442 01:52:47,250 --> 01:52:50,460 Like, suppose I want to ignore Argv.py. 2443 01:52:50,460 --> 01:52:53,640 I mean I could do something hackish like this. 2444 01:52:53,640 --> 01:52:59,105 If arg equals Argv.py, I could just ignore, 2445 01:52:59,105 --> 01:53:00,480 you know, let's invert the logic. 2446 01:53:00,480 --> 01:53:02,530 I could do this, for instance. 2447 01:53:02,530 --> 01:53:05,100 So if the arg does not equal the program name, 2448 01:53:05,100 --> 01:53:07,890 then go ahead and print out the word. 2449 01:53:07,890 --> 01:53:09,840 So I get Foobar and baz only. 2450 01:53:09,840 --> 01:53:14,400 Or, this is what's kind of neat about Python 2, let me undo that. 2451 01:53:14,400 --> 01:53:18,400 And let me just take a slice of the array of the list instead. 2452 01:53:18,400 --> 01:53:22,810 So it turns out, if argv is a list, I can actually say, 2453 01:53:22,810 --> 01:53:27,060 you know what, go into that list, start at element 1, instead of 0, 2454 01:53:27,060 --> 01:53:29,200 and then go all the way to the end. 2455 01:53:29,200 --> 01:53:31,800 And we have not seen this syntax in C. But this 2456 01:53:31,800 --> 01:53:34,410 is a way of slicing a list in Python. 2457 01:53:34,410 --> 01:53:35,820 So now watch what happens. 2458 01:53:35,820 --> 01:53:40,860 If I run Python of Argv.py, Foo bar baz Enter, 2459 01:53:40,860 --> 01:53:44,730 I get only a subset of the list, starting at position 1, 2460 01:53:44,730 --> 01:53:46,892 going all of the way to the end. 2461 01:53:46,892 --> 01:53:48,600 And you can even do kind of the opposite. 2462 01:53:48,600 --> 01:53:51,330 If, for whatever reason, you want to ignore the last element, 2463 01:53:51,330 --> 01:53:57,030 you can say colon, we could say colon negative 1, 2464 01:53:57,030 --> 01:53:59,560 and use a negative number, which we've not seen before, 2465 01:53:59,560 --> 01:54:02,470 which slices off the end of the list, as well. 2466 01:54:02,470 --> 01:54:06,000 So there's some syntactic tricks that tend to be powerful in Python 2, 2467 01:54:06,000 --> 01:54:10,140 even if at first glance, you might not need them for typical things. 2468 01:54:10,140 --> 01:54:12,798 All right, let's do one other example with exit, 2469 01:54:12,798 --> 01:54:15,090 and then we'll start actually applying some algorithms, 2470 01:54:15,090 --> 01:54:16,215 to make things interesting. 2471 01:54:16,215 --> 01:54:20,470 So in one last program here, let's do Exit.py, just to do one more mechanic, 2472 01:54:20,470 --> 01:54:22,210 before we introduce some algorithms. 2473 01:54:22,210 --> 01:54:24,220 And let's do this. 2474 01:54:24,220 --> 01:54:28,900 Let's import from CIS, import argv. 2475 01:54:28,900 --> 01:54:30,490 Let's now do this. 2476 01:54:30,490 --> 01:54:33,200 Let's make sure the user gives me one command line argument. 2477 01:54:33,200 --> 01:54:39,580 So if the length of argv does not equal 2 in total, then let's go ahead 2478 01:54:39,580 --> 01:54:42,790 and print out something like missing command line argument, 2479 01:54:42,790 --> 01:54:44,590 just to explain what the problem is. 2480 01:54:44,590 --> 01:54:47,380 And then let's do this. 2481 01:54:47,380 --> 01:54:48,580 We can exit. 2482 01:54:48,580 --> 01:54:50,710 But I'm going to use a better version of exit here. 2483 01:54:50,710 --> 01:54:52,900 Let me import two functions from CIS. 2484 01:54:52,900 --> 01:54:57,040 Turns out the better way to do this is with CIS.exit, because I can then exit 2485 01:54:57,040 --> 01:54:59,993 specifically 2, with this exit code. 2486 01:54:59,993 --> 01:55:02,410 Otherwise, down here, I'm going to go ahead and print out, 2487 01:55:02,410 --> 01:55:06,818 something like Hello, comma argv bracket 1, same as before. 2488 01:55:06,818 --> 01:55:08,360 And then I'm going to exit with zero. 2489 01:55:08,360 --> 01:55:10,410 So, again, this was a subtle thing we introduced 2490 01:55:10,410 --> 01:55:12,910 in week two, where you can actually have your programs exit, 2491 01:55:12,910 --> 01:55:15,430 with some number, where 0 signifies success, 2492 01:55:15,430 --> 01:55:17,350 and anything else signifies error. 2493 01:55:17,350 --> 01:55:19,240 This is just the same idea in Python. 2494 01:55:19,240 --> 01:55:23,920 So if I, for instance, just run the program like this, oops, I screwed up. 2495 01:55:23,920 --> 01:55:26,620 I meant to say exit here and exit here. 2496 01:55:26,620 --> 01:55:27,710 Let me do that again. 2497 01:55:27,710 --> 01:55:30,500 If I run this like this, I'm missing a command line argument. 2498 01:55:30,500 --> 01:55:33,200 So let me rerun it with like my name at the prompt. 2499 01:55:33,200 --> 01:55:37,030 So I have exactly two command line arguments, the file name and my name, 2500 01:55:37,030 --> 01:55:38,050 Hello comma David. 2501 01:55:38,050 --> 01:55:40,342 And if I do David Malan, it's not going to work either, 2502 01:55:40,342 --> 01:55:42,160 because now argv does not equal 2. 2503 01:55:42,160 --> 01:55:44,860 But the difference here is that we're exiting with 1, 2504 01:55:44,860 --> 01:55:49,900 so that special programs can detect an error, or 0 in the event of success. 2505 01:55:49,900 --> 01:55:52,180 And now there's one other way to do this, too. 2506 01:55:52,180 --> 01:55:54,460 Suppose that you're importing a lot of functions, 2507 01:55:54,460 --> 01:55:56,943 and you don't really want to make a mess of things 2508 01:55:56,943 --> 01:55:59,110 and just have all of these function names available, 2509 01:55:59,110 --> 01:56:01,630 without it being clear where they came from. 2510 01:56:01,630 --> 01:56:03,460 Let's just import all of CIS. 2511 01:56:03,460 --> 01:56:07,180 And let's just change our syntax, kind of like I proposed for CS50, 2512 01:56:07,180 --> 01:56:09,970 where we just prepend to all of these library functions, 2513 01:56:09,970 --> 01:56:13,420 CIS, just to be super-explicit where they came from, 2514 01:56:13,420 --> 01:56:18,837 and if there's another exit or argv value 2515 01:56:18,837 --> 01:56:21,920 that we want to import from a library, this is one way to avoid collision. 2516 01:56:21,920 --> 01:56:25,150 So if I do it one last time here, missing command line argument. 2517 01:56:25,150 --> 01:56:27,190 But David still actually worked. 2518 01:56:27,190 --> 01:56:30,250 All right, only to demonstrate how we can implement that same idea. 2519 01:56:30,250 --> 01:56:33,130 Let's now do something more powerful, like a search algorithm, 2520 01:56:33,130 --> 01:56:34,032 like binary search. 2521 01:56:34,032 --> 01:56:36,490 I'm going to go ahead and open up a file called Numbers.py, 2522 01:56:36,490 --> 01:56:40,420 and let's just do some searching or linear search, rather, 2523 01:56:40,420 --> 01:56:42,440 on a list of numbers. 2524 01:56:42,440 --> 01:56:44,060 Let's go ahead and do this. 2525 01:56:44,060 --> 01:56:47,050 How about import CIS as before. 2526 01:56:47,050 --> 01:56:52,840 Let me give myself a list of numbers, like 4, 6, 8, 2, 7, 5, 0, 2527 01:56:52,840 --> 01:56:54,670 so just a bunch of integers. 2528 01:56:54,670 --> 01:56:56,170 And then let's do this. 2529 01:56:56,170 --> 01:56:59,590 If you recall from week three, we searched for the number 0 2530 01:56:59,590 --> 01:57:01,880 at the end of the lockers on stage. 2531 01:57:01,880 --> 01:57:04,120 So let's just ask that question in Python. 2532 01:57:04,120 --> 01:57:05,860 No need for a loop or anything like that. 2533 01:57:05,860 --> 01:57:09,550 If 0 is in the numbers, go ahead and print out found. 2534 01:57:09,550 --> 01:57:13,420 And then let's just exit successfully, with 0, else, if we get down here, 2535 01:57:13,420 --> 01:57:15,670 let's just say print not found. 2536 01:57:15,670 --> 01:57:19,210 And then we'll CIS exit with 1. 2537 01:57:19,210 --> 01:57:21,820 So this is where Python starts to get powerful again. 2538 01:57:21,820 --> 01:57:23,050 Here's your list. 2539 01:57:23,050 --> 01:57:25,733 Here is your loop, that's doing all of the checking for you. 2540 01:57:25,733 --> 01:57:28,150 Underneath the hood, Python is going to use linear search. 2541 01:57:28,150 --> 01:57:29,817 You don't have to implement it yourself. 2542 01:57:29,817 --> 01:57:32,320 No while loop, no for loop, you just ask a question. 2543 01:57:32,320 --> 01:57:36,230 If 0 is in numbers, then do the following. 2544 01:57:36,230 --> 01:57:38,350 So that's one feature we now get with Python, 2545 01:57:38,350 --> 01:57:40,340 and get to throw away a lot of that code. 2546 01:57:40,340 --> 01:57:41,830 We can do it with strings, too. 2547 01:57:41,830 --> 01:57:44,840 Let me open a file called Names.py instead, 2548 01:57:44,840 --> 01:57:46,990 and do something that was even more involved in C, 2549 01:57:46,990 --> 01:57:50,020 because we needed Str Comp and the for loop, and so forth. 2550 01:57:50,020 --> 01:57:52,000 Let me import CIS for this file. 2551 01:57:52,000 --> 01:57:54,460 Let's give myself a bunch of names like we did in C. 2552 01:57:54,460 --> 01:58:01,630 And those were Bill and Charlie and Fred and George and Ginny, 2553 01:58:01,630 --> 01:58:05,440 and two more, Percy, and lastly Ron. 2554 01:58:05,440 --> 01:58:07,390 And recall, at the time, we looked for Ron. 2555 01:58:07,390 --> 01:58:09,432 And so we had to iterate through the whole thing, 2556 01:58:09,432 --> 01:58:11,810 doing Str Comp and i plus plus and all of that. 2557 01:58:11,810 --> 01:58:18,760 Now just ask the question, if Ron is in names, then let's go ahead 2558 01:58:18,760 --> 01:58:20,440 and, whoops, let me hide that. 2559 01:58:20,440 --> 01:58:22,250 I hit the command too soon. 2560 01:58:22,250 --> 01:58:26,180 Let me go ahead and say print, found, as before. 2561 01:58:26,180 --> 01:58:29,710 CIS exit 1, just to indicate success, and then down here, 2562 01:58:29,710 --> 01:58:32,840 if we get to this point, we can say not found. 2563 01:58:32,840 --> 01:58:36,170 And then we'll just CIS exit 1 instead. 2564 01:58:36,170 --> 01:58:40,960 So, again, this just does linear search for us by default, Python of Names.py, 2565 01:58:40,960 --> 01:58:44,410 we found Ron, because, indeed, he's there, and at the end of the list. 2566 01:58:44,410 --> 01:58:48,190 But we don't need to deal with all of the mechanics of it. 2567 01:58:48,190 --> 01:58:50,530 All right, let's take things one step further. 2568 01:58:50,530 --> 01:58:52,840 In week three, we also implemented the idea 2569 01:58:52,840 --> 01:58:56,980 of a phone book, that actually associated keys with values. 2570 01:58:56,980 --> 01:59:00,010 But remember, the phone book in C, was kind of a hack, right? 2571 01:59:00,010 --> 01:59:03,520 Because we first had two arrays, one with names, one with numbers. 2572 01:59:03,520 --> 01:59:07,330 Then we introduced structs, and so we gave you a person structure. 2573 01:59:07,330 --> 01:59:10,900 And then we had an array of persons. 2574 01:59:10,900 --> 01:59:15,040 You can do this in Python, using objects and things called classes. 2575 01:59:15,040 --> 01:59:17,670 But we can also just use a general purpose dictionary, 2576 01:59:17,670 --> 01:59:21,420 because just like in P set 5, you can associate keys with values, using 2577 01:59:21,420 --> 01:59:23,100 a hash table, using a try. 2578 01:59:23,100 --> 01:59:26,400 Well, similarly, can Python just do this for us. 2579 01:59:26,400 --> 01:59:29,250 From CS50, let's import get string. 2580 01:59:29,250 --> 01:59:32,760 And now let's give myself a dictionary of people, 2581 01:59:32,760 --> 01:59:36,540 D-I-C-T () open paren closed paren gives you a dictionary. 2582 01:59:36,540 --> 01:59:39,300 Or you can simplify the syntax, actually, 2583 01:59:39,300 --> 01:59:42,360 and a dictionary again is just keys and values, words and definitions. 2584 01:59:42,360 --> 01:59:45,060 You can also just use curly braces instead. 2585 01:59:45,060 --> 01:59:47,020 That gives me an empty dictionary. 2586 01:59:47,020 --> 01:59:50,400 But if I know what I want to put in it by default, let's put Carter in there, 2587 01:59:50,400 --> 01:59:57,790 with a number of plus 1-617-495-1000, just like last time, and put myself, 2588 01:59:57,790 --> 02:00:03,777 David, with plus 1-949-468-2750. 2589 02:00:03,777 --> 02:00:06,360 And it came to my attention, tragically, after class that day, 2590 02:00:06,360 --> 02:00:08,152 that we had a bug in our little Easter egg. 2591 02:00:08,152 --> 02:00:11,190 If today, you would like to call me or text me, at that number, 2592 02:00:11,190 --> 02:00:14,130 we have fixed the code that underlies that little Easter egg. 2593 02:00:14,130 --> 02:00:15,090 Spoiler ahead. 2594 02:00:15,090 --> 02:00:17,040 All right, so this now gives me a variable 2595 02:00:17,040 --> 02:00:21,120 called people, that's associating keys with values. 2596 02:00:21,120 --> 02:00:25,230 There is some new syntax here in Python, not just the curly braces, 2597 02:00:25,230 --> 02:00:28,290 but the colons, and the quotes on the left and the right. 2598 02:00:28,290 --> 02:00:31,380 This is a way, in Python, of associating keys 2599 02:00:31,380 --> 02:00:35,350 with values, words with definitions, anything with anything else. 2600 02:00:35,350 --> 02:00:38,550 And it's going to be a super-common paradigm, including in week seven, 2601 02:00:38,550 --> 02:00:42,450 when we look at CSS and HTML and web programming, keys and values 2602 02:00:42,450 --> 02:00:45,840 are like this omnipresent idea in computer science and programming, 2603 02:00:45,840 --> 02:00:49,300 because it's just a really useful way of associating one thing with another. 2604 02:00:49,300 --> 02:00:52,690 So, at this point in the story, we have a dictionary, a hash table, 2605 02:00:52,690 --> 02:00:56,190 if you will, of people, associating names with phone numbers, 2606 02:00:56,190 --> 02:00:57,675 just like a real world phone book. 2607 02:00:57,675 --> 02:01:01,200 So let's write a program that gets a string from the user and asks them 2608 02:01:01,200 --> 02:01:03,390 whose number they would like to look up. 2609 02:01:03,390 --> 02:01:09,510 Then, let's go ahead and say, if that name is in the people dictionary, 2610 02:01:09,510 --> 02:01:12,090 go ahead and print out that person's number, 2611 02:01:12,090 --> 02:01:14,730 by going into the people dictionary and going 2612 02:01:14,730 --> 02:01:19,480 to that specific name, within there, using an f-string for the whole thing. 2613 02:01:19,480 --> 02:01:21,960 So this is similar in spirit to before. 2614 02:01:21,960 --> 02:01:26,130 Linear search and dictionary lookups will just happen automatically for you 2615 02:01:26,130 --> 02:01:29,280 in Python, by just asking the question, if name and people. 2616 02:01:29,280 --> 02:01:31,170 And this line is just going to print out, 2617 02:01:31,170 --> 02:01:35,710 whoever is in the people dictionary, at that name. 2618 02:01:35,710 --> 02:01:40,200 So I'm using square brackets, because here's the interesting thing in Python, 2619 02:01:40,200 --> 02:01:43,320 just like you can index into an array, or a list in Python, 2620 02:01:43,320 --> 02:01:48,150 using numbers, 0, 1, 2, you can very conveniently index 2621 02:01:48,150 --> 02:01:53,080 into a dictionary in Python, using square brackets, as well. 2622 02:01:53,080 --> 02:01:56,070 And just to make clear what's going on here, let me go 2623 02:01:56,070 --> 02:02:00,480 and create a temporary variable, person equals people bracket name. 2624 02:02:00,480 --> 02:02:05,010 And then let's just, or, sorry, let's say, number equals people bracket name. 2625 02:02:05,010 --> 02:02:07,890 And that will just print out the number in question. 2626 02:02:07,890 --> 02:02:11,850 In C, and previously in Python, anything with square brackets like this 2627 02:02:11,850 --> 02:02:16,950 would have been go to a location in a list or an array, using a number. 2628 02:02:16,950 --> 02:02:20,790 But that can actually be a string, like a word the human has typed. 2629 02:02:20,790 --> 02:02:22,830 And this is what's amazing about dictionaries, 2630 02:02:22,830 --> 02:02:25,890 it's not like a big line, a big linear thing. 2631 02:02:25,890 --> 02:02:28,740 It's this table, that you can look up in one column the name, 2632 02:02:28,740 --> 02:02:31,060 and get back in the other column the number. 2633 02:02:31,060 --> 02:02:33,120 So let's go ahead and run Python of Phonebook.py, 2634 02:02:33,120 --> 02:02:38,100 found, not that, oh, wait. 2635 02:02:38,100 --> 02:02:41,880 That's not what's supposed to happen at all. 2636 02:02:41,880 --> 02:02:43,440 I think I'm in the wrong play. 2637 02:02:43,440 --> 02:02:44,290 Phonebook.py. 2638 02:02:44,290 --> 02:02:47,130 2639 02:02:47,130 --> 02:02:49,260 What's going on? 2640 02:02:49,260 --> 02:02:51,720 Print found. 2641 02:02:51,720 --> 02:02:53,580 I am confused. 2642 02:02:53,580 --> 02:02:55,830 OK, let's run this again. 2643 02:02:55,830 --> 02:02:59,970 Python of Phonebook.py, what the-- 2644 02:02:59,970 --> 02:03:01,050 OK, stand by. 2645 02:03:01,050 --> 02:03:07,026 2646 02:03:07,026 --> 02:03:17,902 [KEYS CLICKING] 2647 02:03:17,902 --> 02:03:19,140 What the heck? 2648 02:03:19,140 --> 02:03:21,255 What am I not understanding here? 2649 02:03:21,255 --> 02:03:24,180 2650 02:03:24,180 --> 02:03:27,348 OK, Roxanne, Carter, do you see what I'm doing wrong? 2651 02:03:27,348 --> 02:03:29,220 AUDIENCE: I don't. 2652 02:03:29,220 --> 02:03:31,484 DAVID J. MALAN: What the-- 2653 02:03:31,484 --> 02:03:33,720 [LAUGHTER] 2654 02:03:33,720 --> 02:03:34,230 Say again? 2655 02:03:34,230 --> 02:03:38,110 SPEAKER 47: When you found the test results, it was doing both commands. 2656 02:03:38,110 --> 02:03:43,390 DAVID J. MALAN: Oh, yeah, found, OK, we're going to do this. 2657 02:03:43,390 --> 02:03:45,622 One sec. 2658 02:03:45,622 --> 02:03:52,270 [KEYS CLICKING] 2659 02:03:52,270 --> 02:03:55,360 Whoa, OK. 2660 02:03:55,360 --> 02:03:57,270 All this is coming out of the video. 2661 02:03:57,270 --> 02:03:58,228 So. 2662 02:03:58,228 --> 02:03:59,164 [LAUGHTER] 2663 02:03:59,164 --> 02:04:01,310 [APPLAUSE] 2664 02:04:01,310 --> 02:04:01,810 Thanks. 2665 02:04:01,810 --> 02:04:05,400 2666 02:04:05,400 --> 02:04:06,283 All right. 2667 02:04:06,283 --> 02:04:08,200 I will try to figure out what was going wrong. 2668 02:04:08,200 --> 02:04:10,800 The best I can tell, it was running the wrong program. 2669 02:04:10,800 --> 02:04:12,820 I don't quite understand why. 2670 02:04:12,820 --> 02:04:14,170 So we will diagnose this later. 2671 02:04:14,170 --> 02:04:16,962 I just put the file into a temporary directory, for now, to run it. 2672 02:04:16,962 --> 02:04:22,710 So let me go ahead and just run this, Python of Phonebook.py, 2673 02:04:22,710 --> 02:04:24,240 type in, for instance, my name. 2674 02:04:24,240 --> 02:04:26,418 And there's my corresponding number. 2675 02:04:26,418 --> 02:04:27,960 Have no idea what was just happening. 2676 02:04:27,960 --> 02:04:30,060 But I will get to the bottom of it and update you, 2677 02:04:30,060 --> 02:04:31,360 if we can put our finger on it. 2678 02:04:31,360 --> 02:04:34,890 So this was just an example, now, of implementing a phone book. 2679 02:04:34,890 --> 02:04:37,590 Let's now consider what we can do that's a little more 2680 02:04:37,590 --> 02:04:40,410 powerful, in these examples, like a phone book that 2681 02:04:40,410 --> 02:04:42,150 actually keeps this information around. 2682 02:04:42,150 --> 02:04:45,510 Thus far, these simple phone book examples throw the information away. 2683 02:04:45,510 --> 02:04:48,780 But using CSV files, comma separated values, 2684 02:04:48,780 --> 02:04:51,555 maybe we could actually keep around the names and numbers, 2685 02:04:51,555 --> 02:04:53,430 so that, like on your phone, you can actually 2686 02:04:53,430 --> 02:04:55,780 keep your contacts around long-term. 2687 02:04:55,780 --> 02:04:59,060 So I'm going to go ahead now and do a slightly different example. 2688 02:04:59,060 --> 02:05:03,240 And let me just hide this detail, so it's not confusing. 2689 02:05:03,240 --> 02:05:06,630 Whoops, I'm going to change my prompt temporarily. 2690 02:05:06,630 --> 02:05:10,540 So let me go ahead now and refine this example as follows. 2691 02:05:10,540 --> 02:05:13,830 I'm going to go into Phonebook.py, and I'm 2692 02:05:13,830 --> 02:05:16,290 going to import a whole library called CSV. 2693 02:05:16,290 --> 02:05:18,150 And this is a powerful one, because Python 2694 02:05:18,150 --> 02:05:21,870 comes with a library that just handles CSV files for you. 2695 02:05:21,870 --> 02:05:25,600 A CSV file is just a file with comma separated values. 2696 02:05:25,600 --> 02:05:29,580 And, in fact, to demonstrate this, let me check on one thing 2697 02:05:29,580 --> 02:05:32,460 here, just to make this a little more real. 2698 02:05:32,460 --> 02:05:39,010 To demonstrate this, let's go ahead and do this. 2699 02:05:39,010 --> 02:05:41,970 Let me import the CSV library from CS50. 2700 02:05:41,970 --> 02:05:43,830 Let me import getString. 2701 02:05:43,830 --> 02:05:47,550 Let me then open a file, using the open function, 2702 02:05:47,550 --> 02:05:52,410 open a file called Phonebook.csv, in append format, 2703 02:05:52,410 --> 02:05:54,900 in contrast with read format and write format. 2704 02:05:54,900 --> 02:05:58,450 Write just blows it away if it exists, append adds to the bottom of it. 2705 02:05:58,450 --> 02:06:00,930 So I keep this phone book around, just like you might 2706 02:06:00,930 --> 02:06:02,868 keep adding contacts to your phone. 2707 02:06:02,868 --> 02:06:05,410 Now let me go ahead and get a couple of values from the user. 2708 02:06:05,410 --> 02:06:08,820 Let me say getString and ask the user for a name. 2709 02:06:08,820 --> 02:06:14,160 Then let me getString again, and ask the user for their number. 2710 02:06:14,160 --> 02:06:16,185 And now, let me go ahead and do this. 2711 02:06:16,185 --> 02:06:18,060 And this is new, and this is Python-specific. 2712 02:06:18,060 --> 02:06:20,820 And you would only know this by following a tutorial, 2713 02:06:20,820 --> 02:06:22,480 or reading the documentation. 2714 02:06:22,480 --> 02:06:24,870 Let me give myself a variable called writer, 2715 02:06:24,870 --> 02:06:29,950 and ask the CSV library for a writer to that file. 2716 02:06:29,950 --> 02:06:33,390 Then, let me go ahead and use that writer variable, 2717 02:06:33,390 --> 02:06:36,720 use a function or a method inside of it, called write row, 2718 02:06:36,720 --> 02:06:41,200 to write out a list containing that person's name and number. 2719 02:06:41,200 --> 02:06:44,310 Notice the square brackets inside the parentheses, 2720 02:06:44,310 --> 02:06:49,350 because I'm just printing a list to that particular row in the file. 2721 02:06:49,350 --> 02:06:51,100 And then I'm just going to close the file. 2722 02:06:51,100 --> 02:06:52,742 So what is the effect of all of this? 2723 02:06:52,742 --> 02:06:55,200 Well, let me go ahead and run this version of Phonebook.py, 2724 02:06:55,200 --> 02:06:56,680 and I'm prompted for a name. 2725 02:06:56,680 --> 02:07:05,130 Let's do Carter's first, plus 1-617-495-1000, and then, 2726 02:07:05,130 --> 02:07:07,770 let's go ahead and LS. 2727 02:07:07,770 --> 02:07:10,960 Notice in my current directory, there's two files now, Phonebook.py, 2728 02:07:10,960 --> 02:07:14,430 which I wrote, and apparently Phonebook.csv. 2729 02:07:14,430 --> 02:07:16,830 CSV just stands for comma separated values. 2730 02:07:16,830 --> 02:07:20,380 And it's like a very simple way of storing data in a spreadsheet, 2731 02:07:20,380 --> 02:07:23,670 if you will, where the comma represents the separation between your columns. 2732 02:07:23,670 --> 02:07:26,370 There's only two columns here, name and number. 2733 02:07:26,370 --> 02:07:29,580 But, because I'm writing to this file in append mode, 2734 02:07:29,580 --> 02:07:33,220 let me run it one more time, Python of Phonebook.py, 2735 02:07:33,220 --> 02:07:41,490 and let me go ahead and do David and plus 1-949-468-2750, Enter. 2736 02:07:41,490 --> 02:07:43,350 And notice what happened in the CSV file. 2737 02:07:43,350 --> 02:07:46,380 It automatically updated, because I'm now persisting 2738 02:07:46,380 --> 02:07:49,000 this data to the file in question. 2739 02:07:49,000 --> 02:07:51,360 So if I wanted to now read this file in, I 2740 02:07:51,360 --> 02:07:55,680 could actually go ahead and do linear search on the data, 2741 02:07:55,680 --> 02:07:58,650 using a read function to actually read from the CSV. 2742 02:07:58,650 --> 02:08:01,350 But, for now, we'll just leave it a little simply as write. 2743 02:08:01,350 --> 02:08:03,270 And let me make one refinement here. 2744 02:08:03,270 --> 02:08:07,020 It turns out that, if you're in the habit of re-opening a file, 2745 02:08:07,020 --> 02:08:09,330 you don't have to even close it explicitly. 2746 02:08:09,330 --> 02:08:10,920 You can instead do this. 2747 02:08:10,920 --> 02:08:16,050 You can instead say, with the opening of a file called Phonebook.csv 2748 02:08:16,050 --> 02:08:21,300 in append mode, calling the thing file, go ahead and do all of these lines 2749 02:08:21,300 --> 02:08:22,350 here. 2750 02:08:22,350 --> 02:08:24,377 So the with keyword is a new thing in Python. 2751 02:08:24,377 --> 02:08:27,210 And it's used in a few different ways, but one of the ways it's used 2752 02:08:27,210 --> 02:08:28,335 is to tighten up code here. 2753 02:08:28,335 --> 02:08:30,418 And I'm going to move my variables to the outside, 2754 02:08:30,418 --> 02:08:32,910 because they don't need to be inside of the with statement, 2755 02:08:32,910 --> 02:08:33,868 where the file is open. 2756 02:08:33,868 --> 02:08:36,452 This just has the effect of ensuring that you, the programmer, 2757 02:08:36,452 --> 02:08:38,790 don't screw up, and accidentally don't close your file. 2758 02:08:38,790 --> 02:08:40,680 In fact, you might recall, from C, Valgrind 2759 02:08:40,680 --> 02:08:45,237 might have complained at you, if you had a file that, you didn't close a file, 2760 02:08:45,237 --> 02:08:47,820 you might have had a memory leak as a result. The with keyword 2761 02:08:47,820 --> 02:08:51,840 takes care of all of that for you, as well. 2762 02:08:51,840 --> 02:08:54,670 How about let's do, want to do this. 2763 02:08:54,670 --> 02:08:57,960 How about, let's do one other thing. 2764 02:08:57,960 --> 02:08:59,230 Let's do this. 2765 02:08:59,230 --> 02:09:02,280 Let me go ahead and propose, that on your phone or laptop 2766 02:09:02,280 --> 02:09:07,470 here, or online, go to this URL here, where you'll find a Google form. 2767 02:09:07,470 --> 02:09:10,290 And just to show that these CSVs are actually kind of omnipresent, 2768 02:09:10,290 --> 02:09:11,850 and if you've ever like used a Google Form 2769 02:09:11,850 --> 02:09:13,560 or managed a student group, or something where you've 2770 02:09:13,560 --> 02:09:15,750 collected data via Google Forms, you can actually 2771 02:09:15,750 --> 02:09:18,640 export all of that data via CSV files. 2772 02:09:18,640 --> 02:09:21,150 So go ahead to this URL here. 2773 02:09:21,150 --> 02:09:22,950 And those of you watching on demand later, 2774 02:09:22,950 --> 02:09:24,540 will find that the form is no longer working, 2775 02:09:24,540 --> 02:09:26,030 since we're only doing this live. 2776 02:09:26,030 --> 02:09:27,780 But that will lead to a Google Form that's 2777 02:09:27,780 --> 02:09:30,750 going to let everyone input their answer to a question, 2778 02:09:30,750 --> 02:09:33,660 like what house do you want to end up into, 2779 02:09:33,660 --> 02:09:36,630 sort of an approximation of the sorting hat in Harry Potter. 2780 02:09:36,630 --> 02:09:40,680 And via this form, will we then have the ability to export, 2781 02:09:40,680 --> 02:09:43,780 we'll see, a CSV file. 2782 02:09:43,780 --> 02:09:47,610 So let's give you a moment to do that. 2783 02:09:47,610 --> 02:09:50,460 In just a moment, I'll share my version of the screen, which 2784 02:09:50,460 --> 02:09:54,330 is going to let me actually open the file, the form itself. 2785 02:09:54,330 --> 02:09:59,070 And in just a moment, I'll switch over. 2786 02:09:59,070 --> 02:10:01,020 OK, so this is now my version of the form 2787 02:10:01,020 --> 02:10:04,290 here, where we have 200 plus responses to a simple question of the form, what 2788 02:10:04,290 --> 02:10:08,010 house do you belong in, Gryffindor, Hufflepuff, Ravenclaw, or Slytherin. 2789 02:10:08,010 --> 02:10:12,800 If I go over to responses, I'll see all of the responses in the GUI form here. 2790 02:10:12,800 --> 02:10:15,300 So graphical user interface, and we could flip through this. 2791 02:10:15,300 --> 02:10:20,010 And it looks like, interestingly, 40% of Harvard students 2792 02:10:20,010 --> 02:10:24,223 want to be in Gryffindor, 22% in Slytherin, and everyone else 2793 02:10:24,223 --> 02:10:25,140 in between the others. 2794 02:10:25,140 --> 02:10:27,270 But you might have noticed, if ever using a Google Form, 2795 02:10:27,270 --> 02:10:28,720 this Google Spreadsheets link. 2796 02:10:28,720 --> 02:10:30,010 So I'm going to go ahead and click that. 2797 02:10:30,010 --> 02:10:32,460 And that's going to automatically open, in this case, Google Spreadsheets. 2798 02:10:32,460 --> 02:10:35,290 But you can do the same thing with Office 365 as well. 2799 02:10:35,290 --> 02:10:38,040 And now you see the raw data as a spreadsheet. 2800 02:10:38,040 --> 02:10:42,900 But in Google Spreadsheets, if I go to File and then I go to Download, 2801 02:10:42,900 --> 02:10:46,800 notice I can download this as an Excel file, a PDF, and also 2802 02:10:46,800 --> 02:10:48,910 a CSV, comma separated values. 2803 02:10:48,910 --> 02:10:50,620 So let me go ahead and do that. 2804 02:10:50,620 --> 02:10:53,920 That gives me a file in my Downloads folder on my computer. 2805 02:10:53,920 --> 02:10:57,970 I'm going to now go back to my code editor here. 2806 02:10:57,970 --> 02:11:00,180 And what I'm going to go ahead and do is upload 2807 02:11:00,180 --> 02:11:04,320 this file, from my Downloads folder to VS Code, 2808 02:11:04,320 --> 02:11:06,610 so that we can actually see it within here. 2809 02:11:06,610 --> 02:11:08,220 And now you can see this open file. 2810 02:11:08,220 --> 02:11:11,220 And I'm going to shorten its name, just so it's a little easier to read. 2811 02:11:11,220 --> 02:11:15,990 I'm going to rename this using the MV command, to just Hogwarts.csv. 2812 02:11:15,990 --> 02:11:19,367 And then we can see, in the file, that there's two columns, timestamp column 2813 02:11:19,367 --> 02:11:21,450 house, where you have a whole bunch of time stamps 2814 02:11:21,450 --> 02:11:24,270 when people filled out the form, with someone very early in class. 2815 02:11:24,270 --> 02:11:25,980 And then everyone else just a moment ago. 2816 02:11:25,980 --> 02:11:29,310 And the second value, after each comma, is the name of the house. 2817 02:11:29,310 --> 02:11:32,040 Well, let me go ahead here and implement a program 2818 02:11:32,040 --> 02:11:36,100 in a file called Hogwarts.py, that processes this data. 2819 02:11:36,100 --> 02:11:38,280 So in Hogwarts.py, let's just write a program 2820 02:11:38,280 --> 02:11:41,440 that now reads a CSV, in this case not a phone book, 2821 02:11:41,440 --> 02:11:43,410 but everyone's sorting hat information. 2822 02:11:43,410 --> 02:11:45,450 And I'm going to go ahead and Import CSV. 2823 02:11:45,450 --> 02:11:48,660 And suppose I want to answer a reasonable question, ignoring 2824 02:11:48,660 --> 02:11:52,470 the fact that Google's GUI or graphical user interface, can do this for me. 2825 02:11:52,470 --> 02:11:55,320 I just want to count up who's going to be in which house. 2826 02:11:55,320 --> 02:11:59,640 So let me give myself a dictionary called houses, that's initially empty, 2827 02:11:59,640 --> 02:12:00,780 with curly braces. 2828 02:12:00,780 --> 02:12:02,790 And let me pre-create a few keys. 2829 02:12:02,790 --> 02:12:07,500 Let me say Gryffindor is going to be initialized to 0, 2830 02:12:07,500 --> 02:12:11,820 Hufflepuff will be initialized to 0 as well, Ravenclaw 2831 02:12:11,820 --> 02:12:13,200 will be initialized to 0. 2832 02:12:13,200 --> 02:12:16,770 And finally, Slytherin will be initialized to 0. 2833 02:12:16,770 --> 02:12:19,950 So here's another example of a dictionary, or a hash table, 2834 02:12:19,950 --> 02:12:22,140 just being a very general-purpose piece of data. 2835 02:12:22,140 --> 02:12:23,760 You can have keys and values. 2836 02:12:23,760 --> 02:12:25,470 The keys, in this case, are the houses. 2837 02:12:25,470 --> 02:12:28,500 The values are initially zero, but I'm going to use this, 2838 02:12:28,500 --> 02:12:33,600 instead of like four separate variables, to keep track of everyone's answer 2839 02:12:33,600 --> 02:12:34,730 to this form. 2840 02:12:34,730 --> 02:12:35,730 So I'm going to do this. 2841 02:12:35,730 --> 02:12:43,180 With opening Hogwarts.csv, in read mode, not append, I don't want to change it. 2842 02:12:43,180 --> 02:12:46,440 I just want to read it, as file as my variable name. 2843 02:12:46,440 --> 02:12:49,530 Let's go ahead and create a reader this time, 2844 02:12:49,530 --> 02:12:54,710 that is using the reader function in the CSV library, by opening that file. 2845 02:12:54,710 --> 02:12:57,210 I'm going to go ahead and ignore the first line of the file, 2846 02:12:57,210 --> 02:13:00,270 because, recall, that the first line is just timestamp and house. 2847 02:13:00,270 --> 02:13:01,450 I want to get the real data. 2848 02:13:01,450 --> 02:13:03,540 So this next function is just a little trick 2849 02:13:03,540 --> 02:13:06,730 for ignoring the first line of the file. 2850 02:13:06,730 --> 02:13:07,800 Then let's do this. 2851 02:13:07,800 --> 02:13:12,180 For every other row in the reader, that is line by line, 2852 02:13:12,180 --> 02:13:15,420 get the current person's house, which is in row bracket 1. 2853 02:13:15,420 --> 02:13:18,213 This is what the CSV reader library is doing for us. 2854 02:13:18,213 --> 02:13:20,130 It's handling all of the reading of this file. 2855 02:13:20,130 --> 02:13:23,760 It figures out where the comma is, and, for every row in the file, 2856 02:13:23,760 --> 02:13:26,250 it hands you back a list of size 2. 2857 02:13:26,250 --> 02:13:31,090 In bracket 0 is the time stamp, in bracket 1 is the house name. 2858 02:13:31,090 --> 02:13:34,830 So, in my code, I can say house equals row bracket 1. 2859 02:13:34,830 --> 02:13:36,970 I don't care about the time stamp for this program. 2860 02:13:36,970 --> 02:13:41,070 And then let's go into my dictionary called houses, plural, index 2861 02:13:41,070 --> 02:13:47,370 into it at the house location, by its name, and increment that 0 to 1. 2862 02:13:47,370 --> 02:13:50,280 And now, at the end of this block of code, 2863 02:13:50,280 --> 02:13:53,040 that has the effect of iterating over every line of the file, 2864 02:13:53,040 --> 02:13:55,470 updating my dictionary in four different places, 2865 02:13:55,470 --> 02:13:59,190 based on whether someone typed Gryffindor or Slytherin or anything 2866 02:13:59,190 --> 02:13:59,700 else. 2867 02:13:59,700 --> 02:14:03,810 And notice that I'm using the name of the house to index into my dictionary, 2868 02:14:03,810 --> 02:14:07,500 to essentially go up to this little cheat sheet and change the 0 to a 1, 2869 02:14:07,500 --> 02:14:10,020 the 1 to a 2, the 2 to a 3, instead of having 2870 02:14:10,020 --> 02:14:12,000 like four separate variables, which would just 2871 02:14:12,000 --> 02:14:14,070 be much more annoying to maintain. 2872 02:14:14,070 --> 02:14:16,290 Down at the bottom, let's just print out the results. 2873 02:14:16,290 --> 02:14:19,620 For each house in those houses, iterating over 2874 02:14:19,620 --> 02:14:21,750 the keys they're in by default in Python, 2875 02:14:21,750 --> 02:14:24,630 let's go ahead and print out an f-string that says, 2876 02:14:24,630 --> 02:14:29,460 the current house has the current count. 2877 02:14:29,460 --> 02:14:35,070 And count will be the result of indexing into houses, for that given house. 2878 02:14:35,070 --> 02:14:36,810 And let me close my quote. 2879 02:14:36,810 --> 02:14:41,940 So let's run this to summarize the data, Hogwarts.py, 140 of you 2880 02:14:41,940 --> 02:14:46,200 answered Gryffindor, 54 Hufflepuff, 72 Ravenclaw, and 80 of you Slytherin. 2881 02:14:46,200 --> 02:14:48,570 And that's just my now way of code, and this is, oh, 2882 02:14:48,570 --> 02:14:52,227 my God, so much easier than C, to actually analyze data in this way. 2883 02:14:52,227 --> 02:14:55,560 And one of the reasons that Python is so popular for data science and analytics, 2884 02:14:55,560 --> 02:14:59,910 more generally, is that it's actually really easy to manipulate data, and run 2885 02:14:59,910 --> 02:15:00,940 analytics like this. 2886 02:15:00,940 --> 02:15:02,370 And let me clean this up slightly. 2887 02:15:02,370 --> 02:15:05,160 It's a little annoying that I just have to know and trust 2888 02:15:05,160 --> 02:15:10,410 that the house name is in bracket 1 and timestamp is in bracket 0. 2889 02:15:10,410 --> 02:15:11,440 Let's clean this up. 2890 02:15:11,440 --> 02:15:16,530 There's something called a Dictionary Reader in the CSV library 2891 02:15:16,530 --> 02:15:17,880 that I can use instead. 2892 02:15:17,880 --> 02:15:22,470 Capital D, capital R, this means I can throw away this next thing, 2893 02:15:22,470 --> 02:15:24,900 because what a dictionary reader does is it 2894 02:15:24,900 --> 02:15:28,890 still returns to me every row from the file, one after the other, 2895 02:15:28,890 --> 02:15:32,560 but it doesn't just give me a list of size 2 representing each row. 2896 02:15:32,560 --> 02:15:33,960 It gives me a dictionary. 2897 02:15:33,960 --> 02:15:39,000 And it uses, as the keys in that dictionary, timestamp and house, 2898 02:15:39,000 --> 02:15:41,460 for every row in the file, which is just to say 2899 02:15:41,460 --> 02:15:43,950 it makes my code a little more readable, because instead 2900 02:15:43,950 --> 02:15:46,590 of doing this little trickery, bracket 1, 2901 02:15:46,590 --> 02:15:49,500 I can say quote unquote "Bracket House" with a capital H, 2902 02:15:49,500 --> 02:15:52,360 because it's capitalized in the Google Form itself. 2903 02:15:52,360 --> 02:15:54,798 So the code now is just minorly different, 2904 02:15:54,798 --> 02:15:57,840 but it's way more resilient, especially if I'm using Google Spreadsheets, 2905 02:15:57,840 --> 02:16:00,390 and I'm moving the columns around or doing something like that, 2906 02:16:00,390 --> 02:16:01,973 where the numbers might get messed up. 2907 02:16:01,973 --> 02:16:05,260 Now I can run this on Hogwarts.py again, and I get the same answers. 2908 02:16:05,260 --> 02:16:09,960 But I now don't have to worry about where those individual columns are. 2909 02:16:09,960 --> 02:16:14,880 All right, any questions on those capabilities there. 2910 02:16:14,880 --> 02:16:17,400 And that's a teaser of sorts, for some of the manipulation 2911 02:16:17,400 --> 02:16:19,620 we'll do in P set 6. 2912 02:16:19,620 --> 02:16:23,555 All right, so some final examples and flair, to intrigue 2913 02:16:23,555 --> 02:16:24,930 with what you can do with Python. 2914 02:16:24,930 --> 02:16:28,710 I'm going to actually switch over to a terminal window on my own Mac, 2915 02:16:28,710 --> 02:16:31,900 so that I can actually use audio a little more effectively. 2916 02:16:31,900 --> 02:16:33,930 So here's just a terminal window on Mac OS. 2917 02:16:33,930 --> 02:16:37,950 I before class have preinstalled some additional Python libraries, 2918 02:16:37,950 --> 02:16:40,379 that won't really work in VS Code in the cloud, 2919 02:16:40,379 --> 02:16:43,535 because they require audio that the browser won't necessarily support. 2920 02:16:43,535 --> 02:16:45,660 But I'm going to go ahead and write an example here 2921 02:16:45,660 --> 02:16:49,559 that involves writing a speech-based program, that actually does something 2922 02:16:49,559 --> 02:16:50,212 with speech. 2923 02:16:50,212 --> 02:16:52,170 And I'm going to go ahead and import a library, 2924 02:16:52,170 --> 02:16:55,709 that, again, I pre-installed, called Python text to speech, 2925 02:16:55,709 --> 02:16:58,260 and I'm going to go ahead and, per its documentation, 2926 02:16:58,260 --> 02:17:02,879 give myself a speech engine, by using that library's init function, 2927 02:17:02,879 --> 02:17:04,080 for initialize. 2928 02:17:04,080 --> 02:17:06,930 I'm then going to use this engine's save function 2929 02:17:06,930 --> 02:17:09,180 to do something fun, like Hello, world. 2930 02:17:09,180 --> 02:17:12,480 And then I'm going to go ahead and tell this engine to run and wait, 2931 02:17:12,480 --> 02:17:13,855 while it says those words. 2932 02:17:13,855 --> 02:17:15,480 All right, I'm going to save this file. 2933 02:17:15,480 --> 02:17:16,980 I'm not using VS Code at the moment. 2934 02:17:16,980 --> 02:17:20,070 I'm using another popular program that we used in CS50 back in my day, 2935 02:17:20,070 --> 02:17:22,830 called Vim, which is a command line program that's 2936 02:17:22,830 --> 02:17:24,790 just in this black and white window. 2937 02:17:24,790 --> 02:17:28,849 Let me go ahead now and run Python of Speech.py, and-- 2938 02:17:28,849 --> 02:17:30,745 COMPUTER: Hello, world. 2939 02:17:30,745 --> 02:17:33,120 DAVID J. MALAN: All right, so it's a little computerized, 2940 02:17:33,120 --> 02:17:36,113 but it is speech that has been synthesized from this example. 2941 02:17:36,113 --> 02:17:38,280 Let's change it a little bit to be more interesting. 2942 02:17:38,280 --> 02:17:39,488 Let's do something like this. 2943 02:17:39,488 --> 02:17:43,950 Let's ask the user for their name, like what's your name question mark. 2944 02:17:43,950 --> 02:17:47,850 And then, let's use the little F string, and say, not Hello, world, 2945 02:17:47,850 --> 02:17:50,010 but Hello to that person's name. 2946 02:17:50,010 --> 02:17:54,270 Let me save my file, run Python of Speech.py, Enter. 2947 02:17:54,270 --> 02:17:55,260 David. 2948 02:17:55,260 --> 02:17:57,360 COMPUTER: Hello, David. 2949 02:17:57,360 --> 02:17:59,639 DAVID J. MALAN: All right, so we pronounce my name OK, 2950 02:17:59,639 --> 02:18:02,306 might struggle with different names, depending on the phonetics. 2951 02:18:02,306 --> 02:18:03,570 But that one seemed to be OK. 2952 02:18:03,570 --> 02:18:05,850 Let's do something else with Python, using similarly, 2953 02:18:05,850 --> 02:18:07,780 just a few lines of code. 2954 02:18:07,780 --> 02:18:12,540 Let me go into today's examples. 2955 02:18:12,540 --> 02:18:18,330 And I'm going to go into a folder called Detect, whoops, a folder called 2956 02:18:18,330 --> 02:18:19,680 Faces.py. 2957 02:18:19,680 --> 02:18:20,790 Sorry, Faces. 2958 02:18:20,790 --> 02:18:23,370 And in this folder, that I've written in advance, 2959 02:18:23,370 --> 02:18:25,879 are a few files, Detect.py, Recognize.py, 2960 02:18:25,879 --> 02:18:30,330 and two full of photos, Office.jpeg and Toby.jpeg. 2961 02:18:30,330 --> 02:18:32,799 If you're familiar with the show, here, for instance, 2962 02:18:32,799 --> 02:18:34,809 is the cast photo from The Office here. 2963 02:18:34,809 --> 02:18:36,299 So here's a photo as input. 2964 02:18:36,299 --> 02:18:38,639 Suppose I want to do something very Facebook-style, 2965 02:18:38,639 --> 02:18:40,860 where I want to analyze all of the faces, 2966 02:18:40,860 --> 02:18:42,870 or detect all of the faces in there. 2967 02:18:42,870 --> 02:18:44,940 Well, let me go ahead and show you a program 2968 02:18:44,940 --> 02:18:47,879 I wrote in advance, that's not terribly long. 2969 02:18:47,879 --> 02:18:49,379 Much of it is actually comments. 2970 02:18:49,379 --> 02:18:50,639 But let's see what I'm doing. 2971 02:18:50,639 --> 02:18:54,000 I'm importing the Pillow library, again, to get access to images. 2972 02:18:54,000 --> 02:18:57,480 I'm importing a library called face recognition, which I downloaded 2973 02:18:57,480 --> 02:18:58,590 and installed in advance. 2974 02:18:58,590 --> 02:19:00,129 But it does what it says. 2975 02:19:00,129 --> 02:19:02,959 According to its documentation, you go into that library 2976 02:19:02,959 --> 02:19:04,760 and you call a function called load image 2977 02:19:04,760 --> 02:19:07,370 file, to load something like Office.jpeg, 2978 02:19:07,370 --> 02:19:10,040 and then you can use the line of code like this. 2979 02:19:10,040 --> 02:19:14,120 Call a function called face locations, passing the images input, 2980 02:19:14,120 --> 02:19:17,120 and you get back a list of all of the faces in the image. 2981 02:19:17,120 --> 02:19:20,750 And then down here, a for loop, that iterates over all of those 2982 02:19:20,750 --> 02:19:22,040 face locations. 2983 02:19:22,040 --> 02:19:24,799 And inside of this loop, I just do a bit of trickery. 2984 02:19:24,799 --> 02:19:29,580 I figure out the top, right, bottom, and left corners of those locations. 2985 02:19:29,580 --> 02:19:31,940 And then, using these lines of code here, 2986 02:19:31,940 --> 02:19:34,834 I'm using that image library, to just draw a box, essentially. 2987 02:19:34,834 --> 02:19:35,959 And the code looks cryptic. 2988 02:19:35,959 --> 02:19:38,150 Honestly, I would have to look this up to write it again. 2989 02:19:38,150 --> 02:19:40,650 But per the documentation, this just draws a nice little box 2990 02:19:40,650 --> 02:19:41,610 around the image. 2991 02:19:41,610 --> 02:19:48,200 So let me go ahead and zoom out here, and run this now on Office.jpeg. 2992 02:19:48,200 --> 02:19:53,390 All right, it's analyzing, analyzing, and you can see in the sidebar here, 2993 02:19:53,390 --> 02:19:54,380 here's the original. 2994 02:19:54,380 --> 02:19:59,180 And here is every face that my, what, 10 lines of Python code 2995 02:19:59,180 --> 02:20:00,740 found, within that file. 2996 02:20:00,740 --> 02:20:01,410 What's a face? 2997 02:20:01,410 --> 02:20:04,190 Presumably the library is looking for something, 2998 02:20:04,190 --> 02:20:07,100 maybe without a mask, that has two eyes, a nose, and a mouth, 2999 02:20:07,100 --> 02:20:09,420 in some kind of arrangement, some kind of pattern. 3000 02:20:09,420 --> 02:20:12,440 So it would seem pretty reliable, at least on these fairly easy-to-read 3001 02:20:12,440 --> 02:20:13,370 faces here. 3002 02:20:13,370 --> 02:20:15,660 What if we want to look for someone specific, 3003 02:20:15,660 --> 02:20:17,180 for instance, someone that's always getting picked on. 3004 02:20:17,180 --> 02:20:18,763 Well, we could do something like this. 3005 02:20:18,763 --> 02:20:23,060 Recognize.py, which is taking two files as input, that image and the image 3006 02:20:23,060 --> 02:20:24,620 of one person in particular. 3007 02:20:24,620 --> 02:20:26,900 And if you're trying to find Toby in a crowd, 3008 02:20:26,900 --> 02:20:29,570 here I conflated the program, sorry, this is the version that 3009 02:20:29,570 --> 02:20:31,550 draws a box around the given face. 3010 02:20:31,550 --> 02:20:33,680 Here we have Toby as identified. 3011 02:20:33,680 --> 02:20:34,220 Why? 3012 02:20:34,220 --> 02:20:38,450 Because that program, Recognize.py, has a few more lines of code, 3013 02:20:38,450 --> 02:20:42,800 but long story short, it additionally loads as input Toby.jpeg, 3014 02:20:42,800 --> 02:20:45,410 in order to recognize that specific face. 3015 02:20:45,410 --> 02:20:48,350 And that specific face is a completely different photo, 3016 02:20:48,350 --> 02:20:52,970 but it looks similar enough to the person, that it all worked out OK. 3017 02:20:52,970 --> 02:20:55,820 Let's do one other that's a little sensitive to microphones. 3018 02:20:55,820 --> 02:21:00,650 Let me go into, how about my listen folder here, which is available 3019 02:21:00,650 --> 02:21:01,610 online, too. 3020 02:21:01,610 --> 02:21:04,380 And let's just run Python of Listen0.py. 3021 02:21:04,380 --> 02:21:07,430 I'm going to type in like David. 3022 02:21:07,430 --> 02:21:10,520 Oh, sorry, no, I'm going to-- 3023 02:21:10,520 --> 02:21:11,150 Hello, world. 3024 02:21:11,150 --> 02:21:16,045 3025 02:21:16,045 --> 02:21:17,420 Oh, no, that's the wrong version. 3026 02:21:17,420 --> 02:21:19,250 [CHUCKLES] OK, I looked like an idiot. 3027 02:21:19,250 --> 02:21:21,500 OK, hello, there we go. 3028 02:21:21,500 --> 02:21:22,310 Hello to you, too. 3029 02:21:22,310 --> 02:21:26,300 And if I say goodbye, I'm talking to my laptop like an idiot, OK. 3030 02:21:26,300 --> 02:21:28,590 Now it's detecting what I'm saying here. 3031 02:21:28,590 --> 02:21:32,130 So this first version of the program is just using some relatively simple, if 3032 02:21:32,130 --> 02:21:36,472 elif elif, and it's just asking for input, forcing it to lowercase. 3033 02:21:36,472 --> 02:21:38,430 And that was my mistake with the first example. 3034 02:21:38,430 --> 02:21:41,360 And then, I'm just checking, is Hello in the user's words? 3035 02:21:41,360 --> 02:21:42,818 Is how are you in the user's words? 3036 02:21:42,818 --> 02:21:44,152 Didn't see that, but it's there. 3037 02:21:44,152 --> 02:21:45,470 Is goodbye in the user's words? 3038 02:21:45,470 --> 02:21:49,280 Now let's do a cooler version, using a library, just by looking at the effect. 3039 02:21:49,280 --> 02:21:51,140 Python of Listen1.py. 3040 02:21:51,140 --> 02:21:55,685 Hello, world. 3041 02:21:55,685 --> 02:21:56,720 Huh. 3042 02:21:56,720 --> 02:22:04,170 Let's do version 2 of this, that uses an audio speech-to-text library. 3043 02:22:04,170 --> 02:22:07,160 Hello, world. 3044 02:22:07,160 --> 02:22:09,710 OK, so now it's artificial intelligence. 3045 02:22:09,710 --> 02:22:11,810 Now let's do something a little more interesting. 3046 02:22:11,810 --> 02:22:15,230 The third version of this program that actually analyzes the words that are 3047 02:22:15,230 --> 02:22:16,880 said. 3048 02:22:16,880 --> 02:22:18,800 Hello, world, my name is David. 3049 02:22:18,800 --> 02:22:19,700 How are you? 3050 02:22:19,700 --> 02:22:22,760 3051 02:22:22,760 --> 02:22:26,000 OK, so that time, it not only analyzed what I said, 3052 02:22:26,000 --> 02:22:27,930 but it plucked my name out of it. 3053 02:22:27,930 --> 02:22:30,480 Let's do two final examples. 3054 02:22:30,480 --> 02:22:33,150 This one will generate a QR code. 3055 02:22:33,150 --> 02:22:35,120 Let me go ahead and write a program called 3056 02:22:35,120 --> 02:22:39,030 QR.py, that very simply does this. 3057 02:22:39,030 --> 02:22:40,820 Let me import a library called OS. 3058 02:22:40,820 --> 02:22:43,230 Let me import a library called QR code. 3059 02:22:43,230 --> 02:22:48,000 Let me grab an image here, that's QRcode.make. 3060 02:22:48,000 --> 02:22:51,440 And let me give you the URL of like a lecture video on YouTube, or something 3061 02:22:51,440 --> 02:22:55,040 like that, with this ID. 3062 02:22:55,040 --> 02:22:59,840 Let me just type this, so I don't get it wrong. 3063 02:22:59,840 --> 02:23:05,300 OK, so if I now use this URL here, of a video on YouTube, making 3064 02:23:05,300 --> 02:23:07,812 sure I haven't made any typos, I'm now going 3065 02:23:07,812 --> 02:23:09,770 to go ahead and do two lines of code in Python. 3066 02:23:09,770 --> 02:23:13,460 I'm going to first save that as a file called QR.png, which is 3067 02:23:13,460 --> 02:23:15,490 a two dimensional barcode, a QR code. 3068 02:23:15,490 --> 02:23:17,240 And, indeed, I'm going to use this format. 3069 02:23:17,240 --> 02:23:23,790 And I'm going to use the OS.system library to open QR.png automatically. 3070 02:23:23,790 --> 02:23:26,090 And if you'd like to take out your phone at this point, 3071 02:23:26,090 --> 02:23:32,270 you can see the result of my barcode, that's just been dynamically generated. 3072 02:23:32,270 --> 02:23:33,785 Hopefully from afar that will scan. 3073 02:23:33,785 --> 02:23:37,355 3074 02:23:37,355 --> 02:23:40,150 [UPROAR] 3075 02:23:40,150 --> 02:23:42,460 And I think that's an appropriate line to end on. 3076 02:23:42,460 --> 02:23:43,860 So that's it for CS50. 3077 02:23:43,860 --> 02:23:46,020 We will see you next time. 3078 02:23:46,020 --> 02:23:47,820 [APPLAUSE] 3079 02:23:47,820 --> 02:23:51,470 [MUSIC PLAYING] 3080 02:23:51,470 --> 02:24:25,000 261701

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.