Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
0
00:00:00,000 --> 00:00:05,988
[MUSIC PLAYING]
1
00:00:05,988 --> 00:01:17,990
2
00:01:17,990 --> 00:01:21,800
DAVID J. MALAN: All right, this is CS50, and this is already week 6.
3
00:01:21,800 --> 00:01:24,458
And this is the week in which you learn yet another language.
4
00:01:24,458 --> 00:01:26,750
But the goal is not just to teach you another language,
5
00:01:26,750 --> 00:01:29,480
for languages sake, as we transition today
6
00:01:29,480 --> 00:01:32,780
and in the coming weeks from C, where we've spent the past several weeks, now
7
00:01:32,780 --> 00:01:33,440
to Python.
8
00:01:33,440 --> 00:01:37,530
The goal ultimately is to teach you all how to teach yourselves new languages,
9
00:01:37,530 --> 00:01:40,020
so that by the end of this course, it's not in your mind,
10
00:01:40,020 --> 00:01:42,710
the fact that you learned how to program in C
11
00:01:42,710 --> 00:01:44,960
or learned some weeks back how to program in Scratch,
12
00:01:44,960 --> 00:01:48,170
but really how you learned how to program fundamentally,
13
00:01:48,170 --> 00:01:50,630
in a paradigm known as procedural programming,
14
00:01:50,630 --> 00:01:53,450
as well as with some taste today, and in the weeks to come,
15
00:01:53,450 --> 00:01:55,310
of other aspects of programming languages,
16
00:01:55,310 --> 00:01:58,010
like object-oriented programming, and more.
17
00:01:58,010 --> 00:02:00,180
So recall, though, back in week zero, Hello, world
18
00:02:00,180 --> 00:02:01,680
looked a little something like this.
19
00:02:01,680 --> 00:02:03,387
And the world was quite simple.
20
00:02:03,387 --> 00:02:05,720
All you had to do was drag and drop these puzzle pieces.
21
00:02:05,720 --> 00:02:08,960
But there were still functions and conditionals and loops and variables
22
00:02:08,960 --> 00:02:11,030
and all of those kinds of primitives.
23
00:02:11,030 --> 00:02:14,300
We then transitioned, of course, to a much more arcane language that
24
00:02:14,300 --> 00:02:15,840
looked a little something like this.
25
00:02:15,840 --> 00:02:17,798
And even now, some weeks later, you might still
26
00:02:17,798 --> 00:02:20,470
be struggling with some of the syntax or getting annoying bugs
27
00:02:20,470 --> 00:02:22,970
when you try to compile your code, and it just doesn't work.
28
00:02:22,970 --> 00:02:24,800
But there, too, the past few weeks, we've
29
00:02:24,800 --> 00:02:28,130
been focusing on functions and loops and variables, conditionals, and really
30
00:02:28,130 --> 00:02:29,550
all of those same ideas.
31
00:02:29,550 --> 00:02:33,710
And so what we begin to do today is to, one, simplify the language
32
00:02:33,710 --> 00:02:38,840
we're using, transitioning from C now to Python, this now being the equivalent
33
00:02:38,840 --> 00:02:42,200
program in Python, and look at its relative simplicity,
34
00:02:42,200 --> 00:02:43,940
but also transitioning to look at how you
35
00:02:43,940 --> 00:02:45,800
can implement these same kinds of features,
36
00:02:45,800 --> 00:02:47,430
just using a different language.
37
00:02:47,430 --> 00:02:49,250
So we're going to see a lot of code today.
38
00:02:49,250 --> 00:02:53,150
And you won't have nearly as much practice with Python as you did with C.
39
00:02:53,150 --> 00:02:56,210
But that's because so many of the ideas are still going to be with us.
40
00:02:56,210 --> 00:02:58,580
And, really, it's going to be a process of figuring out, all right,
41
00:02:58,580 --> 00:02:59,413
I want to do a loop.
42
00:02:59,413 --> 00:03:01,760
I know how to do it in C. How do I do this in Python?
43
00:03:01,760 --> 00:03:02,990
How do I do the same with conditionals?
44
00:03:02,990 --> 00:03:04,710
How do I declare variables, and the like,
45
00:03:04,710 --> 00:03:07,460
and moving forward, not just in CS50, but in life in general,
46
00:03:07,460 --> 00:03:10,760
if you continue programming and learn some other language after the class,
47
00:03:10,760 --> 00:03:14,270
if in 5-10 years, there's a new, more popular language that you pick up,
48
00:03:14,270 --> 00:03:16,520
it's just going to be a matter of googling and looking
49
00:03:16,520 --> 00:03:18,410
at websites like Stack Overflow and the like,
50
00:03:18,410 --> 00:03:21,350
to look at just basic building blocks of programming languages,
51
00:03:21,350 --> 00:03:24,680
because you already speak, after these past 6 plus weeks,
52
00:03:24,680 --> 00:03:27,500
you already speak programming itself fundamentally.
53
00:03:27,500 --> 00:03:31,070
All right, so let's do a few quick comparisons, left and right, of what
54
00:03:31,070 --> 00:03:32,960
something might have looked like in Scratch,
55
00:03:32,960 --> 00:03:34,820
and what it then looked like in C, but now,
56
00:03:34,820 --> 00:03:36,770
as of today, what it's going to look like in Python.
57
00:03:36,770 --> 00:03:38,853
Then we'll turn our attention to the command line,
58
00:03:38,853 --> 00:03:42,510
ultimately, in order to implement some actual programs.
59
00:03:42,510 --> 00:03:45,740
So in Scratch, we had functions like this, say Hello,
60
00:03:45,740 --> 00:03:47,270
world, a verb or an action.
61
00:03:47,270 --> 00:03:49,740
In C it looked a little something like this,
62
00:03:49,740 --> 00:03:53,150
and a bit of a cryptic mess the first week, you had the printf,
63
00:03:53,150 --> 00:03:54,290
you had the double quotes.
64
00:03:54,290 --> 00:03:55,980
You had the semicolon, the parentheses.
65
00:03:55,980 --> 00:03:58,423
So there's a lot more syntax just to do the same thing.
66
00:03:58,423 --> 00:04:01,340
We're not going to get rid of all of that syntax now, but as of today,
67
00:04:01,340 --> 00:04:05,580
in Python, that same statement is going to look a little something like this.
68
00:04:05,580 --> 00:04:07,640
And just to perhaps call out the obvious, what
69
00:04:07,640 --> 00:04:12,050
is different or, now, simpler in Python versus C, even
70
00:04:12,050 --> 00:04:13,640
in this simple example here?
71
00:04:13,640 --> 00:04:14,545
Yeah.
72
00:04:14,545 --> 00:04:17,420
AUDIENCE: Now print, instead of printf would be, something like that.
73
00:04:17,420 --> 00:04:19,837
DAVID J. MALAN: Good, so it's now print instead of printf.
74
00:04:19,837 --> 00:04:21,110
And there's also no semicolon.
75
00:04:21,110 --> 00:04:23,103
And there's one other subtlety, over here.
76
00:04:23,103 --> 00:04:24,020
AUDIENCE: No new line.
77
00:04:24,020 --> 00:04:25,640
DAVID J. MALAN: Yeah, so no new line, and that
78
00:04:25,640 --> 00:04:27,110
doesn't mean it's not going to be printed.
79
00:04:27,110 --> 00:04:29,402
It just turns out that one of the differences we'll see
80
00:04:29,402 --> 00:04:31,640
is that, with print, you get the new line for free.
81
00:04:31,640 --> 00:04:34,950
It automatically gets outputted by default, being sort of a common case.
82
00:04:34,950 --> 00:04:37,190
But you can override it, we'll see, ultimately, too.
83
00:04:37,190 --> 00:04:38,300
How about in Scratch?
84
00:04:38,300 --> 00:04:42,082
We had multiple functions like this, that not only said something
85
00:04:42,082 --> 00:04:43,790
on the screen, but also asked a question,
86
00:04:43,790 --> 00:04:47,300
thereby being another function that returned a value, called answer.
87
00:04:47,300 --> 00:04:49,730
In C we saw code that looked a little something
88
00:04:49,730 --> 00:04:53,420
like this, whereby that first line declares a variable called answer,
89
00:04:53,420 --> 00:04:55,790
sets it equal to the return value of getString,
90
00:04:55,790 --> 00:04:57,740
one of the functions from the CS50 library,
91
00:04:57,740 --> 00:05:00,980
and then the same double quotes and parentheses and semicolon.
92
00:05:00,980 --> 00:05:05,390
Then we had this format code in C that allowed us, with %S,
93
00:05:05,390 --> 00:05:07,760
to actually print out that same value.
94
00:05:07,760 --> 00:05:10,400
In Python, this, too, is going to look a little bit simpler.
95
00:05:10,400 --> 00:05:13,460
Instead, we're going to have answer equals getString,
96
00:05:13,460 --> 00:05:16,070
quote unquote "What's your name," and then print,
97
00:05:16,070 --> 00:05:18,870
with a plus sign and a little bit of new syntax.
98
00:05:18,870 --> 00:05:21,650
But let's see if we can't just infer from this example what
99
00:05:21,650 --> 00:05:22,860
it is that's going on.
100
00:05:22,860 --> 00:05:25,670
Well, first missing on the left is what?
101
00:05:25,670 --> 00:05:28,620
To the left of the equal sign, there's no what this time?
102
00:05:28,620 --> 00:05:29,870
Feel free to just call it out.
103
00:05:29,870 --> 00:05:30,690
AUDIENCE: Type.
104
00:05:30,690 --> 00:05:31,460
DAVID J. MALAN: So there's no type.
105
00:05:31,460 --> 00:05:33,770
There's no type, like the word string, which
106
00:05:33,770 --> 00:05:38,090
even though that was a type in CS50, every other variable in C
107
00:05:38,090 --> 00:05:41,437
did we use Int or string or float, or Bool or something else.
108
00:05:41,437 --> 00:05:43,520
In Python, there are still going to be data types,
109
00:05:43,520 --> 00:05:45,980
today onward, but you, the programmer, don't
110
00:05:45,980 --> 00:05:49,042
have to bother telling the computer what types you're using.
111
00:05:49,042 --> 00:05:50,750
The computer is going to be smart enough,
112
00:05:50,750 --> 00:05:53,240
the language, really, is going to be smart enough, to just figure it out
113
00:05:53,240 --> 00:05:54,260
from context.
114
00:05:54,260 --> 00:05:56,150
Meanwhile, on the right hand side, getString
115
00:05:56,150 --> 00:05:57,858
is going to be a function we'll use today
116
00:05:57,858 --> 00:06:01,320
and this week, which comes from a Python version of the CS50 library.
117
00:06:01,320 --> 00:06:04,370
But we'll also start to take off those training wheels, so that you'll
118
00:06:04,370 --> 00:06:07,670
see how to do things without any CS50 library moving forward,
119
00:06:07,670 --> 00:06:09,290
using a different function instead.
120
00:06:09,290 --> 00:06:12,920
As before, no semicolon, but the rest of the syntax is pretty much the same
121
00:06:12,920 --> 00:06:13,430
here.
122
00:06:13,430 --> 00:06:16,013
This starts, of course, to get a little bit different, though.
123
00:06:16,013 --> 00:06:17,650
We're using print instead of printf.
124
00:06:17,650 --> 00:06:20,860
But now, even though this looks a little cryptic,
125
00:06:20,860 --> 00:06:23,110
perhaps, if you've never programmed before CS50,
126
00:06:23,110 --> 00:06:27,130
what might that plus be doing, just based on inference here.
127
00:06:27,130 --> 00:06:27,880
What do you think?
128
00:06:27,880 --> 00:06:31,720
AUDIENCE: Adding answer to the string Hello.
129
00:06:31,720 --> 00:06:34,990
DAVID J. MALAN: Yeah, so adding answer to the string Hello,
130
00:06:34,990 --> 00:06:37,030
and adding, so to speak, not mathematically,
131
00:06:37,030 --> 00:06:39,580
but in the form of joining them together, much like we
132
00:06:39,580 --> 00:06:43,040
saw the joined block in Scratch, or concatenation was the term of art
133
00:06:43,040 --> 00:06:43,540
there.
134
00:06:43,540 --> 00:06:46,810
This plus sign appends, if you will, whatever's
135
00:06:46,810 --> 00:06:48,625
in answer to whatever is quoted here.
136
00:06:48,625 --> 00:06:51,250
And I deliberately left a space there, so that grammatically it
137
00:06:51,250 --> 00:06:53,422
looks nice, after the comma as well.
138
00:06:53,422 --> 00:06:54,880
Now there's another way to do this.
139
00:06:54,880 --> 00:06:57,130
And it, too, is going to look cryptic at first glance.
140
00:06:57,130 --> 00:06:59,510
But it just gets easier and more convenient over time.
141
00:06:59,510 --> 00:07:04,580
You can also change this second line to be this, instead.
142
00:07:04,580 --> 00:07:05,770
So what's going on here.
143
00:07:05,770 --> 00:07:08,710
This is actually a relatively new feature of Python in the past couple
144
00:07:08,710 --> 00:07:11,020
of years, where now what you're seeing is, yes,
145
00:07:11,020 --> 00:07:13,580
a string, between these same double quotes,
146
00:07:13,580 --> 00:07:17,075
but this is what Python would call a format string, or Fstring.
147
00:07:17,075 --> 00:07:20,200
And it literally starts with the letter F, which admittedly looks, I think,
148
00:07:20,200 --> 00:07:20,980
a little weird.
149
00:07:20,980 --> 00:07:24,700
But that just indicates that Python should
150
00:07:24,700 --> 00:07:29,110
assume that anything inside of curly braces inside of the string
151
00:07:29,110 --> 00:07:32,560
should be interpolated, so to speak, which is a fancy term saying,
152
00:07:32,560 --> 00:07:36,160
substitute the value of any variables therein.
153
00:07:36,160 --> 00:07:38,030
And it can do some other things as well.
154
00:07:38,030 --> 00:07:42,040
So answer is a variable, declared, of course, on this first line.
155
00:07:42,040 --> 00:07:46,300
This Fstring, then, says to Python, print out Hello comma space, and then
156
00:07:46,300 --> 00:07:47,950
the value of Answer.
157
00:07:47,950 --> 00:07:52,390
If, by contrast, you omitted the curly braces,
158
00:07:52,390 --> 00:07:54,040
just take a guess, what would happen?
159
00:07:54,040 --> 00:07:56,920
What would the symptom of that bug be, if you accidentally
160
00:07:56,920 --> 00:08:00,010
forgot the curly braces, but maybe still had the F there?
161
00:08:00,010 --> 00:08:01,750
AUDIENCE: It would print below it, too.
162
00:08:01,750 --> 00:08:04,300
DAVID J. MALAN: Yeah, it would literally print Hello, comma answer, because it's
163
00:08:04,300 --> 00:08:05,200
going to take you literally.
164
00:08:05,200 --> 00:08:07,690
So the curly braces just kind of allow you to plug things in.
165
00:08:07,690 --> 00:08:09,350
And, again, it looks a little more cryptic,
166
00:08:09,350 --> 00:08:11,267
but it's just going to save us time over time.
167
00:08:11,267 --> 00:08:14,120
And if any of you programmed in Java in high school, for instance,
168
00:08:14,120 --> 00:08:16,630
you saw plus in that context, too, for concatenation.
169
00:08:16,630 --> 00:08:19,755
This just kind of makes your code a little tighter, a little more succinct.
170
00:08:19,755 --> 00:08:21,730
So it's a convenient feature now in Python.
171
00:08:21,730 --> 00:08:24,190
All right, this was an example in Scratch of a variable,
172
00:08:24,190 --> 00:08:26,740
setting a variable like counter equal to 0.
173
00:08:26,740 --> 00:08:30,460
In C it looked like this, where you specify the type, the name,
174
00:08:30,460 --> 00:08:32,230
and then the value, with a semicolon.
175
00:08:32,230 --> 00:08:35,096
In Python, it's going to look like this.
176
00:08:35,096 --> 00:08:36,429
And I'll state the obvious here.
177
00:08:36,429 --> 00:08:39,340
You don't need to mention the type, just like before with string.
178
00:08:39,340 --> 00:08:41,030
And you don't need a semicolon.
179
00:08:41,030 --> 00:08:42,130
So it's a little simpler.
180
00:08:42,130 --> 00:08:45,005
If you want a variable, just write it and set it equal to some value.
181
00:08:45,005 --> 00:08:48,070
But the single equal sign still behaves the same as in C.
182
00:08:48,070 --> 00:08:50,440
Suppose we wanted to increment counter by one.
183
00:08:50,440 --> 00:08:52,750
In Scratch, we use this puzzle piece here.
184
00:08:52,750 --> 00:08:55,250
In C, we could do this, actually, in a few different ways.
185
00:08:55,250 --> 00:08:57,400
There was this way, if counter already exists,
186
00:08:57,400 --> 00:08:59,980
you just say counter equals counter plus 1.
187
00:08:59,980 --> 00:09:04,840
There was the slightly less verbose way, where you could say, oops, sorry.
188
00:09:04,840 --> 00:09:06,400
Let me do the first sentence first.
189
00:09:06,400 --> 00:09:08,690
In Python, that same thing, as you might guess,
190
00:09:08,690 --> 00:09:12,160
is actually going to be almost the same, you just throw away the semicolon.
191
00:09:12,160 --> 00:09:15,370
And the mathematics are ultimately the same, copying from right to left,
192
00:09:15,370 --> 00:09:17,290
via the assignment operator.
193
00:09:17,290 --> 00:09:19,570
Now, recall, in C, that we had this shorthand
194
00:09:19,570 --> 00:09:22,000
notation, which did the same thing.
195
00:09:22,000 --> 00:09:26,980
In Python, you can similarly do the same thing, just no need for the semicolon.
196
00:09:26,980 --> 00:09:29,290
The only step backwards we're taking, if you
197
00:09:29,290 --> 00:09:33,790
were a big fan of counter plus plus, that doesn't exist in Python,
198
00:09:33,790 --> 00:09:34,625
nor minus minus.
199
00:09:34,625 --> 00:09:35,500
You just can't do it.
200
00:09:35,500 --> 00:09:40,210
You have to do the plus equals 1 or plus/minus or minus equals 1
201
00:09:40,210 --> 00:09:43,720
to achieve that same result. All right, how about in Python 2?
202
00:09:43,720 --> 00:09:46,360
Here in Scratch, recall, was a conditional,
203
00:09:46,360 --> 00:09:49,990
asking a silly question like is x less than y, and if so, just say as much.
204
00:09:49,990 --> 00:09:53,980
In C, that looked a little something like this, printf and if
205
00:09:53,980 --> 00:09:57,310
with the parentheses, the curly braces, the semicolon, and all of that.
206
00:09:57,310 --> 00:10:00,610
In Python, this is going to get a little more pleasant to type, too.
207
00:10:00,610 --> 00:10:03,320
It's going to be just this.
208
00:10:03,320 --> 00:10:06,460
And if someone wants to call out some of the obvious changes here,
209
00:10:06,460 --> 00:10:10,365
what has been simplified now in Python for a conditional, it would seem?
210
00:10:10,365 --> 00:10:11,740
Yeah, what's missing, or changed?
211
00:10:11,740 --> 00:10:12,350
AUDIENCE: Braces.
212
00:10:12,350 --> 00:10:13,405
DAVID J. MALAN: So no curly braces.
213
00:10:13,405 --> 00:10:14,740
AUDIENCE: Colon is back.
214
00:10:14,740 --> 00:10:15,370
DAVID J. MALAN: I'm sorry?
215
00:10:15,370 --> 00:10:16,510
AUDIENCE: Using the colon instead.
216
00:10:16,510 --> 00:10:18,593
DAVID J. MALAN: And we're using the colon instead.
217
00:10:18,593 --> 00:10:20,620
So I got rid of the curly braces in Python.
218
00:10:20,620 --> 00:10:22,193
But I'm using a colon instead.
219
00:10:22,193 --> 00:10:24,110
And even though this is a single line of code,
220
00:10:24,110 --> 00:10:28,450
so long as you indent subsequent lines along with the printf,
221
00:10:28,450 --> 00:10:32,830
that's going to imply that everything, if the if condition is true,
222
00:10:32,830 --> 00:10:36,970
should be executed below it, until you start to un-indent and start writing
223
00:10:36,970 --> 00:10:38,470
a different line of code altogether.
224
00:10:38,470 --> 00:10:41,000
So indentation in Python is important.
225
00:10:41,000 --> 00:10:45,100
So this is among the reasons we've emphasized axes like style,
226
00:10:45,100 --> 00:10:46,840
just how well styled your code is.
227
00:10:46,840 --> 00:10:49,360
And honestly, we've seen, certainly, in office hours,
228
00:10:49,360 --> 00:10:52,000
and you've seen in your own code, sort of a tendency sometimes
229
00:10:52,000 --> 00:10:55,030
to be a little lax when it comes to indentation, right?
230
00:10:55,030 --> 00:10:57,670
If you're one of those folks who likes to indent everything
231
00:10:57,670 --> 00:11:01,210
on the left hand side of the window, yeah, it might compile and run.
232
00:11:01,210 --> 00:11:04,870
But it's not particularly readable by you or anyone else.
233
00:11:04,870 --> 00:11:08,590
Python actually addresses this by just requiring indentation,
234
00:11:08,590 --> 00:11:09,790
when logically needed.
235
00:11:09,790 --> 00:11:14,050
So Python is going to force you to start inventing properly now, if that's been,
236
00:11:14,050 --> 00:11:16,680
perhaps, a tendency otherwise.
237
00:11:16,680 --> 00:11:17,620
What else is missing?
238
00:11:17,620 --> 00:11:19,050
Well, we have no semicolon here.
239
00:11:19,050 --> 00:11:21,150
Of course, it's print instead of printf.
240
00:11:21,150 --> 00:11:23,820
But otherwise, those seem to be the primary differences.
241
00:11:23,820 --> 00:11:25,680
What about something larger in Scratch?
242
00:11:25,680 --> 00:11:28,812
If an if-else block, like this, you can perhaps
243
00:11:28,812 --> 00:11:30,270
guess what it's going to look like.
244
00:11:30,270 --> 00:11:33,540
In C it looks like this, curly braces semicolons, and so forth.
245
00:11:33,540 --> 00:11:37,530
In Python, it's going to now look like this, almost the same,
246
00:11:37,530 --> 00:11:38,820
but indentation is important.
247
00:11:38,820 --> 00:11:39,960
The colons are important.
248
00:11:39,960 --> 00:11:42,810
And there's one other difference that's now again visible here,
249
00:11:42,810 --> 00:11:44,670
but we didn't call it out a second ago.
250
00:11:44,670 --> 00:11:47,760
What else is different in Python versus C for these conditionals?
251
00:11:47,760 --> 00:11:48,471
Yeah.
252
00:11:48,471 --> 00:11:51,120
AUDIENCE: You don't have any parentheses around the condition.
253
00:11:51,120 --> 00:11:51,700
DAVID J. MALAN: Perfect.
254
00:11:51,700 --> 00:11:54,090
We don't have any parentheses around the condition,
255
00:11:54,090 --> 00:11:55,710
the Boolean expression itself.
256
00:11:55,710 --> 00:11:56,567
And why not?
257
00:11:56,567 --> 00:11:57,900
Well, it's just simpler to type.
258
00:11:57,900 --> 00:11:58,950
It's less to type.
259
00:11:58,950 --> 00:12:00,450
You can still use parentheses.
260
00:12:00,450 --> 00:12:02,550
And, in fact, you might want to or need to,
261
00:12:02,550 --> 00:12:07,470
if you want to combine thoughts and do this and that, or this or that.
262
00:12:07,470 --> 00:12:10,920
But by default, you no longer need or should have those parentheses.
263
00:12:10,920 --> 00:12:12,150
Just say what you mean.
264
00:12:12,150 --> 00:12:14,440
Lastly, with conditionals, we had something like this,
265
00:12:14,440 --> 00:12:16,770
an if else if else statement.
266
00:12:16,770 --> 00:12:18,840
In C, it looked a little something like this.
267
00:12:18,840 --> 00:12:20,880
In Python, it's going to get really tighter now.
268
00:12:20,880 --> 00:12:25,830
It's just if, and this is the curiosity, elif x greater than y.
269
00:12:25,830 --> 00:12:31,110
So it's not else if, it's literally one keyword, elif, and the colons
270
00:12:31,110 --> 00:12:33,315
remain now on each of the three lines.
271
00:12:33,315 --> 00:12:34,690
But the indentation is important.
272
00:12:34,690 --> 00:12:36,480
And if we did want to do multiple things,
273
00:12:36,480 --> 00:12:40,238
we could just indent below each of these conditionals, as well.
274
00:12:40,238 --> 00:12:42,030
All right, let me pause there first, to see
275
00:12:42,030 --> 00:12:44,490
if there's any questions on these syntactic differences.
276
00:12:44,490 --> 00:12:45,247
Yeah.
277
00:12:45,247 --> 00:12:47,532
AUDIENCE: My thought is maybe like, it's good,
278
00:12:47,532 --> 00:12:51,160
though, does it matter if there's this in between thing like that, but
279
00:12:51,160 --> 00:12:52,170
and why.
280
00:12:52,170 --> 00:12:55,050
DAVID J. MALAN: In between, between what and what?
281
00:12:55,050 --> 00:12:58,420
AUDIENCE: So like the left-hand side and like the right side spaces?
282
00:12:58,420 --> 00:13:01,830
DAVID J. MALAN: Ah, good question, is Python sensitive
283
00:13:01,830 --> 00:13:03,750
to spaces and where they go?
284
00:13:03,750 --> 00:13:06,390
Sometimes no, sometimes yes, is the short answer.
285
00:13:06,390 --> 00:13:10,080
Stylistically, though, you should be practicing what we're preaching here,
286
00:13:10,080 --> 00:13:14,265
whereby you do have spaces to the left and right of binary operators,
287
00:13:14,265 --> 00:13:16,140
that they're called, something like less than
288
00:13:16,140 --> 00:13:18,348
or greater than is a binary operator, because there's
289
00:13:18,348 --> 00:13:20,580
two operands to the left and to the right of them.
290
00:13:20,580 --> 00:13:23,640
And in fact, in Python, more so than the world of C,
291
00:13:23,640 --> 00:13:26,340
there's actually formal style conventions.
292
00:13:26,340 --> 00:13:30,687
Not only within CS50 have we had a style guide on the course's website,
293
00:13:30,687 --> 00:13:34,020
for instance, that just dictates how you should write your code so that it looks
294
00:13:34,020 --> 00:13:34,945
like everyone else's.
295
00:13:34,945 --> 00:13:37,320
In the Python community, they take this one step further,
296
00:13:37,320 --> 00:13:41,260
and there's an actual standard whereby you don't have to adhere to it,
297
00:13:41,260 --> 00:13:44,310
but generally speaking, in the real world, someone would reprimand you,
298
00:13:44,310 --> 00:13:47,100
would reject your code, if you're trying to contribute it to another project,
299
00:13:47,100 --> 00:13:48,730
if you don't adhere to these standards.
300
00:13:48,730 --> 00:13:51,690
So while you could be lax with some of this white space,
301
00:13:51,690 --> 00:13:52,860
do make things readable.
302
00:13:52,860 --> 00:13:56,775
And that's Python theme, for the code to be as readable as possible.
303
00:13:56,775 --> 00:13:59,400
All right, so let's take a look at a couple of other constructs
304
00:13:59,400 --> 00:14:01,360
before transitioning to some actual code.
305
00:14:01,360 --> 00:14:04,110
This, of course, in Scratch was a loop, meowing forever.
306
00:14:04,110 --> 00:14:08,340
In C, the closest we could get was doing something while true, because true
307
00:14:08,340 --> 00:14:09,100
never changes.
308
00:14:09,100 --> 00:14:12,060
So it's sort of a simple way of just saying do this forever.
309
00:14:12,060 --> 00:14:14,940
In Python, it's pretty much the same thing,
310
00:14:14,940 --> 00:14:16,740
but a couple of small differences here.
311
00:14:16,740 --> 00:14:18,600
The parentheses are gone.
312
00:14:18,600 --> 00:14:19,598
The colon is there.
313
00:14:19,598 --> 00:14:20,640
The indentation is there.
314
00:14:20,640 --> 00:14:24,263
No semicolon, and there's one other subtle difference.
315
00:14:24,263 --> 00:14:24,930
What do you see?
316
00:14:24,930 --> 00:14:25,920
AUDIENCE: True is capitalized?
317
00:14:25,920 --> 00:14:28,003
DAVID J. MALAN: True is capitalized, just because.
318
00:14:28,003 --> 00:14:30,570
Both true and false are Boolean values in Python.
319
00:14:30,570 --> 00:14:33,150
But you've got to start capitalizing them, just because.
320
00:14:33,150 --> 00:14:35,040
All right, how about a loop like this, where
321
00:14:35,040 --> 00:14:38,460
you repeat something a finite number of times, like meowing three times.
322
00:14:38,460 --> 00:14:41,050
In C, we could do this a few different ways.
323
00:14:41,050 --> 00:14:44,790
There's this very mechanical way, where you initialize a variable like i
324
00:14:44,790 --> 00:14:45,570
to zero.
325
00:14:45,570 --> 00:14:49,350
You then use a while loop and check if i is less than 3,
326
00:14:49,350 --> 00:14:51,187
the total number of times you want to meow.
327
00:14:51,187 --> 00:14:52,770
Then you print what you want to print.
328
00:14:52,770 --> 00:14:56,370
You increment i using this syntax, or the longer, more verbose syntax,
329
00:14:56,370 --> 00:14:57,880
with plus equals or whatnot.
330
00:14:57,880 --> 00:15:00,210
And then you do it again and again and again.
331
00:15:00,210 --> 00:15:04,170
In Python, you can do it functionally the same way, same idea,
332
00:15:04,170 --> 00:15:05,580
slightly different syntax.
333
00:15:05,580 --> 00:15:08,190
You just don't bother saying what type of variable you want.
334
00:15:08,190 --> 00:15:11,038
Python will infer from the fact that there's a 0 right there.
335
00:15:11,038 --> 00:15:12,330
You don't need the parentheses.
336
00:15:12,330 --> 00:15:13,260
You do need the colon.
337
00:15:13,260 --> 00:15:14,760
You do need the indentation.
338
00:15:14,760 --> 00:15:17,910
You can't do the i plus plus, but you can do this other technique,
339
00:15:17,910 --> 00:15:20,100
as we could have done in C, as well.
340
00:15:20,100 --> 00:15:22,320
How else might we do this, though, too?
341
00:15:22,320 --> 00:15:24,540
Well. it turns out in C, we could do something
342
00:15:24,540 --> 00:15:28,230
like this, which, again, sort of cryptic at first glance,
343
00:15:28,230 --> 00:15:31,170
became perhaps more familiar, where you have initialization,
344
00:15:31,170 --> 00:15:34,920
a conditional, and then an update that you do after each iteration.
345
00:15:34,920 --> 00:15:37,950
In Python, there isn't really an analog.
346
00:15:37,950 --> 00:15:40,500
There is no analog in Python, where you have
347
00:15:40,500 --> 00:15:43,380
the parentheses and the multiple semicolons in the same line.
348
00:15:43,380 --> 00:15:47,010
Instead, there is a for loop, but it's meant to read a little more
349
00:15:47,010 --> 00:15:50,550
like English, for i in 0, 1, and 2.
350
00:15:50,550 --> 00:15:54,780
So we'll see in a bit, these square brackets represent an array, now
351
00:15:54,780 --> 00:15:57,090
to be called a list in Python.
352
00:15:57,090 --> 00:16:01,290
So lists in Python are more like link lists than they are arrays.
353
00:16:01,290 --> 00:16:02,380
More on that soon.
354
00:16:02,380 --> 00:16:06,210
So this just means for i and the following list of three values.
355
00:16:06,210 --> 00:16:09,820
And on each iteration of this loop, Python automatically, for you,
356
00:16:09,820 --> 00:16:11,250
it first sets i to zero.
357
00:16:11,250 --> 00:16:12,840
Then it sets i to one.
358
00:16:12,840 --> 00:16:17,880
Then it sets i to two, so that you effectively do things three times.
359
00:16:17,880 --> 00:16:21,450
But this doesn't necessarily scale, as I've drawn it on the board.
360
00:16:21,450 --> 00:16:25,140
Suppose you took this at face value as the way
361
00:16:25,140 --> 00:16:28,980
you iterate some number of times in Python, using a for loop.
362
00:16:28,980 --> 00:16:33,482
At what point does this approach perhaps get bad, or bad design?
363
00:16:33,482 --> 00:16:35,190
Let me give folks just a moment to think.
364
00:16:35,190 --> 00:16:36,415
Yeah, in back.
365
00:16:36,415 --> 00:16:39,082
AUDIENCE: If you don't know how many times, last time, you know,
366
00:16:39,082 --> 00:16:41,083
you've got the link in there.
367
00:16:41,083 --> 00:16:43,500
DAVID J. MALAN: Sure, if you don't know how many times you
368
00:16:43,500 --> 00:16:47,460
want to loop or iterate, you can't really create a hard-coded list
369
00:16:47,460 --> 00:16:48,750
like that, of 0, 1, 2.
370
00:16:48,750 --> 00:16:50,323
Other thoughts?
371
00:16:50,323 --> 00:16:52,990
AUDIENCE: So you want to say raise a large number of allowances.
372
00:16:52,990 --> 00:16:55,740
DAVID J. MALAN: Yeah, if you're iterating a large number of times,
373
00:16:55,740 --> 00:16:57,640
this list is going to get longer and longer,
374
00:16:57,640 --> 00:16:59,932
and you're just kind of stupidly going to be typing out
375
00:16:59,932 --> 00:17:03,660
like comma 3, comma 4, comma 5, comma dot dot dot, comma 99, comma 100.
376
00:17:03,660 --> 00:17:06,160
I mean, your code would start to look atrocious, eventually.
377
00:17:06,160 --> 00:17:07,510
So there is a better way.
378
00:17:07,510 --> 00:17:10,359
In Python, there is a function, or technically a type,
379
00:17:10,359 --> 00:17:14,530
called range, that essentially magically gives you back a range of values
380
00:17:14,530 --> 00:17:17,599
from 0 on up to, but not through a value.
381
00:17:17,599 --> 00:17:21,609
So the effect of this line of code, for i in the following range,
382
00:17:21,609 --> 00:17:24,484
essentially hands you back a list of three values,
383
00:17:24,484 --> 00:17:26,359
thereby letting you do something three times.
384
00:17:26,359 --> 00:17:29,067
And if you want to do something 99 times instead, you, of course,
385
00:17:29,067 --> 00:17:30,575
just change the 3 to a 99.
386
00:17:30,575 --> 00:17:31,075
Question.
387
00:17:31,075 --> 00:17:35,090
AUDIENCE: Is there a way to start the beginning point of that range
388
00:17:35,090 --> 00:17:39,410
at a number or an integer that's higher than zero, or is there never a really
389
00:17:39,410 --> 00:17:40,460
any point to do so?
390
00:17:40,460 --> 00:17:41,540
DAVID J. MALAN: A really good question, can
391
00:17:41,540 --> 00:17:43,440
you start counting at a higher number.
392
00:17:43,440 --> 00:17:46,910
So not 0, which is the implied default, but something larger than that.
393
00:17:46,910 --> 00:17:51,560
Yes, so it turns out the range function takes multiple arguments, not just one
394
00:17:51,560 --> 00:17:54,998
but maybe two or even three, that allows you to customize this behavior.
395
00:17:54,998 --> 00:17:56,540
So you can customize where it begins.
396
00:17:56,540 --> 00:17:57,920
You can customize the increment.
397
00:17:57,920 --> 00:17:59,712
By default, it's one, but if you want to do
398
00:17:59,712 --> 00:18:02,582
every two values, for like evens or odds, you could do that as well,
399
00:18:02,582 --> 00:18:03,540
and a few other things.
400
00:18:03,540 --> 00:18:05,930
And before long, we'll take a look at some Python documentation
401
00:18:05,930 --> 00:18:08,810
that will become your authoritative source for answers like that.
402
00:18:08,810 --> 00:18:10,790
Like, what can this function do.
403
00:18:10,790 --> 00:18:15,020
Other questions on this thus far?
404
00:18:15,020 --> 00:18:19,980
Seeing none, so what else might we compare and contrast here.
405
00:18:19,980 --> 00:18:24,320
Well, in the world of C, recall that we had a whole bunch of built-in data
406
00:18:24,320 --> 00:18:28,310
types, like these here, Bool and char and double and float, and so forth,
407
00:18:28,310 --> 00:18:31,670
string, which happened to come from the CS50 library.
408
00:18:31,670 --> 00:18:35,990
But the language C itself certainly understood the idea of strings,
409
00:18:35,990 --> 00:18:40,700
because the backslash 0, the support for %S and printf, that's all native,
410
00:18:40,700 --> 00:18:43,370
built into C, not a CS50 simplification.
411
00:18:43,370 --> 00:18:45,620
All we did, and revealed, as of a couple of weeks
412
00:18:45,620 --> 00:18:48,050
ago, is that string, this data type, is just
413
00:18:48,050 --> 00:18:52,730
a synonym for a typedef for char star, which is part of the language natively.
414
00:18:52,730 --> 00:18:55,610
In Python now, this list actually gets a little shorter, at least
415
00:18:55,610 --> 00:18:57,443
for these common primitive data types.
416
00:18:57,443 --> 00:19:00,110
Still going to have bulls, we're going to have floats, and Ints,
417
00:19:00,110 --> 00:19:02,600
and we're going to have strings, but we're going to call them STRs.
418
00:19:02,600 --> 00:19:04,760
And this is not a CS50 thing from the library,
419
00:19:04,760 --> 00:19:08,300
STR, S-T-R, is, in fact, a data type in Python,
420
00:19:08,300 --> 00:19:12,260
that's going to do a lot more than strings did for us automatically in C.
421
00:19:12,260 --> 00:19:17,133
Ints and floats, meanwhile, don't need the corresponding longs and doubles,
422
00:19:17,133 --> 00:19:19,550
because, in fact, among the problems Python solves for us,
423
00:19:19,550 --> 00:19:22,340
too, Ints can get as big as you want.
424
00:19:22,340 --> 00:19:25,220
Integer overflow is no longer going to be an issue.
425
00:19:25,220 --> 00:19:27,950
Per week 1, the language solves that for us.
426
00:19:27,950 --> 00:19:29,790
Floating point imprecision, unfortunately,
427
00:19:29,790 --> 00:19:31,190
is still a problem that remains.
428
00:19:31,190 --> 00:19:34,730
But there are libraries, code that other people have written, as we briefly
429
00:19:34,730 --> 00:19:37,010
discussed in weeks past, that allow you to do
430
00:19:37,010 --> 00:19:40,250
scientific or financial computing, using libraries that build
431
00:19:40,250 --> 00:19:42,625
on top of these data types, as well.
432
00:19:42,625 --> 00:19:45,500
So there's other data types, too, in Python, which we'll see actually
433
00:19:45,500 --> 00:19:48,710
gives us a whole bunch of more power and capability,
434
00:19:48,710 --> 00:19:51,500
things called ranges, like we just saw, lists,
435
00:19:51,500 --> 00:19:54,080
like I called out verbally, with the square brackets,
436
00:19:54,080 --> 00:19:56,900
things called tuples, for things like x comma y,
437
00:19:56,900 --> 00:20:00,305
or latitude, longitude, dictionaries, or Dicts,
438
00:20:00,305 --> 00:20:03,740
which allow you to store keys and values, much like our hash tables
439
00:20:03,740 --> 00:20:06,973
from last time, and then sets in the mathematical sense, where they filter
440
00:20:06,973 --> 00:20:09,890
out duplicates for you, and you can just put a whole bunch of numbers,
441
00:20:09,890 --> 00:20:13,910
a whole bunch of words or whatnot, and the language, via this data type,
442
00:20:13,910 --> 00:20:16,400
will filter out duplicates for you.
443
00:20:16,400 --> 00:20:19,985
Now there's going to be a few functions we give you this week and beyond,
444
00:20:19,985 --> 00:20:22,610
training wheels that we're then going to very quickly take off,
445
00:20:22,610 --> 00:20:26,060
just because, as we'll see today, they just simplify the process of getting
446
00:20:26,060 --> 00:20:29,205
user input correctly, without accidentally writing buggy code,
447
00:20:29,205 --> 00:20:32,330
just when you're trying to get Hello, World, or something similar, to work.
448
00:20:32,330 --> 00:20:36,050
And we'll give you functions, not like, not as long as this list in C,
449
00:20:36,050 --> 00:20:38,630
but a subset of these, get float, get Int,
450
00:20:38,630 --> 00:20:41,660
and get string, that'll automate the process of getting
451
00:20:41,660 --> 00:20:45,410
user input in a way that's more resilient against potential bugs.
452
00:20:45,410 --> 00:20:47,270
But we'll see what those bugs might be.
453
00:20:47,270 --> 00:20:50,120
And the way we're going to do this is similar in spirit to C.
454
00:20:50,120 --> 00:20:54,380
Instead of doing include, CS50.h, like we did in C,
455
00:20:54,380 --> 00:20:57,290
you're going to now start saying import CS50.
456
00:20:57,290 --> 00:21:00,560
Python supports, similar to C, libraries,
457
00:21:00,560 --> 00:21:02,300
but there aren't header files anymore.
458
00:21:02,300 --> 00:21:05,090
You just use the name of the library in Python.
459
00:21:05,090 --> 00:21:08,450
And if you want to import CS50's functions, you just say import CS50.
460
00:21:08,450 --> 00:21:12,470
Or, if you want to be more precise, and not just import the whole thing, which
461
00:21:12,470 --> 00:21:15,860
could be slow, if you've got a really big library with a lot of functionality
462
00:21:15,860 --> 00:21:19,730
in it, you can be more precise and say from CS50, import get float.
463
00:21:19,730 --> 00:21:23,480
From CS50 import get Int, from CSM 50 import get string,
464
00:21:23,480 --> 00:21:26,270
or you can just separate them by commas and import 3
465
00:21:26,270 --> 00:21:30,550
and only 3 things from a particular library, like ours.
466
00:21:30,550 --> 00:21:32,300
But starting today and onward, we're going
467
00:21:32,300 --> 00:21:35,450
to start making much more heavy use of libraries, code
468
00:21:35,450 --> 00:21:38,570
that other people wrote, so that we're no longer reinventing the wheel.
469
00:21:38,570 --> 00:21:41,875
We're not making our own linked lists, our own trees, our own dictionaries.
470
00:21:41,875 --> 00:21:44,250
We're going to start standing on the shoulders of others,
471
00:21:44,250 --> 00:21:47,120
so that you can get real work done, so to speak, faster,
472
00:21:47,120 --> 00:21:51,710
by building your software on top of others' code as well.
473
00:21:51,710 --> 00:21:55,110
All right, so that's it for the syntactic tour of the language,
474
00:21:55,110 --> 00:21:56,360
and the sort of core features.
475
00:21:56,360 --> 00:21:58,320
Soon we'll transition to application thereof.
476
00:21:58,320 --> 00:22:04,040
But let me pause here to see if there's any questions on syntax or primitives
477
00:22:04,040 --> 00:22:10,340
or otherwise, or otherwise.
478
00:22:10,340 --> 00:22:12,204
Oh, yes, in back.
479
00:22:12,204 --> 00:22:16,163
AUDIENCE: Why don't Python have the increment operators.
480
00:22:16,163 --> 00:22:18,330
DAVID J. MALAN: I'm sorry, say it again, why doesn't
481
00:22:18,330 --> 00:22:19,788
Python have what kind of operators?
482
00:22:19,788 --> 00:22:22,578
AUDIENCE: Why doesn't Python have the increment operator?
483
00:22:22,578 --> 00:22:25,620
DAVID J. MALAN: Sorry, someone coughed when you said something operators.
484
00:22:25,620 --> 00:22:26,948
AUDIENCE: The increment.
485
00:22:26,948 --> 00:22:28,740
DAVID J. MALAN: Oh, the increment operator?
486
00:22:28,740 --> 00:22:30,407
I'd have to check the history, honestly.
487
00:22:30,407 --> 00:22:32,910
Python has tended to be a fairly minimus language.
488
00:22:32,910 --> 00:22:36,090
And if you can do something one way, the community, arguably,
489
00:22:36,090 --> 00:22:40,145
has tended to not give you multiple ways to do the same thing syntactically.
490
00:22:40,145 --> 00:22:41,520
There's probably a better answer.
491
00:22:41,520 --> 00:22:45,840
And I'll see if I can dig in and post something online, to follow up on that.
492
00:22:45,840 --> 00:22:49,870
All right, so before we transition to now writing some actual code,
493
00:22:49,870 --> 00:22:54,870
let me go ahead and consider exactly how we're going to write code.
494
00:22:54,870 --> 00:22:58,770
In the world of C, recall that it's generally been a 2-step process.
495
00:22:58,770 --> 00:23:04,230
We create a file called like Hello.c, and then, step one, make Hello, step 2,
496
00:23:04,230 --> 00:23:05,400
./Hello.
497
00:23:05,400 --> 00:23:08,130
Or, if you think back to week two, when we sort of peeled back
498
00:23:08,130 --> 00:23:11,100
the layer of what Hello, of what make was doing,
499
00:23:11,100 --> 00:23:14,310
you could more verbosely type out the name of the actual compiler,
500
00:23:14,310 --> 00:23:17,640
Clang in our case, command line arguments like dash Oh, Hello,
501
00:23:17,640 --> 00:23:19,840
to specify what name you want to create.
502
00:23:19,840 --> 00:23:21,660
And then you can specify the file name.
503
00:23:21,660 --> 00:23:25,050
And then you can specify what libraries you want to link in.
504
00:23:25,050 --> 00:23:26,550
So that was a very verbose approach.
505
00:23:26,550 --> 00:23:28,930
But it was always a two-step approach.
506
00:23:28,930 --> 00:23:31,680
And so, even as you've been doing recent problem sets,
507
00:23:31,680 --> 00:23:35,400
odds are you've realized that, any time you want to make a change to your code,
508
00:23:35,400 --> 00:23:39,660
or make a change to your code and try and test your code again,
509
00:23:39,660 --> 00:23:42,360
you're constantly doing those two steps.
510
00:23:42,360 --> 00:23:45,840
Moving forward in Python, it's going to become simpler,
511
00:23:45,840 --> 00:23:47,610
and it's going to be just this.
512
00:23:47,610 --> 00:23:50,460
The file name is going to change, but that might go without saying.
513
00:23:50,460 --> 00:23:55,260
It's going to be something like Hello.py, P-Y, instead of Hello.c.
514
00:23:55,260 --> 00:23:57,990
And that's just a convention, using a different file extension.
515
00:23:57,990 --> 00:24:00,780
But there's no compilation step per se.
516
00:24:00,780 --> 00:24:04,170
You jump right to the execution of your code.
517
00:24:04,170 --> 00:24:07,200
And so Python, it turns out, is the name, not only of the language
518
00:24:07,200 --> 00:24:12,150
we're going to start using, it's also the name of a program on a Mac, a PC,
519
00:24:12,150 --> 00:24:16,020
assuming it's been pre-installed, that interprets the language for you.
520
00:24:16,020 --> 00:24:20,100
This is to say that Python is generally described as being interpreted,
521
00:24:20,100 --> 00:24:21,360
not compiled.
522
00:24:21,360 --> 00:24:25,170
And by that, I mean you get to skip, from the programmer's perspective,
523
00:24:25,170 --> 00:24:26,370
that compilation step.
524
00:24:26,370 --> 00:24:30,870
There is no manual step in the world of Python, typically, of writing your code
525
00:24:30,870 --> 00:24:34,530
and then compiling it to zeros and ones, and then running the zeros and ones.
526
00:24:34,530 --> 00:24:36,870
Instead, these kind of two steps get collapsed
527
00:24:36,870 --> 00:24:42,570
into the illusion of one, whereby you, instead, are able to just run the code,
528
00:24:42,570 --> 00:24:46,200
and let the computer figure out how to actually convert it
529
00:24:46,200 --> 00:24:48,240
to something the computer understands.
530
00:24:48,240 --> 00:24:51,850
And the way we do that is via this old process, input and output.
531
00:24:51,850 --> 00:24:53,910
But now, when you have source code, it's going
532
00:24:53,910 --> 00:24:56,850
to be passed into an interpreter, not a compiler.
533
00:24:56,850 --> 00:24:59,400
And the best analog of this is just to perhaps point out
534
00:24:59,400 --> 00:25:01,950
that, in the human world, if you speak, or don't speak,
535
00:25:01,950 --> 00:25:05,640
multiple human languages, it can be a pretty slow process from going
536
00:25:05,640 --> 00:25:07,270
from one language to another.
537
00:25:07,270 --> 00:25:10,170
For instance, here are step-by-step instructions for finding someone
538
00:25:10,170 --> 00:25:12,540
in a phone book, unfortunately, in Spanish.
539
00:25:12,540 --> 00:25:15,360
Unfortunately, if you don't speak or read Spanish.
540
00:25:15,360 --> 00:25:16,560
You could figure this out.
541
00:25:16,560 --> 00:25:19,380
You could run this algorithm, but you're going to have to do some googling,
542
00:25:19,380 --> 00:25:22,130
or you're going to have to open up literal dictionary from Spanish
543
00:25:22,130 --> 00:25:23,460
to English and convert this.
544
00:25:23,460 --> 00:25:27,060
And the catch with translating any language, human or computer
545
00:25:27,060 --> 00:25:30,850
or otherwise, is that you're going to pay a price, typically some time.
546
00:25:30,850 --> 00:25:33,840
And so converting this in Spanish to this in English
547
00:25:33,840 --> 00:25:36,360
is just going to take you longer than if this were already
548
00:25:36,360 --> 00:25:38,453
in your native language.
549
00:25:38,453 --> 00:25:41,370
And that's going to be one of the subtleties with the world of Python.
550
00:25:41,370 --> 00:25:45,180
Yes, it's a feature that you can just run the code without having
551
00:25:45,180 --> 00:25:47,880
to bother compiling it manually first.
552
00:25:47,880 --> 00:25:49,050
But we might pay a price.
553
00:25:49,050 --> 00:25:50,815
And things might be a little slower.
554
00:25:50,815 --> 00:25:52,440
Now, there's ways to chip away at that.
555
00:25:52,440 --> 00:25:53,815
But we'll see an example thereof.
556
00:25:53,815 --> 00:25:56,700
In fact, let me transition now to just a couple of examples
557
00:25:56,700 --> 00:26:00,660
that demonstrate how Python is not only easier for many people
558
00:26:00,660 --> 00:26:03,240
to use, perhaps yourselves too, because it throws away
559
00:26:03,240 --> 00:26:06,120
a lot of the annoying syntax, it shortens the number of lines
560
00:26:06,120 --> 00:26:09,810
you have to write, and also it comes with so many darn libraries,
561
00:26:09,810 --> 00:26:14,740
you can just do so much more without having to write the code yourself.
562
00:26:14,740 --> 00:26:17,670
So, as an example of this, let me switch over here
563
00:26:17,670 --> 00:26:24,090
to this image from problem set 4, which is the Weeks Bridge down by the Charles
564
00:26:24,090 --> 00:26:25,290
River here in Cambridge.
565
00:26:25,290 --> 00:26:27,245
And this is the original photo, pretty clear,
566
00:26:27,245 --> 00:26:30,370
and it's even higher res if we looked at the original version of the photo.
567
00:26:30,370 --> 00:26:33,660
But there have been no filters, a la Instagram, applied to this photo.
568
00:26:33,660 --> 00:26:36,750
Recall, for problem set four, you had to implement a few filters.
569
00:26:36,750 --> 00:26:38,460
And among them might have been blur.
570
00:26:38,460 --> 00:26:41,610
And blur was probably among the more challenging of the ones,
571
00:26:41,610 --> 00:26:44,190
because you had to iterate over all of the pixels,
572
00:26:44,190 --> 00:26:47,130
you had to take into account what's above, what's below, to the left,
573
00:26:47,130 --> 00:26:47,490
to the right.
574
00:26:47,490 --> 00:26:49,448
I mean, there was a lot of math and arithmetic.
575
00:26:49,448 --> 00:26:52,620
And if you ultimately got it, it was probably a great sense of satisfaction.
576
00:26:52,620 --> 00:26:54,780
But that was probably several hours later.
577
00:26:54,780 --> 00:26:57,540
In a language like Python, where there might
578
00:26:57,540 --> 00:27:01,170
be libraries that had been written by others, on whose shoulders
579
00:27:01,170 --> 00:27:03,880
you can stand, we could perhaps do something like this.
580
00:27:03,880 --> 00:27:08,280
Let me go ahead and run a program, or write a program, called Blur.py here.
581
00:27:08,280 --> 00:27:12,130
And in Blur.py, in VS Code, let me just do this.
582
00:27:12,130 --> 00:27:15,370
Let me import from a library, not the CS50 library,
583
00:27:15,370 --> 00:27:19,620
but the Pillow library, so to speak, a keyword called image
584
00:27:19,620 --> 00:27:23,330
and another one called image filter, then let me go ahead
585
00:27:23,330 --> 00:27:26,420
and say, let me open the current version of this image, which
586
00:27:26,420 --> 00:27:27,740
is called Bridge.bmp.
587
00:27:27,740 --> 00:27:30,260
So the before version of the image will be
588
00:27:30,260 --> 00:27:34,550
the result of calling image.open quote unquote "Bridge.bmp,"
589
00:27:34,550 --> 00:27:37,040
and then, let me create an after version.
590
00:27:37,040 --> 00:27:38,840
So you'll see before and after.
591
00:27:38,840 --> 00:27:45,010
After equals the before version .filter of image filter.
592
00:27:45,010 --> 00:27:46,760
And there is, if I read the documentation,
593
00:27:46,760 --> 00:27:49,052
I'll see that there's something called a box blur, that
594
00:27:49,052 --> 00:27:52,160
allows you to blur in box format, like one pixel above,
595
00:27:52,160 --> 00:27:53,750
below, left, and right.
596
00:27:53,750 --> 00:27:55,367
So I'll do one pixel there.
597
00:27:55,367 --> 00:27:57,950
And then, after that's done, let me go ahead and save the file
598
00:27:57,950 --> 00:28:01,070
as something like Out.bmp.
599
00:28:01,070 --> 00:28:02,180
That's it.
600
00:28:02,180 --> 00:28:04,910
Assuming this library works as described,
601
00:28:04,910 --> 00:28:08,060
I am opening the file in Python, using line 3.
602
00:28:08,060 --> 00:28:09,680
And this is somewhat new syntax.
603
00:28:09,680 --> 00:28:13,250
In the world of Python, we're going to start making use of the dot operator
604
00:28:13,250 --> 00:28:15,320
more, because in the world of Python, you have
605
00:28:15,320 --> 00:28:19,700
what's called object-oriented programming, or OOP, as a term of art.
606
00:28:19,700 --> 00:28:22,470
And what this means is that you still have functions,
607
00:28:22,470 --> 00:28:24,980
you still have variables, but sometimes those functions
608
00:28:24,980 --> 00:28:28,850
are embedded inside of the variables, or, more specifically,
609
00:28:28,850 --> 00:28:30,710
inside of the data types themselves.
610
00:28:30,710 --> 00:28:34,430
Think back to C. When you wanted to convert something to uppercase,
611
00:28:34,430 --> 00:28:38,582
there was a to upper function that takes as input an argument that's a char.
612
00:28:38,582 --> 00:28:41,540
And you can pass in any char you want, and it will uppercase it for you
613
00:28:41,540 --> 00:28:42,890
and give you back a value.
614
00:28:42,890 --> 00:28:46,160
Well, you know what, if that's such a common paradigm, where
615
00:28:46,160 --> 00:28:49,850
upper-casing chars is a useful thing, what the world of Python does
616
00:28:49,850 --> 00:28:54,470
is it embeds into the string data type, or char if you will,
617
00:28:54,470 --> 00:28:59,240
the ability just to uppercase any char by treating the char, or the string,
618
00:28:59,240 --> 00:29:02,150
as though it's a struct in C. Recall that structs
619
00:29:02,150 --> 00:29:04,400
encapsulate multiple types of values.
620
00:29:04,400 --> 00:29:07,610
In object-oriented programming, in a language like Python,
621
00:29:07,610 --> 00:29:11,510
you can encapsulate not just values, but also functionality.
622
00:29:11,510 --> 00:29:13,818
Functions can now be inside of structs.
623
00:29:13,818 --> 00:29:15,860
But we're not going to call them structs anymore.
624
00:29:15,860 --> 00:29:17,270
We're going to call them objects.
625
00:29:17,270 --> 00:29:19,130
But that's just a different vernacular.
626
00:29:19,130 --> 00:29:20,870
So what am I doing here?
627
00:29:20,870 --> 00:29:23,870
Inside of the image library, there's a function called open,
628
00:29:23,870 --> 00:29:26,630
and it takes an argument, the name of the file, to open.
629
00:29:26,630 --> 00:29:30,260
Once I have a variable called before, that is a struct, or technically
630
00:29:30,260 --> 00:29:33,290
an object, inside of which is now, because it
631
00:29:33,290 --> 00:29:36,140
was returned from this function, a function
632
00:29:36,140 --> 00:29:38,280
called filter, that takes an argument.
633
00:29:38,280 --> 00:29:41,660
The argument here happens to be image.boxblur1,
634
00:29:41,660 --> 00:29:42,830
which itself is a function.
635
00:29:42,830 --> 00:29:44,803
But it just returns the filter to use.
636
00:29:44,803 --> 00:29:46,970
And then, after, dot save does what you might think.
637
00:29:46,970 --> 00:29:48,150
It just saves the file.
638
00:29:48,150 --> 00:29:51,470
So instead of using fopen and fwrite, you just say dot save,
639
00:29:51,470 --> 00:29:54,510
and that does all of that messy work for you.
640
00:29:54,510 --> 00:29:57,230
So it's just, what, four lines of code total?
641
00:29:57,230 --> 00:30:00,240
Let me go ahead and go down to my terminal window.
642
00:30:00,240 --> 00:30:03,533
Let me go ahead and show you with LS that, at the moment,
643
00:30:03,533 --> 00:30:05,450
whoops, sorry, let me not bother showing that,
644
00:30:05,450 --> 00:30:07,160
because I have other examples to come.
645
00:30:07,160 --> 00:30:14,310
I'm going to go ahead and do Python of Blur.py, nope, sorry, wrong place.
646
00:30:14,310 --> 00:30:15,570
I did need to make a command.
647
00:30:15,570 --> 00:30:16,280
There we go.
648
00:30:16,280 --> 00:30:19,340
OK, let me go ahead and type LS inside of my filter directory, which
649
00:30:19,340 --> 00:30:21,560
is among the sample code online today.
650
00:30:21,560 --> 00:30:24,800
There's only one file called Bridge.bmp, dammit,
651
00:30:24,800 --> 00:30:27,630
I'm trying to get these things ready at the same time.
652
00:30:27,630 --> 00:30:28,730
Let me rewind.
653
00:30:28,730 --> 00:30:32,120
Let me move this code into place.
654
00:30:32,120 --> 00:30:34,710
All right, I've gone ahead and moved this file, Blur.py,
655
00:30:34,710 --> 00:30:37,190
into a folder called filter, inside of which
656
00:30:37,190 --> 00:30:42,080
there's another file called Bridge.bmp, which we can confer with LS.
657
00:30:42,080 --> 00:30:44,390
Let me now go ahead and run Python, which
658
00:30:44,390 --> 00:30:46,700
is my interpreter, and also the name of the language,
659
00:30:46,700 --> 00:30:48,990
and run Python on this file.
660
00:30:48,990 --> 00:30:51,348
So much like running the Spanish algorithm
661
00:30:51,348 --> 00:30:53,390
through Google Translate, or something like that,
662
00:30:53,390 --> 00:30:55,650
as input, to get back the English output,
663
00:30:55,650 --> 00:30:59,540
this is going to translate the Python language to something
664
00:30:59,540 --> 00:31:01,760
this computer, or this cloud-based environment,
665
00:31:01,760 --> 00:31:05,070
understands, and then run the corresponding code, top to bottom,
666
00:31:05,070 --> 00:31:05,707
left to right.
667
00:31:05,707 --> 00:31:07,040
I'm going to go ahead and Enter.
668
00:31:07,040 --> 00:31:08,930
No error message is generally a good thing.
669
00:31:08,930 --> 00:31:11,960
If I type LS you'll now see out.bmp.
670
00:31:11,960 --> 00:31:13,295
Let me go ahead and open that.
671
00:31:13,295 --> 00:31:15,920
And, you know what, just to make clear what's really happening,
672
00:31:15,920 --> 00:31:17,087
let me blur it even further.
673
00:31:17,087 --> 00:31:20,550
Let's make a box that's not just one pixel around, but 10.
674
00:31:20,550 --> 00:31:21,950
So let's make that change.
675
00:31:21,950 --> 00:31:24,830
And let me just go ahead and rerun it with Python of Blur.py.
676
00:31:24,830 --> 00:31:27,320
I still have Out.bmp.
677
00:31:27,320 --> 00:31:32,100
Let me go ahead and open Out.bmp and show you first the before,
678
00:31:32,100 --> 00:31:33,680
which looks like this.
679
00:31:33,680 --> 00:31:34,550
That's the original.
680
00:31:34,550 --> 00:31:37,820
And now, crossing my fingers, four lines of code later,
681
00:31:37,820 --> 00:31:39,758
the result of blurring it, as well.
682
00:31:39,758 --> 00:31:42,050
So the library is doing all of the same kind of legwork
683
00:31:42,050 --> 00:31:44,120
that you all did for the assignment, but it's
684
00:31:44,120 --> 00:31:48,303
encapsulated it all into a single library, that you can then use instead.
685
00:31:48,303 --> 00:31:50,720
Those of you who might have been feeling more comfortable,
686
00:31:50,720 --> 00:31:52,595
might have done a little something like this.
687
00:31:52,595 --> 00:31:56,900
Let me go ahead and open up one other file, called Edges.py.
688
00:31:56,900 --> 00:32:00,290
And in Edges.py, I'm again going to import from the Pillow library
689
00:32:00,290 --> 00:32:03,010
the image keyword, and the image filter.
690
00:32:03,010 --> 00:32:05,510
Then I'm going to go ahead and create a before image, that's
691
00:32:05,510 --> 00:32:09,590
a result of calling image.open of the same thing, Bridge.bmp,
692
00:32:09,590 --> 00:32:16,910
then I'm going to go ahead and run a filter on that, called image, whoops,
693
00:32:16,910 --> 00:32:21,850
image filter.find edges, which is like a content, if you will,
694
00:32:21,850 --> 00:32:23,708
defined inside of this library for us.
695
00:32:23,708 --> 00:32:25,750
And then I'm going to do after.save quote unquote
696
00:32:25,750 --> 00:32:28,210
"Out.bmp," using the same file name.
697
00:32:28,210 --> 00:32:36,490
I'm now going to run Python of Edges.py, after, sorry, user error.
698
00:32:36,490 --> 00:32:38,930
We'll see what syntax error means soon.
699
00:32:38,930 --> 00:32:41,470
Let me go ahead and run the code now, Edges.py.
700
00:32:41,470 --> 00:32:44,830
Let me now open that new file, Out.bmp.
701
00:32:44,830 --> 00:32:49,510
And before we had this, and now, especially if what will look familiar
702
00:32:49,510 --> 00:32:52,210
if we did the more comfortable version of P set 4,
703
00:32:52,210 --> 00:32:55,340
we now get this, after just four lines of code.
704
00:32:55,340 --> 00:32:58,120
So again, suggesting the power of using a language that's better
705
00:32:58,120 --> 00:32:59,560
optimized for the tool at hand.
706
00:32:59,560 --> 00:33:02,950
And at the risk of really making folks sad, let's go ahead
707
00:33:02,950 --> 00:33:06,820
and re-implement, if we could, problem set five, real quickly here.
708
00:33:06,820 --> 00:33:11,080
Let me go ahead and open another version of this code,
709
00:33:11,080 --> 00:33:14,307
wherein I have a C version, just from problem
710
00:33:14,307 --> 00:33:16,390
set five, wherein you implemented a spell checker,
711
00:33:16,390 --> 00:33:18,640
loading 100,000 plus words into memory.
712
00:33:18,640 --> 00:33:22,390
And then you kept track of just how much time and memory it took.
713
00:33:22,390 --> 00:33:24,340
And that probably took a while, implementing
714
00:33:24,340 --> 00:33:26,530
all of those functions in Dictionary.c.
715
00:33:26,530 --> 00:33:32,240
Let me instead now go into a new file, called Dictionary.py.
716
00:33:32,240 --> 00:33:35,200
And let me stipulate, for the sake of discussion,
717
00:33:35,200 --> 00:33:37,660
that we already wrote in advance, Speller.py,
718
00:33:37,660 --> 00:33:39,850
which corresponds to Speller.c.
719
00:33:39,850 --> 00:33:41,380
You didn't write either of those.
720
00:33:41,380 --> 00:33:43,600
Recall for problem set five, we gave you Speller.c.
721
00:33:43,600 --> 00:33:45,558
Assume that we're going to give you Speller.py.
722
00:33:45,558 --> 00:33:52,030
So the onus on us right now is only to implement Speller, Dictionary.py.
723
00:33:52,030 --> 00:33:54,940
All right, so I'm going to go ahead and define a few functions.
724
00:33:54,940 --> 00:33:58,000
And we're going to see now the syntax for defining functions in Python.
725
00:33:58,000 --> 00:34:02,230
I want to go ahead and define first, a hash table, which
726
00:34:02,230 --> 00:34:04,840
was the very first thing you defined in Dictionary.c.
727
00:34:04,840 --> 00:34:09,969
I'm going to go ahead, then, and say words gets this, give me a dictionary,
728
00:34:09,969 --> 00:34:11,683
otherwise known as a hash table.
729
00:34:11,683 --> 00:34:13,600
All right, now let me define a function called
730
00:34:13,600 --> 00:34:16,630
check, which was the first function you might have implemented.
731
00:34:16,630 --> 00:34:19,000
Check is going to take a word, and you'll see in Python,
732
00:34:19,000 --> 00:34:20,375
the syntax is a little different.
733
00:34:20,375 --> 00:34:21,880
You don't specify the return type.
734
00:34:21,880 --> 00:34:24,610
You use the word Def instead to define.
735
00:34:24,610 --> 00:34:28,540
You still specify the name of the function and any arguments thereto.
736
00:34:28,540 --> 00:34:31,210
But you omit any mention of types.
737
00:34:31,210 --> 00:34:33,280
But you do use a colon and indent.
738
00:34:33,280 --> 00:34:37,780
So how do I check if a word is in my dictionary, or in my hash table?
739
00:34:37,780 --> 00:34:41,440
Well, in Python, I can just say, if word in words,
740
00:34:41,440 --> 00:34:46,570
go ahead and return true, else go ahead and return false, done,
741
00:34:46,570 --> 00:34:47,949
with the check function.
742
00:34:47,949 --> 00:34:49,639
All right, now I want to do like load.
743
00:34:49,639 --> 00:34:52,639
That was the heavy lift, where you had to load the big file into memory.
744
00:34:52,639 --> 00:34:54,306
So let me define a function called load.
745
00:34:54,306 --> 00:34:56,650
It takes a string, the name of a file to load.
746
00:34:56,650 --> 00:34:59,980
So I'll call that Dictionary, just like in C, but no data type.
747
00:34:59,980 --> 00:35:04,180
Let me go ahead and open a file by using an open function in Python,
748
00:35:04,180 --> 00:35:06,740
by opening that Dictionary in read mode.
749
00:35:06,740 --> 00:35:10,360
So this is a little similar to fopen, a function in C you might recall.
750
00:35:10,360 --> 00:35:12,880
Then let me iterate over every line in the file.
751
00:35:12,880 --> 00:35:17,800
In Python, this is pretty pleasant, for line in file colon indent.
752
00:35:17,800 --> 00:35:22,510
How, now, do I get at the current word, and then strip off the new line,
753
00:35:22,510 --> 00:35:25,570
because in this file of words, 140,000 words,
754
00:35:25,570 --> 00:35:28,752
there's word backslash n, word backslash n, all right?
755
00:35:28,752 --> 00:35:31,210
Well, let me go ahead and get a word from the current line,
756
00:35:31,210 --> 00:35:34,840
but strip off, from the right end of the string, the new line, which
757
00:35:34,840 --> 00:35:37,540
the Rstrip function in Python does for me.
758
00:35:37,540 --> 00:35:42,370
Then let me go ahead and add to my dictionary, or hash table, that word,
759
00:35:42,370 --> 00:35:43,030
done.
760
00:35:43,030 --> 00:35:45,535
Let me go ahead and close the file for good measure.
761
00:35:45,535 --> 00:35:48,160
And then let me go ahead and return true, because all was well.
762
00:35:48,160 --> 00:35:50,320
That's it for the load function in Python.
763
00:35:50,320 --> 00:35:51,580
How about the size function?
764
00:35:51,580 --> 00:35:54,820
This did not take any arguments, it just returns the size of the hash table
765
00:35:54,820 --> 00:35:55,990
or dictionary in Python.
766
00:35:55,990 --> 00:35:59,980
I can do that by returning the length of the dictionary in question.
767
00:35:59,980 --> 00:36:04,660
And then lastly, gone from the world of Python is malloc and free.
768
00:36:04,660 --> 00:36:06,090
Memory is managed for you.
769
00:36:06,090 --> 00:36:08,950
So no matter what I do, there's nothing to unload.
770
00:36:08,950 --> 00:36:10,820
The computer will do that for me.
771
00:36:10,820 --> 00:36:14,860
So I give you, in these functions, problem set five in Python.
772
00:36:14,860 --> 00:36:17,020
So, I'm sorry, we made you write it in C first.
773
00:36:17,020 --> 00:36:20,620
But the implication now is that, what are you getting for free,
774
00:36:20,620 --> 00:36:21,850
in a language like Python?
775
00:36:21,850 --> 00:36:24,370
Well, encapsulated in this one line of code
776
00:36:24,370 --> 00:36:28,270
is much of what you wrote for problem set five, implementing
777
00:36:28,270 --> 00:36:31,270
your array for all of your letters of the alphabet or more,
778
00:36:31,270 --> 00:36:34,390
all of the linked lists that you implemented to create chains,
779
00:36:34,390 --> 00:36:35,930
to store all of those words.
780
00:36:35,930 --> 00:36:37,060
All of that is happening.
781
00:36:37,060 --> 00:36:40,090
It's just someone else in the world wrote that code for you.
782
00:36:40,090 --> 00:36:43,060
And you can now use it by way of a dictionary.
783
00:36:43,060 --> 00:36:45,550
And actually, I can change this a little bit,
784
00:36:45,550 --> 00:36:48,670
because add is technically not the right function to use here.
785
00:36:48,670 --> 00:36:51,620
I'm actually treating the dictionary as something simpler, a set.
786
00:36:51,620 --> 00:36:55,420
So I'm going to make one tweak, set recall was another data type in Python.
787
00:36:55,420 --> 00:36:57,700
But set just allows it to handle duplicates,
788
00:36:57,700 --> 00:37:00,430
and it allows me to just throw things into it by literally
789
00:37:00,430 --> 00:37:02,320
using a function as simple as add.
790
00:37:02,320 --> 00:37:05,170
And I'm going to make one other tweak here,
791
00:37:05,170 --> 00:37:09,790
because, when I'm checking a word, it's possible it might be given
792
00:37:09,790 --> 00:37:12,520
to me in uppercase or capitalized.
793
00:37:12,520 --> 00:37:15,880
It's not going to necessarily come in in the same lowercase format
794
00:37:15,880 --> 00:37:17,470
that my dictionary did.
795
00:37:17,470 --> 00:37:22,390
I can force every word to lowercase by using word.lower.
796
00:37:22,390 --> 00:37:24,500
And I don't have to do it character for character,
797
00:37:24,500 --> 00:37:29,800
I can do the whole darn string at once, by just saying word.lower.
798
00:37:29,800 --> 00:37:32,860
All right, let me go ahead and open up a terminal window here.
799
00:37:32,860 --> 00:37:36,118
And let me go into, first, my C version, on the left.
800
00:37:36,118 --> 00:37:39,160
And actually I'm going to go ahead and split my terminal window into two.
801
00:37:39,160 --> 00:37:44,007
And on the right, I'm going to go into a version that I essentially just wrote.
802
00:37:44,007 --> 00:37:46,840
But it's also available online, if you want to play along afterward.
803
00:37:46,840 --> 00:37:50,170
I'm going to go ahead and make speller in C on the left,
804
00:37:50,170 --> 00:37:52,270
and note that it takes a moment to compile.
805
00:37:52,270 --> 00:37:56,530
Then I'm going to be ready to run speller of dictionaries,
806
00:37:56,530 --> 00:37:59,330
let's do like the Sherlock Holmes text, which is pretty big.
807
00:37:59,330 --> 00:38:03,970
And then over here, let me get ready to run Python of speller
808
00:38:03,970 --> 00:38:07,733
on texts/homes.txt2.
809
00:38:07,733 --> 00:38:10,150
So the syntax is a little different at the command prompt.
810
00:38:10,150 --> 00:38:12,880
I just, on the left, have to compile the code, with make,
811
00:38:12,880 --> 00:38:14,650
and then run it with ./speller.
812
00:38:14,650 --> 00:38:16,370
On the right, I don't need to compile it.
813
00:38:16,370 --> 00:38:17,860
But I do need to use the interpreter.
814
00:38:17,860 --> 00:38:20,230
So even though the lines are wrapping a little bit here,
815
00:38:20,230 --> 00:38:22,180
let me go ahead and run it on the right.
816
00:38:22,180 --> 00:38:24,305
And I'm going to count how long it takes, verbally,
817
00:38:24,305 --> 00:38:25,570
for demonstration sake.
818
00:38:25,570 --> 00:38:28,720
One Mississippi, two Mississippi, three Mississippi, OK,
819
00:38:28,720 --> 00:38:31,190
so it's like three seconds, give or take.
820
00:38:31,190 --> 00:38:33,520
Now running it in Python, keeping in mind,
821
00:38:33,520 --> 00:38:37,103
I spent way fewer hours implementing a spell checker in Python
822
00:38:37,103 --> 00:38:38,770
than you might have in problem set five.
823
00:38:38,770 --> 00:38:42,007
But what's the trade-off going to be, and what kinds of design decisions
824
00:38:42,007 --> 00:38:43,840
do we all now need to be making consciously?
825
00:38:43,840 --> 00:38:46,300
Here we go, on the right, in Python.
826
00:38:46,300 --> 00:38:50,020
One Mississippi, two Mississippi, three Mississippi, four Mississippi,
827
00:38:50,020 --> 00:38:54,070
five Mississippi, six Mississippi, seven Mississippi, eight Mississippi,
828
00:38:54,070 --> 00:38:57,100
nine Mississippi, 10 Mississippi, 11 Mississippi,
829
00:38:57,100 --> 00:38:59,990
all right, so 10 or 11 seconds.
830
00:38:59,990 --> 00:39:01,980
So which one is better?
831
00:39:01,980 --> 00:39:06,550
Let's go to the group here, which of these programs is the better one?
832
00:39:06,550 --> 00:39:10,780
How might you answer that question, based on demonstration alone?
833
00:39:10,780 --> 00:39:11,530
What do you think?
834
00:39:11,530 --> 00:39:13,738
AUDIENCE: I think Python's better for the programmer,
835
00:39:13,738 --> 00:39:17,847
more comfortable for the programmer, but C is better for the user.
836
00:39:17,847 --> 00:39:19,680
DAVID J. MALAN: OK, so Python, to summarize,
837
00:39:19,680 --> 00:39:23,460
is better for the programmer, because it was way faster to write,
838
00:39:23,460 --> 00:39:26,460
but C is maybe better for the computer, because it's much faster to run.
839
00:39:26,460 --> 00:39:28,127
I think that's a reasonable formulation.
840
00:39:28,127 --> 00:39:29,430
Other opinions?
841
00:39:29,430 --> 00:39:30,588
Yeah.
842
00:39:30,588 --> 00:39:32,880
AUDIENCE: I think it depends on the size of the project
843
00:39:32,880 --> 00:39:33,910
that you're dealing with.
844
00:39:33,910 --> 00:39:36,285
So if it's going to be something that's relatively quick,
845
00:39:36,285 --> 00:39:38,710
I might not care that it takes 10 seconds to do it.
846
00:39:38,710 --> 00:39:40,910
And it could be way faster to do it with Python.
847
00:39:40,910 --> 00:39:44,070
Whereas with C, if I'm dealing with something like a massive data
848
00:39:44,070 --> 00:39:48,300
set or something huge, then that time is going to really build up on,
849
00:39:48,300 --> 00:39:52,740
it might be worth it to put in the upfront effort and just load it into C,
850
00:39:52,740 --> 00:39:56,260
so the process continually will run faster over a longer period of time.
851
00:39:56,260 --> 00:39:57,430
DAVID J. MALAN: Absolutely, a really good answer.
852
00:39:57,430 --> 00:40:00,300
And let me summarize, is it depends on the workload, if you will.
853
00:40:00,300 --> 00:40:04,050
If you have a very large data set, you might
854
00:40:04,050 --> 00:40:07,128
want to optimize your code to be as fast and performant as it can be,
855
00:40:07,128 --> 00:40:09,420
especially if you're running that code again and again.
856
00:40:09,420 --> 00:40:10,950
Maybe you're a company like Google.
857
00:40:10,950 --> 00:40:13,110
People are searching a huge database all the time.
858
00:40:13,110 --> 00:40:15,750
You really want to squeeze every bit of performance
859
00:40:15,750 --> 00:40:17,222
as you can out of the computer.
860
00:40:17,222 --> 00:40:19,680
You might want to have someone smart take a language like C
861
00:40:19,680 --> 00:40:21,450
and write it at a very low level.
862
00:40:21,450 --> 00:40:22,500
It's going to be painful.
863
00:40:22,500 --> 00:40:23,400
They're going to have bugs.
864
00:40:23,400 --> 00:40:26,150
They're going to have to deal with memory management and the like.
865
00:40:26,150 --> 00:40:29,490
But if and when it works correctly, it's going to be much faster, it would seem.
866
00:40:29,490 --> 00:40:32,280
By contrast, if you have a data set that's big,
867
00:40:32,280 --> 00:40:35,820
and 140,000 words is not small, but you don't
868
00:40:35,820 --> 00:40:38,940
want to spend like 5 hours, 10 hours, a week of your time,
869
00:40:38,940 --> 00:40:41,063
building a spell checker or a dictionary,
870
00:40:41,063 --> 00:40:43,980
you can instead leverage a different language with different libraries
871
00:40:43,980 --> 00:40:48,690
and build on top of it, in order to prioritize the human time instead.
872
00:40:48,690 --> 00:40:50,841
Other thoughts?
873
00:40:50,841 --> 00:40:52,789
AUDIENCE: Would you, because with Python,
874
00:40:52,789 --> 00:40:56,928
doesn't it also like convert the words, or like
875
00:40:56,928 --> 00:40:58,539
convert the words, for a lesson?
876
00:40:58,539 --> 00:41:00,581
When we convert that into the same version again,
877
00:41:00,581 --> 00:41:04,148
do we just take that into view?
878
00:41:04,148 --> 00:41:06,940
DAVID J. MALAN: That's a perfect segue to exactly the next point we
879
00:41:06,940 --> 00:41:09,340
wanted to make, which was, is there something in between?
880
00:41:09,340 --> 00:41:10,360
And indeed there is.
881
00:41:10,360 --> 00:41:12,970
I'm oversimplifying what this language is actually doing.
882
00:41:12,970 --> 00:41:15,280
It's not as stark a difference as saying, like, hey,
883
00:41:15,280 --> 00:41:18,340
Python is four times slower than C. Like that's not the right takeaway.
884
00:41:18,340 --> 00:41:21,460
There are absolutely ways that engineers can optimize languages,
885
00:41:21,460 --> 00:41:23,230
as they have already done for Python.
886
00:41:23,230 --> 00:41:25,840
And in fact, I've configured my settings in such a way
887
00:41:25,840 --> 00:41:28,777
that I've kind of dramatized just how big the difference is.
888
00:41:28,777 --> 00:41:30,610
It is going to be slower, Python, typically,
889
00:41:30,610 --> 00:41:31,930
than the equivalent C program.
890
00:41:31,930 --> 00:41:33,940
But it doesn't have to be as big of a gap
891
00:41:33,940 --> 00:41:37,720
as it is here, because, indeed, among the features you can turn on in Python
892
00:41:37,720 --> 00:41:40,120
is to save some intermediate results.
893
00:41:40,120 --> 00:41:43,360
Technically speaking, yes, Python is interpreting
894
00:41:43,360 --> 00:41:46,690
Dictionary.py and these other files, translating them
895
00:41:46,690 --> 00:41:48,203
from one language to another.
896
00:41:48,203 --> 00:41:51,370
But that doesn't mean it has to do that every darn time you run the program.
897
00:41:51,370 --> 00:41:57,020
As you propose, you can save, or cache, C-A-C-H-E, the results of that process.
898
00:41:57,020 --> 00:42:00,440
So that the second time and the third time are actually notably faster.
899
00:42:00,440 --> 00:42:03,430
And, in fact, Python itself, the interpreter, the most popular version
900
00:42:03,430 --> 00:42:05,980
thereof, itself is actually implemented in C.
901
00:42:05,980 --> 00:42:09,290
So you can make sure that your interpreter is as fast as possible.
902
00:42:09,290 --> 00:42:11,350
And what then is maybe the high level takeaway?
903
00:42:11,350 --> 00:42:14,320
Yes, if you are going to try to squeeze every bit of performance
904
00:42:14,320 --> 00:42:17,710
out of your code, and maybe code is constrained.
905
00:42:17,710 --> 00:42:19,150
Maybe you have very small devices.
906
00:42:19,150 --> 00:42:20,770
Maybe it's like a watch nowadays.
907
00:42:20,770 --> 00:42:26,320
Or maybe it's a sensor that's installed in some small format in an appliance,
908
00:42:26,320 --> 00:42:29,710
or in infrastructure, where you don't have much battery life
909
00:42:29,710 --> 00:42:31,630
and you don't have much size, you might want
910
00:42:31,630 --> 00:42:33,710
to minimize just how much work is being done.
911
00:42:33,710 --> 00:42:36,743
And so the faster the code runs, and the better it's going to be,
912
00:42:36,743 --> 00:42:38,410
if it's implemented something low level.
913
00:42:38,410 --> 00:42:42,310
So C is still very commonly used for certain types of applications.
914
00:42:42,310 --> 00:42:45,580
But, again, if you just want to solve real world problems,
915
00:42:45,580 --> 00:42:49,840
and get real work done, and your time is just as, if not more, valuable
916
00:42:49,840 --> 00:42:52,000
than the device you're running it on, long term,
917
00:42:52,000 --> 00:42:55,358
you know what, Python is among the most popular languages as well.
918
00:42:55,358 --> 00:42:58,150
And frankly, if I were implementing a spell checker moving forward,
919
00:42:58,150 --> 00:42:59,710
I'm probably starting with Python.
920
00:42:59,710 --> 00:43:01,543
And I'm not going to waste time implementing
921
00:43:01,543 --> 00:43:04,930
all of that low-level stuff, because the whole point of using newer,
922
00:43:04,930 --> 00:43:09,460
modern languages is to use abstractions that other people have created for you.
923
00:43:09,460 --> 00:43:12,910
And by abstraction, I mean something like the dictionary function,
924
00:43:12,910 --> 00:43:15,370
that just gives you a dictionary, or hash table,
925
00:43:15,370 --> 00:43:19,225
or the equivalent version that I used, which in this case was a set.
926
00:43:19,225 --> 00:43:22,720
All right, any questions, then, on Python thus far?
927
00:43:22,720 --> 00:43:25,730
928
00:43:25,730 --> 00:43:26,710
No, all right.
929
00:43:26,710 --> 00:43:27,710
Oh, yeah, in the middle.
930
00:43:27,710 --> 00:43:29,920
AUDIENCE: Could you compile the Python code,
931
00:43:29,920 --> 00:43:34,610
or is there some, I'd imagine that with the audience that can happen,
932
00:43:34,610 --> 00:43:38,180
but it feels like if you can just come up with a Python compiler,
933
00:43:38,180 --> 00:43:40,093
that would give you the best of both worlds.
934
00:43:40,093 --> 00:43:42,260
DAVID J. MALAN: Really good question or observation,
935
00:43:42,260 --> 00:43:43,718
could you just compile Python code?
936
00:43:43,718 --> 00:43:47,180
Yes, absolutely, this idea of compiling code or interpreting code
937
00:43:47,180 --> 00:43:49,490
is not native to the language itself.
938
00:43:49,490 --> 00:43:52,410
It tends to be native to the conventions that we humans use.
939
00:43:52,410 --> 00:43:54,730
So you could actually write an interpreter for C
940
00:43:54,730 --> 00:43:57,980
that would read it top to bottom, left to right, converting it to, on the fly,
941
00:43:57,980 --> 00:44:01,640
something the computer understands, but historically that's not been the case.
942
00:44:01,640 --> 00:44:03,560
C is generally a compiled language.
943
00:44:03,560 --> 00:44:04,670
But it doesn't have to be.
944
00:44:04,670 --> 00:44:08,010
What Python nowadays is actually doing is what you described earlier.
945
00:44:08,010 --> 00:44:10,220
It technically is, sort of unbeknownst to us,
946
00:44:10,220 --> 00:44:13,970
compiling the code, technically not into 0's and 1's, technically
947
00:44:13,970 --> 00:44:17,510
into something called byte code, which is this intermediate step that
948
00:44:17,510 --> 00:44:21,510
just doesn't take as much time as it would to recompile the whole thing.
949
00:44:21,510 --> 00:44:24,377
And this is an area of research for computer scientists working
950
00:44:24,377 --> 00:44:26,960
in programming languages, to improve these kinds of paradigms.
951
00:44:26,960 --> 00:44:27,500
Why?
952
00:44:27,500 --> 00:44:30,740
Well, honestly, for you and I, the programmer, it's just much easier to,
953
00:44:30,740 --> 00:44:33,800
one, run the code and not worry about the stupid second step
954
00:44:33,800 --> 00:44:35,100
of compiling it all the time.
955
00:44:35,100 --> 00:44:35,600
Why?
956
00:44:35,600 --> 00:44:38,220
It's literally half as many steps for me, the human.
957
00:44:38,220 --> 00:44:40,500
And that's a nice thing to optimize for.
958
00:44:40,500 --> 00:44:44,330
And ultimately, too, you might want all of the fancy features that
959
00:44:44,330 --> 00:44:45,920
come with these other languages.
960
00:44:45,920 --> 00:44:47,960
So you should really just be fine-tuning how
961
00:44:47,960 --> 00:44:51,800
you can enable these features, as opposed to shying away from them here.
962
00:44:51,800 --> 00:44:54,590
And, in fact, the only time I personally ever use C
963
00:44:54,590 --> 00:44:57,950
is from like September to October of every year, during CS50.
964
00:44:57,950 --> 00:45:00,350
Almost every other month do I reach for Python,
965
00:45:00,350 --> 00:45:03,690
or another language called JavaScript, to actually get real work done,
966
00:45:03,690 --> 00:45:07,640
which is not to impugn C. It's just that those other languages tend to be better
967
00:45:07,640 --> 00:45:11,030
fits for the amount of time I have to allocate, and the types of problems
968
00:45:11,030 --> 00:45:11,905
that I want to solve.
969
00:45:11,905 --> 00:45:14,405
All right, let's go ahead and take a five minute break here.
970
00:45:14,405 --> 00:45:17,390
And when we come back, we'll start writing some programs from Scratch.
971
00:45:17,390 --> 00:45:18,300
All right.
972
00:45:18,300 --> 00:45:21,740
So let's go ahead and start writing some code from the beginning
973
00:45:21,740 --> 00:45:24,710
here, whereby we start small with some simple examples,
974
00:45:24,710 --> 00:45:28,042
and then we'll build our way up to more sophisticated examples in Python.
975
00:45:28,042 --> 00:45:29,750
But what we'll do along the way is first,
976
00:45:29,750 --> 00:45:31,865
look side by side at what the C code looked
977
00:45:31,865 --> 00:45:34,640
like way back in week 1 or 2 or 3 and so forth,
978
00:45:34,640 --> 00:45:36,890
and then write the corresponding Python code at right.
979
00:45:36,890 --> 00:45:39,530
And then we'll transition just to focusing on Python itself.
980
00:45:39,530 --> 00:45:42,322
What I've done in advance today is I've downloaded some of the code
981
00:45:42,322 --> 00:45:44,930
from the course's website, my source 6 directory, which
982
00:45:44,930 --> 00:45:47,825
contains all of the pre-written C code from weeks past.
983
00:45:47,825 --> 00:45:49,700
But it'll also have copies of the Python code
984
00:45:49,700 --> 00:45:51,660
we'll write here together and look at.
985
00:45:51,660 --> 00:45:55,445
So first, here is Hello.c back from week 0.
986
00:45:55,445 --> 00:45:57,323
And this was version 0 of it.
987
00:45:57,323 --> 00:45:58,740
I'm going to go ahead and do this.
988
00:45:58,740 --> 00:46:02,240
I'm going to go ahead and split my code window up here.
989
00:46:02,240 --> 00:46:05,042
I'm going to go ahead and create a new file called Hello.py.
990
00:46:05,042 --> 00:46:07,250
And this isn't something you'll typically have to do,
991
00:46:07,250 --> 00:46:08,810
laying your code out side by side.
992
00:46:08,810 --> 00:46:10,880
But I've just clicked the little icon in VS Code
993
00:46:10,880 --> 00:46:14,330
that looks like two columns, that splits my code editor into two places,
994
00:46:14,330 --> 00:46:17,330
so that we can, in fact, see things, for now, side by side,
995
00:46:17,330 --> 00:46:18,788
with my terminal window down below.
996
00:46:18,788 --> 00:46:21,747
All right, now I'm going to go ahead and write the corresponding Python
997
00:46:21,747 --> 00:46:24,560
program on the right, which, recall, was just print, quote
998
00:46:24,560 --> 00:46:27,170
unquote, "Hello, world," and that's it.
999
00:46:27,170 --> 00:46:29,420
Now down in my terminal window, I'm going
1000
00:46:29,420 --> 00:46:33,080
to go ahead and run Python of Hello.py, Enter, and voila,
1001
00:46:33,080 --> 00:46:34,450
we've got Hello.py working.
1002
00:46:34,450 --> 00:46:36,950
So again, I'm not going to play any further with the C code.
1003
00:46:36,950 --> 00:46:38,930
It's there just to jog your memory left and right.
1004
00:46:38,930 --> 00:46:41,240
So let's now look at a second version of Hello, world
1005
00:46:41,240 --> 00:46:44,452
from that first week, whereby if I go and get Hello1.c,
1006
00:46:44,452 --> 00:46:46,160
I'm going to drag that over to the right.
1007
00:46:46,160 --> 00:46:48,980
Whoops, I'm going to go ahead and drag that over to the left here.
1008
00:46:48,980 --> 00:46:51,950
And now, on the right, let's modify Hello.py
1009
00:46:51,950 --> 00:46:55,700
to look a little more like this second version in C, all right?
1010
00:46:55,700 --> 00:46:59,867
I want to get an answer from the user as a return value,
1011
00:46:59,867 --> 00:47:01,700
but I also want to get some input from them.
1012
00:47:01,700 --> 00:47:05,420
So from CS50, I'm going to import the function called getString for now.
1013
00:47:05,420 --> 00:47:07,170
We're going to get rid of that eventually,
1014
00:47:07,170 --> 00:47:08,962
but for now, it's a helpful training wheel.
1015
00:47:08,962 --> 00:47:11,180
And then down here, I'm going to say, answer
1016
00:47:11,180 --> 00:47:14,510
equals getString quote unquote, "What's your name"?
1017
00:47:14,510 --> 00:47:15,980
Question mark, space.
1018
00:47:15,980 --> 00:47:17,453
But no semicolon, no data type.
1019
00:47:17,453 --> 00:47:19,370
And then I'm going to go ahead and print, just
1020
00:47:19,370 --> 00:47:25,118
like the first example on the slide, Hello, comma space plus answer.
1021
00:47:25,118 --> 00:47:26,660
And now let me go ahead and run this.
1022
00:47:26,660 --> 00:47:29,660
Python, of Hello.py, all right, it's asking me what's my name.
1023
00:47:29,660 --> 00:47:30,170
David.
1024
00:47:30,170 --> 00:47:31,370
Hello comma David.
1025
00:47:31,370 --> 00:47:36,507
But it's worth calling attention to the fact that I've also simplified further.
1026
00:47:36,507 --> 00:47:38,840
It's not just that the individual functions are simpler.
1027
00:47:38,840 --> 00:47:42,470
What is also now glaringly omitted from my Python code at right,
1028
00:47:42,470 --> 00:47:44,657
both in this version, and the previous version.
1029
00:47:44,657 --> 00:47:46,115
What did I not bother implementing?
1030
00:47:46,115 --> 00:47:47,267
AUDIENCE: The main code.
1031
00:47:47,267 --> 00:47:49,850
DAVID J. MALAN: Yeah, so I didn't even need to implement main.
1032
00:47:49,850 --> 00:47:53,210
We'll revisit the main function, because having a main function
1033
00:47:53,210 --> 00:47:54,860
actually does solve problems sometimes.
1034
00:47:54,860 --> 00:47:56,090
But it's no longer required.
1035
00:47:56,090 --> 00:47:59,750
In C you have to have that to kick-start the entire process of actually running
1036
00:47:59,750 --> 00:48:00,337
your code.
1037
00:48:00,337 --> 00:48:03,170
And in fact, if you were missing main, as you might have experienced
1038
00:48:03,170 --> 00:48:06,033
if you accidentally compiled Helpers.c instead of the file
1039
00:48:06,033 --> 00:48:08,450
that contained main, you would have seen a compiler error.
1040
00:48:08,450 --> 00:48:09,658
In Python it's not necessary.
1041
00:48:09,658 --> 00:48:12,410
In Python you can just jump right in, start programming, and boom,
1042
00:48:12,410 --> 00:48:13,350
you're good to go.
1043
00:48:13,350 --> 00:48:15,225
Especially if it's a small program like this,
1044
00:48:15,225 --> 00:48:18,210
you don't need the added overhead or complexity of a main function.
1045
00:48:18,210 --> 00:48:19,860
So that's one other difference here.
1046
00:48:19,860 --> 00:48:23,390
All right, there are a few other ways we could say Hello, world.
1047
00:48:23,390 --> 00:48:26,160
Recall that I could use a format string.
1048
00:48:26,160 --> 00:48:30,360
So I could put this whole thing in quotes, I could use this f prefix.
1049
00:48:30,360 --> 00:48:33,250
And then let me go ahead and run Python of Hello.py again.
1050
00:48:33,250 --> 00:48:35,250
You can perhaps see where we're going with this.
1051
00:48:35,250 --> 00:48:37,170
Let me type my name, David, and here we go.
1052
00:48:37,170 --> 00:48:39,570
OK, that's the mistake that someone identified earlier,
1053
00:48:39,570 --> 00:48:41,040
you need the curly braces.
1054
00:48:41,040 --> 00:48:44,940
Otherwise no variables are interpolated, that is substituted,
1055
00:48:44,940 --> 00:48:46,390
with their actual values.
1056
00:48:46,390 --> 00:48:50,160
So if I go back in and add those curly braces to the F string,
1057
00:48:50,160 --> 00:48:54,632
now let me run Python of Hello.py, type in my name, and there we go.
1058
00:48:54,632 --> 00:48:55,590
We're back in business.
1059
00:48:55,590 --> 00:48:56,388
Which one's better?
1060
00:48:56,388 --> 00:48:57,180
I mean, it depends.
1061
00:48:57,180 --> 00:49:00,540
But generally speaking, making shorter, more concise code
1062
00:49:00,540 --> 00:49:01,870
tends to be a good thing.
1063
00:49:01,870 --> 00:49:06,450
So stylistically, the F string is probably a reasonable instinct to have.
1064
00:49:06,450 --> 00:49:09,280
All right, well, what more can we do besides this?
1065
00:49:09,280 --> 00:49:12,180
Well, let me go ahead here and let's get rid of the training wheel
1066
00:49:12,180 --> 00:49:13,230
altogether, actually.
1067
00:49:13,230 --> 00:49:15,180
So same C code at left.
1068
00:49:15,180 --> 00:49:18,150
Let me get rid of the CS50 library, which we will ultimately,
1069
00:49:18,150 --> 00:49:19,620
in a couple of weeks, anyway.
1070
00:49:19,620 --> 00:49:22,560
I can't use getString, but I can use a function
1071
00:49:22,560 --> 00:49:24,730
that comes with Python called input.
1072
00:49:24,730 --> 00:49:28,050
And, in fact, this is actually a one-for-one substitution, pretty much.
1073
00:49:28,050 --> 00:49:31,380
There's really no downside to using input instead of getString.
1074
00:49:31,380 --> 00:49:33,420
We implement getString just for consistency
1075
00:49:33,420 --> 00:49:37,800
with what you saw in C. Python of Hello.py, what's your name, David.
1076
00:49:37,800 --> 00:49:39,310
Still actually works the same.
1077
00:49:39,310 --> 00:49:41,227
So gone are the CS50 specific training wheels.
1078
00:49:41,227 --> 00:49:43,227
But we're going to bring them back shortly, just
1079
00:49:43,227 --> 00:49:45,240
to deal with integers or floats or other values,
1080
00:49:45,240 --> 00:49:47,490
too, because it's going to make our lives a little simpler,
1081
00:49:47,490 --> 00:49:48,510
with error checking.
1082
00:49:48,510 --> 00:49:52,350
All right, any questions, before we now pivot to revisiting other examples
1083
00:49:52,350 --> 00:49:56,280
from week 1, but now in Python?
1084
00:49:56,280 --> 00:49:58,110
All right, let me go ahead and open up now.
1085
00:49:58,110 --> 00:50:03,240
Let's say Calculator0.c, which was one of the first examples we did involving
1086
00:50:03,240 --> 00:50:06,870
math and operators like that, as well as functions like getInt,
1087
00:50:06,870 --> 00:50:11,820
let me go ahead and create a new file now called Calculator.py,
1088
00:50:11,820 --> 00:50:15,360
at right, so that I have my C code at left still,
1089
00:50:15,360 --> 00:50:16,950
and my Python code at right.
1090
00:50:16,950 --> 00:50:20,610
All right, let me go dive into a translation of this code into Python.
1091
00:50:20,610 --> 00:50:23,100
I am going to use getInt from the CS50 library.
1092
00:50:23,100 --> 00:50:24,960
So let me import that.
1093
00:50:24,960 --> 00:50:27,340
I'm going to go ahead now and get an Int from the user.
1094
00:50:27,340 --> 00:50:31,000
So x equals getInt, and I'll ask them for an x value,
1095
00:50:31,000 --> 00:50:32,430
just like we did weeks ago.
1096
00:50:32,430 --> 00:50:37,800
No need to specify a semicolon, though, or an Int for the x.
1097
00:50:37,800 --> 00:50:38,940
It will just figure it out.
1098
00:50:38,940 --> 00:50:42,090
Y is going to get another Int via y colon,
1099
00:50:42,090 --> 00:50:46,830
and then down here, I'm going to go ahead and say print of x plus y.
1100
00:50:46,830 --> 00:50:48,720
So this is already a bit new.
1101
00:50:48,720 --> 00:50:53,400
Recall, the C version required that I use this format string, as well
1102
00:50:53,400 --> 00:50:54,428
as printf itself.
1103
00:50:54,428 --> 00:50:56,220
Python is just a little more user-friendly.
1104
00:50:56,220 --> 00:50:59,670
If all you want to do is print out a value, like x plus y, just print it.
1105
00:50:59,670 --> 00:51:02,610
Don't futz with any percent signs or format codes.
1106
00:51:02,610 --> 00:51:05,160
It's not printf, it's indeed just print now.
1107
00:51:05,160 --> 00:51:08,610
All right, let me go ahead and run Python of Calculator.py,
1108
00:51:08,610 --> 00:51:13,620
Enter, just do a quick sample, 1 plus 2 indeed equals 3.
1109
00:51:13,620 --> 00:51:16,410
As an aside, suppose I had taken a different approach
1110
00:51:16,410 --> 00:51:19,508
to importing the whole CS50 library, functionally, it's the same.
1111
00:51:19,508 --> 00:51:21,550
You're not to notice any performance impact here.
1112
00:51:21,550 --> 00:51:22,690
It's a small library.
1113
00:51:22,690 --> 00:51:25,680
But notice what does not work now, whereas it did work
1114
00:51:25,680 --> 00:51:31,110
in C. Python of Calculator.py, Enter, we see our first traceback deliberately
1115
00:51:31,110 --> 00:51:31,690
here.
1116
00:51:31,690 --> 00:51:33,570
So a traceback is just a term of art that
1117
00:51:33,570 --> 00:51:37,210
says, here is a trace back through all of the functions
1118
00:51:37,210 --> 00:51:38,250
that just got executed.
1119
00:51:38,250 --> 00:51:40,170
In the world of C, you might call this a stack
1120
00:51:40,170 --> 00:51:42,937
trace, stack being the operative word.
1121
00:51:42,937 --> 00:51:45,270
Recall that when we talked about the stack and the heap,
1122
00:51:45,270 --> 00:51:48,077
the stack, like a stack of trays, was all of the functions that
1123
00:51:48,077 --> 00:51:49,660
might get called, one after the other.
1124
00:51:49,660 --> 00:51:54,330
We had main, we had swap, then swap went away, and then main finished, recall.
1125
00:51:54,330 --> 00:51:58,020
So here's a trace back of all of the functions or code that got executed.
1126
00:51:58,020 --> 00:52:00,880
There's not really any functions other than my file itself.
1127
00:52:00,880 --> 00:52:02,350
Otherwise there'd be more detail.
1128
00:52:02,350 --> 00:52:05,580
But even though it's a little cryptic, we can perhaps infer from the output
1129
00:52:05,580 --> 00:52:09,960
here, name error, so something related to the name of something, name, getInt
1130
00:52:09,960 --> 00:52:10,950
is not defined.
1131
00:52:10,950 --> 00:52:14,190
And this of course, happens on line 3 over there.
1132
00:52:14,190 --> 00:52:15,520
All right, so why is that?
1133
00:52:15,520 --> 00:52:19,170
Well, Python essentially allows us to namespace
1134
00:52:19,170 --> 00:52:21,750
our functions that come from libraries.
1135
00:52:21,750 --> 00:52:25,290
There was a problem in C. If you were using the CS50 library,
1136
00:52:25,290 --> 00:52:27,180
and thus had access to getInt, getString,
1137
00:52:27,180 --> 00:52:29,850
and so forth, you could not use another library
1138
00:52:29,850 --> 00:52:31,590
that had the same function names.
1139
00:52:31,590 --> 00:52:33,510
They would collide, and the compiler would not
1140
00:52:33,510 --> 00:52:36,030
know how to link them together correctly.
1141
00:52:36,030 --> 00:52:41,520
In Python, and other languages like JavaScript, and in Java,
1142
00:52:41,520 --> 00:52:45,270
you have support for effectively what would be called namespaces.
1143
00:52:45,270 --> 00:52:50,370
You can isolate variables and function names to their own namespace,
1144
00:52:50,370 --> 00:52:52,590
like their own container in memory.
1145
00:52:52,590 --> 00:52:55,560
And what this means is, if you import all of CS50,
1146
00:52:55,560 --> 00:52:59,730
you have to say that the getInt you want is inside the CS50 library.
1147
00:52:59,730 --> 00:53:03,180
So just like with the image blurring, and the image edges
1148
00:53:03,180 --> 00:53:08,430
before, where I had to specify image dot and image filter dot, similarly here,
1149
00:53:08,430 --> 00:53:11,970
am I specifying with a dot operator, albeit a little differently, that I
1150
00:53:11,970 --> 00:53:14,410
want CS50.getInt in both places.
1151
00:53:14,410 --> 00:53:18,120
And now if I rerun Python of Calculator.py, 1 and 2,
1152
00:53:18,120 --> 00:53:19,860
now we're back in business.
1153
00:53:19,860 --> 00:53:20,790
Which one is better?
1154
00:53:20,790 --> 00:53:24,790
Generally speaking, it depends on just how many functions
1155
00:53:24,790 --> 00:53:26,040
you're using from the library.
1156
00:53:26,040 --> 00:53:29,040
If you're using a whole bunch of functions, just import the whole thing.
1157
00:53:29,040 --> 00:53:33,333
If you're only using maybe one or two, import them line by line.
1158
00:53:33,333 --> 00:53:35,750
All right, so let's go ahead and make a little tweak here.
1159
00:53:35,750 --> 00:53:38,917
Let's get rid of this library and take this training wheel off,
1160
00:53:38,917 --> 00:53:41,750
too, as quickly as we introduced it, though for the problems set six
1161
00:53:41,750 --> 00:53:44,310
you'll be able to use all of these same functions.
1162
00:53:44,310 --> 00:53:48,110
Suppose I get rid of this, and I just use the input function,
1163
00:53:48,110 --> 00:53:51,710
just like I did by replacing getString earlier.
1164
00:53:51,710 --> 00:53:54,710
Let me go ahead now and run this version of the code.
1165
00:53:54,710 --> 00:54:00,964
Python of Calculator.py, OK, how about 1 plus 2 equals 3.
1166
00:54:00,964 --> 00:54:02,660
Huh.
1167
00:54:02,660 --> 00:54:05,330
All right, obviously wrong, incorrect.
1168
00:54:05,330 --> 00:54:09,890
Can anyone explain what just happened, based on instincts?
1169
00:54:09,890 --> 00:54:10,890
What just happened here.
1170
00:54:10,890 --> 00:54:11,390
Yeah.
1171
00:54:11,390 --> 00:54:12,620
AUDIENCE: You want an answer?
1172
00:54:12,620 --> 00:54:13,745
DAVID J. MALAN: Sure, yeah.
1173
00:54:13,745 --> 00:54:17,930
AUDIENCE: Say you have a number of strings that don't have Ints,
1174
00:54:17,930 --> 00:54:21,320
so you would part with them and say, printing one, two, better.
1175
00:54:21,320 --> 00:54:24,650
DAVID J. MALAN: Exactly, Python is interpreting, or treating,
1176
00:54:24,650 --> 00:54:26,810
both x and y as strings, which is actually
1177
00:54:26,810 --> 00:54:29,120
what the input function returns by default.
1178
00:54:29,120 --> 00:54:32,150
And so plus is now being interpreted as concatenation, as we defined it
1179
00:54:32,150 --> 00:54:32,660
earlier.
1180
00:54:32,660 --> 00:54:35,780
So x plus y isn't x plus y mathematically,
1181
00:54:35,780 --> 00:54:38,480
but in terms of string joining, just like in Scratch.
1182
00:54:38,480 --> 00:54:41,690
So that's why we're getting 12, or really one two,
1183
00:54:41,690 --> 00:54:43,040
which isn't itself a number.
1184
00:54:43,040 --> 00:54:44,180
It, too, is another string.
1185
00:54:44,180 --> 00:54:45,950
So we somehow need to convert things.
1186
00:54:45,950 --> 00:54:49,040
And we didn't have this ability quite as easily in C.
1187
00:54:49,040 --> 00:54:52,670
We did have like the A to i function, ASCII to integer,
1188
00:54:52,670 --> 00:54:54,270
which did allow you to do this.
1189
00:54:54,270 --> 00:54:59,390
The analog in Python is actually just to do a cast, a typecast, using Int.
1190
00:54:59,390 --> 00:55:02,750
So just like in C, you can use the keyword Int,
1191
00:55:02,750 --> 00:55:04,500
but you use it a little differently.
1192
00:55:04,500 --> 00:55:09,300
Notice that I'm not doing parenthesis Int close parenthesis before the value.
1193
00:55:09,300 --> 00:55:11,010
I'm using Int as a function.
1194
00:55:11,010 --> 00:55:13,430
So indeed, in Python, Int is a function.
1195
00:55:13,430 --> 00:55:16,610
Float is a function, that you can pass values into,
1196
00:55:16,610 --> 00:55:18,270
to do this kind of conversion.
1197
00:55:18,270 --> 00:55:22,010
So now, if I run Python of Calculator.py, 1 and 2,
1198
00:55:22,010 --> 00:55:25,430
now we're back in business, and getting the answer of 3.
1199
00:55:25,430 --> 00:55:27,240
But there's kind of a catch here.
1200
00:55:27,240 --> 00:55:28,430
There's always going to be a trade-off.
1201
00:55:28,430 --> 00:55:30,560
Like that sounds amazing that it just works in this way.
1202
00:55:30,560 --> 00:55:32,450
We can throw away the CS50 library already.
1203
00:55:32,450 --> 00:55:37,130
But what if the user accidentally types, or maliciously types in,
1204
00:55:37,130 --> 00:55:39,035
like a cat, instead of a number.
1205
00:55:39,035 --> 00:55:40,910
Damn, well, there's one of these trace backs.
1206
00:55:40,910 --> 00:55:42,780
Like, now my program has crashed.
1207
00:55:42,780 --> 00:55:45,342
This is similar in spirit to the kinds of segfaults
1208
00:55:45,342 --> 00:55:46,550
that you might have had in C.
1209
00:55:46,550 --> 00:55:47,840
But they're not segfaults per se.
1210
00:55:47,840 --> 00:55:49,507
It doesn't necessarily relate to memory.
1211
00:55:49,507 --> 00:55:55,290
This time it relates to actual runtime values, not being as expected.
1212
00:55:55,290 --> 00:55:58,250
So this time it's not a name error, it's a value error,
1213
00:55:58,250 --> 00:56:02,580
invalid literal for Int with base 10 quote unquote "cat."
1214
00:56:02,580 --> 00:56:06,800
So, again, it's written for sort of a programmer, more than sort
1215
00:56:06,800 --> 00:56:09,650
of a typical person, because it's pretty arcane, the language here.
1216
00:56:09,650 --> 00:56:10,900
But let's try to interpret it.
1217
00:56:10,900 --> 00:56:14,862
Invalid literal, a literal is just something someone typed for Int, which
1218
00:56:14,862 --> 00:56:16,320
is the function name, with base 10.
1219
00:56:16,320 --> 00:56:18,170
It's just defaulting to decimal numbers.
1220
00:56:18,170 --> 00:56:20,415
Cat is apparently not a decimal number.
1221
00:56:20,415 --> 00:56:23,040
It doesn't look like it, therefore it can't be treated like it.
1222
00:56:23,040 --> 00:56:24,930
Therefore, there's a value error.
1223
00:56:24,930 --> 00:56:26,750
So what can we do?
1224
00:56:26,750 --> 00:56:30,200
Unfortunately, you would have to somehow catch this error.
1225
00:56:30,200 --> 00:56:32,450
And the only way to do that in Python really
1226
00:56:32,450 --> 00:56:34,970
is by way of another feature that C did not have,
1227
00:56:34,970 --> 00:56:37,400
namely, what are called exceptions.
1228
00:56:37,400 --> 00:56:42,080
An exception is exactly what just happened, name error, value error.
1229
00:56:42,080 --> 00:56:45,590
They are things that can go wrong when your Python code is running,
1230
00:56:45,590 --> 00:56:50,670
that aren't necessarily going to be detected until you run your code.
1231
00:56:50,670 --> 00:56:56,240
So in Python, and in JavaScript, and in Java, and other more modern languages,
1232
00:56:56,240 --> 00:56:59,240
there's this ability to actually try to do something,
1233
00:56:59,240 --> 00:57:01,015
except if something goes wrong.
1234
00:57:01,015 --> 00:57:03,140
And in fact, I'm going to introduce a bit of syntax
1235
00:57:03,140 --> 00:57:05,557
here, even though we won't have to use this much just yet.
1236
00:57:05,557 --> 00:57:09,980
Instead of just blindly converting x to an Int, let me go ahead
1237
00:57:09,980 --> 00:57:11,970
and try to do that.
1238
00:57:11,970 --> 00:57:15,380
And if there's an exception, go ahead and say something
1239
00:57:15,380 --> 00:57:22,280
like print, that is not an Int.
1240
00:57:22,280 --> 00:57:25,538
And then I'm going to do something like exit, right there.
1241
00:57:25,538 --> 00:57:27,080
And let me go ahead and do this here.
1242
00:57:27,080 --> 00:57:31,370
Let me try to get y, except if there's an exception.
1243
00:57:31,370 --> 00:57:35,997
Then let me go ahead and say, again, that is not an Int exclamation point.
1244
00:57:35,997 --> 00:57:38,330
And then I'm going to exit from there to, otherwise I'll
1245
00:57:38,330 --> 00:57:39,860
go ahead and print x plus y.
1246
00:57:39,860 --> 00:57:46,460
If I run Python of Calculator.py now, whoops, oh,
1247
00:57:46,460 --> 00:57:48,680
forgot my close quote, sorry.
1248
00:57:48,680 --> 00:57:54,560
All right, so close quote, Python of Calculator.py, 1 and 2 still work.
1249
00:57:54,560 --> 00:57:57,800
But if I try to type in something wrong like cat, now
1250
00:57:57,800 --> 00:57:59,310
it actually detects the error.
1251
00:57:59,310 --> 00:58:01,850
So what is the CS50 library in Python doing?
1252
00:58:01,850 --> 00:58:05,600
It's actually doing that try and accept for you, because suffice it to say,
1253
00:58:05,600 --> 00:58:08,540
otherwise your programs for something simple, like a calculator,
1254
00:58:08,540 --> 00:58:09,900
start to get longer and longer.
1255
00:58:09,900 --> 00:58:13,160
So we factored that kind of logic out to the CS50 getInt
1256
00:58:13,160 --> 00:58:14,690
function and get float function.
1257
00:58:14,690 --> 00:58:18,783
But underneath the hood, they're essentially doing this, try except,
1258
00:58:18,783 --> 00:58:20,450
but they're being a little more precise.
1259
00:58:20,450 --> 00:58:24,450
They're detecting a specific error, and they are doing it in a loop,
1260
00:58:24,450 --> 00:58:27,050
so that these functions will get executed again and again.
1261
00:58:27,050 --> 00:58:30,710
In fact, the best way to do this is to say except if there's a value error,
1262
00:58:30,710 --> 00:58:34,078
then print that error message out to the user.
1263
00:58:34,078 --> 00:58:36,870
And again, let's not get too into the weeds here with this feature.
1264
00:58:36,870 --> 00:58:38,760
We've already put into the CS50 library.
1265
00:58:38,760 --> 00:58:41,060
But that's why, for instance, we bootstrap things,
1266
00:58:41,060 --> 00:58:44,420
by just using these functions out of the box.
1267
00:58:44,420 --> 00:58:47,610
All right, let's do something more with our calculator here.
1268
00:58:47,610 --> 00:58:49,010
How about this.
1269
00:58:49,010 --> 00:58:51,890
In the world of C, we had another version
1270
00:58:51,890 --> 00:58:56,990
of this code, which actually did some division by way of--
1271
00:58:56,990 --> 00:59:01,680
which actually did division of numbers, not just the addition herein.
1272
00:59:01,680 --> 00:59:05,990
So let me go ahead and close the C version, and let's focus only on Python
1273
00:59:05,990 --> 00:59:07,942
now, doing some of these same lines of codes.
1274
00:59:07,942 --> 00:59:09,650
But I'm going to go ahead and just assume
1275
00:59:09,650 --> 00:59:12,140
that the user is going to cooperate and use proper input.
1276
00:59:12,140 --> 00:59:16,310
So from CS50, import getInt, that will deal with any errors for me.
1277
00:59:16,310 --> 00:59:23,640
X gets getInt, ask the user for an Int x, y equals getInt,
1278
00:59:23,640 --> 00:59:25,170
ask the user for an Int y.
1279
00:59:25,170 --> 00:59:27,010
And then, let's go ahead and do this.
1280
00:59:27,010 --> 00:59:31,110
Let's declare a variable called z, set it equal to x divided by y.
1281
00:59:31,110 --> 00:59:32,850
Then let's go ahead and print z.
1282
00:59:32,850 --> 00:59:37,240
Still no need for a format string, I can just print out the variable's value.
1283
00:59:37,240 --> 00:59:39,240
Let me go ahead and run Python of Calculator.py.
1284
00:59:39,240 --> 00:59:43,650
Let me do 1, 10, and I get 0.1.
1285
00:59:43,650 --> 00:59:49,260
What did I get in C, though, if you think back.
1286
00:59:49,260 --> 00:59:52,076
What would we have happened in C?
1287
00:59:52,076 --> 00:59:53,420
AUDIENCE: Zero?
1288
00:59:53,420 --> 00:59:55,640
DAVID J. MALAN: Yeah, we would have gotten zero in C.
1289
00:59:55,640 --> 00:59:57,998
But why, in C, when you divide one Int by another,
1290
00:59:57,998 --> 00:59:59,915
and those Ints are like 1 and 10 respectively?
1291
00:59:59,915 --> 01:00:01,677
AUDIENCE: It'll give you an integer back.
1292
01:00:01,677 --> 01:00:03,260
DAVID J. MALAN: It will give you what?
1293
01:00:03,260 --> 01:00:04,343
AUDIENCE: An integer back.
1294
01:00:04,343 --> 01:00:07,910
DAVID J. MALAN: It will give you an integer back, and, unfortunately, 0.1,
1295
01:00:07,910 --> 01:00:09,860
the integer part of it is indeed zero.
1296
01:00:09,860 --> 01:00:11,970
So this was an example of truncation.
1297
01:00:11,970 --> 01:00:14,540
So truncation was an issue in C. But it would
1298
01:00:14,540 --> 01:00:17,450
seem as though this is no longer a problem in Python,
1299
01:00:17,450 --> 01:00:21,290
insofar as the division operator actually handles that for us.
1300
01:00:21,290 --> 01:00:24,230
As an aside, if you want the old behavior, because it actually
1301
01:00:24,230 --> 01:00:27,020
is sometimes useful for rounding or flooring values,
1302
01:00:27,020 --> 01:00:29,570
you can actually use two slashes.
1303
01:00:29,570 --> 01:00:31,620
And now you get the C behavior.
1304
01:00:31,620 --> 01:00:33,710
So that now 1 divided by 10 is zero.
1305
01:00:33,710 --> 01:00:36,230
So you don't give up that capability, but at least it
1306
01:00:36,230 --> 01:00:37,610
does a more sensible default.
1307
01:00:37,610 --> 01:00:41,030
Most people, especially new programmers, when dividing one value by another,
1308
01:00:41,030 --> 01:00:44,000
would want to get 0.1, not 0, for reasons
1309
01:00:44,000 --> 01:00:46,100
that indeed we had to explain weeks ago.
1310
01:00:46,100 --> 01:00:49,940
But what about another problem we had with the world of floats before,
1311
01:00:49,940 --> 01:00:52,040
whereby there is imprecision?
1312
01:00:52,040 --> 01:00:54,980
Let me go ahead and, somewhat cryptically, print out the value of z
1313
01:00:54,980 --> 01:00:55,860
as follows.
1314
01:00:55,860 --> 01:00:58,340
I'm going to format it using an f-string.
1315
01:00:58,340 --> 01:01:02,720
And I'm going to go ahead and format, not just z, because this is essentially
1316
01:01:02,720 --> 01:01:03,450
the same thing.
1317
01:01:03,450 --> 01:01:06,620
Notice this, if I do Python of Calculator.py, 1 and 10,
1318
01:01:06,620 --> 01:01:09,770
I get, by default, just one significant digit.
1319
01:01:09,770 --> 01:01:13,920
But if I use this syntax in Python, which we won't have to use often,
1320
01:01:13,920 --> 01:01:16,550
I can actually do in C like I did before,
1321
01:01:16,550 --> 01:01:19,650
50 significant digits after the decimal point.
1322
01:01:19,650 --> 01:01:24,020
So now let me rerun Python of Calculator.py 1 and 10,
1323
01:01:24,020 --> 01:01:26,990
and let's see if floating point imprecision is still with us.
1324
01:01:26,990 --> 01:01:28,280
Unfortunately, it is.
1325
01:01:28,280 --> 01:01:30,950
And you can see as much here, the f-string, the format string,
1326
01:01:30,950 --> 01:01:33,990
is just showing us now 50 digits instead of the default one.
1327
01:01:33,990 --> 01:01:36,110
So we've not solved all problems.
1328
01:01:36,110 --> 01:01:38,845
But we have solved at least some.
1329
01:01:38,845 --> 01:01:41,720
All right, before we pivot away from a mere calculator, any questions
1330
01:01:41,720 --> 01:01:45,350
now on syntax or concepts or the like?
1331
01:01:45,350 --> 01:01:46,070
Yeah.
1332
01:01:46,070 --> 01:01:49,320
AUDIENCE: Do you think the double slash you get
1333
01:01:49,320 --> 01:01:51,937
has merit, how do you comment on that?
1334
01:01:51,937 --> 01:01:53,270
DAVID J. MALAN: How do you what?
1335
01:01:53,270 --> 01:01:54,228
Oh, how do you comment.
1336
01:01:54,228 --> 01:01:57,410
Really good question, if you're using double slash for division
1337
01:01:57,410 --> 01:01:59,870
with flooring or truncation, like I described,
1338
01:01:59,870 --> 01:02:01,850
how do you do a comment in Python.
1339
01:02:01,850 --> 01:02:03,380
This is a comment.
1340
01:02:03,380 --> 01:02:05,930
And the convention is actually to use a complete sentence,
1341
01:02:05,930 --> 01:02:07,473
like with a capital T here.
1342
01:02:07,473 --> 01:02:09,890
You don't need a period unless there's multiple sentences.
1343
01:02:09,890 --> 01:02:12,840
And technically, it should be above the line of code by convention.
1344
01:02:12,840 --> 01:02:15,120
So you would use a hash symbol instead.
1345
01:02:15,120 --> 01:02:16,080
Good question.
1346
01:02:16,080 --> 01:02:17,420
I haven't seen those yet.
1347
01:02:17,420 --> 01:02:20,750
All right, let's go ahead and make something else here, how about.
1348
01:02:20,750 --> 01:02:23,430
Let me go ahead and open up, for instance,
1349
01:02:23,430 --> 01:02:29,090
an example called Points1.c, which we saw a few weeks back.
1350
01:02:29,090 --> 01:02:33,530
And let me go ahead on the other side and create a file called Points.py.
1351
01:02:33,530 --> 01:02:36,890
This was a program, recall, that asked the user how many points they
1352
01:02:36,890 --> 01:02:39,388
lost on the first assignment.
1353
01:02:39,388 --> 01:02:41,180
And then it went ahead and just printed out
1354
01:02:41,180 --> 01:02:43,790
whether they lost fewer points than me, because I lost two,
1355
01:02:43,790 --> 01:02:47,117
if you recall the photo, more points than me, or the same points as me.
1356
01:02:47,117 --> 01:02:49,700
Let me go ahead and zoom out so we can see a bit more of this.
1357
01:02:49,700 --> 01:02:54,208
And let me now, on the top right here, go about implementing this in Python.
1358
01:02:54,208 --> 01:02:56,750
So I want to first prompt the user for some number of points.
1359
01:02:56,750 --> 01:03:00,540
So from CS50 let's import getInt, so it handles the error-checking.
1360
01:03:00,540 --> 01:03:03,410
Let's then do points equals getInt, and ask
1361
01:03:03,410 --> 01:03:07,430
the user, how many points did you lose, question mark.
1362
01:03:07,430 --> 01:03:11,990
Then let's go ahead and say, if points less than two, which was my value,
1363
01:03:11,990 --> 01:03:15,800
print, you lost fewer points than me.
1364
01:03:15,800 --> 01:03:23,270
Otherwise, if it's else if points greater than 2, go ahead and print,
1365
01:03:23,270 --> 01:03:27,070
you lost more points than me.
1366
01:03:27,070 --> 01:03:30,800
Else let's go ahead and handle the final scenario, which is you
1367
01:03:30,800 --> 01:03:34,600
lost the same number of points as me.
1368
01:03:34,600 --> 01:03:39,230
Before I run this, does anyone want to point out a mistake I've already made?
1369
01:03:39,230 --> 01:03:39,730
Yeah.
1370
01:03:39,730 --> 01:03:41,390
AUDIENCE: Else if has to be elif.
1371
01:03:41,390 --> 01:03:44,690
DAVID J. MALAN: Yeah, so else if in C is actually now elif in Python.
1372
01:03:44,690 --> 01:03:45,780
It's a single word.
1373
01:03:45,780 --> 01:03:49,790
So let me change this to elif, and now cross my fingers, Python of Points.py,
1374
01:03:49,790 --> 01:03:53,330
suppose you lost three points on some assignment.
1375
01:03:53,330 --> 01:03:55,190
You lost more points than my two.
1376
01:03:55,190 --> 01:03:57,808
If you only lost one point, you lost fewer points than me.
1377
01:03:57,808 --> 01:03:58,850
So the logic is the same.
1378
01:03:58,850 --> 01:04:01,040
But notice the code is much tighter.
1379
01:04:01,040 --> 01:04:04,700
In 10 total lines, we did in what was 24 lines, because we've
1380
01:04:04,700 --> 01:04:06,350
thrown away a lot of the syntax.
1381
01:04:06,350 --> 01:04:08,370
The curly braces are no longer necessary.
1382
01:04:08,370 --> 01:04:10,230
The parentheses are gone, the semicolons.
1383
01:04:10,230 --> 01:04:13,670
So this is why it just tends to be more pleasant pretty quickly,
1384
01:04:13,670 --> 01:04:16,310
using a language like this.
1385
01:04:16,310 --> 01:04:18,770
All right, let's do one other example here.
1386
01:04:18,770 --> 01:04:23,000
In C, recall that we were able to determine the parity of some number,
1387
01:04:23,000 --> 01:04:24,590
if something is even or odd.
1388
01:04:24,590 --> 01:04:29,000
Well, in Python, let me go ahead and create a file called Parity.py,
1389
01:04:29,000 --> 01:04:32,810
and let's look for a moment at the C version at left.
1390
01:04:32,810 --> 01:04:36,680
Here was the code in C that we used to determine the parity of a number.
1391
01:04:36,680 --> 01:04:39,800
And, really, the key takeaway from all these lines
1392
01:04:39,800 --> 01:04:41,290
was just the remainder operator.
1393
01:04:41,290 --> 01:04:42,540
And that one is still with us.
1394
01:04:42,540 --> 01:04:44,998
So this is a simple demonstration, just to make that point,
1395
01:04:44,998 --> 01:04:48,770
if in Python, I want to determine whether a number is even or odd.
1396
01:04:48,770 --> 01:04:53,150
Well, let's go ahead and from CS50, import getInt, then let's go ahead
1397
01:04:53,150 --> 01:04:58,610
and get a number like n from the user, using getInt, and ask them for n.
1398
01:04:58,610 --> 01:05:04,220
And then let's go ahead and say, if n percent sign 2 equals 0,
1399
01:05:04,220 --> 01:05:08,270
then let's go ahead and print quote unquote "Even."
1400
01:05:08,270 --> 01:05:13,753
Else let's go ahead and print out Odd, but before I run this,
1401
01:05:13,753 --> 01:05:16,670
anyone want to instinctively, even though we've not talked about this,
1402
01:05:16,670 --> 01:05:19,010
point out a mistake here?
1403
01:05:19,010 --> 01:05:19,810
What I did wrong?
1404
01:05:19,810 --> 01:05:20,810
AUDIENCE: Double equals.
1405
01:05:20,810 --> 01:05:22,435
DAVID J. MALAN: Yeah, so double equals.
1406
01:05:22,435 --> 01:05:25,850
Again, so even though some of the stuff is changing, some of the same ideas
1407
01:05:25,850 --> 01:05:26,430
are the same.
1408
01:05:26,430 --> 01:05:28,520
So this, too, should be a double equal sign,
1409
01:05:28,520 --> 01:05:30,620
because I'm comparing for equality here.
1410
01:05:30,620 --> 01:05:32,153
And why is this the right math?
1411
01:05:32,153 --> 01:05:34,070
Well, if you divide a number by 2, it's either
1412
01:05:34,070 --> 01:05:36,290
going to have 0 or 1 as a remainder.
1413
01:05:36,290 --> 01:05:39,030
And that's going to determine if it's even or odd for us.
1414
01:05:39,030 --> 01:05:42,200
So let's run Python of Parity.py, type in a number like 50,
1415
01:05:42,200 --> 01:05:44,660
and hopefully we get, indeed, even.
1416
01:05:44,660 --> 01:05:46,910
So again, same idea, but now we're down to eight lines
1417
01:05:46,910 --> 01:05:48,560
of code instead of the 20.
1418
01:05:48,560 --> 01:05:50,810
Well, let's now do something a little more interactive
1419
01:05:50,810 --> 01:05:54,680
and a little representative of tools that actually ask the user questions.
1420
01:05:54,680 --> 01:06:00,320
In C, recall that we had this agreement program, Agree.c.
1421
01:06:00,320 --> 01:06:04,280
And then let's go ahead and implement a corresponding version in Python,
1422
01:06:04,280 --> 01:06:05,870
in a file called Agree.py.
1423
01:06:05,870 --> 01:06:08,570
And let's look at the C version first.
1424
01:06:08,570 --> 01:06:10,700
On the left, we used get char here.
1425
01:06:10,700 --> 01:06:13,190
And then we used the double vertical bars
1426
01:06:13,190 --> 01:06:16,430
to check if C is equal to capital Y or lowercase y.
1427
01:06:16,430 --> 01:06:18,500
And then we did the same thing for n for no.
1428
01:06:18,500 --> 01:06:24,380
And so let's go over here and let's do from CS50, import get--
1429
01:06:24,380 --> 01:06:26,570
OK, get char is not a thing.
1430
01:06:26,570 --> 01:06:29,090
And this here is another difference with Python.
1431
01:06:29,090 --> 01:06:32,510
There is no data type for individual characters.
1432
01:06:32,510 --> 01:06:34,640
You have strings, STRs, and, honestly, those
1433
01:06:34,640 --> 01:06:36,620
are fine, because if you have a STR that's
1434
01:06:36,620 --> 01:06:38,960
just one character, for all intents and purposes,
1435
01:06:38,960 --> 01:06:40,710
it is just a single character.
1436
01:06:40,710 --> 01:06:41,960
So it's just a simplification.
1437
01:06:41,960 --> 01:06:43,200
You don't have to think as much.
1438
01:06:43,200 --> 01:06:45,658
You don't have to worry about double quotes, single quotes.
1439
01:06:45,658 --> 01:06:49,350
In fact, in Python, you can use double quotes or single quotes,
1440
01:06:49,350 --> 01:06:50,930
so long as you're consistent.
1441
01:06:50,930 --> 01:06:52,970
So long as you're consistent, the single quotes
1442
01:06:52,970 --> 01:06:55,670
do not mean something different, like they do in C.
1443
01:06:55,670 --> 01:06:58,340
So I'm going to go ahead and use getString here,
1444
01:06:58,340 --> 01:07:01,220
although, strictly speaking, I could just use the input function,
1445
01:07:01,220 --> 01:07:02,480
as we saw before.
1446
01:07:02,480 --> 01:07:07,250
I'm going to get a string from the user that asks them this, getString,
1447
01:07:07,250 --> 01:07:10,557
quote unquote, "Do you agree," like a little checkbox or interactive prompt,
1448
01:07:10,557 --> 01:07:13,640
where you have to say yes or no, you want to agree to the following terms,
1449
01:07:13,640 --> 01:07:14,580
or whatnot.
1450
01:07:14,580 --> 01:07:18,110
And then let's translate the conditionals to Python, now, too.
1451
01:07:18,110 --> 01:07:25,850
So if S equals equals quote-unquote "Y," or S equals equals lowercase y,
1452
01:07:25,850 --> 01:07:32,180
let's go ahead and print out agreed, just like in C, elif S equals
1453
01:07:32,180 --> 01:07:35,540
equals N or S equals equals little n.
1454
01:07:35,540 --> 01:07:38,058
Let's go ahead, then, and print out not agreed.
1455
01:07:38,058 --> 01:07:40,850
And you can already see, perhaps, one of the differences here, too.
1456
01:07:40,850 --> 01:07:43,700
Is Python a little more English-like, in that
1457
01:07:43,700 --> 01:07:47,610
you just literally use the English word or, instead of the two vertical bars.
1458
01:07:47,610 --> 01:07:50,370
But it's ultimately doing the same thing.
1459
01:07:50,370 --> 01:07:53,390
Can we simplify this code a bit, though.
1460
01:07:53,390 --> 01:07:55,340
This would be a little annoying if we wanted
1461
01:07:55,340 --> 01:07:57,800
to add support, not just for big Y and little y,
1462
01:07:57,800 --> 01:08:04,230
but Yes or big Yes or little yes or big Y, lowercase e, capital S, right?
1463
01:08:04,230 --> 01:08:07,130
There's a lot of permutations of Y-E-S or just y,
1464
01:08:07,130 --> 01:08:08,720
that we ideally should tolerate.
1465
01:08:08,720 --> 01:08:11,470
Otherwise, the user is going to have to type exactly what we want,
1466
01:08:11,470 --> 01:08:12,770
which isn't very user-friendly.
1467
01:08:12,770 --> 01:08:15,050
Any intuition for how we could logically,
1468
01:08:15,050 --> 01:08:18,270
even if you don't know how to do it in code, make this better?
1469
01:08:18,270 --> 01:08:18,770
Yeah.
1470
01:08:18,770 --> 01:08:21,535
AUDIENCE: Write way over the list, and then up,
1471
01:08:21,535 --> 01:08:22,910
it's like the things in the list.
1472
01:08:22,910 --> 01:08:27,050
DAVID J. MALAN: Nice, yeah, we saw an example of a list before, just 0, 1, 2.
1473
01:08:27,050 --> 01:08:29,899
Why don't we take that same idea and ask a similar question.
1474
01:08:29,899 --> 01:08:34,819
If S is in the following list of values, Y or little y,
1475
01:08:34,819 --> 01:08:38,600
or heck, let me add to the list now, yes, or maybe all capital YES.
1476
01:08:38,600 --> 01:08:40,779
And it's going to get a little annoying, admittedly,
1477
01:08:40,779 --> 01:08:43,750
but this is still better than the alternative, with all the or's.
1478
01:08:43,750 --> 01:08:45,640
I could do things like this, and so forth.
1479
01:08:45,640 --> 01:08:47,740
There's a whole bunch more permutations.
1480
01:08:47,740 --> 01:08:50,470
But let's leave this alone, and let me just go into here
1481
01:08:50,470 --> 01:08:57,279
and change this to, if S is in the following list of N or little n or no,
1482
01:08:57,279 --> 01:09:00,460
and I won't do as, let's just not worry about the weird capitalizations
1483
01:09:00,460 --> 01:09:01,600
there, for now.
1484
01:09:01,600 --> 01:09:02,800
Let's go ahead and run this.
1485
01:09:02,800 --> 01:09:05,950
Python of Agree.py, do I agree?
1486
01:09:05,950 --> 01:09:08,740
Y. OK, how about yes?
1487
01:09:08,740 --> 01:09:10,359
All right, how about big Yes.
1488
01:09:10,359 --> 01:09:11,850
OK, that does not seem to work.
1489
01:09:11,850 --> 01:09:14,350
Notice it did not say agreed, and it did not say not agreed.
1490
01:09:14,350 --> 01:09:15,410
It didn't detect it.
1491
01:09:15,410 --> 01:09:17,180
So how can I do this?
1492
01:09:17,180 --> 01:09:20,770
Well, you know what I could do, what I don't really
1493
01:09:20,770 --> 01:09:22,240
need the uppercase and lowercase.
1494
01:09:22,240 --> 01:09:24,189
Let me tighten this list up a little bit.
1495
01:09:24,189 --> 01:09:27,640
And why don't I just force S to be lowercase.
1496
01:09:27,640 --> 01:09:31,000
S.lower, recall, whether it's one character or more,
1497
01:09:31,000 --> 01:09:34,180
is a function built into STRs now, strings in Python,
1498
01:09:34,180 --> 01:09:35,950
that forces the whole thing to lowercase.
1499
01:09:35,950 --> 01:09:37,450
So now, watch what I can do.
1500
01:09:37,450 --> 01:09:42,700
Python of Agree.py, little y, that works, big Y, that works.
1501
01:09:42,700 --> 01:09:47,840
Big Yes, that works, big Y, little e, big S, that also works.
1502
01:09:47,840 --> 01:09:50,910
So we've now handled, in one fell swoop, a whole bunch more logic.
1503
01:09:50,910 --> 01:09:52,910
And you know what, we can tighten this up a bit.
1504
01:09:52,910 --> 01:09:56,350
Here's an opportunity, in Python, for slightly better design.
1505
01:09:56,350 --> 01:10:00,070
What have I done in here that's a little redundant?
1506
01:10:00,070 --> 01:10:04,180
Does anyone see an opportunity to eliminate a redundancy,
1507
01:10:04,180 --> 01:10:06,820
doing something more times than you need.
1508
01:10:06,820 --> 01:10:08,030
Is a stretch here, no.
1509
01:10:08,030 --> 01:10:08,530
Yep.
1510
01:10:08,530 --> 01:10:11,163
AUDIENCE: You can do S dot lower, above.
1511
01:10:11,163 --> 01:10:13,330
DAVID J. MALAN: We could move the S dot lower above.
1512
01:10:13,330 --> 01:10:15,310
Notice that I'm using S dot lower twice.
1513
01:10:15,310 --> 01:10:17,870
But it's going to give me the same answer both times.
1514
01:10:17,870 --> 01:10:20,080
So I could do a couple of things here.
1515
01:10:20,080 --> 01:10:24,700
I could, first of all, get rid of this lower, and get rid of this lower,
1516
01:10:24,700 --> 01:10:28,720
and then above this, maybe I could do something like this, S equal--
1517
01:10:28,720 --> 01:10:31,600
I can't just do this, because that throws the value away.
1518
01:10:31,600 --> 01:10:34,240
It does the math, but it doesn't convert the string itself.
1519
01:10:34,240 --> 01:10:35,840
It's going to return a value.
1520
01:10:35,840 --> 01:10:38,260
So I have to say S equals s.lower.
1521
01:10:38,260 --> 01:10:39,340
I could do that.
1522
01:10:39,340 --> 01:10:41,840
Or, honestly, I can chain these things together.
1523
01:10:41,840 --> 01:10:46,070
And this is not something we saw in C. If getString returns a string,
1524
01:10:46,070 --> 01:10:49,240
and strings have functions like lower in them,
1525
01:10:49,240 --> 01:10:52,330
you can chain these functions together, like this, and do dot this,
1526
01:10:52,330 --> 01:10:53,788
dot that, dot this other thing.
1527
01:10:53,788 --> 01:10:56,830
And eventually you want to stop, because it's going to become crazy long.
1528
01:10:56,830 --> 01:10:58,810
But this is reasonable, still fits on the screen.
1529
01:10:58,810 --> 01:10:59,560
It's pretty tight.
1530
01:10:59,560 --> 01:11:01,690
It does in one place what I was doing in two.
1531
01:11:01,690 --> 01:11:03,010
So I think that's OK.
1532
01:11:03,010 --> 01:11:05,980
Let me go ahead and do Python of Agree.py one last time.
1533
01:11:05,980 --> 01:11:07,120
Let's try it one last time.
1534
01:11:07,120 --> 01:11:10,360
And it's still working as intended.
1535
01:11:10,360 --> 01:11:12,700
Also if I tried those other inputs as well.
1536
01:11:12,700 --> 01:11:13,435
Yeah, question.
1537
01:11:13,435 --> 01:11:19,290
AUDIENCE: Could you add on like a for uppercase as well, for like upper,
1538
01:11:19,290 --> 01:11:22,700
and then cover all the functions where it's lowercase, for all the functions
1539
01:11:22,700 --> 01:11:25,450
where it's uppercase as well, or could you not just do this again.
1540
01:11:25,450 --> 01:11:29,095
1541
01:11:29,095 --> 01:11:30,470
DAVID J. MALAN: Let me summarize.
1542
01:11:30,470 --> 01:11:33,340
Could we handle uppercase and lowercase together in some form?
1543
01:11:33,340 --> 01:11:35,020
I'm actually doing that already.
1544
01:11:35,020 --> 01:11:36,370
I just have to pick a lane.
1545
01:11:36,370 --> 01:11:39,307
I have to either be all lowercase in my logic or all uppercase,
1546
01:11:39,307 --> 01:11:41,140
and not worry about what the human types in,
1547
01:11:41,140 --> 01:11:43,240
because no matter what the human types in, I'm
1548
01:11:43,240 --> 01:11:44,950
forcing their input to lowercase.
1549
01:11:44,950 --> 01:11:48,280
And then I am using a lowercase list of values.
1550
01:11:48,280 --> 01:11:49,520
If I want to flip that, fine.
1551
01:11:49,520 --> 01:11:51,040
I just have to be self-consistent.
1552
01:11:51,040 --> 01:11:52,420
But I'm handling that already.
1553
01:11:52,420 --> 01:11:53,223
Yeah.
1554
01:11:53,223 --> 01:11:56,953
AUDIENCE: Are strings no longer an array of characters?
1555
01:11:56,953 --> 01:11:58,870
DAVID J. MALAN: A really good loaded questions
1556
01:11:58,870 --> 01:12:02,080
are strings no longer an array of characters?
1557
01:12:02,080 --> 01:12:04,120
Conceptually, yes, underneath the hood, no.
1558
01:12:04,120 --> 01:12:06,190
They're a little more sophisticated than that,
1559
01:12:06,190 --> 01:12:08,590
because with strings, you have a few changes.
1560
01:12:08,590 --> 01:12:10,600
Not only do they have functions built into them,
1561
01:12:10,600 --> 01:12:12,580
because strings are now what we call objects,
1562
01:12:12,580 --> 01:12:14,500
in what's called object-oriented programming.
1563
01:12:14,500 --> 01:12:17,042
And we're going to keep seeing examples of this dot operator.
1564
01:12:17,042 --> 01:12:21,550
They are also immutable, so to speak, I-M-M-U-T-A-B-L-E.
1565
01:12:21,550 --> 01:12:25,180
Immutable means they cannot be changed, which means, unlike C,
1566
01:12:25,180 --> 01:12:28,750
you can't go into a string and change its individual characters.
1567
01:12:28,750 --> 01:12:31,480
You can make a copy of the string that makes a change,
1568
01:12:31,480 --> 01:12:33,698
but you can't change the original string itself.
1569
01:12:33,698 --> 01:12:35,740
This is both a little annoying, maybe, sometimes.
1570
01:12:35,740 --> 01:12:38,365
But it's also pretty protective, because you can't do screw-ups
1571
01:12:38,365 --> 01:12:41,680
like I did weeks ago, when I was trying to copy S and call it T.
1572
01:12:41,680 --> 01:12:43,270
And then one affected the other.
1573
01:12:43,270 --> 01:12:47,080
Python, underneath the hood, is handling all of the memory management
1574
01:12:47,080 --> 01:12:48,550
and the pointers and all of that.
1575
01:12:48,550 --> 01:12:51,040
There are no pointers in Python.
1576
01:12:51,040 --> 01:12:55,840
So If that wasn't clear, all of that pain, if you will, all of that power,
1577
01:12:55,840 --> 01:13:00,280
is now handled by the language itself, not by us, the programmers.
1578
01:13:00,280 --> 01:13:02,440
All right, so let's introduce maybe some loops,
1579
01:13:02,440 --> 01:13:04,390
like we've been in the habit of doing.
1580
01:13:04,390 --> 01:13:08,170
Let me open up Meow.c, which was an example in C, just meowing
1581
01:13:08,170 --> 01:13:09,730
a bunch of times textually.
1582
01:13:09,730 --> 01:13:12,800
Let me create a file called Meow.py here on the right.
1583
01:13:12,800 --> 01:13:15,190
And notice on the left, this was correct code in C,
1584
01:13:15,190 --> 01:13:16,670
but it was kind of poorly designed.
1585
01:13:16,670 --> 01:13:17,170
Why?
1586
01:13:17,170 --> 01:13:19,450
Because it was a missed opportunity for a loop.
1587
01:13:19,450 --> 01:13:22,460
Why say something three times when you can say it just once?
1588
01:13:22,460 --> 01:13:25,990
So in Python, let me do it the poorly designed way first.
1589
01:13:25,990 --> 01:13:27,400
Let me print out meow.
1590
01:13:27,400 --> 01:13:31,210
And, like I generally should not, let me copy, paste it three times,
1591
01:13:31,210 --> 01:13:33,670
run Python of Meow.py, and it works.
1592
01:13:33,670 --> 01:13:35,318
OK, but not good practice.
1593
01:13:35,318 --> 01:13:37,360
So let me go ahead and improve this a little bit.
1594
01:13:37,360 --> 01:13:38,990
And there's a few ways to do this.
1595
01:13:38,990 --> 01:13:44,050
If I wanted to do this three times, I could instead do something like this.
1596
01:13:44,050 --> 01:13:48,010
For i in range of 3, recall that that was the better version,
1597
01:13:48,010 --> 01:13:51,370
rather than arbitrarily enumerate numbers yourself, let me go ahead
1598
01:13:51,370 --> 01:13:53,490
and print out quote unquote "Meow."
1599
01:13:53,490 --> 01:13:56,077
Now if I run Python of Meow, still seems to work.
1600
01:13:56,077 --> 01:13:57,910
So it's a little tighter, and, my God, like,
1601
01:13:57,910 --> 01:13:59,952
programs can't really get much shorter than this.
1602
01:13:59,952 --> 01:14:04,300
We're down to two lines of code, no main function, no gratuitous syntax.
1603
01:14:04,300 --> 01:14:06,580
Let's now improve the design further, like we
1604
01:14:06,580 --> 01:14:09,550
did in C, by introducing a function called
1605
01:14:09,550 --> 01:14:11,230
meow, that actually does the meowing.
1606
01:14:11,230 --> 01:14:13,000
So this was our first abstraction, recall,
1607
01:14:13,000 --> 01:14:18,100
both in Scratch and in C. Let me focus now entirely on the Python version
1608
01:14:18,100 --> 01:14:18,760
here.
1609
01:14:18,760 --> 01:14:23,485
Let me go ahead and first define a function.
1610
01:14:23,485 --> 01:14:26,890
1611
01:14:26,890 --> 01:14:30,250
Let me first go ahead and do this, for i in range of 3,
1612
01:14:30,250 --> 01:14:33,430
let's assume for the moment that there's a meow function,
1613
01:14:33,430 --> 01:14:34,720
that I'm just going to call.
1614
01:14:34,720 --> 01:14:38,320
Let's now go ahead and define, using the Def key word, which we saw briefly
1615
01:14:38,320 --> 01:14:41,170
with the speller demonstration, a function
1616
01:14:41,170 --> 01:14:42,880
called meow that takes no arguments.
1617
01:14:42,880 --> 01:14:45,460
And all it does for now is print meow.
1618
01:14:45,460 --> 01:14:50,620
Let me now go ahead and run Python of Meow.py Enter, huh, one
1619
01:14:50,620 --> 01:14:51,950
of those trace backs.
1620
01:14:51,950 --> 01:14:54,080
So this is another name error.
1621
01:14:54,080 --> 01:14:57,080
And, again, name meow is not defined.
1622
01:14:57,080 --> 01:14:59,080
What's your instinct here, even though we've not
1623
01:14:59,080 --> 01:15:00,760
tripped over this yet in Python?
1624
01:15:00,760 --> 01:15:03,130
Where does your mind go here?
1625
01:15:03,130 --> 01:15:03,670
Yeah.
1626
01:15:03,670 --> 01:15:06,080
AUDIENCE: Does it read top to bottom, left to right?
1627
01:15:06,080 --> 01:15:09,600
I'm guessing we could find a new case.
1628
01:15:09,600 --> 01:15:13,020
DAVID J. MALAN: Perfect, as smart, as smarter as Python seems to be,
1629
01:15:13,020 --> 01:15:14,770
it still makes certain assumptions.
1630
01:15:14,770 --> 01:15:18,010
And if it hasn't seen a keyword yet, it just doesn't exist.
1631
01:15:18,010 --> 01:15:21,000
So if you want it to exist, we have to be a little clever here.
1632
01:15:21,000 --> 01:15:24,090
I could just put it, flip it around, like this.
1633
01:15:24,090 --> 01:15:26,470
But this honestly isn't particularly good design.
1634
01:15:26,470 --> 01:15:26,970
Why?
1635
01:15:26,970 --> 01:15:30,390
Because now, if you, the reader of your code, whether you
1636
01:15:30,390 --> 01:15:32,970
wrote it or someone else, you kind of have to go fishing now.
1637
01:15:32,970 --> 01:15:34,560
Like where does this program begin?
1638
01:15:34,560 --> 01:15:38,130
And even though, yes, it's obvious that it begins on line four, logically,
1639
01:15:38,130 --> 01:15:40,710
like, if the file were longer, you're going to be annoyed
1640
01:15:40,710 --> 01:15:43,180
and fishing visually for the right lines of code.
1641
01:15:43,180 --> 01:15:44,397
So let's reintroduce main.
1642
01:15:44,397 --> 01:15:46,230
And indeed, this would be a common paradigm.
1643
01:15:46,230 --> 01:15:49,380
When you want to start having abstractions in your own functions,
1644
01:15:49,380 --> 01:15:53,460
just put your own code in main, so that, one, you can leave it up top, and two,
1645
01:15:53,460 --> 01:15:55,650
you can solve the problem we just encountered.
1646
01:15:55,650 --> 01:15:58,860
So let me define a function called main that has that same loop,
1647
01:15:58,860 --> 01:16:00,240
meowing three times.
1648
01:16:00,240 --> 01:16:02,040
But now watch what happens.
1649
01:16:02,040 --> 01:16:07,350
Let me go into my terminal and run Python of Meow.py, Enter.
1650
01:16:07,350 --> 01:16:07,850
Nothing.
1651
01:16:07,850 --> 01:16:10,500
1652
01:16:10,500 --> 01:16:14,050
All right, investigate this.
1653
01:16:14,050 --> 01:16:16,290
What could explain this symptom.
1654
01:16:16,290 --> 01:16:18,020
I have not told you the answer yet.
1655
01:16:18,020 --> 01:16:19,770
So all you have is your instinct, assuming
1656
01:16:19,770 --> 01:16:21,720
you've never touched Python before.
1657
01:16:21,720 --> 01:16:26,800
What might explain this symptom, where nothing is meowing?
1658
01:16:26,800 --> 01:16:27,300
Yeah?
1659
01:16:27,300 --> 01:16:28,970
AUDIENCE: Didn't run the main function.
1660
01:16:28,970 --> 01:16:31,178
DAVID J. MALAN: Yeah, I didn't run the main function.
1661
01:16:31,178 --> 01:16:33,390
So in C, this is functionality you get for free.
1662
01:16:33,390 --> 01:16:34,765
You have to have a main function.
1663
01:16:34,765 --> 01:16:37,580
But, heck, so long as you make it, it will be called for you.
1664
01:16:37,580 --> 01:16:41,390
In Python, this is just a convention, to create a main function,
1665
01:16:41,390 --> 01:16:43,200
borrowing a very common name for it.
1666
01:16:43,200 --> 01:16:46,320
But if you want to call that main function, you have to do it.
1667
01:16:46,320 --> 01:16:48,110
So this looks a little weird, admittedly,
1668
01:16:48,110 --> 01:16:50,030
that you have to call your own main function now,
1669
01:16:50,030 --> 01:16:51,860
and it has to be at the bottom of the file,
1670
01:16:51,860 --> 01:16:55,040
because only once the interpreter gets to the bottom of the file,
1671
01:16:55,040 --> 01:16:58,460
have all of your functions been defined, higher up.
1672
01:16:58,460 --> 01:16:59,990
But this solves both problems.
1673
01:16:59,990 --> 01:17:02,450
It keeps your code, that's the main part of your code,
1674
01:17:02,450 --> 01:17:03,660
at the very top of the file.
1675
01:17:03,660 --> 01:17:06,980
So it's just obvious to you, and a TF, or any reader in the future,
1676
01:17:06,980 --> 01:17:09,140
where the program logically starts.
1677
01:17:09,140 --> 01:17:13,310
But it also ensures that main is not called until everything else, main
1678
01:17:13,310 --> 01:17:15,660
included, has been defined.
1679
01:17:15,660 --> 01:17:17,648
So this is another perfect example of we're
1680
01:17:17,648 --> 01:17:19,440
learning a new language for the first time.
1681
01:17:19,440 --> 01:17:21,020
You're not going to have heard all of the answers before.
1682
01:17:21,020 --> 01:17:24,830
Just apply some logic, as to, like, all right, what could explain this symptom.
1683
01:17:24,830 --> 01:17:28,190
Start to infer how the language does or doesn't work.
1684
01:17:28,190 --> 01:17:32,450
If I now go and run this, Python of Meow.py, now we're back in business.
1685
01:17:32,450 --> 01:17:35,360
And just so you have seen it, there is a quote
1686
01:17:35,360 --> 01:17:38,840
unquote "better" way of doing this, that solves different problems that we
1687
01:17:38,840 --> 01:17:42,050
are not going to encounter, certainly in these initial days.
1688
01:17:42,050 --> 01:17:45,440
Typically, you would see in online tutorials or books,
1689
01:17:45,440 --> 01:17:49,400
something that looks like this, where you actually have a weird conditional
1690
01:17:49,400 --> 01:17:50,810
with multiple underscores.
1691
01:17:50,810 --> 01:17:54,470
That's functionally the same thing, but it solves problems with libraries,
1692
01:17:54,470 --> 01:17:57,840
if we ourselves were implementing a library or something similar in spirit.
1693
01:17:57,840 --> 01:18:00,882
But we're going to keep things simpler and just write main at the bottom,
1694
01:18:00,882 --> 01:18:03,355
because we're not going to encounter that problem just yet.
1695
01:18:03,355 --> 01:18:06,230
All right, let's make one change to this, just to show how it's done.
1696
01:18:06,230 --> 01:18:11,420
In C, the last version of meow also took command line argument, sorry, also
1697
01:18:11,420 --> 01:18:13,910
took arguments to the function meow.
1698
01:18:13,910 --> 01:18:16,490
So suppose that I want to factor this out.
1699
01:18:16,490 --> 01:18:19,250
And I want to just call meow as a better abstraction, where I just
1700
01:18:19,250 --> 01:18:21,080
say meow this number of times.
1701
01:18:21,080 --> 01:18:24,290
And I figure out how many times by just, like, putting in number 3
1702
01:18:24,290 --> 01:18:26,990
or using getInt or something like that, to figure out
1703
01:18:26,990 --> 01:18:28,550
how many times to say meow.
1704
01:18:28,550 --> 01:18:31,820
Well, now, I have to define inside my meow function, in input,
1705
01:18:31,820 --> 01:18:38,330
let's call it n, and then use that, as by doing this, for i in range of n,
1706
01:18:38,330 --> 01:18:41,640
let me go ahead and print out meow that many times.
1707
01:18:41,640 --> 01:18:43,820
So again, the only thing that's different in C
1708
01:18:43,820 --> 01:18:47,630
is we don't bother specifying return types for any of these functions,
1709
01:18:47,630 --> 01:18:52,230
and we don't bother specifying the type of our arguments or our variables.
1710
01:18:52,230 --> 01:18:54,930
So same ideas, simpler in some sense.
1711
01:18:54,930 --> 01:18:56,660
We're just throwing away keystrokes.
1712
01:18:56,660 --> 01:18:59,450
All right, let me run this one final time, Python of Meow.py,
1713
01:18:59,450 --> 01:19:02,390
and we still have the same program.
1714
01:19:02,390 --> 01:19:04,110
All right, let me pause here.
1715
01:19:04,110 --> 01:19:04,780
Any questions?
1716
01:19:04,780 --> 01:19:06,030
And I know this is going fast.
1717
01:19:06,030 --> 01:19:11,355
But hopefully, the C code is still somewhat familiar.
1718
01:19:11,355 --> 01:19:11,855
Yeah.
1719
01:19:11,855 --> 01:19:17,530
AUDIENCE: Is there any difference between global and local variables.
1720
01:19:17,530 --> 01:19:18,780
DAVID J. MALAN: Good question.
1721
01:19:18,780 --> 01:19:21,238
Is there any difference between global and local variables?
1722
01:19:21,238 --> 01:19:23,850
Short answer, yes, and we would run into that same problem,
1723
01:19:23,850 --> 01:19:25,320
if we declare a variable in one function,
1724
01:19:25,320 --> 01:19:27,445
another function is not going to have access to it.
1725
01:19:27,445 --> 01:19:30,660
We can solve that by putting variables globally.
1726
01:19:30,660 --> 01:19:32,760
But we don't have all of the features we had in C,
1727
01:19:32,760 --> 01:19:35,160
like there's no such thing as a constant in Python.
1728
01:19:35,160 --> 01:19:36,900
The mentality in the Python community is,
1729
01:19:36,900 --> 01:19:39,480
if you don't want some value to change, don't touch it.
1730
01:19:39,480 --> 01:19:40,630
Like just don't screw up.
1731
01:19:40,630 --> 01:19:42,240
So there's trade-offs here, too.
1732
01:19:42,240 --> 01:19:45,000
Some languages are stronger or more defensive than that.
1733
01:19:45,000 --> 01:19:48,990
But that, too, is part of the mindset with this particular language.
1734
01:19:48,990 --> 01:19:49,770
[SIREN]
1735
01:19:49,770 --> 01:19:50,645
DAVID J. MALAN: Yeah.
1736
01:19:50,645 --> 01:19:52,937
AUDIENCE: There is really only one green line, in the--
1737
01:19:52,937 --> 01:19:54,437
DAVID J. MALAN: Oh, sorry, where's--
1738
01:19:54,437 --> 01:19:55,080
say it louder.
1739
01:19:55,080 --> 01:19:58,342
AUDIENCE: There has only been one green line printed at a time.
1740
01:19:58,342 --> 01:20:00,050
DAVID J. MALAN: That is an amazing segue.
1741
01:20:00,050 --> 01:20:01,370
Let's come to that in just a moment, because we're
1742
01:20:01,370 --> 01:20:03,620
going to recreate also that Mario example, where
1743
01:20:03,620 --> 01:20:06,925
we had like the question marks for the coins and the vertical bars.
1744
01:20:06,925 --> 01:20:08,550
So let's come back to that in a second.
1745
01:20:08,550 --> 01:20:09,656
And your question?
1746
01:20:09,656 --> 01:20:13,362
AUDIENCE: If strings are immutable, and every time you like make a copy.
1747
01:20:13,362 --> 01:20:15,320
DAVID J. MALAN: Correct, strings are immutable.
1748
01:20:15,320 --> 01:20:19,220
Any time you seem to be modifying it, as with the lower function,
1749
01:20:19,220 --> 01:20:20,480
you're getting back a copy.
1750
01:20:20,480 --> 01:20:22,940
So it's taking a little more memory somewhere.
1751
01:20:22,940 --> 01:20:26,145
But you don't have to deal with it Python's doing that for you.
1752
01:20:26,145 --> 01:20:28,892
AUDIENCE: So you don't free anything.
1753
01:20:28,892 --> 01:20:30,100
DAVID J. MALAN: Say it again?
1754
01:20:30,100 --> 01:20:31,226
You don't need what?
1755
01:20:31,226 --> 01:20:34,663
AUDIENCE: You don't free like taking leave on stuff.
1756
01:20:34,663 --> 01:20:36,330
DAVID J. MALAN: You don't free anything.
1757
01:20:36,330 --> 01:20:38,870
So if you weren't a big fan, over the past couple of weeks,
1758
01:20:38,870 --> 01:20:42,860
of malloc or free or memory or addresses, or all
1759
01:20:42,860 --> 01:20:44,990
of those low level implementation details,
1760
01:20:44,990 --> 01:20:47,390
Python is the language for you, because all of that
1761
01:20:47,390 --> 01:20:49,340
is handled for you automatically.
1762
01:20:49,340 --> 01:20:50,780
Java does the same.
1763
01:20:50,780 --> 01:20:51,960
JavaScript does the same.
1764
01:20:51,960 --> 01:20:52,460
Yeah.
1765
01:20:52,460 --> 01:20:58,244
AUDIENCE: Each up for the variable, you put it before the name, use of the body
1766
01:20:58,244 --> 01:20:59,700
before the name, correct?
1767
01:20:59,700 --> 01:21:03,785
Well, if there isn't a main function in Python, how do you define those words?
1768
01:21:03,785 --> 01:21:05,910
DAVID J. MALAN: How do you define a global variable
1769
01:21:05,910 --> 01:21:07,493
if there's no main function in Python?
1770
01:21:07,493 --> 01:21:11,480
Global variables, by definition, always need to be outside of main, as well.
1771
01:21:11,480 --> 01:21:12,480
So that's not a problem.
1772
01:21:12,480 --> 01:21:15,300
If I wanted to have a function that's outside of,
1773
01:21:15,300 --> 01:21:19,703
and, therefore, global to all of these, like global--
1774
01:21:19,703 --> 01:21:22,620
actually, don't use the word global, that's a special word in Python--
1775
01:21:22,620 --> 01:21:27,450
variable equals Foo, F-O-O, just as an arbitrary string
1776
01:21:27,450 --> 01:21:31,410
value that a computer scientist would typically use, that is now global.
1777
01:21:31,410 --> 01:21:34,000
There are some caveats, though, as to how you access that.
1778
01:21:34,000 --> 01:21:36,010
But let's come back to that another time.
1779
01:21:36,010 --> 01:21:38,030
But that problem is solvable, too.
1780
01:21:38,030 --> 01:21:38,530
All right.
1781
01:21:38,530 --> 01:21:39,780
So let's go ahead and do this.
1782
01:21:39,780 --> 01:21:43,050
To come back to the question about the print command, let me go ahead
1783
01:21:43,050 --> 01:21:45,300
and create a file now called Mario.py.
1784
01:21:45,300 --> 01:21:47,700
Won't bother showing the C code anymore.
1785
01:21:47,700 --> 01:21:49,590
We'll focus just on the new language here.
1786
01:21:49,590 --> 01:21:54,540
But recall that, in Python, in Mario, we wanted to first do something like this.
1787
01:21:54,540 --> 01:21:57,600
This was a random screen from the side scroller version 1
1788
01:21:57,600 --> 01:21:58,800
of Super Mario Brothers.
1789
01:21:58,800 --> 01:22:02,820
And we just want to print like three hashes to represent those three blocks.
1790
01:22:02,820 --> 01:22:04,950
Well, in Python, we could do something like this,
1791
01:22:04,950 --> 01:22:11,280
print, oh, sorry, for i in the range of 3, go ahead and print out quote unquote
1792
01:22:11,280 --> 01:22:11,828
"hash."
1793
01:22:11,828 --> 01:22:13,620
And I think this is pretty straightforward.
1794
01:22:13,620 --> 01:22:16,260
Python of Mario.py, we get our three hashes.
1795
01:22:16,260 --> 01:22:18,850
You could imagine parameterizing this now, though,
1796
01:22:18,850 --> 01:22:20,350
and getting actual user input.
1797
01:22:20,350 --> 01:22:21,730
So let's do that.
1798
01:22:21,730 --> 01:22:27,420
Let me go up here and let me go and say from CS50, import getInt,
1799
01:22:27,420 --> 01:22:31,090
and then let's get the input from the user.
1800
01:22:31,090 --> 01:22:33,210
So it actually is a value n, like, all right,
1801
01:22:33,210 --> 01:22:38,190
getInt the height of the column of bricks that you want to do.
1802
01:22:38,190 --> 01:22:42,270
And then, let's go ahead and print out n hashes instead of three.
1803
01:22:42,270 --> 01:22:43,560
So let me run this.
1804
01:22:43,560 --> 01:22:45,385
Let's print out like five hashes.
1805
01:22:45,385 --> 01:22:47,760
OK, one, two, three, four, five, that seems to work, too.
1806
01:22:47,760 --> 01:22:49,677
And it's going to work for any positive value.
1807
01:22:49,677 --> 01:22:53,400
But it's not going to work for, how about negative 1?
1808
01:22:53,400 --> 01:22:54,660
That just doesn't do anything.
1809
01:22:54,660 --> 01:22:55,747
But that seems OK.
1810
01:22:55,747 --> 01:22:58,830
But also recall that it's not going to work if the user types in something
1811
01:22:58,830 --> 01:23:03,990
weird, like, oh, sorry, it is going to work if the user types in something
1812
01:23:03,990 --> 01:23:05,790
weird like cat, why?
1813
01:23:05,790 --> 01:23:08,820
We're using CS50's getInt function, which is
1814
01:23:08,820 --> 01:23:11,710
handling all of those headaches for us.
1815
01:23:11,710 --> 01:23:15,180
But, what if the user indeed types a negative number?
1816
01:23:15,180 --> 01:23:16,110
We're tolerating that.
1817
01:23:16,110 --> 01:23:17,860
So that was the bug I wanted to highlight.
1818
01:23:17,860 --> 01:23:20,250
It would be nice to re-prompt them and re-prompt them.
1819
01:23:20,250 --> 01:23:22,560
And in C, what was the programming construct we
1820
01:23:22,560 --> 01:23:25,020
used when we wanted to ask the user a question.
1821
01:23:25,020 --> 01:23:29,280
And then, if they didn't cooperate, prompt them again, prompt them again.
1822
01:23:29,280 --> 01:23:29,890
What was that?
1823
01:23:29,890 --> 01:23:30,390
Yeah.
1824
01:23:30,390 --> 01:23:30,750
AUDIENCE: Do while loop.
1825
01:23:30,750 --> 01:23:32,100
DAVID J. MALAN: Yeah, do while loop, right?
1826
01:23:32,100 --> 01:23:34,830
That was useful, because it's almost the same as a while loop.
1827
01:23:34,830 --> 01:23:38,100
But instead of checking a condition, and then doing something,
1828
01:23:38,100 --> 01:23:39,948
you do something and then check a condition,
1829
01:23:39,948 --> 01:23:42,240
which makes sense with user input, because what are you
1830
01:23:42,240 --> 01:23:44,615
even going to check if the user hasn't done anything yet?
1831
01:23:44,615 --> 01:23:46,200
You need that inverted logic.
1832
01:23:46,200 --> 01:23:50,010
Unfortunately in Python, there is no do while loop.
1833
01:23:50,010 --> 01:23:51,300
There is a for loop.
1834
01:23:51,300 --> 01:23:52,740
There is a while loop.
1835
01:23:52,740 --> 01:23:55,590
And frankly, those are enough to recreate this idea.
1836
01:23:55,590 --> 01:23:59,160
And the way to do this in Python, the Pythonic way, which
1837
01:23:59,160 --> 01:24:02,160
is another term of art in the community, is to say this.
1838
01:24:02,160 --> 01:24:06,300
Deliberately induce an infinite loop, while True, with capital T for true.
1839
01:24:06,300 --> 01:24:09,930
And then do what you got to do, like get an Int from a user,
1840
01:24:09,930 --> 01:24:12,060
asking them for the height of this thing.
1841
01:24:12,060 --> 01:24:18,270
And then, if that is what you want, like a number greater than zero, go ahead
1842
01:24:18,270 --> 01:24:20,020
and break out of the loop.
1843
01:24:20,020 --> 01:24:25,440
So this is how, in Python, you could recreate the idea of a do while loop.
1844
01:24:25,440 --> 01:24:27,315
You deliberately induce an infinite loop.
1845
01:24:27,315 --> 01:24:29,190
So something's going to happen at least once.
1846
01:24:29,190 --> 01:24:32,280
Then, if you get the answer you want, you break out of it,
1847
01:24:32,280 --> 01:24:34,330
effectively achieving the same logic.
1848
01:24:34,330 --> 01:24:37,080
So this is the Pythonic way of doing a do while loop.
1849
01:24:37,080 --> 01:24:41,760
Let me go ahead and run Python of Mario.py, type in 3 this time.
1850
01:24:41,760 --> 01:24:44,670
And now I get back just the 3 hashes as well.
1851
01:24:44,670 --> 01:24:50,310
What if, though, I wanted to get rid of, how about ultimately
1852
01:24:50,310 --> 01:24:55,058
that CS50 library function, and also encapsulate this in a function.
1853
01:24:55,058 --> 01:24:57,100
Well, let's go ahead and tweak this a little bit.
1854
01:24:57,100 --> 01:24:59,070
Let me go ahead and remove this temporarily.
1855
01:24:59,070 --> 01:25:01,680
Give myself a main function, so I don't make the same mistake
1856
01:25:01,680 --> 01:25:03,360
as I did initially earlier.
1857
01:25:03,360 --> 01:25:07,110
And let me give myself a function called get height that takes no arguments.
1858
01:25:07,110 --> 01:25:10,620
And inside of that function is going to be that same code.
1859
01:25:10,620 --> 01:25:14,280
But I don't want to break in this case, I want to return n.
1860
01:25:14,280 --> 01:25:17,293
So, recall, that if you return from a function, you're done,
1861
01:25:17,293 --> 01:25:19,210
you're going to exit from right at that point.
1862
01:25:19,210 --> 01:25:20,320
So this would be fine.
1863
01:25:20,320 --> 01:25:22,680
You can just say return n inside of the loop,
1864
01:25:22,680 --> 01:25:25,320
or, if you would prefer to break out, you
1865
01:25:25,320 --> 01:25:26,940
could do something like this instead.
1866
01:25:26,940 --> 01:25:32,700
Break, and then down here, you could return, down here,
1867
01:25:32,700 --> 01:25:34,630
you could return n as well.
1868
01:25:34,630 --> 01:25:37,290
And let me make one point here before we go back up to main.
1869
01:25:37,290 --> 01:25:41,490
This is a little different from C. And this one's subtle.
1870
01:25:41,490 --> 01:25:47,250
What have I done here that in C would have been a bug, but is apparently not,
1871
01:25:47,250 --> 01:25:48,315
I claim, in Python.
1872
01:25:48,315 --> 01:25:50,860
1873
01:25:50,860 --> 01:25:52,220
It's super subtle, this one.
1874
01:25:52,220 --> 01:25:52,720
Yeah.
1875
01:25:52,720 --> 01:25:55,911
AUDIENCE: So aren't we like defining mostly object,
1876
01:25:55,911 --> 01:25:59,470
like we're using it first, defining an object?
1877
01:25:59,470 --> 01:26:04,275
[INAUDIBLE]
1878
01:26:04,275 --> 01:26:07,150
DAVID J. MALAN: So similar, it's not quite that we're using it first.
1879
01:26:07,150 --> 01:26:10,980
So it's OK not to declare a variable with like the data type.
1880
01:26:10,980 --> 01:26:15,420
We've addressed that before, but on line 9, we're assigning n a value, it seems.
1881
01:26:15,420 --> 01:26:18,600
And then we return n on line 12.
1882
01:26:18,600 --> 01:26:20,190
But notice the indentation.
1883
01:26:20,190 --> 01:26:25,410
In the world of C, if we had declared a variable inside of a loop, on line 9,
1884
01:26:25,410 --> 01:26:28,200
it would have been scoped to that loop, which
1885
01:26:28,200 --> 01:26:31,530
means as soon as you get out of that loop, like further down in the program,
1886
01:26:31,530 --> 01:26:33,340
n would not exist.
1887
01:26:33,340 --> 01:26:36,090
It would be local to the curly braces therein.
1888
01:26:36,090 --> 01:26:39,720
Here, logically, curly braces are gone, but the indentation
1889
01:26:39,720 --> 01:26:44,250
makes clear that n is still inside of this loop, between lines 8 through 11.
1890
01:26:44,250 --> 01:26:47,280
But n is actually still in scope in Python.
1891
01:26:47,280 --> 01:26:50,380
The moment you create a variable in Python, for better or for worse,
1892
01:26:50,380 --> 01:26:53,760
It is available everywhere within that function, even outside
1893
01:26:53,760 --> 01:26:55,690
of the loop in which you defined it.
1894
01:26:55,690 --> 01:26:59,070
So this logic is actually OK in Python.
1895
01:26:59,070 --> 01:27:02,138
In C, recall, to solve this same problem,
1896
01:27:02,138 --> 01:27:04,680
we would have had to do something a little hackish like this,
1897
01:27:04,680 --> 01:27:09,600
like define n up here on line 8, so that it exists, now, on line 10,
1898
01:27:09,600 --> 01:27:12,000
and so that it exists on line 13.
1899
01:27:12,000 --> 01:27:15,700
That is no longer an issue or need, in Python.
1900
01:27:15,700 --> 01:27:17,700
Once you create a variable, even if it's nested,
1901
01:27:17,700 --> 01:27:19,867
nested, nested inside of some loops or conditionals,
1902
01:27:19,867 --> 01:27:23,520
it still exists within the function itself.
1903
01:27:23,520 --> 01:27:27,870
All right, any questions then on this, before we now run this and then get
1904
01:27:27,870 --> 01:27:31,680
rid of the CS50 library again?
1905
01:27:31,680 --> 01:27:34,300
OK, so let me go ahead and get the height from the user.
1906
01:27:34,300 --> 01:27:36,758
Let's go ahead and create a variable in main called height.
1907
01:27:36,758 --> 01:27:38,460
Let's call this get height function.
1908
01:27:38,460 --> 01:27:43,380
And then let's use that height value, instead of something hardcoded there.
1909
01:27:43,380 --> 01:27:45,000
And let me see if this all works now.
1910
01:27:45,000 --> 01:27:46,410
Python of Mario.py.
1911
01:27:46,410 --> 01:27:49,110
Hopefully, I haven't messed up, but I did.
1912
01:27:49,110 --> 01:27:51,460
But this is an easy fix now.
1913
01:27:51,460 --> 01:27:51,960
Yeah.
1914
01:27:51,960 --> 01:27:53,085
AUDIENCE: Got to call main.
1915
01:27:53,085 --> 01:27:54,543
DAVID J. MALAN: I got to call main.
1916
01:27:54,543 --> 01:27:55,980
So again, I deleted that earlier.
1917
01:27:55,980 --> 01:27:56,920
But let me bring it back.
1918
01:27:56,920 --> 01:27:58,128
So I'm actually calling main.
1919
01:27:58,128 --> 01:28:02,190
Let me rerun Python of Mario.py, there we go, height 3.
1920
01:28:02,190 --> 01:28:03,880
Now it seems to be working.
1921
01:28:03,880 --> 01:28:05,880
So let's do one last thing with Mario, just
1922
01:28:05,880 --> 01:28:08,980
to tie together that idea now of exceptions from before.
1923
01:28:08,980 --> 01:28:11,070
Again, exceptions are a feature of Python,
1924
01:28:11,070 --> 01:28:13,060
whereby you can try to do something.
1925
01:28:13,060 --> 01:28:16,710
And if there's a problem, you can handle it in any way you see fit.
1926
01:28:16,710 --> 01:28:20,070
Previously, I handled it by just yelling at the user that that's not an Int.
1927
01:28:20,070 --> 01:28:23,460
But let's actually use this to re-implement CS50's own getInt
1928
01:28:23,460 --> 01:28:24,240
function.
1929
01:28:24,240 --> 01:28:27,130
Let me throw away CS50's getInt function.
1930
01:28:27,130 --> 01:28:32,880
And now let me go ahead and replace getInt with input.
1931
01:28:32,880 --> 01:28:35,670
But it's not sufficient to just use input.
1932
01:28:35,670 --> 01:28:39,480
What do I have to add to this line of code on line 8?
1933
01:28:39,480 --> 01:28:40,740
If I want to get back an Int?
1934
01:28:40,740 --> 01:28:41,790
AUDIENCE: The Int function.
1935
01:28:41,790 --> 01:28:43,832
DAVID J. MALAN: Yeah, I have to cast it to an Int
1936
01:28:43,832 --> 01:28:46,500
by calling the Int function around that value,
1937
01:28:46,500 --> 01:28:48,750
or I could do it on a separate line, just to be clear.
1938
01:28:48,750 --> 01:28:52,110
I could also do n equals Int of n.
1939
01:28:52,110 --> 01:28:55,020
That would work too, but it's sort of an unnecessary extra line.
1940
01:28:55,020 --> 01:28:57,990
This is not sufficient, because that does not change the value.
1941
01:28:57,990 --> 01:28:58,935
It creates the value.
1942
01:28:58,935 --> 01:29:00,060
But then it throws it away.
1943
01:29:00,060 --> 01:29:01,192
We need to assign it.
1944
01:29:01,192 --> 01:29:03,900
So the conventional way to do this would probably be in one line,
1945
01:29:03,900 --> 01:29:05,358
just to keep things nice and tight.
1946
01:29:05,358 --> 01:29:06,780
So that works fine now.
1947
01:29:06,780 --> 01:29:11,470
If I run Python of Mario.py, I can still type in 3, and all as well.
1948
01:29:11,470 --> 01:29:15,720
I can still type in negative 1, because that is an Int that I am handling.
1949
01:29:15,720 --> 01:29:18,750
What I'm not yet handling is weird input like cat
1950
01:29:18,750 --> 01:29:21,760
or some string that is not a base 10 number.
1951
01:29:21,760 --> 01:29:23,880
So here, again, is my traceback.
1952
01:29:23,880 --> 01:29:27,000
And notice that here, let me scroll up a little bit,
1953
01:29:27,000 --> 01:29:31,620
here we can actually see more detail in the traceback.
1954
01:29:31,620 --> 01:29:36,900
Notice that, just like in C, or just like in the debugger in VS Code,
1955
01:29:36,900 --> 01:29:38,100
you can see a few things.
1956
01:29:38,100 --> 01:29:41,490
You can see mention of module, that just means your file, main, which
1957
01:29:41,490 --> 01:29:43,013
is my main function, and get height.
1958
01:29:43,013 --> 01:29:44,430
So notice, it's kind of backwards.
1959
01:29:44,430 --> 01:29:46,720
It's top to bottom instead of bottom up, as we drew it
1960
01:29:46,720 --> 01:29:48,720
on the board the other day, and as we envisioned
1961
01:29:48,720 --> 01:29:50,520
stacks of trays in the cafeteria.
1962
01:29:50,520 --> 01:29:52,680
But this is your stack, of functions that
1963
01:29:52,680 --> 01:29:54,330
have been called, from top to bottom.
1964
01:29:54,330 --> 01:29:57,360
Get height is the most recent, main is the very first,
1965
01:29:57,360 --> 01:29:59,200
value error is the problem.
1966
01:29:59,200 --> 01:30:03,740
So let's try to do, let's try to do this literally, except if there's an error.
1967
01:30:03,740 --> 01:30:04,740
So what do I want to do?
1968
01:30:04,740 --> 01:30:09,720
I'm going to go in here, and I'm going to say, try to do the following.
1969
01:30:09,720 --> 01:30:17,070
Whoops, try to do the following, except if there's a value error, value error,
1970
01:30:17,070 --> 01:30:20,640
then go ahead and say something, well, like before, print,
1971
01:30:20,640 --> 01:30:23,830
that's not an integer exclamation point.
1972
01:30:23,830 --> 01:30:26,760
But the difference this time is because I'm in a loop, the user
1973
01:30:26,760 --> 01:30:29,200
is going to have a chance to recover from this issue.
1974
01:30:29,200 --> 01:30:32,340
So if I run Mario.py, 3 still works as before.
1975
01:30:32,340 --> 01:30:35,880
If I run Mario.py and type in cat, I detect it now,
1976
01:30:35,880 --> 01:30:39,240
and because I'm still in that loop, and because the program hasn't crashed,
1977
01:30:39,240 --> 01:30:43,050
because I've caught, so to speak, the value error, using this line of code
1978
01:30:43,050 --> 01:30:46,950
here, that's the way in Python to detect these kinds of errors,
1979
01:30:46,950 --> 01:30:49,680
that would otherwise end up being on the user's own screen.
1980
01:30:49,680 --> 01:30:51,540
If I type in cat, dog, that doesn't work.
1981
01:30:51,540 --> 01:30:56,820
If I type in, though, 2, I get my two hashes, because that's, indeed, an Int.
1982
01:30:56,820 --> 01:30:58,740
Are any questions on this, and we're not going
1983
01:30:58,740 --> 01:31:00,750
to spend too much time on exceptions, but just wanted
1984
01:31:00,750 --> 01:31:03,680
to show you what's involved with getting rid of those training wheels.
1985
01:31:03,680 --> 01:31:04,180
Yeah.
1986
01:31:04,180 --> 01:31:05,763
AUDIENCE: Then the hash marks in line.
1987
01:31:05,763 --> 01:31:07,305
DAVID J. MALAN: OK, so let's do this.
1988
01:31:07,305 --> 01:31:09,140
That actually comes to the earlier question
1989
01:31:09,140 --> 01:31:11,060
about printing the hashes on the same line,
1990
01:31:11,060 --> 01:31:13,808
or maybe something like this, where we have the little bricks
1991
01:31:13,808 --> 01:31:15,350
in the sky, or little question marks.
1992
01:31:15,350 --> 01:31:17,725
Let's recreate this idea, because the problem with print,
1993
01:31:17,725 --> 01:31:20,930
as was noted earlier, is you're automatically printing out new lines.
1994
01:31:20,930 --> 01:31:22,460
But what if we don't want that.
1995
01:31:22,460 --> 01:31:24,740
Well, let's change this program entirely.
1996
01:31:24,740 --> 01:31:26,310
Let me throw away all the functions.
1997
01:31:26,310 --> 01:31:29,220
Let's just go to a simpler world, where we're just doing this.
1998
01:31:29,220 --> 01:31:30,912
So let me start fresh in Mario.py.
1999
01:31:30,912 --> 01:31:33,120
I'm not going to bother with exceptions or functions.
2000
01:31:33,120 --> 01:31:39,410
Let's just do a very simple program, to create this idea, for i in range of 4
2001
01:31:39,410 --> 01:31:42,860
this time, because there are four of these things in the sky.
2002
01:31:42,860 --> 01:31:45,230
Let's go ahead and just print out a question mark
2003
01:31:45,230 --> 01:31:47,450
to represent each of those bricks.
2004
01:31:47,450 --> 01:31:51,140
Odds are you know this not going to end well, because these are unfortunately,
2005
01:31:51,140 --> 01:31:54,450
as you've predicted, on separate lines.
2006
01:31:54,450 --> 01:31:57,380
So it turns out that the print function actually
2007
01:31:57,380 --> 01:32:00,320
takes in multiple arguments, not just the thing you want to print,
2008
01:32:00,320 --> 01:32:03,650
but also some additional arguments, that allow you to specify
2009
01:32:03,650 --> 01:32:06,170
what the default line ending should be.
2010
01:32:06,170 --> 01:32:09,110
But what's interesting about this is that, if you
2011
01:32:09,110 --> 01:32:12,630
want to change the line ending to be something like,
2012
01:32:12,630 --> 01:32:16,790
quote unquote, "that is nothing," instead of backslash n,
2013
01:32:16,790 --> 01:32:19,310
this is not sufficient, because in Python, you
2014
01:32:19,310 --> 01:32:21,770
can have two types of arguments, or parameters.
2015
01:32:21,770 --> 01:32:25,160
Some arguments are positional, which is the fancy way of saying it's
2016
01:32:25,160 --> 01:32:26,690
a comma separated list of arguments.
2017
01:32:26,690 --> 01:32:29,540
And that's what we did all the time in C. Something comma, something
2018
01:32:29,540 --> 01:32:31,665
comma, something, we did it in printf all the time,
2019
01:32:31,665 --> 01:32:33,980
and in other functions that took multiple arguments.
2020
01:32:33,980 --> 01:32:37,880
In Python, you have, not only positional arguments,
2021
01:32:37,880 --> 01:32:41,660
where you just separate them by commas, to give one or two or three or more
2022
01:32:41,660 --> 01:32:42,650
arguments.
2023
01:32:42,650 --> 01:32:46,220
There are also named arguments, which looks weird but is
2024
01:32:46,220 --> 01:32:48,140
helpful for reasons like this.
2025
01:32:48,140 --> 01:32:50,900
If you read the documentation, you will see
2026
01:32:50,900 --> 01:32:54,740
that there is a named argument that Python accepts, called end.
2027
01:32:54,740 --> 01:32:57,680
And if you set that equal to something, that
2028
01:32:57,680 --> 01:33:00,200
will be used as the end of every line, instead
2029
01:33:00,200 --> 01:33:02,750
of the default, which the documentation will also say
2030
01:33:02,750 --> 01:33:04,700
is quote unquote backslash n.
2031
01:33:04,700 --> 01:33:09,000
So this line here has no effect on my logic at the moment.
2032
01:33:09,000 --> 01:33:13,280
But if I change it to just quote unquote, essentially overriding
2033
01:33:13,280 --> 01:33:18,470
the default new line character, and now run Mario again, now I get all four
2034
01:33:18,470 --> 01:33:19,278
on the same line.
2035
01:33:19,278 --> 01:33:20,570
There's a bit of a bug, though.
2036
01:33:20,570 --> 01:33:23,610
My prompt is not meant to be on the same line.
2037
01:33:23,610 --> 01:33:25,640
So I can fix that by just printing nothing.
2038
01:33:25,640 --> 01:33:28,640
But, really, it's not nothing, because you get the new line for free.
2039
01:33:28,640 --> 01:33:32,930
So let me run Python of Mario.py again, and now we
2040
01:33:32,930 --> 01:33:36,140
have what I intended in the first place, which was a little something that
2041
01:33:36,140 --> 01:33:37,170
looked like this.
2042
01:33:37,170 --> 01:33:40,910
And this is just one example of an argument that has a name.
2043
01:33:40,910 --> 01:33:43,280
But this is a common paradigm in Python 2,
2044
01:33:43,280 --> 01:33:46,250
to not just separate things by commas, but to be very specific,
2045
01:33:46,250 --> 01:33:50,810
because the print function might take 5, 10, even 20 different arguments.
2046
01:33:50,810 --> 01:33:54,628
And my God, if you had to enumerate like 10 or 20 commas,
2047
01:33:54,628 --> 01:33:55,670
you're going to screw up.
2048
01:33:55,670 --> 01:33:57,587
You're going to get things in the wrong order.
2049
01:33:57,587 --> 01:34:00,600
Named arguments allow you to be resilient against that.
2050
01:34:00,600 --> 01:34:02,690
So you only specify arguments by name, and it
2051
01:34:02,690 --> 01:34:06,004
doesn't matter what order they are in.
2052
01:34:06,004 --> 01:34:10,160
All right, any questions, then, on this, and the overriding of new line.
2053
01:34:10,160 --> 01:34:14,270
And to be clear, you can do something like, very weird,
2054
01:34:14,270 --> 01:34:19,910
but logically expected, like this, by just changing the line ending, too.
2055
01:34:19,910 --> 01:34:21,830
But the right way to solve the Mario problem
2056
01:34:21,830 --> 01:34:25,652
would be just to override it to be nothing like this.
2057
01:34:25,652 --> 01:34:27,110
All right, how about this for cool.
2058
01:34:27,110 --> 01:34:29,000
And this is why a lot of people like Python.
2059
01:34:29,000 --> 01:34:30,440
Suppose you don't really like loops.
2060
01:34:30,440 --> 01:34:31,970
You don't really like three-line programs,
2061
01:34:31,970 --> 01:34:34,637
because that was kind of three times longer than it needs to be.
2062
01:34:34,637 --> 01:34:39,200
What if you just printed out a question mark four times?
2063
01:34:39,200 --> 01:34:43,380
Python, whoops, Python of Mario.py, that also works.
2064
01:34:43,380 --> 01:34:46,550
So it turns out that, just like the plus operator in Python
2065
01:34:46,550 --> 01:34:50,570
can join things together, the multiply operator is not
2066
01:34:50,570 --> 01:34:51,840
arithmetic in this case.
2067
01:34:51,840 --> 01:34:56,070
It actually means, take this and concatenate it four times over.
2068
01:34:56,070 --> 01:34:59,000
So that's a way of just distilling into one line what
2069
01:34:59,000 --> 01:35:02,750
would have otherwise taken multiple lines in C, fewer, but still multiple
2070
01:35:02,750 --> 01:35:07,130
lines in Python, but is really now rather succinct in Python,
2071
01:35:07,130 --> 01:35:08,385
by doing that instead.
2072
01:35:08,385 --> 01:35:11,510
Let's do one last Mario example, which looked a little something like this.
2073
01:35:11,510 --> 01:35:14,090
If this is another part of the Mario interface,
2074
01:35:14,090 --> 01:35:16,800
this is like a grid of like 3 by 3 bricks, for instance.
2075
01:35:16,800 --> 01:35:20,690
So two dimensions now, just not just vertical, not horizontal, but now both.
2076
01:35:20,690 --> 01:35:23,130
Let's print out something like that, using hashes.
2077
01:35:23,130 --> 01:35:26,070
Well, how about, how do I do this.
2078
01:35:26,070 --> 01:35:29,210
So how about for i in range of 3.
2079
01:35:29,210 --> 01:35:34,280
Then I could do for j in range of 3, just because j comes after I
2080
01:35:34,280 --> 01:35:35,810
and that's reasonable for counting.
2081
01:35:35,810 --> 01:35:41,000
I could now print out a hash symbol, well, let's see what this does.
2082
01:35:41,000 --> 01:35:47,660
Python of Mario.py, OK, that's just one crazy long column.
2083
01:35:47,660 --> 01:35:51,240
What do I need to fix and where here, to make this look like this?
2084
01:35:51,240 --> 01:35:55,850
So 3 by 3 bricks, instead of one long column.
2085
01:35:55,850 --> 01:35:56,450
Any instincts?
2086
01:35:56,450 --> 01:36:00,500
AUDIENCE: Why don't we create a line and then we'll skip it.
2087
01:36:00,500 --> 01:36:03,450
DAVID J. MALAN: OK, so after printing 3, we want to skip a line.
2088
01:36:03,450 --> 01:36:05,750
So maybe like print out a blank line here.
2089
01:36:05,750 --> 01:36:06,740
OK, let's try that.
2090
01:36:06,740 --> 01:36:09,920
I like that instinct, right, print 3, new line, print 3, new line.
2091
01:36:09,920 --> 01:36:12,260
Let's go ahead and run Python of Mario.py.
2092
01:36:12,260 --> 01:36:16,580
OK, it's more visible, what I'm doing, but still wrong.
2093
01:36:16,580 --> 01:36:19,110
What can I, what's the remaining fix, though?
2094
01:36:19,110 --> 01:36:19,610
Yeah.
2095
01:36:19,610 --> 01:36:22,790
AUDIENCE: So right behind the two.
2096
01:36:22,790 --> 01:36:25,680
DAVID J. MALAN: Yeah, I'm getting an extra new line here,
2097
01:36:25,680 --> 01:36:27,870
which I don't want while I'm on this row.
2098
01:36:27,870 --> 01:36:31,850
So let me do n equals quote unquote, and now, together, your solutions might
2099
01:36:31,850 --> 01:36:33,950
take us the whole way there.
2100
01:36:33,950 --> 01:36:37,345
Python of Mario.py, voila, now we've got it, in two dimensions.
2101
01:36:37,345 --> 01:36:38,720
And even this, we can tighten up.
2102
01:36:38,720 --> 01:36:41,220
Like, we could just use the little trick we learned.
2103
01:36:41,220 --> 01:36:45,230
So we could just say, print a hash times 3 times,
2104
01:36:45,230 --> 01:36:47,810
and we can get rid of one of those loops altogether.
2105
01:36:47,810 --> 01:36:50,930
All it's doing is, whoops, all it's doing is automating that process.
2106
01:36:50,930 --> 01:36:53,060
But, no, I don't want to do that.
2107
01:36:53,060 --> 01:36:54,832
What do I, how do I fix this here.
2108
01:36:54,832 --> 01:36:56,540
I don't think I want this anymore, right?
2109
01:36:56,540 --> 01:36:58,350
Because that's giving me an extra new line.
2110
01:36:58,350 --> 01:37:01,260
So now this program is really tightened up.
2111
01:37:01,260 --> 01:37:03,050
Same thing, two lines of code.
2112
01:37:03,050 --> 01:37:07,220
But we're now implementing this same two dimensional structure here.
2113
01:37:07,220 --> 01:37:10,440
All right, any questions here on these?
2114
01:37:10,440 --> 01:37:10,940
Yeah.
2115
01:37:10,940 --> 01:37:16,790
AUDIENCE: Is there any practical reason why when we write n, n is, I mean,
2116
01:37:16,790 --> 01:37:19,850
the print function, you don't put any spaces in it.
2117
01:37:19,850 --> 01:37:22,430
DAVID J. MALAN: If I print n, any spaces.
2118
01:37:22,430 --> 01:37:23,300
Say that once more.
2119
01:37:23,300 --> 01:37:25,440
AUDIENCE: Whenever we write n, for example,
2120
01:37:25,440 --> 01:37:28,850
the print function is, you know, in order
2121
01:37:28,850 --> 01:37:33,820
to stop it from going to a new line, it seems like any spaces,
2122
01:37:33,820 --> 01:37:37,800
we did like n equals and then too close.
2123
01:37:37,800 --> 01:37:38,820
There were no spaces.
2124
01:37:38,820 --> 01:37:40,300
Did you do that on purpose?
2125
01:37:40,300 --> 01:37:42,300
DAVID J. MALAN: Oh.
2126
01:37:42,300 --> 01:37:43,200
yes, good question.
2127
01:37:43,200 --> 01:37:44,242
I see what you're saying.
2128
01:37:44,242 --> 01:37:48,030
So in a previous version, let me rewind in time, when we had this,
2129
01:37:48,030 --> 01:37:49,170
I did not put spaces.
2130
01:37:49,170 --> 01:37:51,720
The convention in Python is not to do that.
2131
01:37:51,720 --> 01:37:52,350
Why?
2132
01:37:52,350 --> 01:37:54,263
It just starts to add too much space.
2133
01:37:54,263 --> 01:37:56,430
And this is a little inconsistent, because, earlier,
2134
01:37:56,430 --> 01:37:58,470
when we talked about like pluses or spaces
2135
01:37:58,470 --> 01:38:00,750
around the less than or equal signs, I did say add it.
2136
01:38:00,750 --> 01:38:03,010
Here it's actually clearer and recommended
2137
01:38:03,010 --> 01:38:04,260
to keep them tighter together.
2138
01:38:04,260 --> 01:38:07,560
Otherwise it just becomes harder to read where the gaps are.
2139
01:38:07,560 --> 01:38:08,820
Good observation.
2140
01:38:08,820 --> 01:38:14,357
All right, let's do, how about, another five minute break.
2141
01:38:14,357 --> 01:38:14,940
Let's do that.
2142
01:38:14,940 --> 01:38:17,732
And then we're going to dive into some more sophisticated problems,
2143
01:38:17,732 --> 01:38:21,160
and then ultimately build with some audio and visual examples, as well.
2144
01:38:21,160 --> 01:38:23,130
See you in five.
2145
01:38:23,130 --> 01:38:28,260
All right, so almost all of the examples we just did
2146
01:38:28,260 --> 01:38:30,540
were recreations of what we did in week 1.
2147
01:38:30,540 --> 01:38:33,120
And recall that week 1 was like our most syntax-heavy week.
2148
01:38:33,120 --> 01:38:36,930
It was when we were first learning how to program in C. But after week 1,
2149
01:38:36,930 --> 01:38:39,900
we began to focus a bit more on ideas, like arrays,
2150
01:38:39,900 --> 01:38:41,640
and other higher-level constructs.
2151
01:38:41,640 --> 01:38:44,880
And we'll do that again here, condensing some of those first early weeks
2152
01:38:44,880 --> 01:38:47,250
into a fewer set of examples in Python.
2153
01:38:47,250 --> 01:38:50,020
And we'll culminate by actually taking Python out for a spin,
2154
01:38:50,020 --> 01:38:52,300
and doing things that would be way harder to do,
2155
01:38:52,300 --> 01:38:56,830
and way more time-consuming to do in C, even more so than the speller example.
2156
01:38:56,830 --> 01:38:59,790
But how do you go about figuring out what functions exist,
2157
01:38:59,790 --> 01:39:02,970
if you didn't hear it in class, you don't see it online,
2158
01:39:02,970 --> 01:39:06,480
but you want to see it officially, you can go to the Python documentation,
2159
01:39:06,480 --> 01:39:08,220
docs.python.org here.
2160
01:39:08,220 --> 01:39:11,340
And I will disclaim that, honestly, the Python documentation is not
2161
01:39:11,340 --> 01:39:12,750
terribly user-friendly.
2162
01:39:12,750 --> 01:39:15,240
Google will often be your friend, so googling something
2163
01:39:15,240 --> 01:39:19,350
you're interested in, to find your way to the appropriate page on Python.org,
2164
01:39:19,350 --> 01:39:22,410
or StackOverflow.com is another popular website.
2165
01:39:22,410 --> 01:39:24,780
As always, though, the line should be googling
2166
01:39:24,780 --> 01:39:27,600
things like, how do I convert a string to lowercase.
2167
01:39:27,600 --> 01:39:29,070
Like that's reasonable to Google.
2168
01:39:29,070 --> 01:39:33,160
Or how to convert to uppercase or how implement function in Python.
2169
01:39:33,160 --> 01:39:37,950
But googling, of course, things like how to implement problem set 6 in CS50,
2170
01:39:37,950 --> 01:39:39,120
of course, crosses the line.
2171
01:39:39,120 --> 01:39:42,078
But moving forward, and really with programming in general, like Google
2172
01:39:42,078 --> 01:39:44,220
and Stack Overflow are your friends, but the line
2173
01:39:44,220 --> 01:39:46,540
is between the reasonable and the unreasonable.
2174
01:39:46,540 --> 01:39:49,890
So let me officially use the Python documentation search, just
2175
01:39:49,890 --> 01:39:52,530
to search for something like the lowercase function.
2176
01:39:52,530 --> 01:39:54,540
Like, I know I can lowercase things in Python.
2177
01:39:54,540 --> 01:39:55,980
I don't quite remember how.
2178
01:39:55,980 --> 01:39:57,870
So let me just search for the word lower.
2179
01:39:57,870 --> 01:40:00,810
You're going to get, often, an overwhelming number of results,
2180
01:40:00,810 --> 01:40:03,678
because Python is a pretty big language, with lots of functionality.
2181
01:40:03,678 --> 01:40:05,970
And you're going to want to look for familiar patterns.
2182
01:40:05,970 --> 01:40:09,060
For whatever reason, string.lower, which is probably
2183
01:40:09,060 --> 01:40:12,420
more popular or more commonly used than these other ones, is third on the list.
2184
01:40:12,420 --> 01:40:15,460
But it's purple, because I clicked it a moment ago, when looking for it.
2185
01:40:15,460 --> 01:40:18,450
So str.lower is probably what I want, because I
2186
01:40:18,450 --> 01:40:21,060
am interested at the moment in lower casing strings.
2187
01:40:21,060 --> 01:40:25,258
When I click on that, this is an example of what Python's documentation tends
2188
01:40:25,258 --> 01:40:25,800
to look like.
2189
01:40:25,800 --> 01:40:27,340
It's in this general format.
2190
01:40:27,340 --> 01:40:29,340
Here's my str.lower function.
2191
01:40:29,340 --> 01:40:31,540
This returns a copy of the string, with all
2192
01:40:31,540 --> 01:40:33,750
of the cased characters converted to lowercase,
2193
01:40:33,750 --> 01:40:35,670
and the lower-casing algorithm, dot dot dot.
2194
01:40:35,670 --> 01:40:37,168
So that doesn't give me much.
2195
01:40:37,168 --> 01:40:38,460
It doesn't give me sample code.
2196
01:40:38,460 --> 01:40:40,210
But it does say what the function does.
2197
01:40:40,210 --> 01:40:43,890
And if we keep looking, you'll see mention of Lstrip, which is left strip.
2198
01:40:43,890 --> 01:40:48,120
I used its analog, Rstrip before, right strip, which allows you to remove,
2199
01:40:48,120 --> 01:40:51,000
that is strip, from the end of a string, something like white space,
2200
01:40:51,000 --> 01:40:52,930
like a new line, or even something else.
2201
01:40:52,930 --> 01:40:56,410
And if you scroll through string, this web page here.
2202
01:40:56,410 --> 01:40:58,110
And we're halfway down the page already.
2203
01:40:58,110 --> 01:41:00,180
If you see my scroll bar, tiny on the right,
2204
01:41:00,180 --> 01:41:05,250
there's a huge amount of functionality built into string objects, here.
2205
01:41:05,250 --> 01:41:08,460
And this is just testament to just how rich the language itself is.
2206
01:41:08,460 --> 01:41:12,620
But it's also reason to reassure that the goal, when
2207
01:41:12,620 --> 01:41:14,870
playing around with some new language and learning it,
2208
01:41:14,870 --> 01:41:16,598
is not to learn it exhaustively.
2209
01:41:16,598 --> 01:41:18,390
Just like in English or any human language,
2210
01:41:18,390 --> 01:41:20,640
there's always going to be vocab words you don't know,
2211
01:41:20,640 --> 01:41:23,563
ways of presenting the same information in some language.
2212
01:41:23,563 --> 01:41:25,230
That's going to be the case with Python.
2213
01:41:25,230 --> 01:41:28,620
And what we'll do today and this week in problem set 6 is really
2214
01:41:28,620 --> 01:41:30,120
get your footing with this language.
2215
01:41:30,120 --> 01:41:33,300
But you won't know all of Python, just like you won't know all of C.
2216
01:41:33,300 --> 01:41:36,300
And, honestly, you won't know all of any of these languages on your own,
2217
01:41:36,300 --> 01:41:38,800
unless you're, perhaps, using them full time professionally,
2218
01:41:38,800 --> 01:41:42,370
and even then, there's more libraries than one might even retain themselves.
2219
01:41:42,370 --> 01:41:45,420
So let's actually now pivot to a few other ideas,
2220
01:41:45,420 --> 01:41:47,560
that we'll implement in Python, in a moment.
2221
01:41:47,560 --> 01:41:50,010
Let me switch back over to VS Code here.
2222
01:41:50,010 --> 01:41:55,260
And let me whip up, say, a recreation of our scores example from week two,
2223
01:41:55,260 --> 01:41:57,883
where we averaged like three scores together.
2224
01:41:57,883 --> 01:42:00,300
And that was an opportunity in week 2 to play with arrays,
2225
01:42:00,300 --> 01:42:02,430
to realize how constrained arrays are.
2226
01:42:02,430 --> 01:42:03,720
They can't grow or shrink.
2227
01:42:03,720 --> 01:42:05,040
You have to decide in advance.
2228
01:42:05,040 --> 01:42:07,110
But let's see what's different here in Python.
2229
01:42:07,110 --> 01:42:11,580
So let me do Scores.py, and let me give myself an array in Python
2230
01:42:11,580 --> 01:42:15,780
called scores, sorry, let me give myself a variable in Python called scores.
2231
01:42:15,780 --> 01:42:17,940
Set it equal to a list of three scores, which
2232
01:42:17,940 --> 01:42:22,560
are the same ones we've used before, 72, 73, 33, in this context
2233
01:42:22,560 --> 01:42:24,630
meant to be scores, not ASCII values.
2234
01:42:24,630 --> 01:42:26,520
And then let's just do the average of these.
2235
01:42:26,520 --> 01:42:28,630
So average will be another variable.
2236
01:42:28,630 --> 01:42:32,910
And it turns out I can do, well, how did I sum these before?
2237
01:42:32,910 --> 01:42:36,580
I probably had a for loop to add one, then I knew how long they were.
2238
01:42:36,580 --> 01:42:39,580
Turns out in Python, you can just say sum of scores
2239
01:42:39,580 --> 01:42:41,530
divided by the length of scores.
2240
01:42:41,530 --> 01:42:43,130
That's going to give me my average.
2241
01:42:43,130 --> 01:42:46,210
So sum is a function that takes a list, in this case, as input,
2242
01:42:46,210 --> 01:42:49,000
and it just does the sum for you, with a for loop or whatever
2243
01:42:49,000 --> 01:42:49,930
underneath the hood.
2244
01:42:49,930 --> 01:42:53,480
Len gives you the length of the list, how many things are in it.
2245
01:42:53,480 --> 01:42:55,240
So I can dynamically figure that out.
2246
01:42:55,240 --> 01:43:00,340
Now let me go ahead and print out, using print, the word average, and then,
2247
01:43:00,340 --> 01:43:03,628
in curly braces, the actual average, close quote.
2248
01:43:03,628 --> 01:43:05,920
All right, so let's run this code, Python of Scores.py.
2249
01:43:05,920 --> 01:43:11,050
And there is my average, in this case, 59.33333 and so forth,
2250
01:43:11,050 --> 01:43:12,310
based on the math.
2251
01:43:12,310 --> 01:43:14,500
Well, let's actually, now, change this a little bit
2252
01:43:14,500 --> 01:43:17,625
and make it a little more interesting, and actually get input from the user
2253
01:43:17,625 --> 01:43:19,190
rather than hard coding this.
2254
01:43:19,190 --> 01:43:22,568
Let me go back up here and use from CS50 import getInt,
2255
01:43:22,568 --> 01:43:25,360
because I don't want to deal with all the exceptions and the loops.
2256
01:43:25,360 --> 01:43:27,820
Like, I just want to use someone else's function here.
2257
01:43:27,820 --> 01:43:31,600
Let me give myself an empty list called scores.
2258
01:43:31,600 --> 01:43:34,480
And this is not something we were able to do in C, right?
2259
01:43:34,480 --> 01:43:36,610
Because in C, if you tried to make an empty array,
2260
01:43:36,610 --> 01:43:39,590
well, that's pretty stupid, because you can't add things to it.
2261
01:43:39,590 --> 01:43:40,910
It's a fixed size.
2262
01:43:40,910 --> 01:43:42,650
So it wouldn't even let you do that.
2263
01:43:42,650 --> 01:43:45,640
But I can just create an empty list in Python,
2264
01:43:45,640 --> 01:43:48,340
because lists, unlike arrays, are really lengthless.
2265
01:43:48,340 --> 01:43:49,750
They'll grow and shrink.
2266
01:43:49,750 --> 01:43:52,870
But you and I are not dealing with all the pointers underneath the hood.
2267
01:43:52,870 --> 01:43:54,770
Python's doing that for us.
2268
01:43:54,770 --> 01:43:58,435
So now, let's go ahead and get a whole bunch of scores from the user.
2269
01:43:58,435 --> 01:43:59,810
How about three of them in total.
2270
01:43:59,810 --> 01:44:05,350
So for i in range of 3, let's go ahead and grab a score from the user,
2271
01:44:05,350 --> 01:44:07,810
using getInt, asking them for score.
2272
01:44:07,810 --> 01:44:14,840
And then let's go ahead and append, to the scores list, that particular score.
2273
01:44:14,840 --> 01:44:17,200
So it turns out that a list, and I could read the Python
2274
01:44:17,200 --> 01:44:21,280
documentation to confirm as much, lists have a function built into them,
2275
01:44:21,280 --> 01:44:25,155
and functions built into objects are generally known as methods,
2276
01:44:25,155 --> 01:44:26,530
if you've heard that term before.
2277
01:44:26,530 --> 01:44:29,320
Same idea, but whereas a function kind of stands on its own,
2278
01:44:29,320 --> 01:44:33,430
a method is a function built into an object, like a list here.
2279
01:44:33,430 --> 01:44:35,917
That's going to achieve the same result. Strictly speaking,
2280
01:44:35,917 --> 01:44:37,000
I don't need the variable.
2281
01:44:37,000 --> 01:44:40,603
Just like in C, I could tighten this up and do something like this as well.
2282
01:44:40,603 --> 01:44:42,520
But, I don't know, I kind of like it this way.
2283
01:44:42,520 --> 01:44:45,970
It's more clear, to me, at least, that what I'm doing here, getting the score
2284
01:44:45,970 --> 01:44:47,838
and then appending it to the list.
2285
01:44:47,838 --> 01:44:49,630
Now the rest of the code can stay the same.
2286
01:44:49,630 --> 01:44:54,700
Python of Scores.py, score will be 72, 73, 33.
2287
01:44:54,700 --> 01:44:55,820
And I get back the math.
2288
01:44:55,820 --> 01:44:58,840
But now the program's a little more dynamic, which is nice.
2289
01:44:58,840 --> 01:45:00,940
But there's other syntax I could use here.
2290
01:45:00,940 --> 01:45:04,330
Just so you've seen it, Python does have some neat syntactic tricks,
2291
01:45:04,330 --> 01:45:06,850
whereby, if you don't want to do scores.append,
2292
01:45:06,850 --> 01:45:11,290
you can actually say scores plus equals this score.
2293
01:45:11,290 --> 01:45:15,730
So you can actually concatenate lists together in Python 2.
2294
01:45:15,730 --> 01:45:18,340
Just as we used plus to join two strings together,
2295
01:45:18,340 --> 01:45:21,400
you can use plus to join two lists together.
2296
01:45:21,400 --> 01:45:24,040
The catch is, you need to put the one score I'm
2297
01:45:24,040 --> 01:45:26,770
adding here in a list of its own, which is kind of silly.
2298
01:45:26,770 --> 01:45:31,330
But it's necessary, so that this thing and this thing are both lists.
2299
01:45:31,330 --> 01:45:33,970
To do this more verbosely, which most programmers wouldn't
2300
01:45:33,970 --> 01:45:36,310
do, but just for clarity, this is the same thing
2301
01:45:36,310 --> 01:45:38,950
as saying scores plus this score.
2302
01:45:38,950 --> 01:45:42,910
So now maybe it's a little more clear that scores and brackets score
2303
01:45:42,910 --> 01:45:47,680
plural, sorry, singular, are both lists themselves, being concatenated
2304
01:45:47,680 --> 01:45:48,860
or joined together.
2305
01:45:48,860 --> 01:45:51,740
So two different ways, not sure one is better than the other.
2306
01:45:51,740 --> 01:45:57,640
This way is pretty common, but .append is also quite reasonable as well.
2307
01:45:57,640 --> 01:46:00,340
All right, how about another example from week two.
2308
01:46:00,340 --> 01:46:03,070
This one was called uppercase.
2309
01:46:03,070 --> 01:46:06,320
So let me do this in Uppercase.py, though, this time.
2310
01:46:06,320 --> 01:46:10,180
And let me import from CS50, get string again.
2311
01:46:10,180 --> 01:46:14,020
And let me go ahead and say, before will be my first variable.
2312
01:46:14,020 --> 01:46:17,500
Let me get a string from the user, asking them for a before string.
2313
01:46:17,500 --> 01:46:22,660
And then let me go ahead and say, after, just to demonstrate some changes,
2314
01:46:22,660 --> 01:46:25,190
upper-casing to this string.
2315
01:46:25,190 --> 01:46:27,850
Let me change my line ending to be that, using our new trick.
2316
01:46:27,850 --> 01:46:31,490
And this is where things get cool in Python, relatively speaking.
2317
01:46:31,490 --> 01:46:35,050
If I want to iterate over all of the characters in a string,
2318
01:46:35,050 --> 01:46:38,140
and print them out in uppercase, one way to do that would be this.
2319
01:46:38,140 --> 01:46:46,032
For c in the before string, go ahead and print out C.uppercase, sorry, C.upper,
2320
01:46:46,032 --> 01:46:49,240
but don't end the line yet, because I want to keep these all on the same line
2321
01:46:49,240 --> 01:46:50,440
until I'm all done.
2322
01:46:50,440 --> 01:46:51,490
So what am I doing?
2323
01:46:51,490 --> 01:46:54,970
Python of Uppercase.py, let me type in Hello in all lowercase.
2324
01:46:54,970 --> 01:46:57,010
I've just upper-cased the whole string.
2325
01:46:57,010 --> 01:46:57,700
How?
2326
01:46:57,700 --> 01:47:00,130
I first get string, calling it before.
2327
01:47:00,130 --> 01:47:02,680
I then just print out some fluffy text that says after colon,
2328
01:47:02,680 --> 01:47:04,840
and I get rid of the line ending, just so I can kind of line these up.
2329
01:47:04,840 --> 01:47:06,632
Notice I hit the spacebar a couple of times
2330
01:47:06,632 --> 01:47:08,620
just so letters line up to be pretty.
2331
01:47:08,620 --> 01:47:10,780
For c and before, this is new.
2332
01:47:10,780 --> 01:47:14,500
This is powerful in C, sorry, in Python, whereby
2333
01:47:14,500 --> 01:47:17,590
you don't have to do like Int i equals 0 and i less than this,
2334
01:47:17,590 --> 01:47:22,310
you could just say, for c in the string in question, for c and before.
2335
01:47:22,310 --> 01:47:25,510
And then here is just upper-casing that specific character,
2336
01:47:25,510 --> 01:47:27,700
and making sure we don't output a new line too soon.
2337
01:47:27,700 --> 01:47:29,920
But this is actually more work than I need to do.
2338
01:47:29,920 --> 01:47:34,000
Based on what we've seen thus far, like from our agreement example,
2339
01:47:34,000 --> 01:47:35,620
can I tighten this up further?
2340
01:47:35,620 --> 01:47:40,340
Can I collapse lines 5 and 6, maybe even 7, all together?
2341
01:47:40,340 --> 01:47:46,550
If the goal of this program is just to uppercase the before string,
2342
01:47:46,550 --> 01:47:49,640
how might I do this?
2343
01:47:49,640 --> 01:47:50,480
Yeah, in back.
2344
01:47:50,480 --> 01:47:52,287
AUDIENCE: Would it be str.upper?
2345
01:47:52,287 --> 01:47:54,620
DAVID J. MALAN: Str.upper, yeah, so I could do something
2346
01:47:54,620 --> 01:47:57,500
like this, after gets before.upper.
2347
01:47:57,500 --> 01:47:59,750
So it's not stir literally dot upper, stir
2348
01:47:59,750 --> 01:48:01,500
just represents the string in question.
2349
01:48:01,500 --> 01:48:04,620
So it would be before.upper, but right idea otherwise.
2350
01:48:04,620 --> 01:48:08,130
And so let me go ahead and just tweak my print statement a little bit.
2351
01:48:08,130 --> 01:48:12,810
Let me just go ahead and print out the after variable here, after creating it.
2352
01:48:12,810 --> 01:48:15,440
So this line is the same, I'm getting a string called before.
2353
01:48:15,440 --> 01:48:18,530
I'm creating another variable called after, and, as you propose,
2354
01:48:18,530 --> 01:48:21,960
I'm calling upper on the whole string, not one character at a time.
2355
01:48:21,960 --> 01:48:22,460
Why?
2356
01:48:22,460 --> 01:48:23,360
Because it's allowed.
2357
01:48:23,360 --> 01:48:27,350
And, again, in Python, there aren't technically characters individually.
2358
01:48:27,350 --> 01:48:28,760
There's only strings, anyway.
2359
01:48:28,760 --> 01:48:30,600
So I might as well do them all at once.
2360
01:48:30,600 --> 01:48:34,220
So if I rerun the code now, Python of Uppercase.py.
2361
01:48:34,220 --> 01:48:39,080
Now I'll type in Hello in all lowercase, and, oh, so close,
2362
01:48:39,080 --> 01:48:42,110
I think I can get rid of this override, because I'm
2363
01:48:42,110 --> 01:48:45,510
printing the whole thing out at once, not character by character.
2364
01:48:45,510 --> 01:48:49,880
So now if I type in Hello before, now I have an even tighter version
2365
01:48:49,880 --> 01:48:52,080
of the program here.
2366
01:48:52,080 --> 01:48:55,910
All right, any questions, then, on lists or on strings,
2367
01:48:55,910 --> 01:49:01,240
and what this kind of function, upper, represents, with its docs.
2368
01:49:01,240 --> 01:49:01,740
No?
2369
01:49:01,740 --> 01:49:04,760
All right, so a couple other building blocks before we start.
2370
01:49:04,760 --> 01:49:05,855
Oh.
2371
01:49:05,855 --> 01:49:06,480
Where was that?
2372
01:49:06,480 --> 01:49:08,010
AUDIENCE: To the right.
2373
01:49:08,010 --> 01:49:10,050
DAVID J. MALAN: To the right, right.
2374
01:49:10,050 --> 01:49:11,040
Yes, thank you.
2375
01:49:11,040 --> 01:49:17,202
AUDIENCE: Could you write, very close to variable string, and then print upper,
2376
01:49:17,202 --> 01:49:19,257
you start creating a variable upper.
2377
01:49:19,257 --> 01:49:21,840
DAVID J. MALAN: Yes, do I have to create this variable, upper?
2378
01:49:21,840 --> 01:49:22,590
No, I don't.
2379
01:49:22,590 --> 01:49:24,870
I could actually tighten this up, and, if you really
2380
01:49:24,870 --> 01:49:28,170
want to see something neat, inside of the curly braces,
2381
01:49:28,170 --> 01:49:31,050
you don't have to just put the names of variables.
2382
01:49:31,050 --> 01:49:33,600
You can put a small amount of logic, so long
2383
01:49:33,600 --> 01:49:36,780
as it doesn't start to look stupid and kind of overwhelmingly complex, such
2384
01:49:36,780 --> 01:49:38,940
that it's sort of bad design at that point.
2385
01:49:38,940 --> 01:49:40,540
I can tighten this up like this.
2386
01:49:40,540 --> 01:49:44,610
And now we're in Python of Uppercase.py, writing Hello again.
2387
01:49:44,610 --> 01:49:45,730
And that, too, works.
2388
01:49:45,730 --> 01:49:47,280
But I would be careful about this.
2389
01:49:47,280 --> 01:49:50,483
You want to resist the temptation of having like a long line of code that's
2390
01:49:50,483 --> 01:49:53,400
inside the curly braces, because it's just going to be harder to read.
2391
01:49:53,400 --> 01:49:55,890
But, absolutely, you could indeed do that, too.
2392
01:49:55,890 --> 01:49:58,950
All right, how about command line arguments, which was one thing
2393
01:49:58,950 --> 01:50:03,030
we introduced in week two also, so that we could actually have the ability
2394
01:50:03,030 --> 01:50:06,750
to take input from the user, whoops.
2395
01:50:06,750 --> 01:50:10,270
So we could actually take input from the user at the command line,
2396
01:50:10,270 --> 01:50:13,210
so as to take literally command line arguments.
2397
01:50:13,210 --> 01:50:16,020
These are a little different, but it follows the same paradigm.
2398
01:50:16,020 --> 01:50:19,860
There's no main by default. And there's no Def main int
2399
01:50:19,860 --> 01:50:26,050
arg c char, or we called it string, argv by default. There's none of this.
2400
01:50:26,050 --> 01:50:30,510
So if you want access to the argument vector, argv, you import it.
2401
01:50:30,510 --> 01:50:35,100
And it turns out, there's another module in Python, or library in Python
2402
01:50:35,100 --> 01:50:39,180
called CIS, and you can import from the system this thing called argv.
2403
01:50:39,180 --> 01:50:41,357
So same idea, different place.
2404
01:50:41,357 --> 01:50:42,940
Now I'm going to go ahead and do this.
2405
01:50:42,940 --> 01:50:47,820
Let's write a program that just requires that the user types in two, a word,
2406
01:50:47,820 --> 01:50:50,050
after the program's name, or none at all.
2407
01:50:50,050 --> 01:50:56,670
So if the length of argv equals 2, let's go ahead and print out, how about,
2408
01:50:56,670 --> 01:51:05,088
Hello comma argv bracket 1 close quote, else if they don't type two words
2409
01:51:05,088 --> 01:51:08,130
total at the prompt, let's just say the default's, like we did weeks ago,
2410
01:51:08,130 --> 01:51:09,160
Hello, world.
2411
01:51:09,160 --> 01:51:12,180
So the only thing that's new here is we're importing argv from CIS,
2412
01:51:12,180 --> 01:51:15,450
and we're using this fancy f-string format, which kind of to your point,
2413
01:51:15,450 --> 01:51:18,510
too, it's putting more complex logic in the curly braces.
2414
01:51:18,510 --> 01:51:19,270
But that's OK.
2415
01:51:19,270 --> 01:51:23,890
In this case, it's a list called argv, and we're getting bracket 1 from it.
2416
01:51:23,890 --> 01:51:27,780
Let's do Python of Argv.py, Enter, Hello, world.
2417
01:51:27,780 --> 01:51:31,480
What if I do Argv.py David at the command line.
2418
01:51:31,480 --> 01:51:32,730
Now I get Hello, David.
2419
01:51:32,730 --> 01:51:34,680
So there's one curiosity here.
2420
01:51:34,680 --> 01:51:39,375
Python is not included in argv, whereas in C, dot
2421
01:51:39,375 --> 01:51:41,940
slash whatever was the first thing.
2422
01:51:41,940 --> 01:51:45,510
If the analog in Python is that the name of your Python program
2423
01:51:45,510 --> 01:51:49,800
is the first thing, in bracket 0, which is why David is in bracket 1,
2424
01:51:49,800 --> 01:51:55,740
the word Python does not appear in the argv list, just to be clear.
2425
01:51:55,740 --> 01:51:57,990
But otherwise, the idea of these arguments
2426
01:51:57,990 --> 01:52:00,383
is exactly the same as before.
2427
01:52:00,383 --> 01:52:02,550
And in fact, what you can do, which is kind of cool,
2428
01:52:02,550 --> 01:52:05,730
is, because argv is a list, you can do things like this.
2429
01:52:05,730 --> 01:52:10,890
For arg in argv, go ahead and print out each argument.
2430
01:52:10,890 --> 01:52:12,990
So instead of using a for loop and i and all
2431
01:52:12,990 --> 01:52:17,220
of this, if I do Python of argv Enter, it just writes the program's name.
2432
01:52:17,220 --> 01:52:21,960
If I do Python of argv Foo, it puts Argv.py and Foo.
2433
01:52:21,960 --> 01:52:26,520
If I do, sorry, if I do Foo and bar, those words all print out.
2434
01:52:26,520 --> 01:52:28,770
If I do Foobar baz, those print out too.
2435
01:52:28,770 --> 01:52:31,830
And Foo and bar or baz are like a mathematician's x and y and z
2436
01:52:31,830 --> 01:52:35,200
for computer scientists, when you just need some placeholder words.
2437
01:52:35,200 --> 01:52:36,420
So this is just nice.
2438
01:52:36,420 --> 01:52:40,020
It reads a little more like English, and a for loop is just much more concise,
2439
01:52:40,020 --> 01:52:43,530
allows you to iterate very quickly when you want something like that.
2440
01:52:43,530 --> 01:52:46,170
Suppose I only wanted the real words that the human typed
2441
01:52:46,170 --> 01:52:47,250
after the program's name.
2442
01:52:47,250 --> 01:52:50,460
Like, suppose I want to ignore Argv.py.
2443
01:52:50,460 --> 01:52:53,640
I mean I could do something hackish like this.
2444
01:52:53,640 --> 01:52:59,105
If arg equals Argv.py, I could just ignore,
2445
01:52:59,105 --> 01:53:00,480
you know, let's invert the logic.
2446
01:53:00,480 --> 01:53:02,530
I could do this, for instance.
2447
01:53:02,530 --> 01:53:05,100
So if the arg does not equal the program name,
2448
01:53:05,100 --> 01:53:07,890
then go ahead and print out the word.
2449
01:53:07,890 --> 01:53:09,840
So I get Foobar and baz only.
2450
01:53:09,840 --> 01:53:14,400
Or, this is what's kind of neat about Python 2, let me undo that.
2451
01:53:14,400 --> 01:53:18,400
And let me just take a slice of the array of the list instead.
2452
01:53:18,400 --> 01:53:22,810
So it turns out, if argv is a list, I can actually say,
2453
01:53:22,810 --> 01:53:27,060
you know what, go into that list, start at element 1, instead of 0,
2454
01:53:27,060 --> 01:53:29,200
and then go all the way to the end.
2455
01:53:29,200 --> 01:53:31,800
And we have not seen this syntax in C. But this
2456
01:53:31,800 --> 01:53:34,410
is a way of slicing a list in Python.
2457
01:53:34,410 --> 01:53:35,820
So now watch what happens.
2458
01:53:35,820 --> 01:53:40,860
If I run Python of Argv.py, Foo bar baz Enter,
2459
01:53:40,860 --> 01:53:44,730
I get only a subset of the list, starting at position 1,
2460
01:53:44,730 --> 01:53:46,892
going all of the way to the end.
2461
01:53:46,892 --> 01:53:48,600
And you can even do kind of the opposite.
2462
01:53:48,600 --> 01:53:51,330
If, for whatever reason, you want to ignore the last element,
2463
01:53:51,330 --> 01:53:57,030
you can say colon, we could say colon negative 1,
2464
01:53:57,030 --> 01:53:59,560
and use a negative number, which we've not seen before,
2465
01:53:59,560 --> 01:54:02,470
which slices off the end of the list, as well.
2466
01:54:02,470 --> 01:54:06,000
So there's some syntactic tricks that tend to be powerful in Python 2,
2467
01:54:06,000 --> 01:54:10,140
even if at first glance, you might not need them for typical things.
2468
01:54:10,140 --> 01:54:12,798
All right, let's do one other example with exit,
2469
01:54:12,798 --> 01:54:15,090
and then we'll start actually applying some algorithms,
2470
01:54:15,090 --> 01:54:16,215
to make things interesting.
2471
01:54:16,215 --> 01:54:20,470
So in one last program here, let's do Exit.py, just to do one more mechanic,
2472
01:54:20,470 --> 01:54:22,210
before we introduce some algorithms.
2473
01:54:22,210 --> 01:54:24,220
And let's do this.
2474
01:54:24,220 --> 01:54:28,900
Let's import from CIS, import argv.
2475
01:54:28,900 --> 01:54:30,490
Let's now do this.
2476
01:54:30,490 --> 01:54:33,200
Let's make sure the user gives me one command line argument.
2477
01:54:33,200 --> 01:54:39,580
So if the length of argv does not equal 2 in total, then let's go ahead
2478
01:54:39,580 --> 01:54:42,790
and print out something like missing command line argument,
2479
01:54:42,790 --> 01:54:44,590
just to explain what the problem is.
2480
01:54:44,590 --> 01:54:47,380
And then let's do this.
2481
01:54:47,380 --> 01:54:48,580
We can exit.
2482
01:54:48,580 --> 01:54:50,710
But I'm going to use a better version of exit here.
2483
01:54:50,710 --> 01:54:52,900
Let me import two functions from CIS.
2484
01:54:52,900 --> 01:54:57,040
Turns out the better way to do this is with CIS.exit, because I can then exit
2485
01:54:57,040 --> 01:54:59,993
specifically 2, with this exit code.
2486
01:54:59,993 --> 01:55:02,410
Otherwise, down here, I'm going to go ahead and print out,
2487
01:55:02,410 --> 01:55:06,818
something like Hello, comma argv bracket 1, same as before.
2488
01:55:06,818 --> 01:55:08,360
And then I'm going to exit with zero.
2489
01:55:08,360 --> 01:55:10,410
So, again, this was a subtle thing we introduced
2490
01:55:10,410 --> 01:55:12,910
in week two, where you can actually have your programs exit,
2491
01:55:12,910 --> 01:55:15,430
with some number, where 0 signifies success,
2492
01:55:15,430 --> 01:55:17,350
and anything else signifies error.
2493
01:55:17,350 --> 01:55:19,240
This is just the same idea in Python.
2494
01:55:19,240 --> 01:55:23,920
So if I, for instance, just run the program like this, oops, I screwed up.
2495
01:55:23,920 --> 01:55:26,620
I meant to say exit here and exit here.
2496
01:55:26,620 --> 01:55:27,710
Let me do that again.
2497
01:55:27,710 --> 01:55:30,500
If I run this like this, I'm missing a command line argument.
2498
01:55:30,500 --> 01:55:33,200
So let me rerun it with like my name at the prompt.
2499
01:55:33,200 --> 01:55:37,030
So I have exactly two command line arguments, the file name and my name,
2500
01:55:37,030 --> 01:55:38,050
Hello comma David.
2501
01:55:38,050 --> 01:55:40,342
And if I do David Malan, it's not going to work either,
2502
01:55:40,342 --> 01:55:42,160
because now argv does not equal 2.
2503
01:55:42,160 --> 01:55:44,860
But the difference here is that we're exiting with 1,
2504
01:55:44,860 --> 01:55:49,900
so that special programs can detect an error, or 0 in the event of success.
2505
01:55:49,900 --> 01:55:52,180
And now there's one other way to do this, too.
2506
01:55:52,180 --> 01:55:54,460
Suppose that you're importing a lot of functions,
2507
01:55:54,460 --> 01:55:56,943
and you don't really want to make a mess of things
2508
01:55:56,943 --> 01:55:59,110
and just have all of these function names available,
2509
01:55:59,110 --> 01:56:01,630
without it being clear where they came from.
2510
01:56:01,630 --> 01:56:03,460
Let's just import all of CIS.
2511
01:56:03,460 --> 01:56:07,180
And let's just change our syntax, kind of like I proposed for CS50,
2512
01:56:07,180 --> 01:56:09,970
where we just prepend to all of these library functions,
2513
01:56:09,970 --> 01:56:13,420
CIS, just to be super-explicit where they came from,
2514
01:56:13,420 --> 01:56:18,837
and if there's another exit or argv value
2515
01:56:18,837 --> 01:56:21,920
that we want to import from a library, this is one way to avoid collision.
2516
01:56:21,920 --> 01:56:25,150
So if I do it one last time here, missing command line argument.
2517
01:56:25,150 --> 01:56:27,190
But David still actually worked.
2518
01:56:27,190 --> 01:56:30,250
All right, only to demonstrate how we can implement that same idea.
2519
01:56:30,250 --> 01:56:33,130
Let's now do something more powerful, like a search algorithm,
2520
01:56:33,130 --> 01:56:34,032
like binary search.
2521
01:56:34,032 --> 01:56:36,490
I'm going to go ahead and open up a file called Numbers.py,
2522
01:56:36,490 --> 01:56:40,420
and let's just do some searching or linear search, rather,
2523
01:56:40,420 --> 01:56:42,440
on a list of numbers.
2524
01:56:42,440 --> 01:56:44,060
Let's go ahead and do this.
2525
01:56:44,060 --> 01:56:47,050
How about import CIS as before.
2526
01:56:47,050 --> 01:56:52,840
Let me give myself a list of numbers, like 4, 6, 8, 2, 7, 5, 0,
2527
01:56:52,840 --> 01:56:54,670
so just a bunch of integers.
2528
01:56:54,670 --> 01:56:56,170
And then let's do this.
2529
01:56:56,170 --> 01:56:59,590
If you recall from week three, we searched for the number 0
2530
01:56:59,590 --> 01:57:01,880
at the end of the lockers on stage.
2531
01:57:01,880 --> 01:57:04,120
So let's just ask that question in Python.
2532
01:57:04,120 --> 01:57:05,860
No need for a loop or anything like that.
2533
01:57:05,860 --> 01:57:09,550
If 0 is in the numbers, go ahead and print out found.
2534
01:57:09,550 --> 01:57:13,420
And then let's just exit successfully, with 0, else, if we get down here,
2535
01:57:13,420 --> 01:57:15,670
let's just say print not found.
2536
01:57:15,670 --> 01:57:19,210
And then we'll CIS exit with 1.
2537
01:57:19,210 --> 01:57:21,820
So this is where Python starts to get powerful again.
2538
01:57:21,820 --> 01:57:23,050
Here's your list.
2539
01:57:23,050 --> 01:57:25,733
Here is your loop, that's doing all of the checking for you.
2540
01:57:25,733 --> 01:57:28,150
Underneath the hood, Python is going to use linear search.
2541
01:57:28,150 --> 01:57:29,817
You don't have to implement it yourself.
2542
01:57:29,817 --> 01:57:32,320
No while loop, no for loop, you just ask a question.
2543
01:57:32,320 --> 01:57:36,230
If 0 is in numbers, then do the following.
2544
01:57:36,230 --> 01:57:38,350
So that's one feature we now get with Python,
2545
01:57:38,350 --> 01:57:40,340
and get to throw away a lot of that code.
2546
01:57:40,340 --> 01:57:41,830
We can do it with strings, too.
2547
01:57:41,830 --> 01:57:44,840
Let me open a file called Names.py instead,
2548
01:57:44,840 --> 01:57:46,990
and do something that was even more involved in C,
2549
01:57:46,990 --> 01:57:50,020
because we needed Str Comp and the for loop, and so forth.
2550
01:57:50,020 --> 01:57:52,000
Let me import CIS for this file.
2551
01:57:52,000 --> 01:57:54,460
Let's give myself a bunch of names like we did in C.
2552
01:57:54,460 --> 01:58:01,630
And those were Bill and Charlie and Fred and George and Ginny,
2553
01:58:01,630 --> 01:58:05,440
and two more, Percy, and lastly Ron.
2554
01:58:05,440 --> 01:58:07,390
And recall, at the time, we looked for Ron.
2555
01:58:07,390 --> 01:58:09,432
And so we had to iterate through the whole thing,
2556
01:58:09,432 --> 01:58:11,810
doing Str Comp and i plus plus and all of that.
2557
01:58:11,810 --> 01:58:18,760
Now just ask the question, if Ron is in names, then let's go ahead
2558
01:58:18,760 --> 01:58:20,440
and, whoops, let me hide that.
2559
01:58:20,440 --> 01:58:22,250
I hit the command too soon.
2560
01:58:22,250 --> 01:58:26,180
Let me go ahead and say print, found, as before.
2561
01:58:26,180 --> 01:58:29,710
CIS exit 1, just to indicate success, and then down here,
2562
01:58:29,710 --> 01:58:32,840
if we get to this point, we can say not found.
2563
01:58:32,840 --> 01:58:36,170
And then we'll just CIS exit 1 instead.
2564
01:58:36,170 --> 01:58:40,960
So, again, this just does linear search for us by default, Python of Names.py,
2565
01:58:40,960 --> 01:58:44,410
we found Ron, because, indeed, he's there, and at the end of the list.
2566
01:58:44,410 --> 01:58:48,190
But we don't need to deal with all of the mechanics of it.
2567
01:58:48,190 --> 01:58:50,530
All right, let's take things one step further.
2568
01:58:50,530 --> 01:58:52,840
In week three, we also implemented the idea
2569
01:58:52,840 --> 01:58:56,980
of a phone book, that actually associated keys with values.
2570
01:58:56,980 --> 01:59:00,010
But remember, the phone book in C, was kind of a hack, right?
2571
01:59:00,010 --> 01:59:03,520
Because we first had two arrays, one with names, one with numbers.
2572
01:59:03,520 --> 01:59:07,330
Then we introduced structs, and so we gave you a person structure.
2573
01:59:07,330 --> 01:59:10,900
And then we had an array of persons.
2574
01:59:10,900 --> 01:59:15,040
You can do this in Python, using objects and things called classes.
2575
01:59:15,040 --> 01:59:17,670
But we can also just use a general purpose dictionary,
2576
01:59:17,670 --> 01:59:21,420
because just like in P set 5, you can associate keys with values, using
2577
01:59:21,420 --> 01:59:23,100
a hash table, using a try.
2578
01:59:23,100 --> 01:59:26,400
Well, similarly, can Python just do this for us.
2579
01:59:26,400 --> 01:59:29,250
From CS50, let's import get string.
2580
01:59:29,250 --> 01:59:32,760
And now let's give myself a dictionary of people,
2581
01:59:32,760 --> 01:59:36,540
D-I-C-T () open paren closed paren gives you a dictionary.
2582
01:59:36,540 --> 01:59:39,300
Or you can simplify the syntax, actually,
2583
01:59:39,300 --> 01:59:42,360
and a dictionary again is just keys and values, words and definitions.
2584
01:59:42,360 --> 01:59:45,060
You can also just use curly braces instead.
2585
01:59:45,060 --> 01:59:47,020
That gives me an empty dictionary.
2586
01:59:47,020 --> 01:59:50,400
But if I know what I want to put in it by default, let's put Carter in there,
2587
01:59:50,400 --> 01:59:57,790
with a number of plus 1-617-495-1000, just like last time, and put myself,
2588
01:59:57,790 --> 02:00:03,777
David, with plus 1-949-468-2750.
2589
02:00:03,777 --> 02:00:06,360
And it came to my attention, tragically, after class that day,
2590
02:00:06,360 --> 02:00:08,152
that we had a bug in our little Easter egg.
2591
02:00:08,152 --> 02:00:11,190
If today, you would like to call me or text me, at that number,
2592
02:00:11,190 --> 02:00:14,130
we have fixed the code that underlies that little Easter egg.
2593
02:00:14,130 --> 02:00:15,090
Spoiler ahead.
2594
02:00:15,090 --> 02:00:17,040
All right, so this now gives me a variable
2595
02:00:17,040 --> 02:00:21,120
called people, that's associating keys with values.
2596
02:00:21,120 --> 02:00:25,230
There is some new syntax here in Python, not just the curly braces,
2597
02:00:25,230 --> 02:00:28,290
but the colons, and the quotes on the left and the right.
2598
02:00:28,290 --> 02:00:31,380
This is a way, in Python, of associating keys
2599
02:00:31,380 --> 02:00:35,350
with values, words with definitions, anything with anything else.
2600
02:00:35,350 --> 02:00:38,550
And it's going to be a super-common paradigm, including in week seven,
2601
02:00:38,550 --> 02:00:42,450
when we look at CSS and HTML and web programming, keys and values
2602
02:00:42,450 --> 02:00:45,840
are like this omnipresent idea in computer science and programming,
2603
02:00:45,840 --> 02:00:49,300
because it's just a really useful way of associating one thing with another.
2604
02:00:49,300 --> 02:00:52,690
So, at this point in the story, we have a dictionary, a hash table,
2605
02:00:52,690 --> 02:00:56,190
if you will, of people, associating names with phone numbers,
2606
02:00:56,190 --> 02:00:57,675
just like a real world phone book.
2607
02:00:57,675 --> 02:01:01,200
So let's write a program that gets a string from the user and asks them
2608
02:01:01,200 --> 02:01:03,390
whose number they would like to look up.
2609
02:01:03,390 --> 02:01:09,510
Then, let's go ahead and say, if that name is in the people dictionary,
2610
02:01:09,510 --> 02:01:12,090
go ahead and print out that person's number,
2611
02:01:12,090 --> 02:01:14,730
by going into the people dictionary and going
2612
02:01:14,730 --> 02:01:19,480
to that specific name, within there, using an f-string for the whole thing.
2613
02:01:19,480 --> 02:01:21,960
So this is similar in spirit to before.
2614
02:01:21,960 --> 02:01:26,130
Linear search and dictionary lookups will just happen automatically for you
2615
02:01:26,130 --> 02:01:29,280
in Python, by just asking the question, if name and people.
2616
02:01:29,280 --> 02:01:31,170
And this line is just going to print out,
2617
02:01:31,170 --> 02:01:35,710
whoever is in the people dictionary, at that name.
2618
02:01:35,710 --> 02:01:40,200
So I'm using square brackets, because here's the interesting thing in Python,
2619
02:01:40,200 --> 02:01:43,320
just like you can index into an array, or a list in Python,
2620
02:01:43,320 --> 02:01:48,150
using numbers, 0, 1, 2, you can very conveniently index
2621
02:01:48,150 --> 02:01:53,080
into a dictionary in Python, using square brackets, as well.
2622
02:01:53,080 --> 02:01:56,070
And just to make clear what's going on here, let me go
2623
02:01:56,070 --> 02:02:00,480
and create a temporary variable, person equals people bracket name.
2624
02:02:00,480 --> 02:02:05,010
And then let's just, or, sorry, let's say, number equals people bracket name.
2625
02:02:05,010 --> 02:02:07,890
And that will just print out the number in question.
2626
02:02:07,890 --> 02:02:11,850
In C, and previously in Python, anything with square brackets like this
2627
02:02:11,850 --> 02:02:16,950
would have been go to a location in a list or an array, using a number.
2628
02:02:16,950 --> 02:02:20,790
But that can actually be a string, like a word the human has typed.
2629
02:02:20,790 --> 02:02:22,830
And this is what's amazing about dictionaries,
2630
02:02:22,830 --> 02:02:25,890
it's not like a big line, a big linear thing.
2631
02:02:25,890 --> 02:02:28,740
It's this table, that you can look up in one column the name,
2632
02:02:28,740 --> 02:02:31,060
and get back in the other column the number.
2633
02:02:31,060 --> 02:02:33,120
So let's go ahead and run Python of Phonebook.py,
2634
02:02:33,120 --> 02:02:38,100
found, not that, oh, wait.
2635
02:02:38,100 --> 02:02:41,880
That's not what's supposed to happen at all.
2636
02:02:41,880 --> 02:02:43,440
I think I'm in the wrong play.
2637
02:02:43,440 --> 02:02:44,290
Phonebook.py.
2638
02:02:44,290 --> 02:02:47,130
2639
02:02:47,130 --> 02:02:49,260
What's going on?
2640
02:02:49,260 --> 02:02:51,720
Print found.
2641
02:02:51,720 --> 02:02:53,580
I am confused.
2642
02:02:53,580 --> 02:02:55,830
OK, let's run this again.
2643
02:02:55,830 --> 02:02:59,970
Python of Phonebook.py, what the--
2644
02:02:59,970 --> 02:03:01,050
OK, stand by.
2645
02:03:01,050 --> 02:03:07,026
2646
02:03:07,026 --> 02:03:17,902
[KEYS CLICKING]
2647
02:03:17,902 --> 02:03:19,140
What the heck?
2648
02:03:19,140 --> 02:03:21,255
What am I not understanding here?
2649
02:03:21,255 --> 02:03:24,180
2650
02:03:24,180 --> 02:03:27,348
OK, Roxanne, Carter, do you see what I'm doing wrong?
2651
02:03:27,348 --> 02:03:29,220
AUDIENCE: I don't.
2652
02:03:29,220 --> 02:03:31,484
DAVID J. MALAN: What the--
2653
02:03:31,484 --> 02:03:33,720
[LAUGHTER]
2654
02:03:33,720 --> 02:03:34,230
Say again?
2655
02:03:34,230 --> 02:03:38,110
SPEAKER 47: When you found the test results, it was doing both commands.
2656
02:03:38,110 --> 02:03:43,390
DAVID J. MALAN: Oh, yeah, found, OK, we're going to do this.
2657
02:03:43,390 --> 02:03:45,622
One sec.
2658
02:03:45,622 --> 02:03:52,270
[KEYS CLICKING]
2659
02:03:52,270 --> 02:03:55,360
Whoa, OK.
2660
02:03:55,360 --> 02:03:57,270
All this is coming out of the video.
2661
02:03:57,270 --> 02:03:58,228
So.
2662
02:03:58,228 --> 02:03:59,164
[LAUGHTER]
2663
02:03:59,164 --> 02:04:01,310
[APPLAUSE]
2664
02:04:01,310 --> 02:04:01,810
Thanks.
2665
02:04:01,810 --> 02:04:05,400
2666
02:04:05,400 --> 02:04:06,283
All right.
2667
02:04:06,283 --> 02:04:08,200
I will try to figure out what was going wrong.
2668
02:04:08,200 --> 02:04:10,800
The best I can tell, it was running the wrong program.
2669
02:04:10,800 --> 02:04:12,820
I don't quite understand why.
2670
02:04:12,820 --> 02:04:14,170
So we will diagnose this later.
2671
02:04:14,170 --> 02:04:16,962
I just put the file into a temporary directory, for now, to run it.
2672
02:04:16,962 --> 02:04:22,710
So let me go ahead and just run this, Python of Phonebook.py,
2673
02:04:22,710 --> 02:04:24,240
type in, for instance, my name.
2674
02:04:24,240 --> 02:04:26,418
And there's my corresponding number.
2675
02:04:26,418 --> 02:04:27,960
Have no idea what was just happening.
2676
02:04:27,960 --> 02:04:30,060
But I will get to the bottom of it and update you,
2677
02:04:30,060 --> 02:04:31,360
if we can put our finger on it.
2678
02:04:31,360 --> 02:04:34,890
So this was just an example, now, of implementing a phone book.
2679
02:04:34,890 --> 02:04:37,590
Let's now consider what we can do that's a little more
2680
02:04:37,590 --> 02:04:40,410
powerful, in these examples, like a phone book that
2681
02:04:40,410 --> 02:04:42,150
actually keeps this information around.
2682
02:04:42,150 --> 02:04:45,510
Thus far, these simple phone book examples throw the information away.
2683
02:04:45,510 --> 02:04:48,780
But using CSV files, comma separated values,
2684
02:04:48,780 --> 02:04:51,555
maybe we could actually keep around the names and numbers,
2685
02:04:51,555 --> 02:04:53,430
so that, like on your phone, you can actually
2686
02:04:53,430 --> 02:04:55,780
keep your contacts around long-term.
2687
02:04:55,780 --> 02:04:59,060
So I'm going to go ahead now and do a slightly different example.
2688
02:04:59,060 --> 02:05:03,240
And let me just hide this detail, so it's not confusing.
2689
02:05:03,240 --> 02:05:06,630
Whoops, I'm going to change my prompt temporarily.
2690
02:05:06,630 --> 02:05:10,540
So let me go ahead now and refine this example as follows.
2691
02:05:10,540 --> 02:05:13,830
I'm going to go into Phonebook.py, and I'm
2692
02:05:13,830 --> 02:05:16,290
going to import a whole library called CSV.
2693
02:05:16,290 --> 02:05:18,150
And this is a powerful one, because Python
2694
02:05:18,150 --> 02:05:21,870
comes with a library that just handles CSV files for you.
2695
02:05:21,870 --> 02:05:25,600
A CSV file is just a file with comma separated values.
2696
02:05:25,600 --> 02:05:29,580
And, in fact, to demonstrate this, let me check on one thing
2697
02:05:29,580 --> 02:05:32,460
here, just to make this a little more real.
2698
02:05:32,460 --> 02:05:39,010
To demonstrate this, let's go ahead and do this.
2699
02:05:39,010 --> 02:05:41,970
Let me import the CSV library from CS50.
2700
02:05:41,970 --> 02:05:43,830
Let me import getString.
2701
02:05:43,830 --> 02:05:47,550
Let me then open a file, using the open function,
2702
02:05:47,550 --> 02:05:52,410
open a file called Phonebook.csv, in append format,
2703
02:05:52,410 --> 02:05:54,900
in contrast with read format and write format.
2704
02:05:54,900 --> 02:05:58,450
Write just blows it away if it exists, append adds to the bottom of it.
2705
02:05:58,450 --> 02:06:00,930
So I keep this phone book around, just like you might
2706
02:06:00,930 --> 02:06:02,868
keep adding contacts to your phone.
2707
02:06:02,868 --> 02:06:05,410
Now let me go ahead and get a couple of values from the user.
2708
02:06:05,410 --> 02:06:08,820
Let me say getString and ask the user for a name.
2709
02:06:08,820 --> 02:06:14,160
Then let me getString again, and ask the user for their number.
2710
02:06:14,160 --> 02:06:16,185
And now, let me go ahead and do this.
2711
02:06:16,185 --> 02:06:18,060
And this is new, and this is Python-specific.
2712
02:06:18,060 --> 02:06:20,820
And you would only know this by following a tutorial,
2713
02:06:20,820 --> 02:06:22,480
or reading the documentation.
2714
02:06:22,480 --> 02:06:24,870
Let me give myself a variable called writer,
2715
02:06:24,870 --> 02:06:29,950
and ask the CSV library for a writer to that file.
2716
02:06:29,950 --> 02:06:33,390
Then, let me go ahead and use that writer variable,
2717
02:06:33,390 --> 02:06:36,720
use a function or a method inside of it, called write row,
2718
02:06:36,720 --> 02:06:41,200
to write out a list containing that person's name and number.
2719
02:06:41,200 --> 02:06:44,310
Notice the square brackets inside the parentheses,
2720
02:06:44,310 --> 02:06:49,350
because I'm just printing a list to that particular row in the file.
2721
02:06:49,350 --> 02:06:51,100
And then I'm just going to close the file.
2722
02:06:51,100 --> 02:06:52,742
So what is the effect of all of this?
2723
02:06:52,742 --> 02:06:55,200
Well, let me go ahead and run this version of Phonebook.py,
2724
02:06:55,200 --> 02:06:56,680
and I'm prompted for a name.
2725
02:06:56,680 --> 02:07:05,130
Let's do Carter's first, plus 1-617-495-1000, and then,
2726
02:07:05,130 --> 02:07:07,770
let's go ahead and LS.
2727
02:07:07,770 --> 02:07:10,960
Notice in my current directory, there's two files now, Phonebook.py,
2728
02:07:10,960 --> 02:07:14,430
which I wrote, and apparently Phonebook.csv.
2729
02:07:14,430 --> 02:07:16,830
CSV just stands for comma separated values.
2730
02:07:16,830 --> 02:07:20,380
And it's like a very simple way of storing data in a spreadsheet,
2731
02:07:20,380 --> 02:07:23,670
if you will, where the comma represents the separation between your columns.
2732
02:07:23,670 --> 02:07:26,370
There's only two columns here, name and number.
2733
02:07:26,370 --> 02:07:29,580
But, because I'm writing to this file in append mode,
2734
02:07:29,580 --> 02:07:33,220
let me run it one more time, Python of Phonebook.py,
2735
02:07:33,220 --> 02:07:41,490
and let me go ahead and do David and plus 1-949-468-2750, Enter.
2736
02:07:41,490 --> 02:07:43,350
And notice what happened in the CSV file.
2737
02:07:43,350 --> 02:07:46,380
It automatically updated, because I'm now persisting
2738
02:07:46,380 --> 02:07:49,000
this data to the file in question.
2739
02:07:49,000 --> 02:07:51,360
So if I wanted to now read this file in, I
2740
02:07:51,360 --> 02:07:55,680
could actually go ahead and do linear search on the data,
2741
02:07:55,680 --> 02:07:58,650
using a read function to actually read from the CSV.
2742
02:07:58,650 --> 02:08:01,350
But, for now, we'll just leave it a little simply as write.
2743
02:08:01,350 --> 02:08:03,270
And let me make one refinement here.
2744
02:08:03,270 --> 02:08:07,020
It turns out that, if you're in the habit of re-opening a file,
2745
02:08:07,020 --> 02:08:09,330
you don't have to even close it explicitly.
2746
02:08:09,330 --> 02:08:10,920
You can instead do this.
2747
02:08:10,920 --> 02:08:16,050
You can instead say, with the opening of a file called Phonebook.csv
2748
02:08:16,050 --> 02:08:21,300
in append mode, calling the thing file, go ahead and do all of these lines
2749
02:08:21,300 --> 02:08:22,350
here.
2750
02:08:22,350 --> 02:08:24,377
So the with keyword is a new thing in Python.
2751
02:08:24,377 --> 02:08:27,210
And it's used in a few different ways, but one of the ways it's used
2752
02:08:27,210 --> 02:08:28,335
is to tighten up code here.
2753
02:08:28,335 --> 02:08:30,418
And I'm going to move my variables to the outside,
2754
02:08:30,418 --> 02:08:32,910
because they don't need to be inside of the with statement,
2755
02:08:32,910 --> 02:08:33,868
where the file is open.
2756
02:08:33,868 --> 02:08:36,452
This just has the effect of ensuring that you, the programmer,
2757
02:08:36,452 --> 02:08:38,790
don't screw up, and accidentally don't close your file.
2758
02:08:38,790 --> 02:08:40,680
In fact, you might recall, from C, Valgrind
2759
02:08:40,680 --> 02:08:45,237
might have complained at you, if you had a file that, you didn't close a file,
2760
02:08:45,237 --> 02:08:47,820
you might have had a memory leak as a result. The with keyword
2761
02:08:47,820 --> 02:08:51,840
takes care of all of that for you, as well.
2762
02:08:51,840 --> 02:08:54,670
How about let's do, want to do this.
2763
02:08:54,670 --> 02:08:57,960
How about, let's do one other thing.
2764
02:08:57,960 --> 02:08:59,230
Let's do this.
2765
02:08:59,230 --> 02:09:02,280
Let me go ahead and propose, that on your phone or laptop
2766
02:09:02,280 --> 02:09:07,470
here, or online, go to this URL here, where you'll find a Google form.
2767
02:09:07,470 --> 02:09:10,290
And just to show that these CSVs are actually kind of omnipresent,
2768
02:09:10,290 --> 02:09:11,850
and if you've ever like used a Google Form
2769
02:09:11,850 --> 02:09:13,560
or managed a student group, or something where you've
2770
02:09:13,560 --> 02:09:15,750
collected data via Google Forms, you can actually
2771
02:09:15,750 --> 02:09:18,640
export all of that data via CSV files.
2772
02:09:18,640 --> 02:09:21,150
So go ahead to this URL here.
2773
02:09:21,150 --> 02:09:22,950
And those of you watching on demand later,
2774
02:09:22,950 --> 02:09:24,540
will find that the form is no longer working,
2775
02:09:24,540 --> 02:09:26,030
since we're only doing this live.
2776
02:09:26,030 --> 02:09:27,780
But that will lead to a Google Form that's
2777
02:09:27,780 --> 02:09:30,750
going to let everyone input their answer to a question,
2778
02:09:30,750 --> 02:09:33,660
like what house do you want to end up into,
2779
02:09:33,660 --> 02:09:36,630
sort of an approximation of the sorting hat in Harry Potter.
2780
02:09:36,630 --> 02:09:40,680
And via this form, will we then have the ability to export,
2781
02:09:40,680 --> 02:09:43,780
we'll see, a CSV file.
2782
02:09:43,780 --> 02:09:47,610
So let's give you a moment to do that.
2783
02:09:47,610 --> 02:09:50,460
In just a moment, I'll share my version of the screen, which
2784
02:09:50,460 --> 02:09:54,330
is going to let me actually open the file, the form itself.
2785
02:09:54,330 --> 02:09:59,070
And in just a moment, I'll switch over.
2786
02:09:59,070 --> 02:10:01,020
OK, so this is now my version of the form
2787
02:10:01,020 --> 02:10:04,290
here, where we have 200 plus responses to a simple question of the form, what
2788
02:10:04,290 --> 02:10:08,010
house do you belong in, Gryffindor, Hufflepuff, Ravenclaw, or Slytherin.
2789
02:10:08,010 --> 02:10:12,800
If I go over to responses, I'll see all of the responses in the GUI form here.
2790
02:10:12,800 --> 02:10:15,300
So graphical user interface, and we could flip through this.
2791
02:10:15,300 --> 02:10:20,010
And it looks like, interestingly, 40% of Harvard students
2792
02:10:20,010 --> 02:10:24,223
want to be in Gryffindor, 22% in Slytherin, and everyone else
2793
02:10:24,223 --> 02:10:25,140
in between the others.
2794
02:10:25,140 --> 02:10:27,270
But you might have noticed, if ever using a Google Form,
2795
02:10:27,270 --> 02:10:28,720
this Google Spreadsheets link.
2796
02:10:28,720 --> 02:10:30,010
So I'm going to go ahead and click that.
2797
02:10:30,010 --> 02:10:32,460
And that's going to automatically open, in this case, Google Spreadsheets.
2798
02:10:32,460 --> 02:10:35,290
But you can do the same thing with Office 365 as well.
2799
02:10:35,290 --> 02:10:38,040
And now you see the raw data as a spreadsheet.
2800
02:10:38,040 --> 02:10:42,900
But in Google Spreadsheets, if I go to File and then I go to Download,
2801
02:10:42,900 --> 02:10:46,800
notice I can download this as an Excel file, a PDF, and also
2802
02:10:46,800 --> 02:10:48,910
a CSV, comma separated values.
2803
02:10:48,910 --> 02:10:50,620
So let me go ahead and do that.
2804
02:10:50,620 --> 02:10:53,920
That gives me a file in my Downloads folder on my computer.
2805
02:10:53,920 --> 02:10:57,970
I'm going to now go back to my code editor here.
2806
02:10:57,970 --> 02:11:00,180
And what I'm going to go ahead and do is upload
2807
02:11:00,180 --> 02:11:04,320
this file, from my Downloads folder to VS Code,
2808
02:11:04,320 --> 02:11:06,610
so that we can actually see it within here.
2809
02:11:06,610 --> 02:11:08,220
And now you can see this open file.
2810
02:11:08,220 --> 02:11:11,220
And I'm going to shorten its name, just so it's a little easier to read.
2811
02:11:11,220 --> 02:11:15,990
I'm going to rename this using the MV command, to just Hogwarts.csv.
2812
02:11:15,990 --> 02:11:19,367
And then we can see, in the file, that there's two columns, timestamp column
2813
02:11:19,367 --> 02:11:21,450
house, where you have a whole bunch of time stamps
2814
02:11:21,450 --> 02:11:24,270
when people filled out the form, with someone very early in class.
2815
02:11:24,270 --> 02:11:25,980
And then everyone else just a moment ago.
2816
02:11:25,980 --> 02:11:29,310
And the second value, after each comma, is the name of the house.
2817
02:11:29,310 --> 02:11:32,040
Well, let me go ahead here and implement a program
2818
02:11:32,040 --> 02:11:36,100
in a file called Hogwarts.py, that processes this data.
2819
02:11:36,100 --> 02:11:38,280
So in Hogwarts.py, let's just write a program
2820
02:11:38,280 --> 02:11:41,440
that now reads a CSV, in this case not a phone book,
2821
02:11:41,440 --> 02:11:43,410
but everyone's sorting hat information.
2822
02:11:43,410 --> 02:11:45,450
And I'm going to go ahead and Import CSV.
2823
02:11:45,450 --> 02:11:48,660
And suppose I want to answer a reasonable question, ignoring
2824
02:11:48,660 --> 02:11:52,470
the fact that Google's GUI or graphical user interface, can do this for me.
2825
02:11:52,470 --> 02:11:55,320
I just want to count up who's going to be in which house.
2826
02:11:55,320 --> 02:11:59,640
So let me give myself a dictionary called houses, that's initially empty,
2827
02:11:59,640 --> 02:12:00,780
with curly braces.
2828
02:12:00,780 --> 02:12:02,790
And let me pre-create a few keys.
2829
02:12:02,790 --> 02:12:07,500
Let me say Gryffindor is going to be initialized to 0,
2830
02:12:07,500 --> 02:12:11,820
Hufflepuff will be initialized to 0 as well, Ravenclaw
2831
02:12:11,820 --> 02:12:13,200
will be initialized to 0.
2832
02:12:13,200 --> 02:12:16,770
And finally, Slytherin will be initialized to 0.
2833
02:12:16,770 --> 02:12:19,950
So here's another example of a dictionary, or a hash table,
2834
02:12:19,950 --> 02:12:22,140
just being a very general-purpose piece of data.
2835
02:12:22,140 --> 02:12:23,760
You can have keys and values.
2836
02:12:23,760 --> 02:12:25,470
The keys, in this case, are the houses.
2837
02:12:25,470 --> 02:12:28,500
The values are initially zero, but I'm going to use this,
2838
02:12:28,500 --> 02:12:33,600
instead of like four separate variables, to keep track of everyone's answer
2839
02:12:33,600 --> 02:12:34,730
to this form.
2840
02:12:34,730 --> 02:12:35,730
So I'm going to do this.
2841
02:12:35,730 --> 02:12:43,180
With opening Hogwarts.csv, in read mode, not append, I don't want to change it.
2842
02:12:43,180 --> 02:12:46,440
I just want to read it, as file as my variable name.
2843
02:12:46,440 --> 02:12:49,530
Let's go ahead and create a reader this time,
2844
02:12:49,530 --> 02:12:54,710
that is using the reader function in the CSV library, by opening that file.
2845
02:12:54,710 --> 02:12:57,210
I'm going to go ahead and ignore the first line of the file,
2846
02:12:57,210 --> 02:13:00,270
because, recall, that the first line is just timestamp and house.
2847
02:13:00,270 --> 02:13:01,450
I want to get the real data.
2848
02:13:01,450 --> 02:13:03,540
So this next function is just a little trick
2849
02:13:03,540 --> 02:13:06,730
for ignoring the first line of the file.
2850
02:13:06,730 --> 02:13:07,800
Then let's do this.
2851
02:13:07,800 --> 02:13:12,180
For every other row in the reader, that is line by line,
2852
02:13:12,180 --> 02:13:15,420
get the current person's house, which is in row bracket 1.
2853
02:13:15,420 --> 02:13:18,213
This is what the CSV reader library is doing for us.
2854
02:13:18,213 --> 02:13:20,130
It's handling all of the reading of this file.
2855
02:13:20,130 --> 02:13:23,760
It figures out where the comma is, and, for every row in the file,
2856
02:13:23,760 --> 02:13:26,250
it hands you back a list of size 2.
2857
02:13:26,250 --> 02:13:31,090
In bracket 0 is the time stamp, in bracket 1 is the house name.
2858
02:13:31,090 --> 02:13:34,830
So, in my code, I can say house equals row bracket 1.
2859
02:13:34,830 --> 02:13:36,970
I don't care about the time stamp for this program.
2860
02:13:36,970 --> 02:13:41,070
And then let's go into my dictionary called houses, plural, index
2861
02:13:41,070 --> 02:13:47,370
into it at the house location, by its name, and increment that 0 to 1.
2862
02:13:47,370 --> 02:13:50,280
And now, at the end of this block of code,
2863
02:13:50,280 --> 02:13:53,040
that has the effect of iterating over every line of the file,
2864
02:13:53,040 --> 02:13:55,470
updating my dictionary in four different places,
2865
02:13:55,470 --> 02:13:59,190
based on whether someone typed Gryffindor or Slytherin or anything
2866
02:13:59,190 --> 02:13:59,700
else.
2867
02:13:59,700 --> 02:14:03,810
And notice that I'm using the name of the house to index into my dictionary,
2868
02:14:03,810 --> 02:14:07,500
to essentially go up to this little cheat sheet and change the 0 to a 1,
2869
02:14:07,500 --> 02:14:10,020
the 1 to a 2, the 2 to a 3, instead of having
2870
02:14:10,020 --> 02:14:12,000
like four separate variables, which would just
2871
02:14:12,000 --> 02:14:14,070
be much more annoying to maintain.
2872
02:14:14,070 --> 02:14:16,290
Down at the bottom, let's just print out the results.
2873
02:14:16,290 --> 02:14:19,620
For each house in those houses, iterating over
2874
02:14:19,620 --> 02:14:21,750
the keys they're in by default in Python,
2875
02:14:21,750 --> 02:14:24,630
let's go ahead and print out an f-string that says,
2876
02:14:24,630 --> 02:14:29,460
the current house has the current count.
2877
02:14:29,460 --> 02:14:35,070
And count will be the result of indexing into houses, for that given house.
2878
02:14:35,070 --> 02:14:36,810
And let me close my quote.
2879
02:14:36,810 --> 02:14:41,940
So let's run this to summarize the data, Hogwarts.py, 140 of you
2880
02:14:41,940 --> 02:14:46,200
answered Gryffindor, 54 Hufflepuff, 72 Ravenclaw, and 80 of you Slytherin.
2881
02:14:46,200 --> 02:14:48,570
And that's just my now way of code, and this is, oh,
2882
02:14:48,570 --> 02:14:52,227
my God, so much easier than C, to actually analyze data in this way.
2883
02:14:52,227 --> 02:14:55,560
And one of the reasons that Python is so popular for data science and analytics,
2884
02:14:55,560 --> 02:14:59,910
more generally, is that it's actually really easy to manipulate data, and run
2885
02:14:59,910 --> 02:15:00,940
analytics like this.
2886
02:15:00,940 --> 02:15:02,370
And let me clean this up slightly.
2887
02:15:02,370 --> 02:15:05,160
It's a little annoying that I just have to know and trust
2888
02:15:05,160 --> 02:15:10,410
that the house name is in bracket 1 and timestamp is in bracket 0.
2889
02:15:10,410 --> 02:15:11,440
Let's clean this up.
2890
02:15:11,440 --> 02:15:16,530
There's something called a Dictionary Reader in the CSV library
2891
02:15:16,530 --> 02:15:17,880
that I can use instead.
2892
02:15:17,880 --> 02:15:22,470
Capital D, capital R, this means I can throw away this next thing,
2893
02:15:22,470 --> 02:15:24,900
because what a dictionary reader does is it
2894
02:15:24,900 --> 02:15:28,890
still returns to me every row from the file, one after the other,
2895
02:15:28,890 --> 02:15:32,560
but it doesn't just give me a list of size 2 representing each row.
2896
02:15:32,560 --> 02:15:33,960
It gives me a dictionary.
2897
02:15:33,960 --> 02:15:39,000
And it uses, as the keys in that dictionary, timestamp and house,
2898
02:15:39,000 --> 02:15:41,460
for every row in the file, which is just to say
2899
02:15:41,460 --> 02:15:43,950
it makes my code a little more readable, because instead
2900
02:15:43,950 --> 02:15:46,590
of doing this little trickery, bracket 1,
2901
02:15:46,590 --> 02:15:49,500
I can say quote unquote "Bracket House" with a capital H,
2902
02:15:49,500 --> 02:15:52,360
because it's capitalized in the Google Form itself.
2903
02:15:52,360 --> 02:15:54,798
So the code now is just minorly different,
2904
02:15:54,798 --> 02:15:57,840
but it's way more resilient, especially if I'm using Google Spreadsheets,
2905
02:15:57,840 --> 02:16:00,390
and I'm moving the columns around or doing something like that,
2906
02:16:00,390 --> 02:16:01,973
where the numbers might get messed up.
2907
02:16:01,973 --> 02:16:05,260
Now I can run this on Hogwarts.py again, and I get the same answers.
2908
02:16:05,260 --> 02:16:09,960
But I now don't have to worry about where those individual columns are.
2909
02:16:09,960 --> 02:16:14,880
All right, any questions on those capabilities there.
2910
02:16:14,880 --> 02:16:17,400
And that's a teaser of sorts, for some of the manipulation
2911
02:16:17,400 --> 02:16:19,620
we'll do in P set 6.
2912
02:16:19,620 --> 02:16:23,555
All right, so some final examples and flair, to intrigue
2913
02:16:23,555 --> 02:16:24,930
with what you can do with Python.
2914
02:16:24,930 --> 02:16:28,710
I'm going to actually switch over to a terminal window on my own Mac,
2915
02:16:28,710 --> 02:16:31,900
so that I can actually use audio a little more effectively.
2916
02:16:31,900 --> 02:16:33,930
So here's just a terminal window on Mac OS.
2917
02:16:33,930 --> 02:16:37,950
I before class have preinstalled some additional Python libraries,
2918
02:16:37,950 --> 02:16:40,379
that won't really work in VS Code in the cloud,
2919
02:16:40,379 --> 02:16:43,535
because they require audio that the browser won't necessarily support.
2920
02:16:43,535 --> 02:16:45,660
But I'm going to go ahead and write an example here
2921
02:16:45,660 --> 02:16:49,559
that involves writing a speech-based program, that actually does something
2922
02:16:49,559 --> 02:16:50,212
with speech.
2923
02:16:50,212 --> 02:16:52,170
And I'm going to go ahead and import a library,
2924
02:16:52,170 --> 02:16:55,709
that, again, I pre-installed, called Python text to speech,
2925
02:16:55,709 --> 02:16:58,260
and I'm going to go ahead and, per its documentation,
2926
02:16:58,260 --> 02:17:02,879
give myself a speech engine, by using that library's init function,
2927
02:17:02,879 --> 02:17:04,080
for initialize.
2928
02:17:04,080 --> 02:17:06,930
I'm then going to use this engine's save function
2929
02:17:06,930 --> 02:17:09,180
to do something fun, like Hello, world.
2930
02:17:09,180 --> 02:17:12,480
And then I'm going to go ahead and tell this engine to run and wait,
2931
02:17:12,480 --> 02:17:13,855
while it says those words.
2932
02:17:13,855 --> 02:17:15,480
All right, I'm going to save this file.
2933
02:17:15,480 --> 02:17:16,980
I'm not using VS Code at the moment.
2934
02:17:16,980 --> 02:17:20,070
I'm using another popular program that we used in CS50 back in my day,
2935
02:17:20,070 --> 02:17:22,830
called Vim, which is a command line program that's
2936
02:17:22,830 --> 02:17:24,790
just in this black and white window.
2937
02:17:24,790 --> 02:17:28,849
Let me go ahead now and run Python of Speech.py, and--
2938
02:17:28,849 --> 02:17:30,745
COMPUTER: Hello, world.
2939
02:17:30,745 --> 02:17:33,120
DAVID J. MALAN: All right, so it's a little computerized,
2940
02:17:33,120 --> 02:17:36,113
but it is speech that has been synthesized from this example.
2941
02:17:36,113 --> 02:17:38,280
Let's change it a little bit to be more interesting.
2942
02:17:38,280 --> 02:17:39,488
Let's do something like this.
2943
02:17:39,488 --> 02:17:43,950
Let's ask the user for their name, like what's your name question mark.
2944
02:17:43,950 --> 02:17:47,850
And then, let's use the little F string, and say, not Hello, world,
2945
02:17:47,850 --> 02:17:50,010
but Hello to that person's name.
2946
02:17:50,010 --> 02:17:54,270
Let me save my file, run Python of Speech.py, Enter.
2947
02:17:54,270 --> 02:17:55,260
David.
2948
02:17:55,260 --> 02:17:57,360
COMPUTER: Hello, David.
2949
02:17:57,360 --> 02:17:59,639
DAVID J. MALAN: All right, so we pronounce my name OK,
2950
02:17:59,639 --> 02:18:02,306
might struggle with different names, depending on the phonetics.
2951
02:18:02,306 --> 02:18:03,570
But that one seemed to be OK.
2952
02:18:03,570 --> 02:18:05,850
Let's do something else with Python, using similarly,
2953
02:18:05,850 --> 02:18:07,780
just a few lines of code.
2954
02:18:07,780 --> 02:18:12,540
Let me go into today's examples.
2955
02:18:12,540 --> 02:18:18,330
And I'm going to go into a folder called Detect, whoops, a folder called
2956
02:18:18,330 --> 02:18:19,680
Faces.py.
2957
02:18:19,680 --> 02:18:20,790
Sorry, Faces.
2958
02:18:20,790 --> 02:18:23,370
And in this folder, that I've written in advance,
2959
02:18:23,370 --> 02:18:25,879
are a few files, Detect.py, Recognize.py,
2960
02:18:25,879 --> 02:18:30,330
and two full of photos, Office.jpeg and Toby.jpeg.
2961
02:18:30,330 --> 02:18:32,799
If you're familiar with the show, here, for instance,
2962
02:18:32,799 --> 02:18:34,809
is the cast photo from The Office here.
2963
02:18:34,809 --> 02:18:36,299
So here's a photo as input.
2964
02:18:36,299 --> 02:18:38,639
Suppose I want to do something very Facebook-style,
2965
02:18:38,639 --> 02:18:40,860
where I want to analyze all of the faces,
2966
02:18:40,860 --> 02:18:42,870
or detect all of the faces in there.
2967
02:18:42,870 --> 02:18:44,940
Well, let me go ahead and show you a program
2968
02:18:44,940 --> 02:18:47,879
I wrote in advance, that's not terribly long.
2969
02:18:47,879 --> 02:18:49,379
Much of it is actually comments.
2970
02:18:49,379 --> 02:18:50,639
But let's see what I'm doing.
2971
02:18:50,639 --> 02:18:54,000
I'm importing the Pillow library, again, to get access to images.
2972
02:18:54,000 --> 02:18:57,480
I'm importing a library called face recognition, which I downloaded
2973
02:18:57,480 --> 02:18:58,590
and installed in advance.
2974
02:18:58,590 --> 02:19:00,129
But it does what it says.
2975
02:19:00,129 --> 02:19:02,959
According to its documentation, you go into that library
2976
02:19:02,959 --> 02:19:04,760
and you call a function called load image
2977
02:19:04,760 --> 02:19:07,370
file, to load something like Office.jpeg,
2978
02:19:07,370 --> 02:19:10,040
and then you can use the line of code like this.
2979
02:19:10,040 --> 02:19:14,120
Call a function called face locations, passing the images input,
2980
02:19:14,120 --> 02:19:17,120
and you get back a list of all of the faces in the image.
2981
02:19:17,120 --> 02:19:20,750
And then down here, a for loop, that iterates over all of those
2982
02:19:20,750 --> 02:19:22,040
face locations.
2983
02:19:22,040 --> 02:19:24,799
And inside of this loop, I just do a bit of trickery.
2984
02:19:24,799 --> 02:19:29,580
I figure out the top, right, bottom, and left corners of those locations.
2985
02:19:29,580 --> 02:19:31,940
And then, using these lines of code here,
2986
02:19:31,940 --> 02:19:34,834
I'm using that image library, to just draw a box, essentially.
2987
02:19:34,834 --> 02:19:35,959
And the code looks cryptic.
2988
02:19:35,959 --> 02:19:38,150
Honestly, I would have to look this up to write it again.
2989
02:19:38,150 --> 02:19:40,650
But per the documentation, this just draws a nice little box
2990
02:19:40,650 --> 02:19:41,610
around the image.
2991
02:19:41,610 --> 02:19:48,200
So let me go ahead and zoom out here, and run this now on Office.jpeg.
2992
02:19:48,200 --> 02:19:53,390
All right, it's analyzing, analyzing, and you can see in the sidebar here,
2993
02:19:53,390 --> 02:19:54,380
here's the original.
2994
02:19:54,380 --> 02:19:59,180
And here is every face that my, what, 10 lines of Python code
2995
02:19:59,180 --> 02:20:00,740
found, within that file.
2996
02:20:00,740 --> 02:20:01,410
What's a face?
2997
02:20:01,410 --> 02:20:04,190
Presumably the library is looking for something,
2998
02:20:04,190 --> 02:20:07,100
maybe without a mask, that has two eyes, a nose, and a mouth,
2999
02:20:07,100 --> 02:20:09,420
in some kind of arrangement, some kind of pattern.
3000
02:20:09,420 --> 02:20:12,440
So it would seem pretty reliable, at least on these fairly easy-to-read
3001
02:20:12,440 --> 02:20:13,370
faces here.
3002
02:20:13,370 --> 02:20:15,660
What if we want to look for someone specific,
3003
02:20:15,660 --> 02:20:17,180
for instance, someone that's always getting picked on.
3004
02:20:17,180 --> 02:20:18,763
Well, we could do something like this.
3005
02:20:18,763 --> 02:20:23,060
Recognize.py, which is taking two files as input, that image and the image
3006
02:20:23,060 --> 02:20:24,620
of one person in particular.
3007
02:20:24,620 --> 02:20:26,900
And if you're trying to find Toby in a crowd,
3008
02:20:26,900 --> 02:20:29,570
here I conflated the program, sorry, this is the version that
3009
02:20:29,570 --> 02:20:31,550
draws a box around the given face.
3010
02:20:31,550 --> 02:20:33,680
Here we have Toby as identified.
3011
02:20:33,680 --> 02:20:34,220
Why?
3012
02:20:34,220 --> 02:20:38,450
Because that program, Recognize.py, has a few more lines of code,
3013
02:20:38,450 --> 02:20:42,800
but long story short, it additionally loads as input Toby.jpeg,
3014
02:20:42,800 --> 02:20:45,410
in order to recognize that specific face.
3015
02:20:45,410 --> 02:20:48,350
And that specific face is a completely different photo,
3016
02:20:48,350 --> 02:20:52,970
but it looks similar enough to the person, that it all worked out OK.
3017
02:20:52,970 --> 02:20:55,820
Let's do one other that's a little sensitive to microphones.
3018
02:20:55,820 --> 02:21:00,650
Let me go into, how about my listen folder here, which is available
3019
02:21:00,650 --> 02:21:01,610
online, too.
3020
02:21:01,610 --> 02:21:04,380
And let's just run Python of Listen0.py.
3021
02:21:04,380 --> 02:21:07,430
I'm going to type in like David.
3022
02:21:07,430 --> 02:21:10,520
Oh, sorry, no, I'm going to--
3023
02:21:10,520 --> 02:21:11,150
Hello, world.
3024
02:21:11,150 --> 02:21:16,045
3025
02:21:16,045 --> 02:21:17,420
Oh, no, that's the wrong version.
3026
02:21:17,420 --> 02:21:19,250
[CHUCKLES] OK, I looked like an idiot.
3027
02:21:19,250 --> 02:21:21,500
OK, hello, there we go.
3028
02:21:21,500 --> 02:21:22,310
Hello to you, too.
3029
02:21:22,310 --> 02:21:26,300
And if I say goodbye, I'm talking to my laptop like an idiot, OK.
3030
02:21:26,300 --> 02:21:28,590
Now it's detecting what I'm saying here.
3031
02:21:28,590 --> 02:21:32,130
So this first version of the program is just using some relatively simple, if
3032
02:21:32,130 --> 02:21:36,472
elif elif, and it's just asking for input, forcing it to lowercase.
3033
02:21:36,472 --> 02:21:38,430
And that was my mistake with the first example.
3034
02:21:38,430 --> 02:21:41,360
And then, I'm just checking, is Hello in the user's words?
3035
02:21:41,360 --> 02:21:42,818
Is how are you in the user's words?
3036
02:21:42,818 --> 02:21:44,152
Didn't see that, but it's there.
3037
02:21:44,152 --> 02:21:45,470
Is goodbye in the user's words?
3038
02:21:45,470 --> 02:21:49,280
Now let's do a cooler version, using a library, just by looking at the effect.
3039
02:21:49,280 --> 02:21:51,140
Python of Listen1.py.
3040
02:21:51,140 --> 02:21:55,685
Hello, world.
3041
02:21:55,685 --> 02:21:56,720
Huh.
3042
02:21:56,720 --> 02:22:04,170
Let's do version 2 of this, that uses an audio speech-to-text library.
3043
02:22:04,170 --> 02:22:07,160
Hello, world.
3044
02:22:07,160 --> 02:22:09,710
OK, so now it's artificial intelligence.
3045
02:22:09,710 --> 02:22:11,810
Now let's do something a little more interesting.
3046
02:22:11,810 --> 02:22:15,230
The third version of this program that actually analyzes the words that are
3047
02:22:15,230 --> 02:22:16,880
said.
3048
02:22:16,880 --> 02:22:18,800
Hello, world, my name is David.
3049
02:22:18,800 --> 02:22:19,700
How are you?
3050
02:22:19,700 --> 02:22:22,760
3051
02:22:22,760 --> 02:22:26,000
OK, so that time, it not only analyzed what I said,
3052
02:22:26,000 --> 02:22:27,930
but it plucked my name out of it.
3053
02:22:27,930 --> 02:22:30,480
Let's do two final examples.
3054
02:22:30,480 --> 02:22:33,150
This one will generate a QR code.
3055
02:22:33,150 --> 02:22:35,120
Let me go ahead and write a program called
3056
02:22:35,120 --> 02:22:39,030
QR.py, that very simply does this.
3057
02:22:39,030 --> 02:22:40,820
Let me import a library called OS.
3058
02:22:40,820 --> 02:22:43,230
Let me import a library called QR code.
3059
02:22:43,230 --> 02:22:48,000
Let me grab an image here, that's QRcode.make.
3060
02:22:48,000 --> 02:22:51,440
And let me give you the URL of like a lecture video on YouTube, or something
3061
02:22:51,440 --> 02:22:55,040
like that, with this ID.
3062
02:22:55,040 --> 02:22:59,840
Let me just type this, so I don't get it wrong.
3063
02:22:59,840 --> 02:23:05,300
OK, so if I now use this URL here, of a video on YouTube, making
3064
02:23:05,300 --> 02:23:07,812
sure I haven't made any typos, I'm now going
3065
02:23:07,812 --> 02:23:09,770
to go ahead and do two lines of code in Python.
3066
02:23:09,770 --> 02:23:13,460
I'm going to first save that as a file called QR.png, which is
3067
02:23:13,460 --> 02:23:15,490
a two dimensional barcode, a QR code.
3068
02:23:15,490 --> 02:23:17,240
And, indeed, I'm going to use this format.
3069
02:23:17,240 --> 02:23:23,790
And I'm going to use the OS.system library to open QR.png automatically.
3070
02:23:23,790 --> 02:23:26,090
And if you'd like to take out your phone at this point,
3071
02:23:26,090 --> 02:23:32,270
you can see the result of my barcode, that's just been dynamically generated.
3072
02:23:32,270 --> 02:23:33,785
Hopefully from afar that will scan.
3073
02:23:33,785 --> 02:23:37,355
3074
02:23:37,355 --> 02:23:40,150
[UPROAR]
3075
02:23:40,150 --> 02:23:42,460
And I think that's an appropriate line to end on.
3076
02:23:42,460 --> 02:23:43,860
So that's it for CS50.
3077
02:23:43,860 --> 02:23:46,020
We will see you next time.
3078
02:23:46,020 --> 02:23:47,820
[APPLAUSE]
3079
02:23:47,820 --> 02:23:51,470
[MUSIC PLAYING]
3080
02:23:51,470 --> 02:24:25,000
261701
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.