All language subtitles for 02 - Measuring Performance - Introduction.en

af Afrikaans
sq Albanian
am Amharic
ar Arabic Download
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bn Bengali
bs Bosnian
bg Bulgarian
ca Catalan
ceb Cebuano
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
tl Filipino
fi Finnish
fr French
fy Frisian
gl Galician
ka Georgian
de German
el Greek
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
km Khmer
ko Korean
ku Kurdish (Kurmanji)
ky Kyrgyz
lo Lao
la Latin
lv Latvian
lt Lithuanian
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mn Mongolian
my Myanmar (Burmese)
ne Nepali
no Norwegian
ps Pashto
fa Persian
pl Polish
pt Portuguese
pa Punjabi
ro Romanian
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
st Sesotho
sn Shona
sd Sindhi
si Sinhala
sk Slovak
sl Slovenian
so Somali
es Spanish Download
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
te Telugu
th Thai
tr Turkish
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
or Odia (Oriya)
rw Kinyarwanda
tk Turkmen
tt Tatar
ug Uyghur
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:09,000 --> 00:00:14,000 focus on how to measure and compare performance of different data structures and algorithms. 2 00:00:14,000 --> 00:00:21,000 This specific module is a bit special, in that it presents a toolbox that we will use intensively throughout 3 00:00:21,000 --> 00:00:28,000 the rest of the course. Some of the concepts presented may seem odd or abstract at first, and they might 4 00:00:28,000 --> 00:00:34,000 take some time to get used to, but hold on, once you are through this module and we have established a common 5 00:00:34,000 --> 00:00:40,000 language for performance complexity comparison, the remaining modules will be much more concrete. 6 00:00:40,000 --> 00:00:46,000 We just need this toolbox first. So, without further ado, let's consider 7 00:00:46,000 --> 00:00:54,000 a few different ways of measure performance. So, one strategy is to use a stopwatch to simply measure the 8 00:00:54,000 --> 00:01:00,000 time it takes for a given action to complete. That was actually what we did in the introduction, 9 00:01:00,000 --> 00:01:05,000 comparing two implementations on an iPhone and a laptop respectively. 10 00:01:05,000 --> 00:01:12,000 However, that approach comes with some drawbacks, because a lot of things are in play when measuring one 11 00:01:12,000 --> 00:01:18,000 thing as the hardware the executes the implementation, where more capable hardware may result in a faster 12 00:01:18,000 --> 00:01:27,000 execution time, even though the algorithm implemented in itself is just the same or maybe even slightly worse 13 00:01:27,000 --> 00:01:33,000 than an algorithm running on some less capable hardware. Other things can influence the measure time, too, 14 00:01:33,000 --> 00:01:42,000 such as the compiler optimizations, and possible simultaneously running programs that may eat resources from our machine. 15 00:01:42,000 --> 00:01:50,000 Measuring execution time with a stopwatch has another drawback, namely the fact that it is only a single measuring point. 16 00:01:50,000 --> 00:01:57,000 Specifically, that single measurement does not say anything about how the execution time may behave for 17 00:01:57,000 --> 00:02:04,000 smaller or larger inputs; for example, how the execution time may grow for our log reader when more log 18 00:02:04,000 --> 00:02:10,000 lines are supposed to be analyzed. We can achieve a better understanding on the nature of the performance 19 00:02:10,000 --> 00:02:17,000 behavior for varying input sizes by considering the number of instructions executed by the machine when 20 00:02:17,000 --> 00:02:23,000 running a given program. If we're shown that each instruction takes a certain fraction of a second to 21 00:02:23,000 --> 00:02:31,000 execute, then the total time usage is just a matter of multiplying this per instruction time with the total 22 00:02:31,000 --> 00:02:39,000 number of instructions. For example, consider the function from the intro that just reads the provided log 23 00:02:39,000 --> 00:02:46,000 file and counts the number of log lines. We could assume that the two first lines in the function uses one 24 00:02:46,000 --> 00:02:53,000 instruction each, summarizing to two instructions. And for the sake of simplicity, that the contents of the 25 00:02:53,000 --> 00:03:04,000 for loop uses four instructions. If the log file contains 100,000 lines, we thus end up with an expression 26 00:03:04,000 --> 00:03:10,000 for the total number of instructions executed that looks like this. 27 00:03:10,000 --> 00:03:19,000 The last + 1 comes from the return statement. If we, instead of assuming 100,000 log lines are using the 28 00:03:19,000 --> 00:03:28,000 letter N to symbolize any number of log lines, then we get a more general expression that looks like this. 29 00:03:28,000 --> 00:03:34,000 Now we can get an idea of the complexity of an algorithm for a given input size simply by inserting an 30 00:03:34,000 --> 00:03:42,000 appropriate number of N in the formal layer. An interesting question we could now ask is how this expression 31 00:03:42,000 --> 00:03:51,000 grows when N increases, and this leads us to the third approach of measuring performance, namely, to look at 32 00:03:51,000 --> 00:03:58,000 the nature of the curve that the instruction count formally from before represents. 33 00:03:58,000 --> 00:04:04,000 And remember that the number of instructions directly relates to the execution time. 34 00:04:04,000 --> 00:04:13,000 Consider the following examples of such curves. The formula from before represented a straight line like this. 35 00:04:13,000 --> 00:04:21,000 A straight line means that the execution time is somewhat reliable, so that an increment in the input size, 36 00:04:21,000 --> 00:04:32,000 or N as we named it, of 500, say, from 1000 to 1500, results in the same execution time growth than an 37 00:04:32,000 --> 00:04:41,000 increment of the input size from, say, 10,000 to 10,500. Here is another example of a line where the 38 00:04:41,000 --> 00:04:47,000 execution time increases slightly faster. Often when the curve looks like this we are happy, 39 00:04:47,000 --> 00:04:53,000 because then our algorithm does not get relatively more cumbersome for a larger input. 40 00:04:53,000 --> 00:05:01,000 We also say that the execution time grows linearly with the input size, and linearity is good. 41 00:05:01,000 --> 00:05:10,000 An example of an even better looking curve is this one. And as you can see, increasing the input size with 42 00:05:10,000 --> 00:05:19,000 a certain amount has a higher relative impact on the execution time for smaller inputs than for larger inputs. 43 00:05:19,000 --> 00:05:28,000 That is, increasing the input size with, say, 500 again, will result in almost no additional execution time 44 00:05:28,000 --> 00:05:33,000 when the input is sufficiently large. Of course, this is a quite ideal performance behavior for an 45 00:05:33,000 --> 00:05:41,000 algorithm, and as we shall see later on, there are ways of searching for data that has exactly this property. 46 00:05:41,000 --> 00:05:49,000 And this basically means that the extra time we need to find the needle in the haystack only increases a tiny 47 00:05:49,000 --> 00:05:57,000 bit, even if the haystack grows significantly in size. On the other end of the scale is a curve like this. 48 00:05:57,000 --> 00:06:02,000 As you can see, this curve has opposite properties than a formal one. 49 00:06:02,000 --> 00:06:09,000 Even though the execution time only grows slowly for small input sizes, the execution time will explode in 50 00:06:09,000 --> 00:06:17,000 size whenever the input gets large enough, and notice that this explosion can happen even if the execution 51 00:06:17,000 --> 00:06:23,000 time is very small for small inputs. Naturally this is the worst imaginable behavior for an algorithm, 52 00:06:23,000 --> 00:06:30,000 because this effectively sets a probably rather low upper limit for how large inputs we can handle. 53 00:06:30,000 --> 00:06:38,000 So, considering the curves that represent the instruction count can give us a pretty good understanding of 54 00:06:38,000 --> 00:06:48,000 how the execution time grows when the input size grows. With that mindset in hand, we can start considering 55 00:06:48,000 --> 00:06:55,000 other performance characteristics. For example, we could consider a best-case scenario for an algorithm, 56 00:06:55,000 --> 00:07:02,000 that is, a curve that represents the most optimistic simulation for an execution, a guarantee that states 57 00:07:02,000 --> 00:07:09,000 what the shortest possible execution time can ever be. We can also consider the worst-case scenario, 58 00:07:09,000 --> 00:07:14,000 that is, an upper limit for how bad an algorithm can perform for a given input size. 59 00:07:14,000 --> 00:07:20,000 When talking about algorithms, this curve is very interesting, simply because there is often a desire to 60 00:07:20,000 --> 00:07:28,000 solve problems as fast as possible, not only when the input is nice, but also in the worst possible situations. 61 00:07:28,000 --> 00:07:34,000 And then, it is useful to know how bad the algorithm will possibly behave. 62 00:07:34,000 --> 00:07:42,000 As a last example, consider a performance behavior that is nice, except for certain spikes, as shown in this curve. 63 00:07:42,000 --> 00:07:49,000 If we are running the corresponding algorithm numerous times, multiple times, it can be interesting to 64 00:07:49,000 --> 00:07:55,000 consider the amortized performance, that is, what is the real cost when summarizing the execution times 65 00:07:55,000 --> 00:08:01,000 across multiple executions. Do the spikes contribute significantly to the execution time in the total 66 00:08:01,000 --> 00:08:09,500 picture, or can they, so to speak, be balanced out? So, that was four ways to consider performance measuring. 67 00:08:09,500 --> 00:08:18,000 In the next clip, we will zoom in on how to come from program source code to its related execution time curve. 9116

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.