All language subtitles for 12. Merge Sort (Theory)

af Afrikaans
ak Akan
sq Albanian
am Amharic
ar Arabic Download
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bn Bengali
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fr French
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
ko Korean
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranî)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
fa Persian
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal)
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es Spanish
es-419 Spanish (Latin American)
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 1 00:00:00,125 --> 00:00:02,594 (futuristic electronic music) 2 2 00:00:02,594 --> 00:00:05,220 (keyboard keys clack) 3 3 00:00:05,220 --> 00:00:06,370 All right, in this video, 4 4 00:00:06,370 --> 00:00:08,950 we're going to look at merge sort. 5 5 00:00:08,950 --> 00:00:12,340 Now, merge sort is a divide and conquer algorithm 6 6 00:00:12,340 --> 00:00:16,420 because it involves splitting the array you wanna sort 7 7 00:00:16,420 --> 00:00:18,620 into a bunch of smaller arrays, 8 8 00:00:18,620 --> 00:00:21,850 and it's usually implemented using recursion. 9 9 00:00:21,850 --> 00:00:23,900 You can write algorithm using loops, 10 10 00:00:23,900 --> 00:00:26,150 but it's usually written recursively. 11 11 00:00:26,150 --> 00:00:29,720 Merge sort involves two major phases. 12 12 00:00:29,720 --> 00:00:32,710 The first phase is the splitting phase, 13 13 00:00:32,710 --> 00:00:35,450 and second phase is the merging phase. 14 14 00:00:35,450 --> 00:00:39,610 We do the sorting during the merging phase. 15 15 00:00:39,610 --> 00:00:42,260 The splitting phase is a preparation phase 16 16 00:00:42,260 --> 00:00:45,430 to make sorting faster during the merging phase. 17 17 00:00:45,430 --> 00:00:49,440 Now, I wanna make it clear that the splitting is logical. 18 18 00:00:49,440 --> 00:00:52,920 We don't create new arrays when we do the splitting. 19 19 00:00:52,920 --> 00:00:55,440 We use indices to keep track of 20 20 00:00:55,440 --> 00:00:57,500 where the array has been split. 21 21 00:00:57,500 --> 00:00:59,160 So when during the splitting phase, 22 22 00:00:59,160 --> 00:01:02,210 we're not actually creating new array instances. 23 23 00:01:02,210 --> 00:01:03,850 So let's look at the splitting phase. 24 24 00:01:03,850 --> 00:01:07,170 In the splitting phase, we start with the unsorted array, 25 25 00:01:07,170 --> 00:01:09,930 and we divide the array into two arrays, 26 26 00:01:09,930 --> 00:01:13,230 and remember, both of the arrays will be unsorted, 27 27 00:01:13,230 --> 00:01:15,930 and we call the first array, the left array 28 28 00:01:15,930 --> 00:01:17,700 and second array, the right array, 29 29 00:01:17,700 --> 00:01:20,780 and we generally just divide the array down the middle. 30 30 00:01:20,780 --> 00:01:22,570 If you have an odd number of elements, 31 31 00:01:22,570 --> 00:01:24,180 it will depend on the implementation. 32 32 00:01:24,180 --> 00:01:26,900 Some implementation will put the extra element 33 33 00:01:26,900 --> 00:01:28,320 into the left array, 34 34 00:01:28,320 --> 00:01:30,980 and some implementations will put the extra element 35 35 00:01:30,980 --> 00:01:32,320 into the right array, 36 36 00:01:32,320 --> 00:01:35,520 but the important point is you're dividing the array 37 37 00:01:35,520 --> 00:01:38,260 or splitting the array into two arrays, 38 38 00:01:38,260 --> 00:01:40,320 the left array and the right array, 39 39 00:01:40,320 --> 00:01:41,700 and then, once you've done that, 40 40 00:01:41,700 --> 00:01:43,740 you keep splitting down even further. 41 41 00:01:43,740 --> 00:01:46,170 So now you go to the left array, 42 42 00:01:46,170 --> 00:01:49,200 and you split that into a left and a right array, 43 43 00:01:49,200 --> 00:01:51,660 and then, you split that left array 44 44 00:01:51,660 --> 00:01:54,610 into a left and a right array, and you keep going, 45 45 00:01:54,610 --> 00:01:57,710 splitting all the arrays and the sub-arrays 46 46 00:01:57,710 --> 00:01:59,940 until you split the original array 47 47 00:01:59,940 --> 00:02:03,330 into a bunch of one-element arrays, 48 48 00:02:03,330 --> 00:02:07,070 and remember from our discussion of insertion sort, 49 49 00:02:07,070 --> 00:02:10,470 a one-element array is sorted by default 50 50 00:02:10,470 --> 00:02:12,500 because there's only one element in it. 51 51 00:02:12,500 --> 00:02:15,510 So in the splitting phase, you're taking the array, 52 52 00:02:15,510 --> 00:02:16,900 you're dividing it in half, 53 53 00:02:16,900 --> 00:02:20,080 and then, you're dividing the two sub-arrays in half, 54 54 00:02:20,080 --> 00:02:22,820 and you're dividing those four sub-arrays in half, 55 55 00:02:22,820 --> 00:02:23,710 et cetera, 56 56 00:02:23,710 --> 00:02:27,780 until you get down to a bunch of one-element arrays, 57 57 00:02:27,780 --> 00:02:29,810 and that is the splitting phase. 58 58 00:02:29,810 --> 00:02:31,090 Once you've done that, 59 59 00:02:31,090 --> 00:02:32,910 you're going to enter the merging phase, 60 60 00:02:32,910 --> 00:02:34,410 and in the merging phase, 61 61 00:02:34,410 --> 00:02:38,510 you're going to merge every left/right pair 62 62 00:02:38,510 --> 00:02:40,540 into a sorted array. 63 63 00:02:40,540 --> 00:02:43,380 So let's say we have an array of four elements 64 64 00:02:43,380 --> 00:02:44,730 just to explain this. 65 65 00:02:44,730 --> 00:02:48,390 So we're gonna split that into two sub-arrays 66 66 00:02:48,390 --> 00:02:50,540 of length two each, 67 67 00:02:50,540 --> 00:02:53,030 and then, we're gonna split those two sub-arrays 68 68 00:02:53,030 --> 00:02:55,670 into two arrays of one element. 69 69 00:02:55,670 --> 00:02:57,620 So from that four-element array, 70 70 00:02:57,620 --> 00:03:01,260 we're gonna end up with four one-element sub-arrays. 71 71 00:03:01,260 --> 00:03:03,080 So we split the original array 72 72 00:03:03,080 --> 00:03:04,900 into a left and right sub-array, 73 73 00:03:04,900 --> 00:03:08,630 and then, we split the left sub-array into left and right 74 74 00:03:08,630 --> 00:03:10,640 and the right sub-array into left and right, 75 75 00:03:10,640 --> 00:03:13,790 and then what we do is we take the two one-element arrays 76 76 00:03:13,790 --> 00:03:15,380 from the left sub-array, 77 77 00:03:15,380 --> 00:03:19,460 and we merge them back into a two-element array. 78 78 00:03:19,460 --> 00:03:22,540 The merged two-element array will be sorted, 79 79 00:03:22,540 --> 00:03:24,280 so when we do the merge, 80 80 00:03:24,280 --> 00:03:26,140 that's the point when we do the sort, 81 81 00:03:26,140 --> 00:03:27,850 and then, we'll take the two elements 82 82 00:03:27,850 --> 00:03:29,050 from the right sub-array, 83 83 00:03:29,050 --> 00:03:32,500 and we'll merge them into a two-element sub-array. 84 84 00:03:32,500 --> 00:03:35,590 So now we have, we're back to two arrays, 85 85 00:03:35,590 --> 00:03:36,760 a left and a right array, 86 86 00:03:36,760 --> 00:03:40,060 except this time, the left and the right array are sorted, 87 87 00:03:40,060 --> 00:03:42,560 and then, we'll merge the left and right array, 88 88 00:03:42,560 --> 00:03:44,220 which are now two elements each, 89 89 00:03:44,220 --> 00:03:47,070 back into a four-element array, 90 90 00:03:47,070 --> 00:03:50,460 and at that point, when we do the merge, we sort, 91 91 00:03:50,460 --> 00:03:53,310 so we get a four-element array that's sorted, 92 92 00:03:53,310 --> 00:03:56,190 and we have sorted our array. 93 93 00:03:56,190 --> 00:03:59,880 So basically, we take the bunch of one-element arrays, 94 94 00:03:59,880 --> 00:04:02,240 and we keep merging them into two-element, 95 95 00:04:02,240 --> 00:04:04,550 four-element, eight-element, et cetera, 96 96 00:04:04,550 --> 00:04:06,810 and at each merge, we're making sure 97 97 00:04:06,810 --> 00:04:09,030 that the resulting arrays are sorted 98 98 00:04:09,030 --> 00:04:12,020 until we've merged all the arrays back into one array, 99 99 00:04:12,020 --> 00:04:14,100 and at that point, that array will be sorted 100 100 00:04:14,100 --> 00:04:16,540 because when we're merging, we're sorting. 101 101 00:04:16,540 --> 00:04:20,230 So every resulting array from the merge will be sorted. 102 102 00:04:20,230 --> 00:04:23,430 Now the merging phase does not happen in place. 103 103 00:04:23,430 --> 00:04:25,520 It uses temporary arrays. 104 104 00:04:25,520 --> 00:04:27,970 So the splitting phase is in place, 105 105 00:04:27,970 --> 00:04:30,940 but we need temporary arrays to do the merge. 106 106 00:04:30,940 --> 00:04:33,670 So let's look at our usual array here. 107 107 00:04:33,670 --> 00:04:35,580 Our start index will be zero. 108 108 00:04:35,580 --> 00:04:39,140 Our end will be seven, which is equal to array.length, 109 109 00:04:39,140 --> 00:04:41,300 and we're going to get our midpoint 110 110 00:04:41,300 --> 00:04:44,210 by dividing start plus end by two, 111 111 00:04:44,210 --> 00:04:46,250 so our midpoint will be three, 112 112 00:04:46,250 --> 00:04:49,010 and in the implementation I'm going to show you, 113 113 00:04:49,010 --> 00:04:52,410 the extra element will go into the right array. 114 114 00:04:52,410 --> 00:04:54,480 So the first time we split this array, 115 115 00:04:54,480 --> 00:04:58,230 elements zero, one, and two will go into the left array, 116 116 00:04:58,230 --> 00:05:00,670 and elements three, four, five, and six 117 117 00:05:00,670 --> 00:05:02,720 will go into the right array, 118 118 00:05:02,720 --> 00:05:05,830 and so, now, let's split the left array. 119 119 00:05:05,830 --> 00:05:07,880 So we're not gonna do anything with the right array 120 120 00:05:07,880 --> 00:05:09,300 that's in black now, 121 121 00:05:09,300 --> 00:05:10,930 and we're gonna look at the left array, 122 122 00:05:10,930 --> 00:05:13,820 and for the left array, the start index is zero, 123 123 00:05:13,820 --> 00:05:15,240 and the end index is three. 124 124 00:05:15,240 --> 00:05:17,700 The end index is always one greater 125 125 00:05:17,700 --> 00:05:20,830 than the index of the last element in the array. 126 126 00:05:20,830 --> 00:05:23,010 So once again, we're gonna take the midpoint 127 127 00:05:23,010 --> 00:05:25,580 by adding start to end and dividing by two, 128 128 00:05:25,580 --> 00:05:28,240 so we'll have three over two which is one, 129 129 00:05:28,240 --> 00:05:30,920 and that means that the first element 130 130 00:05:30,920 --> 00:05:33,130 is going to go into the left array, 131 131 00:05:33,130 --> 00:05:37,150 and elements one to two are going to into the right array, 132 132 00:05:37,150 --> 00:05:39,280 and so now, we have this situation. 133 133 00:05:39,280 --> 00:05:41,830 Now we have finished splitting 20. 134 134 00:05:41,830 --> 00:05:43,460 We can't split that any further. 135 135 00:05:43,460 --> 00:05:46,090 It's already a one-element array, 136 136 00:05:46,090 --> 00:05:48,040 so we only have to worry at this point 137 137 00:05:48,040 --> 00:05:49,770 about splitting the right array. 138 138 00:05:49,770 --> 00:05:51,240 So we need to split the array 139 139 00:05:51,240 --> 00:05:54,230 that's occupying positions one and two, 140 140 00:05:54,230 --> 00:05:56,490 and so the start index is one. 141 141 00:05:56,490 --> 00:05:58,130 The end index is three. 142 142 00:05:58,130 --> 00:06:01,470 The midpoint will be one plus three over two which is two, 143 143 00:06:01,470 --> 00:06:03,190 and so, we're gonna put the first element 144 144 00:06:03,190 --> 00:06:04,740 into the left array 145 145 00:06:04,740 --> 00:06:07,110 and the second element into the right array, 146 146 00:06:07,110 --> 00:06:10,880 and we have now completed splitting the left array 147 147 00:06:10,880 --> 00:06:12,670 into one-element arrays. 148 148 00:06:12,670 --> 00:06:16,590 Now, one and two are both coloured in a green shade 149 149 00:06:16,590 --> 00:06:19,350 because they're actually sibling arrays. 150 150 00:06:19,350 --> 00:06:20,630 We go up for a minute. 151 151 00:06:20,630 --> 00:06:22,660 The left and right arrays are the sibling arrays. 152 152 00:06:22,660 --> 00:06:25,030 So these are the two arrays we're going to merge. 153 153 00:06:25,030 --> 00:06:26,300 When we split the left array, 154 154 00:06:26,300 --> 00:06:29,500 we ended up with a one-element and two-element array, 155 155 00:06:29,500 --> 00:06:31,640 and those arrays are sibling arrays, 156 156 00:06:31,640 --> 00:06:32,770 so when we do the merge, 157 157 00:06:32,770 --> 00:06:35,820 we'll be merging the array that has 20 158 158 00:06:35,820 --> 00:06:38,360 with the array that has 35 and minus 15. 159 159 00:06:38,360 --> 00:06:41,640 We were able to split the right array one more time, 160 160 00:06:41,640 --> 00:06:45,040 and so, the first merge we'll do on the left side 161 161 00:06:45,040 --> 00:06:48,380 will be merging 35 with minus 15. 162 162 00:06:48,380 --> 00:06:51,460 So we're always merging the last split. 163 163 00:06:51,460 --> 00:06:54,440 Okay, so let's look at the right array now. 164 164 00:06:54,440 --> 00:06:56,630 So the start index is three. 165 165 00:06:56,630 --> 00:06:57,730 The end is seven. 166 166 00:06:57,730 --> 00:06:59,530 The midpoint will be five, 167 167 00:06:59,530 --> 00:07:02,260 so elements three and four will go into the left array, 168 168 00:07:02,260 --> 00:07:05,100 and elements five to six will go into the right array. 169 169 00:07:05,100 --> 00:07:06,900 Now let's work on the left array, 170 170 00:07:06,900 --> 00:07:08,820 the one that has seven and 55. 171 171 00:07:08,820 --> 00:07:10,560 The midpoint will be four, 172 172 00:07:10,560 --> 00:07:13,490 so we're gonna put element three to three into left 173 173 00:07:13,490 --> 00:07:15,610 and four to four into right, 174 174 00:07:15,610 --> 00:07:19,090 and we've now finished splitting the left array 175 175 00:07:19,090 --> 00:07:20,360 on the right side. 176 176 00:07:20,360 --> 00:07:23,780 So now we're gonna split the right array on the right side. 177 177 00:07:23,780 --> 00:07:26,210 So the start index is five. 178 178 00:07:26,210 --> 00:07:27,800 The end index is seven. 179 179 00:07:27,800 --> 00:07:29,540 The midpoint is six. 180 180 00:07:29,540 --> 00:07:32,510 So we're gonna put element five into the left array 181 181 00:07:32,510 --> 00:07:34,630 and element six into the right array, 182 182 00:07:34,630 --> 00:07:38,040 and we have now completed splitting the right array 183 183 00:07:38,040 --> 00:07:39,670 into one-element arrays, 184 184 00:07:39,670 --> 00:07:42,900 and once again, seven and 55 are both shaded 185 185 00:07:42,900 --> 00:07:45,320 gold, yellow, brown, whatever you wanna, 186 186 00:07:45,320 --> 00:07:46,710 however you wanna look at that. 187 187 00:07:46,710 --> 00:07:49,290 They're shaded the same because they're sibling arrays. 188 188 00:07:49,290 --> 00:07:51,150 They're gonna get merged together first, 189 189 00:07:51,150 --> 00:07:54,490 and the same applies to one and minus 22. 190 190 00:07:54,490 --> 00:07:57,380 So now we need to merge these arrays back together, 191 191 00:07:57,380 --> 00:08:01,900 and remember, when we do the merge, we actually sort. 192 192 00:08:01,900 --> 00:08:05,720 So every time we merge, the resulting array is sorted. 193 193 00:08:05,720 --> 00:08:08,940 So as I mentioned, 35 and minus 15 194 194 00:08:08,940 --> 00:08:11,080 are sibling left/right arrays, 195 195 00:08:11,080 --> 00:08:12,520 so they're gonna get merged. 196 196 00:08:12,520 --> 00:08:15,260 Seven and 55 are gonna get merged. 197 197 00:08:15,260 --> 00:08:17,890 One and minus 22 are gonna get merged. 198 198 00:08:17,890 --> 00:08:19,740 20 won't get merged right now 199 199 00:08:19,740 --> 00:08:22,450 because it doesn't have a sibling at the moment. 200 200 00:08:22,450 --> 00:08:24,860 So we started with our original array, 201 201 00:08:24,860 --> 00:08:28,860 and then, we split that into a left array and a right array, 202 202 00:08:28,860 --> 00:08:32,180 and then, that left array was split into left and right, 203 203 00:08:32,180 --> 00:08:35,820 and that right array was split into left and right, 204 204 00:08:35,820 --> 00:08:37,500 and then here, the right array 205 205 00:08:37,500 --> 00:08:39,810 was split into two two-element arrays, 206 206 00:08:39,810 --> 00:08:42,830 and then, they were each split into two one-element arrays, 207 207 00:08:42,830 --> 00:08:45,410 and that's what we just went through in the slides. 208 208 00:08:45,410 --> 00:08:48,540 Now because of the recursive nature of the implementation, 209 209 00:08:48,540 --> 00:08:50,340 it's important to note that 210 210 00:08:50,340 --> 00:08:54,580 we're going to handle the entire left side of the array 211 211 00:08:54,580 --> 00:08:57,630 before we start working with the right side of the array 212 212 00:08:57,630 --> 00:08:59,590 because you'll see from the implementation 213 213 00:08:59,590 --> 00:09:03,070 that we call a method to partition the left side, 214 214 00:09:03,070 --> 00:09:05,630 and then, we call a method to partition the right side, 215 215 00:09:05,630 --> 00:09:08,460 but that method will call itself recursively, 216 216 00:09:08,460 --> 00:09:11,680 and so, we'll call the method to partition the left side, 217 217 00:09:11,680 --> 00:09:13,920 and then, it will call methods to partition 218 218 00:09:13,920 --> 00:09:15,197 itself into left and right, 219 219 00:09:15,197 --> 00:09:16,560 and this one will call 220 220 00:09:16,560 --> 00:09:18,930 a method to partition itself into left and right, 221 221 00:09:18,930 --> 00:09:21,200 and so, once we start down this path, 222 222 00:09:21,200 --> 00:09:23,800 we go down the recursive rabbit hole, 223 223 00:09:23,800 --> 00:09:26,140 and so, by the time the method called 224 224 00:09:26,140 --> 00:09:28,500 to partition the left returns, 225 225 00:09:28,500 --> 00:09:30,560 we'll have done all this work, 226 226 00:09:30,560 --> 00:09:32,540 and the same goes for the right side. 227 227 00:09:32,540 --> 00:09:34,910 So let's move on to the merging step now, 228 228 00:09:34,910 --> 00:09:36,990 and you'll see that when we do the merging step, 229 229 00:09:36,990 --> 00:09:40,420 we merge backwards, so we merge bottom up, 230 230 00:09:40,420 --> 00:09:42,100 and we're gonna merge sibling arrays, 231 231 00:09:42,100 --> 00:09:44,830 so we'll start by merging 35 with minus 15 232 232 00:09:44,830 --> 00:09:48,800 and then seven with 55 an then one with minus 22, 233 233 00:09:48,800 --> 00:09:51,810 and only after we've merged 35 and minus 15 234 234 00:09:51,810 --> 00:09:54,590 back into a two-element sorted array 235 235 00:09:54,590 --> 00:09:58,010 will we then merge 20 with the result, et cetera. 236 236 00:09:58,010 --> 00:10:00,670 So we're gonna merge all these one-element arrays. 237 237 00:10:00,670 --> 00:10:03,070 We always merge sibling left and right, 238 238 00:10:03,070 --> 00:10:05,950 and each merged array will be sorted, 239 239 00:10:05,950 --> 00:10:08,070 and as I said, 20 doesn't have a sibling, 240 240 00:10:08,070 --> 00:10:11,030 so it's not gonna get merged on the first round. 241 241 00:10:11,030 --> 00:10:12,920 So the way that the merging works 242 242 00:10:12,920 --> 00:10:15,220 is we merge sibling left and right arrays, 243 243 00:10:15,220 --> 00:10:18,070 and what we do is we create a temporary array 244 244 00:10:18,070 --> 00:10:20,320 large enough to hold all the elements 245 245 00:10:20,320 --> 00:10:21,520 in the arrays we're merging. 246 246 00:10:21,520 --> 00:10:23,060 So on the first round, 247 247 00:10:23,060 --> 00:10:25,490 our temporary arrays will be of length two 248 248 00:10:25,490 --> 00:10:29,530 because we're going to be merging two one-element arrays, 249 249 00:10:29,530 --> 00:10:31,430 and what we do is we set i 250 250 00:10:31,430 --> 00:10:34,100 to the first index of the left array, 251 251 00:10:34,100 --> 00:10:36,750 and j to the first index of the right array, 252 252 00:10:36,750 --> 00:10:38,290 and when I say left and right array, 253 253 00:10:38,290 --> 00:10:40,080 I mean the two arrays we're merging, 254 254 00:10:40,080 --> 00:10:44,770 and then, we compare the value at the i position in left 255 255 00:10:44,770 --> 00:10:47,910 to the value at the j position in the right array, 256 256 00:10:47,910 --> 00:10:50,720 and if the value in the left array is smaller, 257 257 00:10:50,720 --> 00:10:52,920 we copy that to the temporary array, 258 258 00:10:52,920 --> 00:10:55,230 and we increment i by one. 259 259 00:10:55,230 --> 00:10:57,590 If the value on the right array is smaller, 260 260 00:10:57,590 --> 00:10:59,900 then we copy that to the temporary array 261 261 00:10:59,900 --> 00:11:01,260 and increment j by one. 262 262 00:11:01,260 --> 00:11:02,730 So essentially, what we're doing 263 263 00:11:02,730 --> 00:11:06,210 is we're stepping through the left and right arrays, 264 264 00:11:06,210 --> 00:11:09,350 and we're taking the smallest value 265 265 00:11:09,350 --> 00:11:10,680 between the left and the right 266 266 00:11:10,680 --> 00:11:12,897 and copying it into the temporary array, 267 267 00:11:12,897 --> 00:11:14,630 and if we keep doing that, 268 268 00:11:14,630 --> 00:11:17,520 that temporary array will contain the values 269 269 00:11:17,520 --> 00:11:18,660 in sorted order. 270 270 00:11:18,660 --> 00:11:21,340 So we're gonna repeat that until all the elements 271 271 00:11:21,340 --> 00:11:23,320 in both arrays have been processed, 272 272 00:11:23,320 --> 00:11:25,090 and, as I said, at that point, 273 273 00:11:25,090 --> 00:11:27,550 the temporary array will contain the merged values 274 274 00:11:27,550 --> 00:11:28,700 in sorted order, 275 275 00:11:28,700 --> 00:11:30,710 and then, we have one final step to do. 276 276 00:11:30,710 --> 00:11:32,580 Remember, we've been copying these values 277 277 00:11:32,580 --> 00:11:34,090 into a temporary array, 278 278 00:11:34,090 --> 00:11:37,710 so we then have to copy the sorted values 279 279 00:11:37,710 --> 00:11:40,300 back into the original input array, 280 280 00:11:40,300 --> 00:11:43,770 the one that we're sorting, at the correct positions, 281 281 00:11:43,770 --> 00:11:47,720 and so, if the left array is at positions x to y 282 282 00:11:47,720 --> 00:11:49,070 in the original array, 283 283 00:11:49,070 --> 00:11:52,810 and the right array is at positions y plus one to z, 284 284 00:11:52,810 --> 00:11:55,920 then after the copy, positions x to z 285 285 00:11:55,920 --> 00:11:58,300 will be sorted in the original array. 286 286 00:11:58,300 --> 00:12:01,480 So we're gonna overwrite what's there in the original array 287 287 00:12:01,480 --> 00:12:03,170 with the sorted values. 288 288 00:12:03,170 --> 00:12:06,140 So we're gonna start by merging the two siblings 289 289 00:12:06,140 --> 00:12:08,530 on the left, 35 and minus 15. 290 290 00:12:08,530 --> 00:12:09,990 So what we'll do is we'll create 291 291 00:12:09,990 --> 00:12:12,170 a temporary two-element array. 292 292 00:12:12,170 --> 00:12:14,010 i will be initialised to one 293 293 00:12:14,010 --> 00:12:17,700 because that's the first index in the left array, 294 294 00:12:17,700 --> 00:12:19,530 and j will be initialised to two 295 295 00:12:19,530 --> 00:12:22,350 because that's the first index in the right array, 296 296 00:12:22,350 --> 00:12:25,740 and then we compare array i to array j. 297 297 00:12:25,740 --> 00:12:28,260 Minus 15 is smaller than 35. 298 298 00:12:28,260 --> 00:12:30,730 There's a typo there, that should be 35, 299 299 00:12:30,730 --> 00:12:34,560 and so, we copy minus 15 to the temporary array. 300 300 00:12:34,560 --> 00:12:37,700 Then, we're gonna copy 35 to the temporary array, 301 301 00:12:37,700 --> 00:12:39,570 and at this point, the temporary array 302 302 00:12:39,570 --> 00:12:41,970 will be minus 15 and 35, 303 303 00:12:41,970 --> 00:12:44,960 and we've now seen all of the elements 304 304 00:12:44,960 --> 00:12:47,780 in the left array and right array that we're merging. 305 305 00:12:47,780 --> 00:12:50,460 So at this point, we wanna copy the temporary array 306 306 00:12:50,460 --> 00:12:53,980 back into positions one and two in the original array, 307 307 00:12:53,980 --> 00:12:58,320 and so, at this point, minus 15 and 35 are sorted. 308 308 00:12:58,320 --> 00:13:01,810 Now, they're not in sorted position in the array, 309 309 00:13:01,810 --> 00:13:04,320 but the merged array is sorted, 310 310 00:13:04,320 --> 00:13:07,480 and the two sibling arrays have now been merged. 311 311 00:13:07,480 --> 00:13:11,280 So now, we're back to two arrays on the left-hand side. 312 312 00:13:11,280 --> 00:13:16,280 We're back to 20, and we're back to minus 15 and 35. 313 313 00:13:16,280 --> 00:13:18,950 So at this point, the left array contains 20, 314 314 00:13:18,950 --> 00:13:22,080 and the right array contains minus 15 and 35. 315 315 00:13:22,080 --> 00:13:24,760 So now we're gonna merge those two arrays 316 316 00:13:24,760 --> 00:13:29,010 because 20 and minus 15 and 35 317 317 00:13:29,010 --> 00:13:30,680 are sibling arrays now. 318 318 00:13:30,680 --> 00:13:32,810 We originally got those two arrays 319 319 00:13:32,810 --> 00:13:35,560 when we split the left array. 320 320 00:13:35,560 --> 00:13:38,570 So we create a temporary array of length three. 321 321 00:13:38,570 --> 00:13:40,640 i will be initialised to zero, 322 322 00:13:40,640 --> 00:13:42,520 and j will be initialised to one, 323 323 00:13:42,520 --> 00:13:45,740 and then, we compare array zero to array one. 324 324 00:13:45,740 --> 00:13:47,980 Minus 15 is less than 20, 325 325 00:13:47,980 --> 00:13:50,780 so it's gonna be copied to the temporary array, 326 326 00:13:50,780 --> 00:13:53,990 and because we copied the value from the right array 327 327 00:13:53,990 --> 00:13:57,330 into the temporary array, we increment j by one 328 328 00:13:57,330 --> 00:13:59,580 because we've processed minus 15. 329 329 00:13:59,580 --> 00:14:02,660 i is still zero because we haven't processed 20 yet. 330 330 00:14:02,660 --> 00:14:06,200 So now we compare 20 to 35. 331 331 00:14:06,200 --> 00:14:09,810 20 is smaller than 35, so it gets copied next, 332 332 00:14:09,810 --> 00:14:13,230 and now, only 35 remains, and so, it's copied last, 333 333 00:14:13,230 --> 00:14:18,130 and so, the temporary array consists of minus 15, 20 and 35, 334 334 00:14:18,130 --> 00:14:21,960 and now, we copy that back into positions zero to two 335 335 00:14:21,960 --> 00:14:23,350 in the original array, 336 336 00:14:23,350 --> 00:14:26,010 and so, at this point, we have completed 337 337 00:14:26,010 --> 00:14:27,790 merging the left sub-array, 338 338 00:14:27,790 --> 00:14:30,400 and, as you can see, it's in sorted order. 339 339 00:14:30,400 --> 00:14:33,380 So now, we're gonna repeat this process with the right 340 340 00:14:33,380 --> 00:14:34,430 part of the array. 341 341 00:14:34,430 --> 00:14:36,040 We have two sets of siblings, 342 342 00:14:36,040 --> 00:14:39,670 seven and 55 and one and minus 22, 343 343 00:14:39,670 --> 00:14:42,930 so we're gonna start by merging seven and 55. 344 344 00:14:42,930 --> 00:14:45,920 So we're gonna create a temporary array of length two. 345 345 00:14:45,920 --> 00:14:47,700 i will be initialised to three, 346 346 00:14:47,700 --> 00:14:49,800 and j will be initialised to four. 347 347 00:14:49,800 --> 00:14:52,540 We're gonna compare array i to array j. 348 348 00:14:52,540 --> 00:14:54,680 Seven is smaller than 55, 349 349 00:14:54,680 --> 00:14:57,670 so it gets copied into the temporary array first, 350 350 00:14:57,670 --> 00:14:59,700 and then, 55 will get copied, 351 351 00:14:59,700 --> 00:15:01,410 and then we copy the temp array 352 352 00:15:01,410 --> 00:15:03,400 back to positions three and four, 353 353 00:15:03,400 --> 00:15:06,570 and so, we've now merged the left array 354 354 00:15:06,570 --> 00:15:09,308 of the right part of the original array, 355 355 00:15:09,308 --> 00:15:12,160 and seven and 55 are in sorted order. 356 356 00:15:12,160 --> 00:15:15,740 Now we repeat this process to merge one and minus 22. 357 357 00:15:15,740 --> 00:15:19,541 The temporary array is gonna end up being minus 22 and one. 358 358 00:15:19,541 --> 00:15:22,950 We're gonna copy that back into positions five and six, 359 359 00:15:22,950 --> 00:15:25,470 and we're back to minus 22, 360 360 00:15:25,470 --> 00:15:29,240 and the merged array is in sorted order. 361 361 00:15:29,240 --> 00:15:30,880 It's minus 22 and one. 362 362 00:15:30,880 --> 00:15:33,090 So now we have two arrays of length two, 363 363 00:15:33,090 --> 00:15:34,870 and we need to merge those. 364 364 00:15:34,870 --> 00:15:37,800 So we're gonna create a temporary array of length four. 365 365 00:15:37,800 --> 00:15:40,950 i will be initialised to three and j to five. 366 366 00:15:40,950 --> 00:15:43,460 Minus 22 is less than seven, 367 367 00:15:43,460 --> 00:15:47,010 so we're gonna copy minus 22 to the temporary array, 368 368 00:15:47,010 --> 00:15:49,480 and j will be incremented to six. 369 369 00:15:49,480 --> 00:15:52,370 So then, we're gonna compare seven to one. 370 370 00:15:52,370 --> 00:15:53,720 One is less than seven, 371 371 00:15:53,720 --> 00:15:56,250 so one is gonna be copied to the temporary array, 372 372 00:15:56,250 --> 00:15:58,400 and j will be incremented to seven. 373 373 00:15:58,400 --> 00:16:01,550 Now, only elements in the left array are left, 374 374 00:16:01,550 --> 00:16:03,280 and we know that they're sorted 375 375 00:16:03,280 --> 00:16:05,800 because all the merged arrays are sorted, 376 376 00:16:05,800 --> 00:16:07,470 and so, all we have to do at this point 377 377 00:16:07,470 --> 00:16:10,240 is just copy the two elements from the left array 378 378 00:16:10,240 --> 00:16:12,410 into the temporary array, and when we've done that, 379 379 00:16:12,410 --> 00:16:17,290 the temporary array will be minus 22, one, seven, and 55, 380 380 00:16:17,290 --> 00:16:18,630 and we just have to copy that 381 381 00:16:18,630 --> 00:16:20,790 back into positions three to six, 382 382 00:16:20,790 --> 00:16:24,530 and so, now we're back down to just two arrays, 383 383 00:16:24,530 --> 00:16:26,020 a left array and a right array 384 384 00:16:26,020 --> 00:16:27,690 that are both in sorted order, 385 385 00:16:27,690 --> 00:16:31,890 and our final step is to merge these left and right arrays, 386 386 00:16:31,890 --> 00:16:34,230 and so, we're gonna create a temporary array 387 387 00:16:34,230 --> 00:16:36,820 that has enough room to hold seven elements. 388 388 00:16:36,820 --> 00:16:40,300 We're gonna initialise i to zero and j to three. 389 389 00:16:40,300 --> 00:16:42,780 Minus 22 is less than minus 15, 390 390 00:16:42,780 --> 00:16:45,860 so we're gonna copy minus 22 to the temporary array, 391 391 00:16:45,860 --> 00:16:47,830 and we're gonna increment j to four. 392 392 00:16:47,830 --> 00:16:50,070 Minus 15 is less than one, 393 393 00:16:50,070 --> 00:16:53,250 so we're gonna copy minus 15 to the temporary array, 394 394 00:16:53,250 --> 00:16:55,470 and i is gonna get incremented to one. 395 395 00:16:55,470 --> 00:16:59,160 One is less than 20, so we copy one to the temporary array, 396 396 00:16:59,160 --> 00:17:01,050 and we increment j to five. 397 397 00:17:01,050 --> 00:17:02,840 Seven is less than 20, 398 398 00:17:02,840 --> 00:17:04,520 so we copy seven to the temporary array, 399 399 00:17:04,520 --> 00:17:06,790 and we increment j to six. 400 400 00:17:06,790 --> 00:17:11,360 20 is less than 55, so we copy 20 to the temporary array, 401 401 00:17:11,360 --> 00:17:13,330 and we increment i to two, 402 402 00:17:13,330 --> 00:17:16,440 and finally, 35 is less than 55, 403 403 00:17:16,440 --> 00:17:18,880 so we copy 35 to the temporary array, 404 404 00:17:18,880 --> 00:17:20,290 and we increment i to three, 405 405 00:17:20,290 --> 00:17:22,620 and at this point, only 55 is left, 406 406 00:17:22,620 --> 00:17:25,030 so we copy it to the temporary array, 407 407 00:17:25,030 --> 00:17:27,690 and now we copy the temporary array 408 408 00:17:27,690 --> 00:17:29,610 back into positions zero to six 409 409 00:17:29,610 --> 00:17:31,530 because we've looked at all the elements, 410 410 00:17:31,530 --> 00:17:34,800 and when we do that, the array is sorted. 411 411 00:17:34,800 --> 00:17:36,220 It is quite a bit of work, 412 412 00:17:36,220 --> 00:17:39,250 but because we're dividing and conquering, 413 413 00:17:39,250 --> 00:17:42,070 we're splitting the array into two at each phase, 414 414 00:17:42,070 --> 00:17:44,160 it actually performs better 415 415 00:17:44,160 --> 00:17:46,220 than the algorithms we've seen so far. 416 416 00:17:46,220 --> 00:17:48,620 So just to remind you of what we had 417 417 00:17:48,620 --> 00:17:50,310 when we started the merging phase, 418 418 00:17:50,310 --> 00:17:52,210 we had done all of the splitting, 419 419 00:17:52,210 --> 00:17:53,950 and then, in the merging phase, 420 420 00:17:53,950 --> 00:17:55,720 we went from the bottom back up. 421 421 00:17:55,720 --> 00:17:58,070 So we merged 35 and minus 15 422 422 00:17:58,070 --> 00:18:02,490 into a two-element sorted array of minus 15 and 35, 423 423 00:18:02,490 --> 00:18:04,460 and we did the same thing with these two elements, 424 424 00:18:04,460 --> 00:18:07,290 and the same thing with one and minus 22, 425 425 00:18:07,290 --> 00:18:10,520 and then, at that point, these guys are now sibling arrays, 426 426 00:18:10,520 --> 00:18:14,340 and so, we merged the left array with the right array, 427 427 00:18:14,340 --> 00:18:16,980 and we got a sorted three-element array, 428 428 00:18:16,980 --> 00:18:19,580 minus 15, 20, and 35, 429 429 00:18:19,580 --> 00:18:23,930 and on this side, we had two two-element sibling array, 430 430 00:18:23,930 --> 00:18:24,890 left and right. 431 431 00:18:24,890 --> 00:18:27,350 We merged them into a sorted array, 432 432 00:18:27,350 --> 00:18:31,510 and then, finally, we merged the original 433 433 00:18:31,510 --> 00:18:34,260 left and right arrays we got from splitting the whole array 434 434 00:18:34,260 --> 00:18:36,290 except this time, they're sorted 435 435 00:18:36,290 --> 00:18:39,340 because we've undergone the merge step, 436 436 00:18:39,340 --> 00:18:42,730 and so, we merged the sorted left array 437 437 00:18:42,730 --> 00:18:44,540 with the sorted right array, 438 438 00:18:44,540 --> 00:18:48,210 and our result from the merge is always a sorted array, 439 439 00:18:48,210 --> 00:18:51,670 and so, we got our original array sorted. 440 440 00:18:51,670 --> 00:18:54,910 So let's take a look at how this algorithm performs. 441 441 00:18:54,910 --> 00:18:56,840 Well, first of all, as I said previously, 442 442 00:18:56,840 --> 00:18:59,250 it's not an in-place algorithm. 443 443 00:18:59,250 --> 00:19:01,220 The splitting phase is in-place, 444 444 00:19:01,220 --> 00:19:03,730 but when we do the merging phase, 445 445 00:19:03,730 --> 00:19:06,100 we use a temporary array to merge 446 446 00:19:06,100 --> 00:19:08,160 each pair of sibling arrays. 447 447 00:19:08,160 --> 00:19:12,460 Now this has a time complexity of O to the n log n, 448 448 00:19:12,460 --> 00:19:15,640 and the log is to the base two, and why is that? 449 449 00:19:15,640 --> 00:19:19,020 Well, we're repeatedly dividing the array in half 450 450 00:19:19,020 --> 00:19:20,550 during the splitting phase, 451 451 00:19:20,550 --> 00:19:23,550 and so, this is a logarithmic algorithm. 452 452 00:19:23,550 --> 00:19:26,020 It's a stable sort algorithm 453 453 00:19:26,020 --> 00:19:28,320 because when we do the merging, 454 454 00:19:28,320 --> 00:19:33,050 we check whether the element in the right array 455 455 00:19:33,050 --> 00:19:36,060 is greater than the element in the left array, 456 456 00:19:36,060 --> 00:19:38,610 and if it isn't, if it's equal to it, 457 457 00:19:38,610 --> 00:19:40,266 then the element in the left array will be the one 458 458 00:19:40,266 --> 00:19:43,400 that's copied into the temporary array first, 459 459 00:19:43,400 --> 00:19:44,600 and because of that, 460 460 00:19:44,600 --> 00:19:47,840 the relative ordering of duplicate items will be preserved. 461 461 00:19:47,840 --> 00:19:49,580 Now as I mentioned on the last slide, 462 462 00:19:49,580 --> 00:19:51,330 the implementation I'm going to show you 463 463 00:19:51,330 --> 00:19:53,450 has a couple of optimizations, 464 464 00:19:53,450 --> 00:19:54,670 and I'll explain those 465 465 00:19:54,670 --> 00:19:56,560 when we go through the implementation, 466 466 00:19:56,560 --> 00:19:57,820 but it's still merge sort, 467 467 00:19:57,820 --> 00:20:00,730 and it's still a logarithmic algorithm. 468 468 00:20:00,730 --> 00:20:03,420 Now these days, memory is cheap and plentiful, 469 469 00:20:03,420 --> 00:20:06,200 so the fact that it's not an in-place algorithm 470 470 00:20:06,200 --> 00:20:08,320 shouldn't dissuade you from using it, 471 471 00:20:08,320 --> 00:20:11,410 but if memory is an issue, then that's a consideration 472 472 00:20:11,410 --> 00:20:15,380 because there's a lot of temporary array instances created, 473 473 00:20:15,380 --> 00:20:17,500 and of course, the amount of memory you're going to need 474 474 00:20:17,500 --> 00:20:19,030 is going to grow 475 475 00:20:19,030 --> 00:20:21,590 as the number of items you're sorting grows. 476 476 00:20:21,590 --> 00:20:24,710 As you'll see, the next sort algorithm we look at 477 477 00:20:24,710 --> 00:20:28,090 also uses a divide and conquer philosophy 478 478 00:20:28,090 --> 00:20:30,740 and is also a logarithmic algorithm, 479 479 00:20:30,740 --> 00:20:32,980 but it's an in-place algorithm. 480 480 00:20:32,980 --> 00:20:35,770 So we'll see that coming up in a couple of videos. 481 481 00:20:35,770 --> 00:20:37,820 All right, so that's how merge sort works. 482 482 00:20:37,820 --> 00:20:38,930 Let's implement it. 483 483 00:20:38,930 --> 00:20:40,480 I'll see you in the next video. 42463

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.