All language subtitles for 18. Radix Sort (Theory)

af Afrikaans
ak Akan
sq Albanian
am Amharic
ar Arabic Download
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bn Bengali
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fr French
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
ko Korean
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranî)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
fa Persian
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal)
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es Spanish
es-419 Spanish (Latin American)
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 1 00:00:00,160 --> 00:00:05,160 (lively music) (keyboard clacking) 2 2 00:00:05,230 --> 00:00:06,960 In this video, we're going to take a look 3 3 00:00:06,960 --> 00:00:10,480 at the Radix Sort and this is another sort algorithm 4 4 00:00:10,480 --> 00:00:13,520 that makes assumptions about the data it's sorting 5 5 00:00:13,520 --> 00:00:15,980 and in this case, the assumptions that it makes 6 6 00:00:15,980 --> 00:00:19,860 is that the data has the same radix and width. 7 7 00:00:19,860 --> 00:00:22,240 Now, what do I mean by that? 8 8 00:00:22,240 --> 00:00:25,360 The radix is the number of unique digits 9 9 00:00:25,360 --> 00:00:28,490 or values in the case of characters 10 10 00:00:28,490 --> 00:00:31,510 that a numbering system or an alphabet has. 11 11 00:00:31,510 --> 00:00:35,300 For example, the radix for the decimal system is 10 12 12 00:00:35,300 --> 00:00:37,180 because there are 10 possible digits 13 13 00:00:37,180 --> 00:00:38,660 in the decimal system. 14 14 00:00:38,660 --> 00:00:39,900 Zero to nine. 15 15 00:00:39,900 --> 00:00:42,580 For binary numbers, the radix is two 16 16 00:00:42,580 --> 00:00:47,230 because we use two digits in the binary system, zero and one 17 17 00:00:47,230 --> 00:00:50,450 and for the English alphabet, the radix is 26 18 18 00:00:50,450 --> 00:00:53,400 because there are 26 letters in the alphabet. 19 19 00:00:53,400 --> 00:00:56,240 So, that's what we mean by radix. 20 20 00:00:56,240 --> 00:00:59,110 Now, by width I mean the number of digits 21 21 00:00:59,110 --> 00:01:01,897 or letters, for example, the number 1235 22 22 00:01:03,660 --> 00:01:05,540 has a width of four. 23 23 00:01:05,540 --> 00:01:08,610 The string hello has a width of five. 24 24 00:01:08,610 --> 00:01:10,930 The number 10 has a width of two. 25 25 00:01:10,930 --> 00:01:15,840 So, 1,235 or 1235 has four digits, 26 26 00:01:15,840 --> 00:01:17,323 so it has a width of four, 27 27 00:01:17,323 --> 00:01:21,750 the string hello has five letters, so it has a width of five 28 28 00:01:21,750 --> 00:01:24,270 and the number 10 has two digits 29 29 00:01:24,270 --> 00:01:26,200 and so, it has a width of two. 30 30 00:01:26,200 --> 00:01:28,310 And so, the radix sort assumes 31 31 00:01:28,310 --> 00:01:32,740 that all the values have the same radix and the same width 32 32 00:01:32,740 --> 00:01:36,690 and so, that means that we can use the radix sort 33 33 00:01:36,690 --> 00:01:39,420 to sort integers and strings. 34 34 00:01:39,420 --> 00:01:41,220 The decimal point isn't a digit 35 35 00:01:41,220 --> 00:01:43,620 and so, floating point numbers can't be sorted 36 36 00:01:43,620 --> 00:01:45,100 using radix sort. 37 37 00:01:45,100 --> 00:01:46,170 So, how does it work? 38 38 00:01:46,170 --> 00:01:49,190 Well, we sort based on each individual digit 39 39 00:01:49,190 --> 00:01:51,530 or letter position and you'll see what I mean 40 40 00:01:51,530 --> 00:01:55,500 when we go through the algorithm on the slides. 41 41 00:01:55,500 --> 00:01:58,430 So, we start at the rightmost position 42 42 00:01:58,430 --> 00:02:01,210 and we sort based on that position, 43 43 00:02:01,210 --> 00:02:03,610 the digit or the letter at that position 44 44 00:02:03,610 --> 00:02:07,810 and then we move to the rightmost minus one digit 45 45 00:02:07,810 --> 00:02:09,780 or letter and we sort based on that 46 46 00:02:09,780 --> 00:02:10,890 and we keep doing that 47 47 00:02:10,890 --> 00:02:13,700 until we've done all of the digits or letters 48 48 00:02:13,700 --> 00:02:14,533 in the values. 49 49 00:02:14,533 --> 00:02:16,350 Now, the important point here 50 50 00:02:16,350 --> 00:02:19,531 and this is key, this is an absolute important point, 51 51 00:02:19,531 --> 00:02:22,820 is we have to use a stable sort algorithm 52 52 00:02:22,820 --> 00:02:27,070 at each stage, so when we sort the values based 53 53 00:02:27,070 --> 00:02:28,670 on the rightmost position, 54 54 00:02:28,670 --> 00:02:31,100 we have to use a stable sort algorithm. 55 55 00:02:31,100 --> 00:02:33,190 And then when we sort the values based 56 56 00:02:33,190 --> 00:02:35,710 on the rightmost minus one position, 57 57 00:02:35,710 --> 00:02:37,630 we have to use a stable sort algorithm 58 58 00:02:37,630 --> 00:02:39,307 and once again, you'll understand why 59 59 00:02:39,307 --> 00:02:42,430 when we take a look at going through the algorithm. 60 60 00:02:42,430 --> 00:02:43,870 So, let's do that now. 61 61 00:02:43,870 --> 00:02:46,238 So, we have a different array here now 62 62 00:02:46,238 --> 00:02:47,890 at the top of the screen. 63 63 00:02:47,890 --> 00:02:50,070 It's got six integers in it 64 64 00:02:50,070 --> 00:02:52,930 and you'll notice that they're all decimal numbers 65 65 00:02:52,930 --> 00:02:55,540 and they're all of width four. 66 66 00:02:55,540 --> 00:02:57,620 They all have four digits 67 67 00:02:57,620 --> 00:03:00,883 and so, we can sort this array using the radix sot. 68 68 00:03:02,250 --> 00:03:03,770 First we're going to sort this array 69 69 00:03:03,770 --> 00:03:05,770 based on the one's position 70 70 00:03:05,770 --> 00:03:08,020 and so, let's take a look at the first value, 71 71 00:03:08,020 --> 00:03:13,020 4725, the number five is in the one's position 72 72 00:03:14,180 --> 00:03:15,970 and for the second value, 73 73 00:03:15,970 --> 00:03:20,970 4586, the number six is in the one's position 74 74 00:03:21,410 --> 00:03:22,690 and so, what we're doing here 75 75 00:03:22,690 --> 00:03:25,440 is we're sorting by the digit 76 76 00:03:25,440 --> 00:03:27,370 that has the least weight 77 77 00:03:27,370 --> 00:03:30,210 because the one's position has the least weight 78 78 00:03:30,210 --> 00:03:32,180 in a decimal number 79 79 00:03:32,180 --> 00:03:34,310 and so, once we've done that, 80 80 00:03:34,310 --> 00:03:36,800 the bottom array will be the result. 81 81 00:03:36,800 --> 00:03:38,780 1330 will come first 82 82 00:03:38,780 --> 00:03:41,960 because it has zero in the one's position, 83 83 00:03:41,960 --> 00:03:44,370 that'll be followed by 8792 84 84 00:03:44,370 --> 00:03:47,270 because there's a two in the one's position. 85 85 00:03:47,270 --> 00:03:49,287 Then we have 1594, 86 86 00:03:49,287 --> 00:03:54,287 4725, 4586 and 5729 87 87 00:03:54,507 --> 00:03:56,570 and if you look at their one's position 88 88 00:03:56,570 --> 00:03:58,470 or the rightmost digit, 89 89 00:03:58,470 --> 00:04:01,170 you'll see that the values have been sorted 90 90 00:04:01,170 --> 00:04:03,520 according to what's in that position 91 91 00:04:03,520 --> 00:04:07,280 and now we move to the second position 92 92 00:04:07,280 --> 00:04:10,980 from the right or the second least significant digit 93 93 00:04:10,980 --> 00:04:13,170 and we sort based on that. 94 94 00:04:13,170 --> 00:04:15,440 And so, at the top is the array we just had 95 95 00:04:15,440 --> 00:04:16,840 on the other slide 96 96 00:04:16,840 --> 00:04:18,450 and now we're gonna sort based 97 97 00:04:18,450 --> 00:04:22,310 on the second to last rightmost position. 98 98 00:04:22,310 --> 00:04:25,460 So, for 1330 that will be three, 99 99 00:04:25,460 --> 00:04:28,590 for 8792 it will be nine, 100 100 00:04:28,590 --> 00:04:31,500 for 1594 it will be nine, 101 101 00:04:31,500 --> 00:04:34,630 for 4725 it will be two et cetera 102 102 00:04:34,630 --> 00:04:36,880 and so, down at the bottom, 103 103 00:04:36,880 --> 00:04:39,150 we've now sorted based on that digit, 104 104 00:04:39,150 --> 00:04:42,540 so we have 4725 'cause it has a two, 105 105 00:04:42,540 --> 00:04:45,330 5729 which also have a two, 106 106 00:04:45,330 --> 00:04:47,378 1330 'cause it has a three, 107 107 00:04:47,378 --> 00:04:50,800 4586 because it has an eight 108 108 00:04:50,800 --> 00:04:54,970 and then 8792 and 1594 both have nines. 109 109 00:04:54,970 --> 00:04:56,810 Now, as I mentioned, 110 110 00:04:56,810 --> 00:05:00,800 critically important, this is a stable sort 111 111 00:05:00,800 --> 00:05:01,930 and so, you'll notice 112 112 00:05:01,930 --> 00:05:04,890 that we have a couple of pairs of duplicates here, 113 113 00:05:04,890 --> 00:05:08,580 4725 and 5729 114 114 00:05:08,580 --> 00:05:12,370 both have twos in their rightmost digit minus one position 115 115 00:05:12,370 --> 00:05:13,800 and you'll notice at the top 116 116 00:05:13,800 --> 00:05:18,301 that 4725 came before 5729 117 117 00:05:18,301 --> 00:05:23,301 and in the bottom, 4725 still comes before 5729 118 118 00:05:24,447 --> 00:05:26,360 and that is critical 119 119 00:05:26,360 --> 00:05:29,800 and that's because we have to use the stable sort algorithm 120 120 00:05:29,800 --> 00:05:32,310 and the same thing goes for 8792 121 121 00:05:32,310 --> 00:05:35,000 and 1594, they both have nines 122 122 00:05:35,000 --> 00:05:37,590 in the second to right position 123 123 00:05:37,590 --> 00:05:40,590 and in the top right, 8792 124 124 00:05:40,590 --> 00:05:43,210 came before 1594 125 125 00:05:43,210 --> 00:05:44,590 and it still does 126 126 00:05:44,590 --> 00:05:46,310 and so, that's critical. 127 127 00:05:46,310 --> 00:05:48,930 We have to use a stable sort algorithm. 128 128 00:05:48,930 --> 00:05:52,630 So, now we're gonna sort based on the hundreds position. 129 129 00:05:52,630 --> 00:05:55,317 So, we sorted here based on the tens position, 130 130 00:05:55,317 --> 00:05:58,320 and now we're gonna sort based on the hundreds position 131 131 00:05:58,320 --> 00:06:00,520 and so, when we do that, 132 132 00:06:00,520 --> 00:06:04,810 1330 comes first 'cause it's got three in that position, 133 133 00:06:04,810 --> 00:06:07,400 followed by 4586, 134 134 00:06:07,400 --> 00:06:10,020 1594, 4725, 5729 135 135 00:06:12,510 --> 00:06:15,140 and 8792 and once again, 136 136 00:06:15,140 --> 00:06:17,771 we have to use a stable sort algorithm 137 137 00:06:17,771 --> 00:06:19,560 and so, you'll notice here 138 138 00:06:19,560 --> 00:06:23,440 that we have two fives in the hundreds position 139 139 00:06:23,440 --> 00:06:27,460 and 4586 still comes before 1594 140 140 00:06:27,460 --> 00:06:30,150 as it did in the top array 141 141 00:06:30,150 --> 00:06:33,820 and we have three values with seven in the hundreds position 142 142 00:06:33,820 --> 00:06:37,330 and the relative positions of those three values 143 143 00:06:37,330 --> 00:06:38,920 have been preserved. 144 144 00:06:38,920 --> 00:06:43,570 So, 4725 still comes before 5729 145 145 00:06:43,570 --> 00:06:46,480 and that still comes before 8792 146 146 00:06:46,480 --> 00:06:48,960 and then the final step is we sort based 147 147 00:06:48,960 --> 00:06:50,670 on the thousands position. 148 148 00:06:50,670 --> 00:06:55,670 And so, 1330 has a one, 1594 has a one, 149 149 00:06:56,090 --> 00:07:00,140 4586 and 4725 both have fours 150 150 00:07:00,140 --> 00:07:03,590 and then there's 5729 and 8792 151 151 00:07:03,590 --> 00:07:06,400 and we have sorted our array. 152 152 00:07:06,400 --> 00:07:08,850 Now once again, this had to be a stable sort 153 153 00:07:08,850 --> 00:07:13,096 and so, 1330 had to come before 1594, 154 154 00:07:13,096 --> 00:07:16,640 4586 had to come before 4725. 155 155 00:07:16,640 --> 00:07:18,940 Now maybe you can understand at this point 156 156 00:07:18,940 --> 00:07:21,060 why the sort has to be stable. 157 157 00:07:21,060 --> 00:07:24,540 If it wasn't stable, this radix sort wouldn't work 158 158 00:07:24,540 --> 00:07:27,320 because at each stage, for example, 159 159 00:07:27,320 --> 00:07:31,840 in this stage, if 1594 could cross 1330, 160 160 00:07:32,700 --> 00:07:34,860 we wouldn't get the correct sorted order. 161 161 00:07:34,860 --> 00:07:39,180 If 4725 could end up before 4586, 162 162 00:07:39,180 --> 00:07:41,530 we would not get the correct sorted order 163 163 00:07:41,530 --> 00:07:42,790 and it's kind of interesting 164 164 00:07:42,790 --> 00:07:45,310 because when we first start doing a sort, 165 165 00:07:45,310 --> 00:07:47,580 it just means, you start to wonder, 166 166 00:07:47,580 --> 00:07:48,930 how is this ever going to work? 167 167 00:07:48,930 --> 00:07:51,010 Because when we sort on the one's position, 168 168 00:07:51,010 --> 00:07:53,400 we got, we go back there, 169 169 00:07:53,400 --> 00:07:54,530 we got a certain array 170 170 00:07:54,530 --> 00:07:56,600 but then when we sorted on the 10 position, 171 171 00:07:56,600 --> 00:07:58,970 all the elements are changing positions again 172 172 00:07:58,970 --> 00:08:01,090 and you're wondering how is this ever going to end up 173 173 00:08:01,090 --> 00:08:02,310 in sorted order? 174 174 00:08:02,310 --> 00:08:04,610 But the key is we're sorting 175 175 00:08:04,610 --> 00:08:06,660 from the least significant digit, 176 176 00:08:06,660 --> 00:08:08,010 the one with the least weight 177 177 00:08:08,010 --> 00:08:09,460 when it comes to the value 178 178 00:08:09,460 --> 00:08:11,080 to the most significant digit 179 179 00:08:11,080 --> 00:08:14,500 and at every phase, it's a stable sort. 180 180 00:08:14,500 --> 00:08:15,951 And so, because of that, 181 181 00:08:15,951 --> 00:08:19,120 as we move from right to left, 182 182 00:08:19,120 --> 00:08:22,410 as we go from the least significant digit 183 183 00:08:22,410 --> 00:08:24,290 to the most significant digit, 184 184 00:08:24,290 --> 00:08:27,970 you'll see that a sorted order is starting to emerge 185 185 00:08:27,970 --> 00:08:29,580 and then on the final step, 186 186 00:08:29,580 --> 00:08:31,610 because it's a stable sort, 187 187 00:08:31,610 --> 00:08:35,030 all of the previous sorts that we've done still count, 188 188 00:08:35,030 --> 00:08:37,870 they don't get undone at each stage, 189 189 00:08:37,870 --> 00:08:40,680 it kind of looks like we're not sorting at each stage 190 190 00:08:40,680 --> 00:08:43,840 but we are, the data is gradually becoming more 191 191 00:08:43,840 --> 00:08:45,230 and more sorted 192 192 00:08:45,230 --> 00:08:47,810 and the fact that we're using a stable sort algorithm means 193 193 00:08:47,810 --> 00:08:49,640 that on that final sort, 194 194 00:08:49,640 --> 00:08:52,470 when we're sorting on the most significant digit, 195 195 00:08:52,470 --> 00:08:56,520 values that have the same most significant digit 196 196 00:08:56,520 --> 00:08:58,930 will not switch relative positions 197 197 00:08:58,930 --> 00:09:02,075 and so, the values that have a higher hundreds, 198 198 00:09:02,075 --> 00:09:04,930 a higher digit in the hundreds will remain 199 199 00:09:04,930 --> 00:09:08,130 after values that have a lower digit in the hundreds 200 200 00:09:08,130 --> 00:09:11,600 and the same will go when we do the sort with the tens. 201 201 00:09:11,600 --> 00:09:13,900 When we do the sort based on the tens, 202 202 00:09:13,900 --> 00:09:16,120 values that have a duplicate tens value, 203 203 00:09:16,120 --> 00:09:18,900 well, the value with the higher digit in the one's position 204 204 00:09:18,900 --> 00:09:21,250 will come after the value with the lower digit 205 205 00:09:21,250 --> 00:09:22,290 in the one's position 206 206 00:09:22,290 --> 00:09:23,220 and so, you can see 207 207 00:09:23,220 --> 00:09:25,560 that because we're using a stable sort algorithm 208 208 00:09:25,560 --> 00:09:26,880 at each phase 209 209 00:09:26,880 --> 00:09:29,800 and we're going from the least significant digit 210 210 00:09:29,800 --> 00:09:31,170 to the most significant digit, 211 211 00:09:31,170 --> 00:09:33,530 or letter depending on what we're sorting, 212 212 00:09:33,530 --> 00:09:37,840 the results of previous sorting phases are preserved 213 213 00:09:37,840 --> 00:09:40,820 and that's why radix sort works. 214 214 00:09:40,820 --> 00:09:42,990 So, counting sort is often used 215 215 00:09:42,990 --> 00:09:44,990 as the sort algorithm for radix sort. 216 216 00:09:44,990 --> 00:09:46,500 In the last video I said 217 217 00:09:46,500 --> 00:09:49,600 that the implementation I showed you was not stable, 218 218 00:09:49,600 --> 00:09:51,620 so when we implement radix sort, 219 219 00:09:51,620 --> 00:09:54,380 you'll have a chance to see the stable version 220 220 00:09:54,380 --> 00:09:55,610 of counting sort. 221 221 00:09:55,610 --> 00:09:58,440 Now, once again, because radix sort makes assumptions 222 222 00:09:58,440 --> 00:10:01,200 about the data, we can achieve linear time. 223 223 00:10:01,200 --> 00:10:04,160 Even so, this often runs slower 224 224 00:10:04,160 --> 00:10:06,360 than O to the nlog of n 225 225 00:10:06,360 --> 00:10:08,490 because of the overhead involved 226 226 00:10:08,490 --> 00:10:09,590 and by overhead, 227 227 00:10:09,590 --> 00:10:14,571 I mean that you have to isolate each individual digit 228 228 00:10:14,571 --> 00:10:17,550 or letter at each phase 229 229 00:10:17,550 --> 00:10:19,020 and so, there's overhead involved 230 230 00:10:19,020 --> 00:10:21,470 just to figure out what value you're supposed to be sorting 231 231 00:10:21,470 --> 00:10:23,010 at each phase and because of that, 232 232 00:10:23,010 --> 00:10:24,780 even though it can achieve linear time, 233 233 00:10:24,780 --> 00:10:27,010 it often runs a little bit slower than that. 234 234 00:10:27,010 --> 00:10:28,820 Now whether or not it's in place 235 235 00:10:28,820 --> 00:10:30,980 will depend on which sort algorithm you use 236 236 00:10:30,980 --> 00:10:33,830 because as we've seen, some sort algorithms are in place 237 237 00:10:33,830 --> 00:10:35,100 and some aren't. 238 238 00:10:35,100 --> 00:10:37,580 So, radix sort can be in place 239 239 00:10:37,580 --> 00:10:40,210 or it might not be place. 240 240 00:10:40,210 --> 00:10:43,520 It's gonna depend on what sort algorithm you choose 241 241 00:10:43,520 --> 00:10:45,340 to do the sorting at each phase. 242 242 00:10:45,340 --> 00:10:47,900 It's a stable algorithm as we've seen 243 243 00:10:47,900 --> 00:10:50,250 because the sort algorithm we use at each phase 244 244 00:10:50,250 --> 00:10:53,640 has to be stable, radix sort turns out to be stable 245 245 00:10:53,640 --> 00:10:57,000 and in fact, that's key to making it work 246 246 00:10:57,000 --> 00:10:59,300 is that the sorts in earlier phases 247 247 00:10:59,300 --> 00:11:02,610 aren't being undone as we sort based 248 248 00:11:02,610 --> 00:11:05,790 on the more significant positions. 249 249 00:11:05,790 --> 00:11:07,230 So, that's it for the theory. 250 250 00:11:07,230 --> 00:11:09,320 Let's move onto implementation. 251 251 00:11:09,320 --> 00:11:10,870 I'll see you in the next video. 21085

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.