subtitlecat.com

All language subtitles for 25 - Huffman Codes English

Afrikaans

Akan

Albanian

Amharic

Arabic Download

Armenian

Azerbaijani

Basque

Belarusian

Bemba

Bengali

Bihari

Bosnian

Breton

Bulgarian

Cambodian

Catalan

Cebuano

Cherokee

Chichewa

Chinese (Simplified)

Chinese (Traditional)

Corsican

Croatian

Czech

Danish

Dutch

English

Esperanto

Estonian

Ewe

Faroese

Filipino

Finnish

French

Frisian

Galician

Georgian

German

Greek

Guarani

Gujarati

Haitian Creole

Hausa

Hawaiian

Hebrew

Hindi

Hmong

Hungarian

Icelandic

Igbo

Indonesian

Interlingua

Irish

Italian

Japanese

Javanese

Kannada

Kazakh

Kinyarwanda

Kirundi

Kongo

Korean

Krio (Sierra Leone)

Kurdish

Kurdish (Soranî)

Kyrgyz

Laothian

Latin

Latvian

Lingala

Lithuanian

Lozi

Luganda

Luo

Luxembourgish

Macedonian

Malagasy

Malay

Malayalam

Maltese

Maori

Marathi

Mauritian Creole

Moldavian

Mongolian

Myanmar (Burmese)

Montenegrin

Nepali

Nigerian Pidgin

Northern Sotho

Norwegian

Norwegian (Nynorsk)

Occitan

Oriya

Oromo

Pashto

Persian

Polish

Portuguese (Brazil)

Portuguese (Portugal)

Punjabi

Quechua

Romanian

Romansh

Runyakitara

Russian

Samoan

Scots Gaelic

Serbian

Serbo-Croatian

Sesotho

Setswana

Seychellois Creole

Shona

Sindhi

Sinhalese

Slovak

Slovenian

Somali

Spanish

Spanish (Latin American)

Sundanese

Swahili

Swedish

Tajik

Tamil

Tatar

Telugu

Thai

Tigrinya

Tonga

Tshiluba

Tumbuka

Turkish

Turkmen

Twi

Uighur

Ukrainian

Urdu

Uzbek

Vietnamese

Welsh

Wolof

Xhosa

Yiddish

Yoruba

Zulu

Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:08,700 --> 00:00:12,070 ‫High in this video we're going to talk about Huffman coding. 2 00:00:12,540 --> 00:00:22,020 ‫Including is a great way to compress data and to be able to take a large amount of data and to shrink 3 00:00:22,020 --> 00:00:30,120 ‫it down to bite size chunks and so to give me a good example of how this works right here on the right 4 00:00:30,120 --> 00:00:30,560 ‫hand side. 5 00:00:30,570 --> 00:00:35,290 ‫We have the letters A B C D and D. 6 00:00:35,430 --> 00:00:44,880 ‫If we had a document that only had a range of these five letters and we want it to compress the data 7 00:00:45,810 --> 00:00:48,030 ‫Huffman coding would be a great way of doing that. 8 00:00:48,030 --> 00:00:57,150 ‫And the reason why is because if we want to store this data in the regular text format that usually 9 00:00:57,330 --> 00:01:00,930 ‫would store data you would have to do something like this. 10 00:01:00,930 --> 00:01:14,100 ‫You'd need to set up your system and each one would have to be represented by a byte and a byte is a 11 00:01:14,100 --> 00:01:20,730 ‫total of 8 bits. 12 00:01:20,930 --> 00:01:31,140 ‫And so you're looking at one two three four five six seven and eight. 13 00:01:31,220 --> 00:01:38,570 ‫And so you're looking at having to take up 8 bits for every letter but there's actually a lot better 14 00:01:38,570 --> 00:01:39,730 ‫way of doing this. 15 00:01:39,770 --> 00:01:44,220 ‫And that's what huffman encoding does and this only works with specific types of data. 16 00:01:44,240 --> 00:01:50,850 ‫So if you needed the full set and the full range of data this isn't something that would work for you. 17 00:01:50,900 --> 00:01:54,170 ‫But what we can do is actually take it. 18 00:01:54,170 --> 00:01:59,140 ‫So we have a range set of rules similar to something like this. 19 00:01:59,240 --> 00:02:10,190 ‫We can actually take it and instead of requiring a total of eight to how we can do this just by using 20 00:02:11,710 --> 00:02:19,120 ‫three bets and we're going to walk through how to do it in a tree form first and then I'm going to show 21 00:02:19,120 --> 00:02:22,840 ‫you how the actual code itself works. 22 00:02:22,840 --> 00:02:31,310 ‫So the very first thing we're going to do is on the right hand side the letter A. 23 00:02:31,440 --> 00:02:38,350 ‫Shows that we have 24 counts of the letter and B has 12 C 10 etc. etc.. 24 00:02:38,470 --> 00:02:43,270 ‫And so if you're wondering when this could actually be important. 25 00:02:43,630 --> 00:02:50,770 ‫There are certain studies in computer science dealing with things like bioinformatics that this is absolutely 26 00:02:50,770 --> 00:02:56,470 ‫critical and because you're dealing with gigantic amounts of data and you need to be able to compress 27 00:02:56,740 --> 00:02:59,660 ‫and Huffman coding is one of the best ways of doing that. 28 00:03:00,070 --> 00:03:05,350 ‫So the type of tree that we're going to generate is going to be a little bit different than some of 29 00:03:05,350 --> 00:03:07,860 ‫the others we've seen specifically. 30 00:03:07,900 --> 00:03:12,720 ‫It's going to be different and like binary trees because it's going to have values associated with them. 31 00:03:12,850 --> 00:03:17,420 ‫And so we're going to actually start off with the two lowest values. 32 00:03:17,560 --> 00:03:27,070 ‫So we're going to give a B and then we're going to say that there is a count of it and we'll draw a 33 00:03:27,070 --> 00:03:31,430 ‫box around this and we know that D is the same thing. 34 00:03:31,480 --> 00:03:42,340 ‫So we will say there's eight of those and then we're going to push that up to the top and say that we 35 00:03:42,340 --> 00:03:48,070 ‫know that 8 plus 8 equals 16 to have a total of 16. 36 00:03:48,070 --> 00:03:56,310 ‫Now we know by looking up at our little chart up here that B and C both have 12 and 10 respectively. 37 00:03:56,410 --> 00:04:01,030 ‫And so we're going to use those values as well starting with the lower value first. 38 00:04:01,030 --> 00:04:03,580 ‫So we're going to start off with C 39 00:04:14,900 --> 00:04:20,770 ‫we're going to start off with C being equal to its value 10 40 00:04:23,720 --> 00:04:30,940 ‫and then B 12. 41 00:04:31,400 --> 00:04:33,640 ‫And you notice that we are leaving out a. 42 00:04:33,650 --> 00:04:37,800 ‫And that's an important part and I'll show you what that means here in a second. 43 00:04:37,970 --> 00:04:42,220 ‫So we have a B and C and that you total those up those equal 22 44 00:04:45,940 --> 00:04:50,360 ‫K and now all we have to do is connect these. 45 00:04:51,050 --> 00:04:55,760 ‫So we have 16 plus 22 which gives the total of 38. 46 00:04:56,720 --> 00:04:59,920 ‫And now all we have to do is connect. 47 00:05:00,110 --> 00:05:01,250 ‫So we have a. 48 00:05:01,460 --> 00:05:02,960 ‫Which equals 24 49 00:05:08,800 --> 00:05:10,740 ‫and then we're going to total that up. 50 00:05:10,740 --> 00:05:20,530 ‫So the way Huffman codes the root node doesn't actually containing data just contains a sum of the total 51 00:05:20,530 --> 00:05:21,840 ‫data we're looking for. 52 00:05:21,910 --> 00:05:24,160 ‫And so we know our total. 53 00:05:24,190 --> 00:05:28,960 ‫If you add up all of these right here the total is 62. 54 00:05:29,080 --> 00:05:34,850 ‫So that's how you know your root node is always going to represent the total amount. 55 00:05:35,080 --> 00:05:44,650 ‫So if you're wondering how this is better than just having a just regular text file or something like 56 00:05:44,650 --> 00:05:52,900 ‫that which usually to use like ASCII kind of format ASCII like we talked about takes bits or one byte 57 00:05:53,080 --> 00:06:00,430 ‫for each one of these items and so that can actually take up quite a bit of space where as what we can 58 00:06:00,430 --> 00:06:04,210 ‫do here is we can actually assign values. 59 00:06:04,300 --> 00:06:13,540 ‫And so we're going to give this a value of zero and we're going to give this a value of 1 and then we're 60 00:06:13,540 --> 00:06:16,840 ‫going to use this just like the regular binary tree. 61 00:06:16,890 --> 00:06:22,180 ‫Right is 1 and 0 is on the left hand side. 62 00:06:22,510 --> 00:06:25,890 ‫We'll do that with each one type. 63 00:06:25,900 --> 00:06:36,970 ‫Now the way this will work is a is just going to be connected from the root so the root A is going to 64 00:06:37,780 --> 00:06:40,390 ‫have the value of 0. 65 00:06:40,710 --> 00:06:51,590 ‫B If you look at the way this works and you just follow binary tree you'll have the room going here. 66 00:06:51,700 --> 00:06:54,350 ‫Here here. 67 00:06:54,360 --> 00:07:08,400 ‫So you're going to be looking at B being represented by the binary value of 1 0 0 and an see if you 68 00:07:08,400 --> 00:07:13,580 ‫follow it down in the tree where it going down 1 0 1 69 00:07:17,010 --> 00:07:31,200 ‫D is going to be 1 1 0 and e is going to be 1 1 1 and all we're doing is we're assigning the edges on 70 00:07:31,200 --> 00:07:37,140 ‫the path we're saying it a 1 or a 0 depending on if it's on the right hand or the left hand side and 71 00:07:37,140 --> 00:07:45,920 ‫then that is what we're going to store in our in our compressed files. 72 00:07:45,920 --> 00:07:58,170 ‫So instead of having to give a total of eight values here we are given a 1 instead of 8 values here 73 00:07:58,180 --> 00:07:59,140 ‫it's three. 74 00:07:59,280 --> 00:08:04,830 ‫And the reason it's 3 it's just one two three here instead of eight. 75 00:08:05,070 --> 00:08:09,330 ‫It's three three instead of a three instead of eight. 76 00:08:09,330 --> 00:08:15,840 ‫So all we're really doing is we're shrinking down the number of bits that are required. 77 00:08:15,990 --> 00:08:18,930 ‫And as you can see right here tab. 78 00:08:18,960 --> 00:08:23,270 ‫ABC D and E it with a normal system. 79 00:08:23,370 --> 00:08:27,520 ‫This would be eight times eight times five. 80 00:08:27,630 --> 00:08:35,310 ‫So we would need a total of 40 as compared to the new system that uses the Huffman coding which has 81 00:08:35,400 --> 00:08:40,550 ‫a total of one plus four sets of three. 82 00:08:40,560 --> 00:08:42,770 ‫So we're looking at 13. 83 00:08:42,840 --> 00:08:52,650 ‫So by being able to Huffman coding we're saving well over half of the amount of space and that's one 84 00:08:52,650 --> 00:08:58,740 ‫of the big reasons why Huffman coding is so powerful because by simply doing something by shrinking 85 00:08:58,740 --> 00:09:08,190 ‫down and number of bits that are necessary on a per energy or on a per letter basis we're able to have 86 00:09:08,250 --> 00:09:09,780 ‫a much smaller file. 87 00:09:09,780 --> 00:09:16,920 ‫And whether you're dealing with DNA sequences or something where you can work with the data set like 88 00:09:16,920 --> 00:09:17,360 ‫this. 89 00:09:17,390 --> 00:09:19,350 ‫This can be a great way of doing it. 90 00:09:19,350 --> 00:09:23,610 ‫So please let me know if you have any questions whatsoever and I'll see you in the next video. 9652