All language subtitles for 968.English

af Afrikaans
ak Akan
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bn Bengali
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fr French
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranรฎ)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ml Malayalam
mt Maltese
mi Maori
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
fa Persian
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal)
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es Spanish
es-419 Spanish (Latin American)
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:09,633 --> 00:00:10,586 We humans, 2 00:00:10,610 --> 00:00:12,001 have a keen eye for 3 00:00:12,025 --> 00:00:12,964 visual treat 4 00:00:12,987 --> 00:00:15,100 and love a good eye candy. 5 00:00:15,478 --> 00:00:17,690 That being said, statisticians 6 00:00:17,714 --> 00:00:19,326 were having a hard time 7 00:00:19,350 --> 00:00:20,758 with getting people to listen 8 00:00:20,782 --> 00:00:22,426 to their very important 9 00:00:22,450 --> 00:00:24,128 and relevent data. 10 00:00:24,489 --> 00:00:25,538 Frankly speaking, 11 00:00:25,562 --> 00:00:26,642 even though tables 12 00:00:26,666 --> 00:00:28,356 are easier to consume, 13 00:00:28,380 --> 00:00:30,462 it still does not taste good. 14 00:00:30,486 --> 00:00:31,141 Does it? 15 00:00:31,465 --> 00:00:32,166 So what 16 00:00:32,189 --> 00:00:32,680 to do? 17 00:00:32,704 --> 00:00:34,417 A lot of data 18 00:00:34,441 --> 00:00:35,402 that we deal with 19 00:00:35,425 --> 00:00:37,614 in the real life is comparative. 20 00:00:37,638 --> 00:00:39,333 As in comparing 2 things. 21 00:00:39,358 --> 00:00:40,598 It can be about 22 00:00:40,622 --> 00:00:42,169 which student is tallest 23 00:00:42,193 --> 00:00:43,838 or which candy is cheapest 24 00:00:43,862 --> 00:00:44,711 and so on. 25 00:00:45,015 --> 00:00:46,151 This is where 26 00:00:46,174 --> 00:00:47,794 graphical representation 27 00:00:47,819 --> 00:00:49,116 comes to our rescue. 28 00:00:49,139 --> 00:00:51,658 Graphical representation of data 29 00:00:51,682 --> 00:00:53,218 allows us to understand 30 00:00:53,242 --> 00:00:55,465 the data much more easily 31 00:00:55,489 --> 00:00:58,472 and intuitively than a table. 32 00:00:58,497 --> 00:00:59,767 Our aim here, 33 00:00:59,791 --> 00:01:00,916 is to throw some light 34 00:01:00,940 --> 00:01:02,627 on 3 major types 35 00:01:02,651 --> 00:01:05,119 of graphical representation of data. 36 00:01:05,144 --> 00:01:07,052 That is the bar graph, 37 00:01:07,075 --> 00:01:09,050 the histogram and 38 00:01:09,074 --> 00:01:10,499 the frequency polygon. 39 00:01:20,904 --> 00:01:21,972 You already know 40 00:01:21,996 --> 00:01:23,522 a few things about graphs 41 00:01:23,546 --> 00:01:25,078 from your earlier classes. 42 00:01:25,102 --> 00:01:26,394 Let's build on that. 43 00:01:26,418 --> 00:01:28,934 Let's represent table of heights 44 00:01:28,958 --> 00:01:29,681 in the form 45 00:01:29,705 --> 00:01:30,881 of a bar graph. 46 00:01:30,905 --> 00:01:32,634 Let's bring up the table 47 00:01:32,658 --> 00:01:34,418 of ungrouped data here. 48 00:01:34,735 --> 00:01:35,899 To draw the chart, 49 00:01:35,923 --> 00:01:37,738 I'll start off by drawing 50 00:01:37,763 --> 00:01:39,858 a flat horizontal line 51 00:01:39,881 --> 00:01:41,655 called the X axis. 52 00:01:41,679 --> 00:01:42,704 Where you represent 53 00:01:42,728 --> 00:01:44,181 different height values 54 00:01:44,205 --> 00:01:45,556 of all the students. 55 00:01:45,864 --> 00:01:47,227 Now, as I 56 00:01:47,251 --> 00:01:48,609 go through this data, 57 00:01:48,632 --> 00:01:50,531 I start to add a dot 58 00:01:50,555 --> 00:01:52,116 above the X axis. 59 00:01:52,453 --> 00:01:54,236 So first, I put a dot 60 00:01:54,260 --> 00:01:55,670 at 195. 61 00:01:55,695 --> 00:01:58,235 The next is at 175. 62 00:01:58,259 --> 00:02:00,611 The next at 170 63 00:02:00,635 --> 00:02:02,339 and I keep continuing. 64 00:02:02,363 --> 00:02:03,653 The fifth student 65 00:02:03,677 --> 00:02:05,677 is at 185. 66 00:02:05,701 --> 00:02:07,897 And so is the sixth student. 67 00:02:07,920 --> 00:02:10,993 So I add a dot above that. 68 00:02:11,018 --> 00:02:12,499 This can continue 69 00:02:12,523 --> 00:02:15,061 till I exhaust the complete data. 70 00:02:15,084 --> 00:02:17,003 So if you observe, 71 00:02:17,028 --> 00:02:19,105 the Y axis represents 72 00:02:19,129 --> 00:02:20,478 the number of students 73 00:02:20,502 --> 00:02:22,770 or the frequency. 74 00:02:23,248 --> 00:02:24,223 Because this is called 75 00:02:24,247 --> 00:02:25,518 a bar graph and not 76 00:02:25,541 --> 00:02:26,652 a dot graph, 77 00:02:26,677 --> 00:02:28,578 instead of you using dots 78 00:02:28,602 --> 00:02:29,622 like you just did, 79 00:02:29,646 --> 00:02:30,979 you can start to draw 80 00:02:31,003 --> 00:02:33,770 rectangular bars and extend it 81 00:02:33,794 --> 00:02:36,118 till your corresponding data point. 82 00:02:36,141 --> 00:02:37,741 For example, there is just 83 00:02:37,765 --> 00:02:39,669 one student with the height of 84 00:02:39,694 --> 00:02:43,059 130 cms. So I extend the bar 85 00:02:43,082 --> 00:02:45,234 till it reaches the level of 86 00:02:45,258 --> 00:02:47,596 1 on the Y axis. 87 00:02:47,620 --> 00:02:49,384 The next data value is 88 00:02:49,408 --> 00:02:52,744 135. Has a frequency of 2. 89 00:02:52,769 --> 00:02:54,503 So I extend the bar 90 00:02:54,527 --> 00:02:56,142 till it reaches a value of 91 00:02:56,166 --> 00:02:58,141 2 on your Y axis. 92 00:02:58,165 --> 00:03:00,724 Moving on, we have 155 93 00:03:00,748 --> 00:03:02,720 which appears 7 times. 94 00:03:02,744 --> 00:03:04,228 So the bar for this value 95 00:03:04,252 --> 00:03:06,574 extends till it reaches 7 96 00:03:06,598 --> 00:03:07,888 on the Y axis. 97 00:03:07,912 --> 00:03:10,604 162 goes upto 6. 98 00:03:10,627 --> 00:03:13,304 168 goes up to 4. 99 00:03:13,328 --> 00:03:15,488 Finally 195, 100 00:03:15,512 --> 00:03:17,810 with the frequency of 1. 101 00:03:17,833 --> 00:03:19,733 Remember that the thickness 102 00:03:19,758 --> 00:03:20,797 of all these bars 103 00:03:20,820 --> 00:03:21,465 that you see 104 00:03:21,489 --> 00:03:23,272 is actually of your choice. 105 00:03:23,296 --> 00:03:25,211 But for the sake of clarity, 106 00:03:25,235 --> 00:03:26,629 you tend to maintain 107 00:03:26,653 --> 00:03:27,270 all of them 108 00:03:27,294 --> 00:03:28,871 as the same thickness. 109 00:03:29,167 --> 00:03:31,319 From this you can easily tell 110 00:03:31,343 --> 00:03:32,695 the number of students 111 00:03:32,719 --> 00:03:34,560 that have the same height 112 00:03:34,584 --> 00:03:35,960 by looking at the 113 00:03:35,985 --> 00:03:38,162 height of each of these bars. 114 00:03:38,389 --> 00:03:40,890 This becomes all the more necessary, 115 00:03:40,914 --> 00:03:42,749 when we ask questions regarding 116 00:03:42,774 --> 00:03:44,138 a particular height 117 00:03:44,162 --> 00:03:46,299 and also how many students 118 00:03:46,322 --> 00:03:47,616 have the same height. 119 00:03:48,054 --> 00:03:49,656 We apply the same logic 120 00:03:49,679 --> 00:03:51,577 to grouped data as well. 121 00:03:51,602 --> 00:03:53,243 In this, we have 122 00:03:53,267 --> 00:03:54,777 heights of 60 students 123 00:03:54,800 --> 00:03:57,206 grouped into classes of 10 each. 124 00:03:57,230 --> 00:04:00,192 So we drop the axis again. 125 00:04:00,216 --> 00:04:01,759 This time we have 126 00:04:01,783 --> 00:04:03,549 the classes on the X axis 127 00:04:03,573 --> 00:04:05,601 and corresponding frequencies 128 00:04:05,624 --> 00:04:06,861 on the Y axis. 129 00:04:07,134 --> 00:04:09,581 The class of 130-140 130 00:04:09,605 --> 00:04:11,492 has a frequency of 9. 131 00:04:11,516 --> 00:04:13,487 So the bar extends from 132 00:04:13,511 --> 00:04:15,019 X axis, to reach 133 00:04:15,043 --> 00:04:16,405 a level of 9 134 00:04:16,430 --> 00:04:17,538 on the Y axis. 135 00:04:17,914 --> 00:04:19,488 Similarly, for the rest 136 00:04:19,512 --> 00:04:20,631 of the frequencies. 137 00:04:21,130 --> 00:04:22,618 Making data visual, 138 00:04:22,642 --> 00:04:24,638 makes it leads better 139 00:04:24,662 --> 00:04:26,587 to understand it. Doesn't it? 140 00:04:26,611 --> 00:04:29,525 A similar representation can happen 141 00:04:29,549 --> 00:04:32,014 using a histogram as well. 142 00:04:32,038 --> 00:04:33,541 Let's dive in. 143 00:04:44,121 --> 00:04:46,206 A histogram is just like 144 00:04:46,230 --> 00:04:47,387 this bar graph. 145 00:04:47,411 --> 00:04:48,928 But I will have to make 146 00:04:48,952 --> 00:04:50,390 a few changes here. 147 00:04:50,619 --> 00:04:52,147 Just like a bar graph 148 00:04:52,171 --> 00:04:53,897 we represent the height 149 00:04:53,922 --> 00:04:55,597 on the horizontal axis 150 00:04:55,621 --> 00:04:58,066 but using a suitable scale. 151 00:04:58,090 --> 00:05:00,818 Scale here, becomes very important 152 00:05:00,842 --> 00:05:02,515 because the area 153 00:05:02,539 --> 00:05:03,992 that this bar covers 154 00:05:04,016 --> 00:05:05,055 in a histogram 155 00:05:05,080 --> 00:05:06,701 is very very important. 156 00:05:07,074 --> 00:05:08,551 We can choose the scale 157 00:05:08,575 --> 00:05:12,293 as 1cm equivalent to 10cms. 158 00:05:12,317 --> 00:05:14,270 So each class occupies 159 00:05:14,294 --> 00:05:16,343 a width of 1cm 160 00:05:16,367 --> 00:05:17,646 on this graph. 161 00:05:17,670 --> 00:05:20,099 Also since the first class interval 162 00:05:20,123 --> 00:05:21,880 is not starting from zero 163 00:05:21,904 --> 00:05:24,454 but a fixed non-zero value 164 00:05:24,478 --> 00:05:25,801 we show it on a graph 165 00:05:25,825 --> 00:05:28,044 by marking a Kink 166 00:05:28,068 --> 00:05:29,504 like you see here. 167 00:05:29,528 --> 00:05:31,650 As this has a break 168 00:05:31,673 --> 00:05:33,326 on the axis. Next. 169 00:05:33,351 --> 00:05:34,467 Unlike the bar graph 170 00:05:34,491 --> 00:05:35,537 there are no gaps 171 00:05:35,561 --> 00:05:37,009 in between the rectangles 172 00:05:37,033 --> 00:05:37,675 of the graph. 173 00:05:37,699 --> 00:05:38,858 So, I will have to 174 00:05:38,882 --> 00:05:40,730 knock off all of these gaps 175 00:05:40,754 --> 00:05:42,145 and will have to keep 176 00:05:42,169 --> 00:05:44,245 only the lower class limits 177 00:05:44,269 --> 00:05:45,037 on the graph. 178 00:05:45,561 --> 00:05:48,880 Technically, it is one solid figure. 179 00:05:48,904 --> 00:05:50,138 What you see now 180 00:05:50,162 --> 00:05:52,461 is called a Histogram. 181 00:05:52,707 --> 00:05:54,244 One important thing 182 00:05:54,267 --> 00:05:55,249 that you will need to 183 00:05:55,273 --> 00:05:56,875 keep in mind about a histogram, 184 00:05:56,899 --> 00:05:58,852 is the area of the graph 185 00:05:58,876 --> 00:06:01,222 plays a very crucial role. 186 00:06:01,690 --> 00:06:03,834 In fact, the area of the bar 187 00:06:03,858 --> 00:06:05,802 is directly proportional 188 00:06:05,826 --> 00:06:08,081 to the frequency of that data. 189 00:06:08,105 --> 00:06:10,273 Also the sum of areas 190 00:06:10,297 --> 00:06:11,698 of all the bars 191 00:06:11,721 --> 00:06:12,609 is equal to the 192 00:06:12,633 --> 00:06:15,430 total frequency of all the classes 193 00:06:15,454 --> 00:06:16,429 in the table. 194 00:06:16,737 --> 00:06:18,464 Till now we have dealt with 195 00:06:18,488 --> 00:06:20,449 classes of equal sizes. 196 00:06:20,676 --> 00:06:22,019 What if I have 197 00:06:22,043 --> 00:06:23,643 different class sizes 198 00:06:23,667 --> 00:06:25,308 on the same histogram? 199 00:06:25,545 --> 00:06:26,680 That is, what if 200 00:06:26,704 --> 00:06:28,452 I have to put all students 201 00:06:28,476 --> 00:06:30,636 with heights less than 150 202 00:06:30,660 --> 00:06:31,520 in one bracket 203 00:06:31,544 --> 00:06:32,895 and anyone with heights 204 00:06:32,919 --> 00:06:35,726 more than 170 in another bracket. 205 00:06:35,990 --> 00:06:38,014 Absolutely arbitrary. 206 00:06:38,038 --> 00:06:39,236 So that means 207 00:06:39,260 --> 00:06:41,321 class interval 130-140, 208 00:06:41,345 --> 00:06:43,772 140-150 get clubbed 209 00:06:43,796 --> 00:06:45,270 into one class interval 210 00:06:45,294 --> 00:06:48,718 of 130-150 along with their 211 00:06:48,741 --> 00:06:50,207 respective frequencies. 212 00:06:50,231 --> 00:06:53,880 Likewise class intervals of 170-180, 213 00:06:53,904 --> 00:06:56,847 180-190, 190-200 214 00:06:56,871 --> 00:06:58,425 all get clubbed 215 00:06:58,448 --> 00:07:01,556 in a class interval of 170-200. 216 00:07:01,580 --> 00:07:04,725 And in between, we have 150-160 217 00:07:04,749 --> 00:07:06,933 and 160-170. 218 00:07:06,957 --> 00:07:08,651 That remain as is. 219 00:07:08,675 --> 00:07:10,622 So, that means, now 220 00:07:10,646 --> 00:07:12,259 you have a new table 221 00:07:12,283 --> 00:07:14,910 with classes of different widths. 222 00:07:14,934 --> 00:07:16,650 Tthe first class width is 20. 223 00:07:16,674 --> 00:07:19,131 That is 150 minus 130. 224 00:07:19,156 --> 00:07:21,068 Followed by 2 class widths 225 00:07:21,092 --> 00:07:22,092 of 10 each. 226 00:07:22,115 --> 00:07:24,629 That's 160 minus 150. 227 00:07:24,653 --> 00:07:25,844 And the last one 228 00:07:25,868 --> 00:07:27,451 with a width of 30, 229 00:07:27,475 --> 00:07:30,437 which is 200 minus 170 230 00:07:30,461 --> 00:07:31,302 that you see. 231 00:07:31,555 --> 00:07:32,859 If I were to draw 232 00:07:32,883 --> 00:07:34,457 a bar graph here, 233 00:07:34,481 --> 00:07:36,410 this is how it would look. 234 00:07:37,239 --> 00:07:38,076 For a histogram 235 00:07:38,100 --> 00:07:39,062 on the other hand, 236 00:07:39,086 --> 00:07:40,184 I mentioned that the 237 00:07:40,208 --> 00:07:42,566 areas of the bars are crucial 238 00:07:42,591 --> 00:07:44,446 for accurate representation. 239 00:07:44,470 --> 00:07:46,194 We need to pay attention 240 00:07:46,218 --> 00:07:47,723 to the width and height 241 00:07:47,747 --> 00:07:49,314 of these bars here. 242 00:07:49,338 --> 00:07:50,575 So, the width of 243 00:07:50,599 --> 00:07:52,320 all the 3 classes are 244 00:07:52,344 --> 00:07:53,831 different. Remember 245 00:07:53,855 --> 00:07:54,976 I told you that the 246 00:07:55,000 --> 00:07:56,388 area of a histogram 247 00:07:56,412 --> 00:07:58,100 has to be proportional 248 00:07:58,124 --> 00:07:59,303 to the frequency. 249 00:07:59,674 --> 00:08:01,265 So how do we do this? 250 00:08:01,289 --> 00:08:02,499 We need to bring 251 00:08:02,523 --> 00:08:04,885 all the frequencies in line 252 00:08:04,910 --> 00:08:07,037 with the minimum class width. 253 00:08:07,061 --> 00:08:09,492 The minimum class width here is 254 00:08:09,516 --> 00:08:10,383 10. 255 00:08:10,407 --> 00:08:12,441 The length of the rectangles 256 00:08:12,465 --> 00:08:13,972 are to be modified 257 00:08:13,996 --> 00:08:16,669 to proportionate this class size. 258 00:08:16,930 --> 00:08:17,876 For instance, 259 00:08:17,900 --> 00:08:19,766 when the class size is 20, 260 00:08:19,790 --> 00:08:22,034 as is the first case. 261 00:08:22,057 --> 00:08:23,575 The length of the rectangle 262 00:08:23,600 --> 00:08:25,874 will be 16 times 10 263 00:08:25,898 --> 00:08:27,354 divided by 20, 264 00:08:27,378 --> 00:08:28,203 which is going to be 265 00:08:28,226 --> 00:08:29,016 equivalent to 8. 266 00:08:29,040 --> 00:08:31,408 This is simple cross multiplication. 267 00:08:31,656 --> 00:08:33,958 This way the total frequency 268 00:08:33,982 --> 00:08:36,284 will be 16 in this range. 269 00:08:36,616 --> 00:08:37,885 The next 2 groups 270 00:08:37,909 --> 00:08:39,626 the class widths are the same 271 00:08:39,650 --> 00:08:41,263 as the minimum class width. 272 00:08:41,287 --> 00:08:42,690 Hence you don't need to 273 00:08:42,714 --> 00:08:43,653 change anything. 274 00:08:43,913 --> 00:08:45,350 The last one however, 275 00:08:45,374 --> 00:08:47,340 goes through the same treatment 276 00:08:47,365 --> 00:08:48,433 as the first one. 277 00:08:48,633 --> 00:08:49,704 In this instance, 278 00:08:49,728 --> 00:08:51,559 the class size is 30 279 00:08:51,583 --> 00:08:53,502 and the frequency is 8. 280 00:08:53,526 --> 00:08:54,833 So when the class size 281 00:08:54,857 --> 00:08:55,769 becomes 10, 282 00:08:55,794 --> 00:08:57,200 the length of this rectangle 283 00:08:57,224 --> 00:08:58,802 will be 8 times 10 284 00:08:58,826 --> 00:08:59,978 divided by 30. 285 00:09:00,002 --> 00:09:02,217 That is 2.666 286 00:09:02,601 --> 00:09:04,798 This histogram can now be said, 287 00:09:04,823 --> 00:09:05,939 to be proportional 288 00:09:05,963 --> 00:09:06,935 to the students 289 00:09:06,959 --> 00:09:09,053 per 10 cm interval. 290 00:09:19,426 --> 00:09:20,568 Even though a bar graph 291 00:09:20,592 --> 00:09:22,002 and a histogram look alike, 292 00:09:22,026 --> 00:09:23,704 you might have noticed already 293 00:09:23,728 --> 00:09:25,454 that there are a few differences. 294 00:09:25,874 --> 00:09:28,241 In fact if I bring them together 295 00:09:28,265 --> 00:09:29,924 unless you are a statistician, 296 00:09:29,948 --> 00:09:30,944 chances are, 297 00:09:30,968 --> 00:09:32,629 that you will get confused. 298 00:09:32,653 --> 00:09:33,733 This exercise that 299 00:09:33,757 --> 00:09:34,676 we will do now, 300 00:09:34,700 --> 00:09:36,201 will help you sort out 301 00:09:36,225 --> 00:09:37,428 this confusion. 302 00:09:37,453 --> 00:09:38,363 If I ask you to 303 00:09:38,387 --> 00:09:40,629 collect data about language preferences 304 00:09:40,653 --> 00:09:42,105 of the students and 305 00:09:42,128 --> 00:09:43,155 add it to our 306 00:09:43,180 --> 00:09:44,224 original table. 307 00:09:44,435 --> 00:09:46,054 Now I will be able to 308 00:09:46,078 --> 00:09:47,192 draw a bar graph 309 00:09:47,217 --> 00:09:48,029 out of it. 310 00:09:48,420 --> 00:09:49,912 Now let's try to make 311 00:09:49,935 --> 00:09:51,388 a histogram out of the 312 00:09:51,412 --> 00:09:52,434 language data 313 00:09:52,458 --> 00:09:53,718 that we have collected. 314 00:09:54,201 --> 00:09:56,746 Is that even possible? 315 00:09:57,152 --> 00:10:00,091 Hmm. No it is not. 316 00:10:00,719 --> 00:10:03,013 Infact the data that you collect 317 00:10:03,037 --> 00:10:05,211 can be split into qualitative 318 00:10:05,234 --> 00:10:07,263 and quantitative data. 319 00:10:07,287 --> 00:10:08,573 If you're looking at 320 00:10:08,597 --> 00:10:09,667 colors of the car 321 00:10:09,691 --> 00:10:10,410 on the road, 322 00:10:10,434 --> 00:10:11,835 then the color of the car 323 00:10:11,859 --> 00:10:13,076 which is a data 324 00:10:13,100 --> 00:10:15,085 which is of the qualitative kind 325 00:10:15,109 --> 00:10:16,836 because this describes the 326 00:10:16,860 --> 00:10:18,914 quality of that particular data. 327 00:10:18,938 --> 00:10:20,213 Or if I ask you 328 00:10:20,236 --> 00:10:21,750 the flavor of ice cream 329 00:10:21,775 --> 00:10:22,468 that you like, 330 00:10:22,491 --> 00:10:24,957 that again is a qualitative data. 331 00:10:24,982 --> 00:10:26,319 On the other hand, 332 00:10:26,343 --> 00:10:27,804 data such as heights, 333 00:10:27,829 --> 00:10:30,338 weights, roll numbers, etc. 334 00:10:30,361 --> 00:10:31,516 are data that are 335 00:10:31,540 --> 00:10:33,025 represented by numbers. 336 00:10:33,269 --> 00:10:37,086 Here height is 160 cms tall. 337 00:10:37,110 --> 00:10:39,886 160 is a quantitative data 338 00:10:39,910 --> 00:10:41,306 since it refers to 339 00:10:41,330 --> 00:10:42,660 numerical data. 340 00:10:42,683 --> 00:10:44,063 From the examples 341 00:10:44,087 --> 00:10:45,305 that we have solved before, 342 00:10:45,329 --> 00:10:46,193 you can see 343 00:10:46,217 --> 00:10:47,476 that we can represent both 344 00:10:47,500 --> 00:10:50,135 qualitative and quantitative data 345 00:10:50,159 --> 00:10:51,212 on the bar graph. 346 00:10:51,236 --> 00:10:52,887 Where as we can represent 347 00:10:52,911 --> 00:10:55,087 only quantitative data 348 00:10:55,111 --> 00:10:56,358 on a histogram. 349 00:10:56,382 --> 00:10:57,908 So the next time, 350 00:10:57,931 --> 00:10:59,356 you need to make a graph 351 00:10:59,380 --> 00:11:00,752 be sure to analyze 352 00:11:00,776 --> 00:11:02,147 what kind of data 353 00:11:02,171 --> 00:11:03,708 you are trying to represent. 354 00:11:04,069 --> 00:11:05,434 Now let's start making 355 00:11:05,458 --> 00:11:07,241 a difference table out here. 356 00:11:07,265 --> 00:11:08,597 And let's start populating 357 00:11:08,621 --> 00:11:10,343 the differences as we go about. 358 00:11:10,661 --> 00:11:11,736 Let's bring back the 359 00:11:11,761 --> 00:11:12,999 graph of heights 360 00:11:13,023 --> 00:11:14,647 from the bar graph section. 361 00:11:14,870 --> 00:11:16,173 Now we see that 362 00:11:16,197 --> 00:11:17,444 on the X axis, 363 00:11:17,468 --> 00:11:19,932 each data point is represented 364 00:11:19,956 --> 00:11:21,923 individually. For example, 365 00:11:21,947 --> 00:11:24,694 a student's height of 130 cms 366 00:11:24,718 --> 00:11:26,520 is represented individually 367 00:11:26,544 --> 00:11:29,810 as 130 cms on the X axis. 368 00:11:29,833 --> 00:11:32,020 This kind of data representation 369 00:11:32,044 --> 00:11:35,614 individually is called Discrete data. 370 00:11:35,638 --> 00:11:36,829 And we also 371 00:11:36,853 --> 00:11:37,682 know that we can 372 00:11:37,705 --> 00:11:38,952 construct a bar a graph 373 00:11:38,976 --> 00:11:40,994 using grouped data as well. 374 00:11:41,018 --> 00:11:43,345 Grouped data here, refers to 375 00:11:43,369 --> 00:11:45,712 when a data point is represented 376 00:11:45,735 --> 00:11:47,586 not individually but as a 377 00:11:47,610 --> 00:11:49,667 continuous range of values. 378 00:11:49,888 --> 00:11:51,915 In case of discrete data, 379 00:11:51,939 --> 00:11:53,278 we can have gaps 380 00:11:53,302 --> 00:11:54,700 in between the values 381 00:11:54,724 --> 00:11:55,841 of data points 382 00:11:55,865 --> 00:11:56,950 on the X axis. 383 00:11:56,974 --> 00:11:58,478 The data that you collect 384 00:11:58,502 --> 00:12:00,216 can again be classified 385 00:12:00,240 --> 00:12:01,344 in one more type. 386 00:12:01,368 --> 00:12:04,256 As continuous and discrete data. 387 00:12:04,280 --> 00:12:05,314 When you're talking about 388 00:12:05,338 --> 00:12:07,517 discrete data, there can be 389 00:12:07,541 --> 00:12:08,687 gaps in the data 390 00:12:08,711 --> 00:12:09,446 that you collect. 391 00:12:09,678 --> 00:12:11,623 For example 130 cms 392 00:12:11,648 --> 00:12:15,196 and 135 cms as heights of students 393 00:12:15,220 --> 00:12:16,615 has a gap of 394 00:12:16,639 --> 00:12:18,233 5 in between them. 395 00:12:18,257 --> 00:12:19,629 And when you're talking about 396 00:12:19,653 --> 00:12:20,850 continuous data, 397 00:12:20,874 --> 00:12:22,698 there cannot be these gaps 398 00:12:22,722 --> 00:12:23,944 that you see here. 399 00:12:23,968 --> 00:12:25,330 So, when it comes to a 400 00:12:25,354 --> 00:12:27,036 bar graph, you can represent 401 00:12:27,060 --> 00:12:30,146 both continous and discrete data. 402 00:12:30,170 --> 00:12:31,629 But in a histogram 403 00:12:31,653 --> 00:12:33,199 you can represent only 404 00:12:33,222 --> 00:12:34,726 continuous data. 405 00:12:34,751 --> 00:12:36,605 This is another reason why 406 00:12:36,629 --> 00:12:38,279 bars of a bar graph 407 00:12:38,303 --> 00:12:39,989 are separated by a gap, 408 00:12:40,013 --> 00:12:42,238 since they are discrete values. 409 00:12:42,262 --> 00:12:43,749 Whereas in a histogram 410 00:12:43,773 --> 00:12:46,287 all the bars are clubbed together. 411 00:12:46,311 --> 00:12:47,705 We also cannot 412 00:12:47,729 --> 00:12:49,344 reorder this data 413 00:12:49,368 --> 00:12:50,914 in case of a histogram 414 00:12:50,939 --> 00:12:53,429 due to continuity of the data. 415 00:12:53,452 --> 00:12:56,167 Let's add these 2 points also 416 00:12:56,191 --> 00:12:58,319 into our comparison chart. 417 00:12:58,570 --> 00:13:00,386 Using continuous data means 418 00:13:00,409 --> 00:13:01,596 that the classes have to be 419 00:13:01,620 --> 00:13:03,057 ordered on the graph 420 00:13:03,081 --> 00:13:04,704 as the appeared to us. 421 00:13:04,727 --> 00:13:05,875 On the other hand 422 00:13:05,899 --> 00:13:07,193 having discrete data 423 00:13:07,217 --> 00:13:08,156 in the bar graph 424 00:13:08,180 --> 00:13:10,023 allows you to arrange the variables 425 00:13:10,047 --> 00:13:12,097 in anyway you want to. 426 00:13:12,121 --> 00:13:13,913 When I'm drawing a bar graph 427 00:13:13,937 --> 00:13:14,942 the order in which 428 00:13:14,967 --> 00:13:15,854 I show the elements 429 00:13:15,878 --> 00:13:16,900 on the X axis 430 00:13:16,924 --> 00:13:18,813 is not a problem at all. 431 00:13:18,837 --> 00:13:21,488 I can first show 130-140. 432 00:13:21,512 --> 00:13:23,719 Then show 150-160. 433 00:13:23,743 --> 00:13:26,580 And then I can have 140-150. 434 00:13:26,604 --> 00:13:28,004 But when it comes to a 435 00:13:28,028 --> 00:13:29,521 histogram, I cannot 436 00:13:29,545 --> 00:13:31,064 reorder the data. 437 00:13:31,088 --> 00:13:33,043 This is obvious because 438 00:13:33,067 --> 00:13:34,240 we are dealing with 439 00:13:34,264 --> 00:13:36,036 continuous variable. 440 00:13:36,060 --> 00:13:37,338 This is one more 441 00:13:37,362 --> 00:13:39,046 for the comparison chart. 442 00:13:39,275 --> 00:13:40,957 As you've already seen, 443 00:13:40,981 --> 00:13:43,193 the spaces in between the bars 444 00:13:43,217 --> 00:13:45,245 are not present in the histogram. 445 00:13:45,269 --> 00:13:47,246 It essentially looks like 446 00:13:47,270 --> 00:13:48,718 one big block. 447 00:13:48,892 --> 00:13:50,742 Also the width of the bars 448 00:13:50,766 --> 00:13:52,169 need not be the same 449 00:13:52,193 --> 00:13:53,872 when it comes to a histogram. 450 00:13:54,244 --> 00:13:55,718 Also remember, 451 00:13:55,741 --> 00:13:57,207 that the area of the bar 452 00:13:57,231 --> 00:13:58,841 plays a huge role 453 00:13:58,865 --> 00:14:00,482 in a histogram and hence, 454 00:14:00,506 --> 00:14:02,712 we need to maintain uniformity 455 00:14:02,736 --> 00:14:04,575 of class width through out. 456 00:14:04,726 --> 00:14:05,806 But in the case 457 00:14:05,831 --> 00:14:06,657 of a bar graph, 458 00:14:06,681 --> 00:14:07,644 the width of the bars 459 00:14:07,668 --> 00:14:08,725 are immaterial 460 00:14:08,748 --> 00:14:09,992 to the interpretation 461 00:14:10,017 --> 00:14:11,065 of the bar graph. 462 00:14:21,355 --> 00:14:23,310 There is yet another visual way 463 00:14:23,334 --> 00:14:25,672 of representing quantitative data 464 00:14:25,696 --> 00:14:27,012 and its frequency. 465 00:14:27,389 --> 00:14:29,841 It's called the frequency polygon. 466 00:14:30,162 --> 00:14:31,736 Let's consider the histogram 467 00:14:31,760 --> 00:14:33,278 that we initially constructed 468 00:14:33,302 --> 00:14:35,416 with equal class intervals. 469 00:14:35,812 --> 00:14:37,711 Let me mark this point, 470 00:14:37,735 --> 00:14:38,740 which is the midpoint 471 00:14:38,764 --> 00:14:39,882 of the class interval 472 00:14:39,906 --> 00:14:41,830 of 130-140. 473 00:14:41,854 --> 00:14:43,450 I will call this point 474 00:14:43,474 --> 00:14:45,176 as the class mark. 475 00:14:45,200 --> 00:14:47,276 So class mark is a 476 00:14:47,301 --> 00:14:49,210 mathematical way of saying 477 00:14:49,233 --> 00:14:51,001 mid-point of class interval 478 00:14:51,026 --> 00:14:52,170 which we obtained 479 00:14:52,194 --> 00:14:53,847 by adding the upper 480 00:14:53,871 --> 00:14:55,903 and lower limits of a class 481 00:14:55,927 --> 00:14:57,907 and dividing it by 2. 482 00:14:57,931 --> 00:14:59,020 If we consider 483 00:14:59,045 --> 00:15:01,449 the class interval of 150-160, 484 00:15:01,473 --> 00:15:02,820 its class mark is 485 00:15:02,844 --> 00:15:05,228 150+160/2 486 00:15:05,252 --> 00:15:07,380 which is going to be 155. 487 00:15:07,404 --> 00:15:09,120 Next I will highlight 488 00:15:09,144 --> 00:15:10,028 the class marks 489 00:15:10,052 --> 00:15:11,836 for all other class intervals 490 00:15:11,860 --> 00:15:12,625 as well. 491 00:15:12,934 --> 00:15:14,785 For a frequency polygon, 492 00:15:14,809 --> 00:15:15,813 all I have to do 493 00:15:15,837 --> 00:15:16,717 is to connect 494 00:15:16,741 --> 00:15:18,071 all of these dots. 495 00:15:18,095 --> 00:15:19,553 Or connect all of these 496 00:15:19,577 --> 00:15:21,359 class marks. Well. 497 00:15:21,383 --> 00:15:24,316 I said frequency polygon. 498 00:15:24,340 --> 00:15:26,119 But what is a polygon? 499 00:15:26,143 --> 00:15:27,711 A polygon is a 500 00:15:27,736 --> 00:15:29,390 multi-sided shape. 501 00:15:29,414 --> 00:15:30,659 But before all 502 00:15:30,683 --> 00:15:32,840 it is a closed shape. 503 00:15:33,207 --> 00:15:35,007 So how do we get that? 504 00:15:35,031 --> 00:15:36,673 We add a class interval 505 00:15:36,696 --> 00:15:37,920 before the first one 506 00:15:37,944 --> 00:15:38,933 in the data 507 00:15:38,957 --> 00:15:40,037 and do the same 508 00:15:40,062 --> 00:15:41,117 in the other end 509 00:15:41,141 --> 00:15:42,949 of the histogram as well. 510 00:15:42,972 --> 00:15:45,028 Since, the first class interval 511 00:15:45,053 --> 00:15:48,615 is 130-140 we add another 512 00:15:48,639 --> 00:15:50,690 with 120-130. 513 00:15:50,714 --> 00:15:53,061 This class interval will ofcourse 514 00:15:53,085 --> 00:15:54,711 have a frequency of zero, 515 00:15:54,735 --> 00:15:55,632 since it is not 516 00:15:55,656 --> 00:15:57,364 represented in the table. 517 00:15:57,388 --> 00:15:58,608 We just have to 518 00:15:58,631 --> 00:16:00,006 mark the class mark 519 00:16:00,031 --> 00:16:01,369 for this group. That is 520 00:16:01,393 --> 00:16:04,344 130+120/2 521 00:16:04,368 --> 00:16:06,248 which is going to be 125. 522 00:16:06,477 --> 00:16:08,061 And then we are done. 523 00:16:08,085 --> 00:16:09,764 We can now connect the line 524 00:16:09,788 --> 00:16:11,245 to the X axis. 525 00:16:11,269 --> 00:16:12,209 Doing the same 526 00:16:12,233 --> 00:16:13,263 on the other end, 527 00:16:13,286 --> 00:16:16,116 we add 200-210 528 00:16:16,140 --> 00:16:17,898 to the frequency polygon graph. 529 00:16:17,923 --> 00:16:21,126 Marking the class mark as 205 530 00:16:21,150 --> 00:16:22,663 and closing the figure 531 00:16:22,687 --> 00:16:23,735 at the both ends 532 00:16:23,759 --> 00:16:26,035 gives us the frequency polygon. 533 00:16:26,324 --> 00:16:28,111 Instead of drawing the entire 534 00:16:28,135 --> 00:16:29,658 bar of a histogram, 535 00:16:29,682 --> 00:16:32,448 you just mark the frequency levels 536 00:16:32,472 --> 00:16:33,935 with the Y axis 537 00:16:33,959 --> 00:16:35,347 at the class mark. 538 00:16:35,371 --> 00:16:36,826 Just like a histogram, 539 00:16:36,850 --> 00:16:39,342 frequency polygon's total area 540 00:16:39,366 --> 00:16:41,043 is directly proportional 541 00:16:41,067 --> 00:16:42,834 to the total frequency 542 00:16:42,858 --> 00:16:44,115 of the table. 543 00:16:44,139 --> 00:16:45,468 For the sake of convenience, 544 00:16:45,492 --> 00:16:46,892 let's bring back the histogram 545 00:16:46,916 --> 00:16:47,539 that we drew 546 00:16:47,563 --> 00:16:48,816 in the previous sections. 547 00:16:48,840 --> 00:16:50,202 Let's take the graph 548 00:16:50,226 --> 00:16:51,611 where the frequency polygon 549 00:16:51,634 --> 00:16:53,808 is drawn over the histogram. 550 00:16:53,832 --> 00:16:56,197 So if I join the class marks 551 00:16:56,222 --> 00:16:56,981 you can see 552 00:16:57,005 --> 00:16:58,393 that the chunks of area 553 00:16:58,417 --> 00:17:00,656 are being leftout of calculation. 554 00:17:00,870 --> 00:17:02,032 There are also a few 555 00:17:02,057 --> 00:17:03,150 empty areas 556 00:17:03,174 --> 00:17:05,328 inside the frequency polygon. 557 00:17:05,673 --> 00:17:06,584 To prove to you 558 00:17:06,608 --> 00:17:07,527 that the area of the 559 00:17:07,551 --> 00:17:08,664 frequency polygon 560 00:17:08,688 --> 00:17:10,054 and that of the histogram 561 00:17:10,079 --> 00:17:11,001 are the same, 562 00:17:11,025 --> 00:17:12,494 I will cut the part 563 00:17:12,517 --> 00:17:14,192 which is outside the line. 564 00:17:14,216 --> 00:17:16,253 Flip it all over and see 565 00:17:16,277 --> 00:17:18,152 that it fits exactly 566 00:17:18,176 --> 00:17:20,311 into the empty area here. 567 00:17:20,336 --> 00:17:21,945 The same can be done 568 00:17:21,969 --> 00:17:23,484 for all the bars. 569 00:17:23,826 --> 00:17:25,933 So eventually, we see that 570 00:17:25,957 --> 00:17:27,350 all the triangles 571 00:17:27,375 --> 00:17:29,248 ejected by the line we drew 572 00:17:29,271 --> 00:17:30,669 are included within this 573 00:17:30,693 --> 00:17:32,808 frequency polygon. And hence, 574 00:17:32,832 --> 00:17:34,304 we can visually say 575 00:17:34,328 --> 00:17:35,472 that the total area 576 00:17:35,496 --> 00:17:36,802 of the frequency polygon 577 00:17:36,825 --> 00:17:37,648 is equal to the 578 00:17:37,673 --> 00:17:39,480 total area of the histogram 579 00:17:39,504 --> 00:17:41,133 made by the same data. 580 00:17:41,526 --> 00:17:43,038 Also the area 581 00:17:43,062 --> 00:17:45,496 is proportional to the frequency. 582 00:17:55,723 --> 00:17:56,927 So till now, 583 00:17:56,951 --> 00:17:57,875 we have learnt about 584 00:17:57,899 --> 00:17:59,411 raw data and how 585 00:17:59,435 --> 00:18:00,922 unless it has context, 586 00:18:00,946 --> 00:18:03,563 it is useless. Raw data 587 00:18:03,587 --> 00:18:05,168 can also be made useful 588 00:18:05,192 --> 00:18:06,430 by processing it. 589 00:18:06,454 --> 00:18:07,691 We process data 590 00:18:07,715 --> 00:18:09,278 by means of statistics. 591 00:18:09,302 --> 00:18:10,669 Using methods such as 592 00:18:10,693 --> 00:18:13,209 creating a frequency distribution table. 593 00:18:13,233 --> 00:18:14,657 Frequency is the 594 00:18:14,680 --> 00:18:15,654 number of times 595 00:18:15,679 --> 00:18:16,892 a particular data 596 00:18:16,916 --> 00:18:18,695 appears in a data set. 597 00:18:18,719 --> 00:18:20,278 When the number of heights 598 00:18:20,303 --> 00:18:21,806 are considered individually 599 00:18:21,830 --> 00:18:24,318 we call it ungrouped data set. 600 00:18:24,341 --> 00:18:25,653 It was too much data 601 00:18:25,678 --> 00:18:26,460 to deal with. 602 00:18:26,484 --> 00:18:27,458 So we then 603 00:18:27,482 --> 00:18:29,151 clubbed the heights to create 604 00:18:29,174 --> 00:18:31,217 a grouped data and a 605 00:18:31,241 --> 00:18:33,645 grouped frequency distribution table. 606 00:18:33,669 --> 00:18:35,143 We then decided 607 00:18:35,167 --> 00:18:37,067 that numbers are all together 608 00:18:37,091 --> 00:18:38,005 too boring, 609 00:18:38,029 --> 00:18:39,041 and came up with 610 00:18:39,065 --> 00:18:42,210 graphical methods of representing data. 611 00:18:42,234 --> 00:18:44,360 This includes bar graphs, 612 00:18:44,384 --> 00:18:47,291 histograms and frequency polygons. 613 00:18:47,685 --> 00:18:49,444 Bar graphs is an excellent 614 00:18:49,468 --> 00:18:50,626 comparative tool 615 00:18:50,650 --> 00:18:52,036 and is used mostly 616 00:18:52,060 --> 00:18:53,914 in non numerical context. 617 00:18:53,938 --> 00:18:56,101 Such as comparing 2 items. 618 00:18:56,125 --> 00:18:57,857 The bars in a bar graph, 619 00:18:57,880 --> 00:18:59,992 typically are of the same width 620 00:19:00,016 --> 00:19:01,407 but bare no relevance 621 00:19:01,432 --> 00:19:03,209 to the area that they occupy. 622 00:19:03,232 --> 00:19:04,738 In contrast to it, 623 00:19:04,763 --> 00:19:05,882 in a histogram 624 00:19:05,906 --> 00:19:07,542 the dimensions of the bars 625 00:19:07,566 --> 00:19:08,968 are very crucial. 626 00:19:09,196 --> 00:19:10,705 The area of the bar 627 00:19:10,729 --> 00:19:12,214 is directly proportional 628 00:19:12,238 --> 00:19:13,510 to its frequency. 629 00:19:13,534 --> 00:19:15,259 Consequently, so 630 00:19:15,283 --> 00:19:16,920 the width of the class intervals 631 00:19:16,943 --> 00:19:18,560 must be taken into account 632 00:19:18,585 --> 00:19:20,067 whenever you're attempting 633 00:19:20,090 --> 00:19:22,169 to answer relevent questions. 634 00:19:22,193 --> 00:19:23,288 Whenever the width of the 635 00:19:23,312 --> 00:19:25,113 class intervals is non-uniform, 636 00:19:25,137 --> 00:19:27,176 use the minimum class interval 637 00:19:27,200 --> 00:19:28,767 as a standard 638 00:19:28,791 --> 00:19:30,549 and use cross multiplication 639 00:19:30,573 --> 00:19:32,439 to get an accurate representation 640 00:19:32,463 --> 00:19:34,041 of data on the graph. 641 00:19:34,358 --> 00:19:36,632 When it comes to frequency polygons, 642 00:19:36,656 --> 00:19:37,753 the only thing that 643 00:19:37,777 --> 00:19:38,831 you need to do differently 644 00:19:38,855 --> 00:19:39,700 from a histogram, 645 00:19:39,724 --> 00:19:41,004 is to mark the 646 00:19:41,028 --> 00:19:42,700 class mark on the graph. 647 00:19:43,024 --> 00:19:45,287 Class mark is the midpoint 648 00:19:45,312 --> 00:19:47,105 of all the class intervals. 649 00:19:47,129 --> 00:19:49,259 Instead of an entire bar, 650 00:19:49,282 --> 00:19:51,136 you only make one mark. 651 00:19:51,160 --> 00:19:52,272 Then you connect 652 00:19:52,296 --> 00:19:53,522 all of these dots 653 00:19:53,547 --> 00:19:55,007 and get a line. 654 00:19:55,031 --> 00:19:56,738 To make frequency polygon 655 00:19:56,761 --> 00:19:57,369 out of this, 656 00:19:57,393 --> 00:19:57,998 you need to 657 00:19:58,023 --> 00:19:59,067 close the figure. 658 00:19:59,090 --> 00:20:01,039 To do this, add a class 659 00:20:01,063 --> 00:20:01,999 before the first 660 00:20:02,023 --> 00:20:03,781 and after the last classes 661 00:20:03,805 --> 00:20:04,896 with the same width 662 00:20:04,920 --> 00:20:06,361 as the width of the first 663 00:20:06,385 --> 00:20:08,570 and the last classes respectively. 40841

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.