All language subtitles for 03_important-ethical-considerations-for-data-professionals.en

af Afrikaans
ak Akan
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bn Bengali
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fr French
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian Download
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
ko Korean
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranî)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
fa Persian
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal)
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es Spanish
es-419 Spanish (Latin American)
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,200 --> 00:00:02,700 One of the most important 2 00:00:02,700 --> 00:00:04,530 responsibilities for those of us in 3 00:00:04,530 --> 00:00:07,170 data-centered careers involves how 4 00:00:07,170 --> 00:00:09,180 we protect our organizations, 5 00:00:09,180 --> 00:00:11,655 manage and protect data. 6 00:00:11,655 --> 00:00:13,380 This has a lot to do with 7 00:00:13,380 --> 00:00:14,610 communication exchanges 8 00:00:14,610 --> 00:00:16,815 between a company and its customers. 9 00:00:16,815 --> 00:00:18,210 As you've been learning, 10 00:00:18,210 --> 00:00:20,840 almost all communication generates data, 11 00:00:20,840 --> 00:00:22,650 whether it's a shopping receipt, 12 00:00:22,650 --> 00:00:24,105 confirmation of an order, 13 00:00:24,105 --> 00:00:27,165 or even earning customer loyalty points. 14 00:00:27,165 --> 00:00:30,935 Businesses have a big responsibility to their customers, 15 00:00:30,935 --> 00:00:32,540 especially when it comes to 16 00:00:32,540 --> 00:00:35,455 maintaining and protecting user privacy. 17 00:00:35,455 --> 00:00:37,280 Any data gathered from 18 00:00:37,280 --> 00:00:39,440 individuals or consumers is referred to 19 00:00:39,440 --> 00:00:43,480 as personally identifiable information or PII. 20 00:00:43,480 --> 00:00:46,820 PII permits the identity of an individual to be 21 00:00:46,820 --> 00:00:50,000 inferred by either direct or indirect means. 22 00:00:50,000 --> 00:00:52,355 This includes things like biometric records, 23 00:00:52,355 --> 00:00:54,380 usernames, and Social Security 24 00:00:54,380 --> 00:00:56,650 or national identification numbers. 25 00:00:56,650 --> 00:00:58,850 Because this information is often 26 00:00:58,850 --> 00:01:00,710 associated with medical, financial, 27 00:01:00,710 --> 00:01:01,790 and employment records, 28 00:01:01,790 --> 00:01:05,860 PII is sensitive and must be managed with great care. 29 00:01:05,860 --> 00:01:08,090 After all, when someone's personal data 30 00:01:08,090 --> 00:01:09,695 is improperly handled, 31 00:01:09,695 --> 00:01:12,500 they become vulnerable to identity theft, 32 00:01:12,500 --> 00:01:14,620 fraud, and other issues. 33 00:01:14,620 --> 00:01:17,525 Recently, there have been great efforts to take 34 00:01:17,525 --> 00:01:18,560 a wider view of 35 00:01:18,560 --> 00:01:22,015 data collection practices and protect individuals. 36 00:01:22,015 --> 00:01:25,700 Industries are trending towards aggregate information. 37 00:01:25,700 --> 00:01:28,310 This is data from a significant number of 38 00:01:28,310 --> 00:01:31,525 users that has eliminated personal information. 39 00:01:31,525 --> 00:01:34,550 By aggregating the data and removing PII, 40 00:01:34,550 --> 00:01:36,710 this protects people and gives them 41 00:01:36,710 --> 00:01:39,110 more control over their own data. 42 00:01:39,110 --> 00:01:42,200 Similarly, as more industries become interconnected, 43 00:01:42,200 --> 00:01:45,455 the amount of data available to them increases. 44 00:01:45,455 --> 00:01:47,450 Just as with aggregate information, 45 00:01:47,450 --> 00:01:48,785 the more data collected, 46 00:01:48,785 --> 00:01:50,690 the more likely it is that it will be 47 00:01:50,690 --> 00:01:53,359 representative of a wider population 48 00:01:53,359 --> 00:01:55,370 rather than a single user. 49 00:01:55,370 --> 00:01:57,500 A key thing to keep in mind is that 50 00:01:57,500 --> 00:02:00,710 data gathering is a task managed by humans, 51 00:02:00,710 --> 00:02:03,440 and that process can be informed 52 00:02:03,440 --> 00:02:04,595 by different backgrounds, 53 00:02:04,595 --> 00:02:07,270 experiences, beliefs, and worldviews. 54 00:02:07,270 --> 00:02:10,250 These and other types of biases can affect 55 00:02:10,250 --> 00:02:12,215 the way that data is communicated 56 00:02:12,215 --> 00:02:13,835 and how the results are shared, 57 00:02:13,835 --> 00:02:17,170 which in turn can have an impact on business decisions. 58 00:02:17,170 --> 00:02:19,475 Effective data professionals know that, 59 00:02:19,475 --> 00:02:21,620 whether collecting, analyzing, interpreting, 60 00:02:21,620 --> 00:02:23,695 or communicating sensitive data, 61 00:02:23,695 --> 00:02:27,345 bias should always be considered. 62 00:02:27,345 --> 00:02:29,090 So be very careful when 63 00:02:29,090 --> 00:02:31,880 interpreting data where there is a clear source 64 00:02:31,880 --> 00:02:36,740 of bias and be on the lookout for subtle biases as well. 65 00:02:36,740 --> 00:02:40,325 In addition to thinking through bias in the data, 66 00:02:40,325 --> 00:02:42,230 data professionals should also try 67 00:02:42,230 --> 00:02:44,105 to emphasize that there can be 68 00:02:44,105 --> 00:02:46,670 a multitude of possible interpretations 69 00:02:46,670 --> 00:02:49,085 for every data insight. 70 00:02:49,085 --> 00:02:52,025 The main trick is avoid 71 00:02:52,025 --> 00:02:53,690 jumping to conclusions until 72 00:02:53,690 --> 00:02:55,660 you've really done your homework. 73 00:02:55,660 --> 00:02:58,070 One method of addressing bias is to 74 00:02:58,070 --> 00:03:00,530 make sure that the data that you're 75 00:03:00,530 --> 00:03:02,945 working with has the same characteristics 76 00:03:02,945 --> 00:03:05,795 as the greater population that you're interested in. 77 00:03:05,795 --> 00:03:09,310 In data analytics, this is called a sample. 78 00:03:09,310 --> 00:03:12,940 A good sample is a segment of a population 79 00:03:12,940 --> 00:03:16,510 that is representative of that entire population. 80 00:03:16,510 --> 00:03:17,650 Here's an example. 81 00:03:17,650 --> 00:03:19,450 A clothing company is 82 00:03:19,450 --> 00:03:22,555 analyzing sales in their highest growth market. 83 00:03:22,555 --> 00:03:25,240 They want to determine what color shirts will 84 00:03:25,240 --> 00:03:28,240 be most popular in the upcoming season. 85 00:03:28,240 --> 00:03:29,650 One person notes that 86 00:03:29,650 --> 00:03:31,840 red and blue shirts accounted for 80 percent of 87 00:03:31,840 --> 00:03:33,640 their sales in this market over 88 00:03:33,640 --> 00:03:36,330 the past three months. This is a big number. 89 00:03:36,330 --> 00:03:39,460 So they suggest ordering lots of red and blue shirts, 90 00:03:39,460 --> 00:03:42,040 but another person points out that 91 00:03:42,040 --> 00:03:44,380 the local sports team's colors are red and blue, 92 00:03:44,380 --> 00:03:47,425 and this team had recently won a championship. 93 00:03:47,425 --> 00:03:49,150 It's very likely that sales 94 00:03:49,150 --> 00:03:50,690 of red and blue shirts will have 95 00:03:50,690 --> 00:03:52,550 spiked as consumers purchase 96 00:03:52,550 --> 00:03:54,445 tease to support the local team. 97 00:03:54,445 --> 00:03:56,060 Plus, they note that 98 00:03:56,060 --> 00:03:58,040 although this market is high-growth, 99 00:03:58,040 --> 00:04:00,020 it only represents 40 percent 100 00:04:00,020 --> 00:04:02,075 of the retailer's total sales. 101 00:04:02,075 --> 00:04:03,875 With all this information in mind, 102 00:04:03,875 --> 00:04:06,635 decision-makers at this retailer instead 103 00:04:06,635 --> 00:04:08,840 choose to evaluate color popularity 104 00:04:08,840 --> 00:04:11,705 over a full year and across all markets. 105 00:04:11,705 --> 00:04:15,070 This will provide a much more complete picture. 106 00:04:15,070 --> 00:04:18,610 We'll investigate more about bias later in this program, 107 00:04:18,610 --> 00:04:20,060 and as you progress, 108 00:04:20,060 --> 00:04:22,100 you'll discover many more strategies 109 00:04:22,100 --> 00:04:24,110 for ensuring that you're aware of bias 110 00:04:24,110 --> 00:04:26,315 and proactively working to counter 111 00:04:26,315 --> 00:04:29,700 it in all of your data work.7897

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.