subtitlecat.com

All language subtitles for 05 - The Machine Learning Workflow.en

Afrikaans

Akan

Albanian

Amharic

Arabic

Armenian

Azerbaijani

Basque

Belarusian

Bemba

Bengali

Bihari

Bosnian

Breton

Bulgarian

Cambodian

Catalan

Cebuano

Cherokee

Chichewa

Chinese (Simplified)

Chinese (Traditional)

Corsican

Croatian

Czech

Danish

Dutch

English

Esperanto

Estonian

Ewe

Faroese

Filipino

Finnish

French Download

Frisian

Galician

Georgian

German

Greek

Guarani

Gujarati

Haitian Creole

Hausa

Hawaiian

Hebrew

Hindi

Hmong

Hungarian

Icelandic

Igbo

Indonesian

Interlingua

Irish

Italian

Japanese

Javanese

Kannada

Kazakh

Kinyarwanda

Kirundi

Kongo

Korean

Krio (Sierra Leone)

Kurdish

Kurdish (Soranî)

Kyrgyz

Laothian

Latin

Latvian

Lingala

Lithuanian

Lozi

Luganda

Luo

Luxembourgish

Macedonian

Malagasy

Malay

Malayalam

Maltese

Maori

Marathi

Mauritian Creole

Moldavian

Mongolian

Myanmar (Burmese)

Montenegrin

Nepali

Nigerian Pidgin

Northern Sotho

Norwegian

Norwegian (Nynorsk)

Occitan

Oriya

Oromo

Pashto

Persian

Polish

Portuguese (Brazil)

Portuguese (Portugal)

Punjabi

Quechua

Romanian

Romansh

Runyakitara

Russian

Samoan

Scots Gaelic

Serbian

Serbo-Croatian

Sesotho

Setswana

Seychellois Creole

Shona

Sindhi

Sinhalese

Slovak

Slovenian

Somali

Spanish

Spanish (Latin American)

Sundanese

Swahili

Swedish

Tajik

Tamil

Tatar

Telugu

Thai

Tigrinya

Tonga

Tshiluba

Tumbuka

Turkish

Turkmen

Twi

Uighur

Ukrainian

Urdu

Uzbek

Vietnamese

Welsh

Wolof

Xhosa

Yiddish

Yoruba

Zulu

Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 0 00:00:01,040 --> 00:00:01,889 [Autogenerated] What are the steps 1 00:00:01,889 --> 00:00:03,710 involved in building and training of 2 00:00:03,710 --> 00:00:05,410 machine learning? Margie, let's take a 3 00:00:05,410 --> 00:00:07,910 quick look in this clip off the machine 4 00:00:07,910 --> 00:00:11,130 learning workload Here is what the basic 5 00:00:11,130 --> 00:00:13,490 machine learning work floor looks like. 6 00:00:13,490 --> 00:00:15,630 Don't be intimidated. There are lots of 7 00:00:15,630 --> 00:00:17,460 processes involved here. Feel. Walk 8 00:00:17,460 --> 00:00:19,570 through each of these in detail. Once 9 00:00:19,570 --> 00:00:21,519 you've understood what you want to build, 10 00:00:21,519 --> 00:00:23,410 you first need to look at the raw data 11 00:00:23,410 --> 00:00:25,429 that you have available. What data do you 12 00:00:25,429 --> 00:00:27,420 have to work with Is a sufficient to train 13 00:00:27,420 --> 00:00:30,239 your machine learning? Marty? If not, you 14 00:00:30,239 --> 00:00:32,020 won't be able to proceed further. You 15 00:00:32,020 --> 00:00:34,250 might need to go back and seek for new 16 00:00:34,250 --> 00:00:37,000 data sources. Once you know that you have 17 00:00:37,000 --> 00:00:39,649 the data that you need, You can move on 18 00:00:39,649 --> 00:00:42,689 and load and store the data. Get it ready 19 00:00:42,689 --> 00:00:44,820 for machine learning. Make sure that it's 20 00:00:44,820 --> 00:00:47,140 located in a database or a data warehouse 21 00:00:47,140 --> 00:00:49,420 where you can access the data you need to 22 00:00:49,420 --> 00:00:51,929 set up by planes toe extract the data from 23 00:00:51,929 --> 00:00:53,820 where you have it stored. And once you 24 00:00:53,820 --> 00:00:56,700 have the data with you, you need to clean 25 00:00:56,700 --> 00:00:58,950 and prepare the data. This is the data pre 26 00:00:58,950 --> 00:01:01,740 processing stage data in the real world 27 00:01:01,740 --> 00:01:03,899 cannot be used directly to train your 28 00:01:03,899 --> 00:01:05,609 machine learning models. It needs to be 29 00:01:05,609 --> 00:01:08,129 pre process. Need to get rid off missing 30 00:01:08,129 --> 00:01:10,530 values. Take care off out liars. If you 31 00:01:10,530 --> 00:01:13,140 have no numeric representations of data, 32 00:01:13,140 --> 00:01:15,000 they have to be in quarter to nomadic 33 00:01:15,000 --> 00:01:17,500 form. These three processing steps that we 34 00:01:17,500 --> 00:01:19,829 just discussed you when you go from raw 35 00:01:19,829 --> 00:01:22,480 data toe clean data that you can feed into 36 00:01:22,480 --> 00:01:25,049 a machine learning model can together be 37 00:01:25,049 --> 00:01:27,010 thought off as the process of selecting 38 00:01:27,010 --> 00:01:29,579 and extracting features that exist in your 39 00:01:29,579 --> 00:01:31,969 data. Then you are a student off machine. 40 00:01:31,969 --> 00:01:34,480 Learning your attention is mostly focused 41 00:01:34,480 --> 00:01:36,109 on understanding the machine learning 42 00:01:36,109 --> 00:01:39,239 algorithm and how it works. But these 43 00:01:39,239 --> 00:01:41,489 three steps are critical and time 44 00:01:41,489 --> 00:01:43,900 consuming steps in the real world, they 45 00:01:43,900 --> 00:01:46,650 pick up an inordinate amount of time. It's 46 00:01:46,650 --> 00:01:48,659 quite possible that you spend more time on 47 00:01:48,659 --> 00:01:50,980 the steps than on building a machine 48 00:01:50,980 --> 00:01:53,109 learning. Marty, Once you have your data 49 00:01:53,109 --> 00:01:55,819 ready, the next step is to choose the 50 00:01:55,819 --> 00:01:57,799 right algorithm for your use case. Do you 51 00:01:57,799 --> 00:02:00,180 want a decision tree You want you support 52 00:02:00,180 --> 00:02:02,859 vector machines? Do you want to use naive 53 00:02:02,859 --> 00:02:05,930 bees or key nearest neighbors? The choice 54 00:02:05,930 --> 00:02:08,360 of algorithm. It's up to you and dependent 55 00:02:08,360 --> 00:02:10,729 on your use case. Once you've chosen an 56 00:02:10,729 --> 00:02:13,900 algorithm, you'll then stream your model 57 00:02:13,900 --> 00:02:15,900 on the data that you have. This is what is 58 00:02:15,900 --> 00:02:18,389 referred to US fitting Ahmadi. The 59 00:02:18,389 --> 00:02:21,080 training process tries to find the best 60 00:02:21,080 --> 00:02:23,430 possible model parameters so that you can 61 00:02:23,430 --> 00:02:25,689 use your model for prediction. Once you 62 00:02:25,689 --> 00:02:28,409 have a model, you need to validate and 63 00:02:28,409 --> 00:02:30,259 evaluate the model, see whether it's a 64 00:02:30,259 --> 00:02:32,849 good one. There are many validation 65 00:02:32,849 --> 00:02:34,610 techniques available. You'll choose a 66 00:02:34,610 --> 00:02:37,280 validation method and apply the validation 67 00:02:37,280 --> 00:02:39,900 method toe. Evaluate your model. You'll 68 00:02:39,900 --> 00:02:42,560 examine the fit off your model and then 69 00:02:42,560 --> 00:02:45,300 update the model if needed. Examining the 70 00:02:45,300 --> 00:02:47,939 fit off your model is also refer to us. 71 00:02:47,939 --> 00:02:49,680 According the model. You have different 72 00:02:49,680 --> 00:02:51,479 metrics that you can use four different 73 00:02:51,479 --> 00:02:53,860 kinds of models you have the are square 74 00:02:53,860 --> 00:02:56,199 for regression models, accuracy, precision 75 00:02:56,199 --> 00:02:59,000 and recall for class. If IRS, once you've 76 00:02:59,000 --> 00:03:01,330 evaluated and scored, your model, will 77 00:03:01,330 --> 00:03:03,460 check to see whether you're satisfied with 78 00:03:03,460 --> 00:03:06,139 the result. If you're not satisfied, you 79 00:03:06,139 --> 00:03:08,599 might wantto update the model. Maybe you 80 00:03:08,599 --> 00:03:10,439 choose a different algorithm. Maybe you'll 81 00:03:10,439 --> 00:03:12,900 use more data for training. Maybe you'll 82 00:03:12,900 --> 00:03:15,469 train for longer and This is an 83 00:03:15,469 --> 00:03:17,379 integrative process that continues till 84 00:03:17,379 --> 00:03:19,389 you're satisfied with the model that you 85 00:03:19,389 --> 00:03:21,430 have. Update the Mahdi. Choose a 86 00:03:21,430 --> 00:03:24,120 validation method, examined the model 87 00:03:24,120 --> 00:03:26,409 evaluated, see whether you're satisfied 88 00:03:26,409 --> 00:03:29,360 and repeat till you're done. Once they're 89 00:03:29,360 --> 00:03:31,889 satisfied, your model is ready to be 90 00:03:31,889 --> 00:03:34,349 deployed in production and to be used for 91 00:03:34,349 --> 00:03:36,210 predictions, you lose your model for 92 00:03:36,210 --> 00:03:38,800 predictions on these prediction instances, 93 00:03:38,800 --> 00:03:40,860 our new data points that you can then 94 00:03:40,860 --> 00:03:43,590 store in your database toe, improve your 95 00:03:43,590 --> 00:03:45,780 model. In the real world, prediction 96 00:03:45,780 --> 00:03:48,520 instances often become partof the training 97 00:03:48,520 --> 00:03:51,060 data for your Marty. This is the basic 98 00:03:51,060 --> 00:03:53,000 machine learning workflow and in this 99 00:03:53,000 --> 00:03:55,750 course will focus our attention on feature 100 00:03:55,750 --> 00:03:59,000 selection and extraction the initial few steps. 7895