All language subtitles for 004 Dataset Splitting

af Afrikaans
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bn Bengali
bs Bosnian
bg Bulgarian
ca Catalan
ceb Cebuano
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
tl Filipino
fi Finnish
fr French
fy Frisian
gl Galician
ka Georgian
de German
el Greek
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
km Khmer
ko Korean
ku Kurdish (Kurmanji)
ky Kyrgyz
lo Lao
la Latin
lv Latvian
lt Lithuanian
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mn Mongolian
my Myanmar (Burmese)
ne Nepali
no Norwegian
ps Pashto
fa Persian
pl Polish
pt Portuguese
pa Punjabi
ro Romanian
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
st Sesotho
sn Shona
sd Sindhi
si Sinhala
sk Slovak
sl Slovenian
so Somali
es Spanish
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
te Telugu
th Thai
tr Turkish
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese Download
cy Welsh
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
or Odia (Oriya)
rw Kinyarwanda
tk Turkmen
tt Tatar
ug Uyghur
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,530 --> 00:00:03,290 In this video, we will explain about dataset splitting. 2 00:00:03,590 --> 00:00:07,430 The dataset must be split into two parts a training set and a test set. 3 00:00:09,760 --> 00:00:14,590 Because this data is used for training, the training that has the most data following the completion 4 00:00:14,590 --> 00:00:19,030 of the training process, the test set is used to evaluate the performance of the train model. 5 00:00:19,030 --> 00:00:22,270 The data is split to prevent overfitting and to evaluate the model. 6 00:00:25,650 --> 00:00:30,810 Overfitting occurs when a model performs well on a training set, but performs poorly on data that the 7 00:00:30,810 --> 00:00:32,490 model has never seen before. 8 00:00:33,700 --> 00:00:42,160 Commonly the size distribution of training and testing sets is 67% training and 33% testing 75% training 9 00:00:42,160 --> 00:00:43,870 and 25% testing. 10 00:00:44,890 --> 00:00:47,380 90% training and 10% testing. 11 00:00:47,650 --> 00:00:51,380 The YOLO model has had two parameters to obtain optimal values. 12 00:00:51,400 --> 00:00:54,460 These hyper parameters must be tuned during the training process. 13 00:00:54,460 --> 00:00:59,650 Data is required to test the tuning hyper parameter, however, because the data used is not from the 14 00:00:59,650 --> 00:01:03,910 training set an additional component, namely the validation set is required. 15 00:01:04,810 --> 00:01:08,710 Typically the validation set is 10% to 20% of the training set. 16 00:01:10,080 --> 00:01:13,620 We have provided a Python program for performing data set splitting. 17 00:01:13,650 --> 00:01:17,280 The program will be explained in the section on training custom objects. 18 00:01:19,190 --> 00:01:19,910 See you then. 1876

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.