Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,530 --> 00:00:03,290
In this video, we will explain about dataset splitting.
2
00:00:03,590 --> 00:00:07,430
The dataset must be split into two parts a training set and a test set.
3
00:00:09,760 --> 00:00:14,590
Because this data is used for training, the training that has the most data following the completion
4
00:00:14,590 --> 00:00:19,030
of the training process, the test set is used to evaluate the performance of the train model.
5
00:00:19,030 --> 00:00:22,270
The data is split to prevent overfitting and to evaluate the model.
6
00:00:25,650 --> 00:00:30,810
Overfitting occurs when a model performs well on a training set, but performs poorly on data that the
7
00:00:30,810 --> 00:00:32,490
model has never seen before.
8
00:00:33,700 --> 00:00:42,160
Commonly the size distribution of training and testing sets is 67% training and 33% testing 75% training
9
00:00:42,160 --> 00:00:43,870
and 25% testing.
10
00:00:44,890 --> 00:00:47,380
90% training and 10% testing.
11
00:00:47,650 --> 00:00:51,380
The YOLO model has had two parameters to obtain optimal values.
12
00:00:51,400 --> 00:00:54,460
These hyper parameters must be tuned during the training process.
13
00:00:54,460 --> 00:00:59,650
Data is required to test the tuning hyper parameter, however, because the data used is not from the
14
00:00:59,650 --> 00:01:03,910
training set an additional component, namely the validation set is required.
15
00:01:04,810 --> 00:01:08,710
Typically the validation set is 10% to 20% of the training set.
16
00:01:10,080 --> 00:01:13,620
We have provided a Python program for performing data set splitting.
17
00:01:13,650 --> 00:01:17,280
The program will be explained in the section on training custom objects.
18
00:01:19,190 --> 00:01:19,910
See you then.
1876
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.