Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:05,890 --> 00:00:07,360
Welcome back everyone.
2
00:00:07,360 --> 00:00:13,150
In this lecture we're going to begin learning about the very syntax basics for the caris API for tensor
3
00:00:13,150 --> 00:00:14,010
flow.
4
00:00:14,020 --> 00:00:16,130
Let's head over to a notebook and get started.
5
00:00:16,600 --> 00:00:16,860
OK.
6
00:00:16,870 --> 00:00:18,250
Here I am at the notebook.
7
00:00:18,310 --> 00:00:22,270
I'm going to begin with a couple of imports will import pansies.
8
00:00:22,320 --> 00:00:26,380
PD will also import pi as MP.
9
00:00:26,380 --> 00:00:31,960
And since we're going to be doing just a little bit of visualization I will import seaborne as s an
10
00:00:31,990 --> 00:00:33,030
s.
11
00:00:33,130 --> 00:00:39,580
Now to actually focus on the syntax of cares for this particular lecture we're going to be using a very
12
00:00:39,580 --> 00:00:45,940
simple and actually fake data set and in the next lecture we'll be using a realistic dataset and we'll
13
00:00:45,940 --> 00:00:48,190
focus a lot more on feature engineering.
14
00:00:48,250 --> 00:00:55,630
Let's go ahead and read in this file we'll use PD that read CSC and then underneath our data folder
15
00:00:55,770 --> 00:01:02,830
there is a file called fake underscore our e.g. for regression that CSB and then we'll check out the
16
00:01:02,830 --> 00:01:06,640
head of the state of frames so as data frame it's very very simple.
17
00:01:06,670 --> 00:01:10,710
It simply has a price and then two corresponding features.
18
00:01:10,750 --> 00:01:17,470
So we're going to treat this as a regression problem where based off feature 1 and feature 2 will attempt
19
00:01:17,470 --> 00:01:23,080
to predict the price so we can imagine that maybe these are measurements of some rare gemstones where
20
00:01:23,320 --> 00:01:28,740
the gemstone has feature 1 and in feature 2 and we're trying to predict the price.
21
00:01:28,750 --> 00:01:33,400
So here we have historical information which means this is a supervised learning problem.
22
00:01:34,960 --> 00:01:40,690
Our main goal is to build a model that when we pick a new Gemstone from the ground we can measure its
23
00:01:40,690 --> 00:01:46,810
features feature 1 and feature two and predict what price we should be selling this at the market due
24
00:01:46,810 --> 00:01:51,510
to the fact that we have historical information on the price sold based off these two features.
25
00:01:51,520 --> 00:01:55,560
So a very simple dataset I want to quickly show you how we could explore this dataset.
26
00:01:55,810 --> 00:02:04,880
We'll say create a pair plot of the data frame run that and then we'll be able to see the features versus
27
00:02:04,880 --> 00:02:11,030
the price and you'll notice that especially feature 2 it seems to have a very high correlation with
28
00:02:11,030 --> 00:02:12,320
the actual price.
29
00:02:12,320 --> 00:02:15,810
So this kind of is another indicator that this is fake data.
30
00:02:15,830 --> 00:02:22,160
So in a realistic dataset which we're going to do in the very next lecture we would take a lot of time
31
00:02:22,310 --> 00:02:28,040
exploring this data doing what's known as exploratory data analysis performing a lot of visualizations
32
00:02:28,370 --> 00:02:34,160
as well as possibly performing some feature engineering trying to extract other features from features
33
00:02:34,160 --> 00:02:35,690
that we can't quite use.
34
00:02:35,690 --> 00:02:42,150
However let's focus really on the main workflow for using Caris and tensor flow for deep learning.
35
00:02:42,320 --> 00:02:48,560
So step number one is to read in your data and then once you have done feature engineering or data once
36
00:02:48,560 --> 00:02:56,230
you've explored your data the next step is to create a test train split and we can do this from psychic
37
00:02:56,270 --> 00:03:05,670
learns model selection has a train test split functionality which is really easy to use.
38
00:03:05,740 --> 00:03:11,500
So what we're gonna do is we're gonna use this train test split function to split our data into a training
39
00:03:11,500 --> 00:03:12,610
set and a test set.
40
00:03:12,640 --> 00:03:17,400
So we'll train on the training set and then evaluate our model's performance on the test set.
41
00:03:17,410 --> 00:03:21,930
So first what we want to do is we want to grab the features that we're going to use.
42
00:03:21,970 --> 00:03:31,120
So in this case that will be feature 1 and feature 2 and because of the way tensor flow works we have
43
00:03:31,120 --> 00:03:37,720
to actually pass in num pi arrays instead of Panda's data frames or pan the series so I can simply add
44
00:03:38,610 --> 00:03:46,200
dot values to the end of a series or data frame and I'll return it back as a num pi array.
45
00:03:46,240 --> 00:03:51,490
So what we're gonna do is we're gonna graph features and set that as X and by convention we use capital
46
00:03:51,490 --> 00:03:58,340
X because typically the feature matrix is two dimensional to indicate that for capital and then the
47
00:03:58,340 --> 00:04:06,200
label that we're going to predict our y is the price column and same thing here will grab values and
48
00:04:06,200 --> 00:04:12,710
again by convention since the price is essentially a one dimensional vector we have lower case Y for
49
00:04:12,710 --> 00:04:13,670
that.
50
00:04:13,760 --> 00:04:18,470
So that's why we have upper case x and lowercase y it essentially stems from the way you would write
51
00:04:18,470 --> 00:04:20,030
this down mathematically on paper.
52
00:04:20,720 --> 00:04:23,470
So now that we have our actual num pi arrays.
53
00:04:23,480 --> 00:04:27,980
So if we take a look at X it's just a num pi array of the same information we had in that data frame
54
00:04:27,980 --> 00:04:29,380
for each one to feature two.
55
00:04:29,390 --> 00:04:33,960
It's just num pints that a panda's it's time for our train test split.
56
00:04:33,980 --> 00:04:38,240
So the way I like to do this is simply called train test split.
57
00:04:38,450 --> 00:04:44,240
And after you've imported it you should be able to do shift tab to see the documentation string expand
58
00:04:44,240 --> 00:04:50,270
on it go ahead and scroll down and eventually at the bottom you'll see something called examples and
59
00:04:50,300 --> 00:04:51,750
to save myself a little bit of time.
60
00:04:51,800 --> 00:04:57,230
I like to just copy and paste this line from the example which is essentially showing you how you would
61
00:04:57,230 --> 00:04:58,830
actually use this.
62
00:04:58,880 --> 00:05:04,400
So I'm going to paste that in and then put this all on one line and essentially explain what's going
63
00:05:04,400 --> 00:05:05,360
on here.
64
00:05:05,360 --> 00:05:12,020
Recall that when we do a train test split we both split our features into X train next test as well
65
00:05:12,020 --> 00:05:14,670
as our labels into y train and white test.
66
00:05:14,840 --> 00:05:19,220
Make sure you review the machine learning section of the course in case you have any questions on what
67
00:05:19,220 --> 00:05:23,200
these four parameters or variables actually represent.
68
00:05:24,510 --> 00:05:29,550
Then for train to split you pass and all your features as X your labels as y.
69
00:05:29,550 --> 00:05:32,760
And then you choose a percentage as your test size.
70
00:05:32,760 --> 00:05:36,570
So typically use maybe around 30 percent of your data.
71
00:05:36,570 --> 00:05:42,660
So if I said zero point three that's going to be 30 percent of my total data will be used for the test
72
00:05:42,660 --> 00:05:47,790
set and you can always make this smaller if you have really large data sets and then there's the random
73
00:05:47,790 --> 00:05:48,690
state.
74
00:05:48,690 --> 00:05:54,960
So the train test split is going to perform this split randomly so it's going to grab random rows and
75
00:05:54,960 --> 00:05:57,770
then split them into the training side and the test side.
76
00:05:57,870 --> 00:06:04,830
If you want to repeat the actual results of the split the same every time then you would set a random
77
00:06:04,830 --> 00:06:06,840
state to a specific number.
78
00:06:06,840 --> 00:06:09,870
The number itself is just an arbitrary arbitrary choice.
79
00:06:09,940 --> 00:06:12,120
Have to make sure to choose the same one each time.
80
00:06:12,120 --> 00:06:17,070
So go ahead and choose random state is equal to forty two and that way you will get the same random
81
00:06:17,070 --> 00:06:18,600
split that I do.
82
00:06:18,750 --> 00:06:20,810
So we'll go ahead and run this.
83
00:06:21,270 --> 00:06:26,760
And now we've split up our data and we can actually check this by checking the shape of this.
84
00:06:26,760 --> 00:06:37,620
So notice X train that shape is now seven hundred by two features and x test that shape is three hundred
85
00:06:37,620 --> 00:06:38,570
by two features.
86
00:06:38,580 --> 00:06:43,670
So here's 70 percent of our data as the train set and 30 percent as the test set.
87
00:06:43,710 --> 00:06:47,900
Since our total size of the original data was 1000 rows.
88
00:06:48,240 --> 00:06:55,740
OK now typically the next step is to actually normalize or scale your data because we're working with
89
00:06:55,920 --> 00:06:59,270
weights and biases inside of a neural network.
90
00:06:59,340 --> 00:07:05,550
If we have really large values in our feature set that could cause errors with the weights and later
91
00:07:05,550 --> 00:07:09,340
on we'll talk about vanishing and exploding gradients that could be an issue.
92
00:07:09,480 --> 00:07:16,320
But one way to try to avoid any issues when train your network is to normalize and scale your feature
93
00:07:16,320 --> 00:07:24,870
data so psychic learn actually allows us to do this quite simply by saying from as K learned that pre
94
00:07:24,870 --> 00:07:30,990
processing import and there's actually lots of different ways you can normalize or scale your data one
95
00:07:30,990 --> 00:07:35,060
simple way is to use what's known as min max scaling.
96
00:07:35,370 --> 00:07:41,700
So we'll go ahead and import min max scalar and if you call help on min max scalar it will actually
97
00:07:41,700 --> 00:07:44,120
describe what this is doing.
98
00:07:44,130 --> 00:07:49,650
So essentially it's going to transform your data based off the standard deviation of your data as well
99
00:07:49,650 --> 00:07:51,570
as the men in the max values.
100
00:07:51,570 --> 00:07:55,320
So you can see here the actual formulation that it's running for us.
101
00:07:55,320 --> 00:08:01,230
So all we're gonna do is show you how you can scale your data which is very typical in your workflow
102
00:08:01,500 --> 00:08:03,030
for dealing with neural networks.
103
00:08:03,030 --> 00:08:07,110
Now you don't have to actually scale the label and if you take a look at our provided notebook we have
104
00:08:07,110 --> 00:08:10,260
a link explaining why we don't need to scale the label.
105
00:08:10,260 --> 00:08:14,970
We really only need to scale the features since that's essentially what's being passed through the actual
106
00:08:15,000 --> 00:08:15,930
network.
107
00:08:15,930 --> 00:08:23,900
The final label is just a comparison done at the end so to use a scalar with psychic learn all we do
108
00:08:24,140 --> 00:08:25,640
is we first create an instance of it.
109
00:08:26,930 --> 00:08:34,310
So we choose some variable name typically scalar and then we create an instance of our min max scalar
110
00:08:34,700 --> 00:08:36,240
open close princes.
111
00:08:36,260 --> 00:08:44,090
So now we have this instance of this scalar and what I'm going to do is I need to actually fit the scalar
112
00:08:44,450 --> 00:08:52,280
onto my training data so I will say fit on X train and what it does is it simply calculates the parameters
113
00:08:52,280 --> 00:08:55,990
it needs to perform the actual scaling later on.
114
00:08:56,000 --> 00:09:02,030
So if we recall from calling help on min max scalar the min max scalar is dependent on the standard
115
00:09:02,030 --> 00:09:08,390
deviation the minimum value and the maximum value within that particular dataset.
116
00:09:08,420 --> 00:09:11,650
So what it does is it essentially calculates the stern deviation.
117
00:09:11,660 --> 00:09:13,300
The men and Max.
118
00:09:13,370 --> 00:09:18,350
So that's what it does to 1 fit on our training set.
119
00:09:18,410 --> 00:09:23,600
And the reason we only run it on the training set is because we want to prevent what's known as data
120
00:09:23,600 --> 00:09:25,670
leakage from the test set.
121
00:09:25,730 --> 00:09:30,050
You don't want to assume that we have prior information of the test set.
122
00:09:30,050 --> 00:09:36,250
So we only fit our scalar to the training set to not try to cheat and look into the test set.
123
00:09:37,410 --> 00:09:48,830
Then what we need to do ifs transform our training data so we'll say extreme is now equal to scalar
124
00:09:48,830 --> 00:09:52,820
that transform on X train that actually performs a transformation.
125
00:09:52,910 --> 00:09:57,710
So there's essentially two steps here we fit which is calculate what's needed for the transformation
126
00:09:57,710 --> 00:10:03,220
to occur and then we actually perform the transformation and we'll do the same for the test set.
127
00:10:04,910 --> 00:10:10,110
So we'll say scalar transform on X test.
128
00:10:10,340 --> 00:10:15,470
And now if we take a look at these values for x train you'll notice they have been scaled.
129
00:10:15,470 --> 00:10:24,290
So if we take a look at what the max value on X train is it's now 1 and then the minimum value is now
130
00:10:24,290 --> 00:10:24,670
0.
131
00:10:24,680 --> 00:10:28,160
So everything's been scaled to now be between 0 and 1.
132
00:10:28,160 --> 00:10:34,160
And again we're only fitting on the train set to not ascertain information from the test set because
133
00:10:34,160 --> 00:10:36,100
that's essentially cheating.
134
00:10:36,130 --> 00:10:37,590
Okay.
135
00:10:37,810 --> 00:10:43,330
So now that we've scaled the data it's time to actually show you how to create your neural network.
136
00:10:43,330 --> 00:10:47,410
So in part 2 of this lecture we'll begin creating our neural network.
137
00:10:47,410 --> 00:10:47,980
I'll see you there.
15087
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.