Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:01,110 --> 00:00:04,770
ہیلو مصنوعی ذہانت کے کورس میں دوبارہ خوش آمدید۔
2
00:00:04,770 --> 00:00:09,420
Today we're going to discuss the plan of attack for the section we're talking about kill learning.
3
00:00:09,450 --> 00:00:15,000
And we've got quite a few tutorials so I think it is a good idea for us to quickly go through them to
4
00:00:15,000 --> 00:00:20,580
understand what to expect in the upcoming videos.
5
00:00:20,580 --> 00:00:21,650
So here we go.
6
00:00:22,140 --> 00:00:22,560
All right.
7
00:00:22,560 --> 00:00:25,230
What we will learn in this section.
8
00:00:25,230 --> 00:00:31,650
First things first we will talk about what reinforcement learning actually is and what the philosophic
9
00:00:31,690 --> 00:00:37,890
behind reinforcement learning is and how reinforcement learning actually can be seen in real life and
10
00:00:37,890 --> 00:00:44,540
how it relates to things that we observe in real life are actually things that we do ourselves.
11
00:00:44,790 --> 00:00:51,630
They don't talk about the bellmen equation very fundamental concept underpinning everything or a lot
12
00:00:51,630 --> 00:00:56,580
of things that are happening and for reinforcement learning especially in the space of CULE learning
13
00:00:56,940 --> 00:01:01,700
and what we're going to be discussing in this section of the course and in the following sections.
14
00:01:01,800 --> 00:01:09,280
Then we'll talk about the plan and the plan that and raw iron artificial intelligence comes up with
15
00:01:09,300 --> 00:01:15,990
in order to navigate inside environments we'll see what that how that comes together very quick but
16
00:01:15,990 --> 00:01:17,270
quite interesting.
17
00:01:17,720 --> 00:01:22,890
There we'll talk about market of decision processes and your concept we're going to introduce a very
18
00:01:22,890 --> 00:01:31,620
new concept which will slowly even add a layer of sophistication to our Belman equation to our whole
19
00:01:31,800 --> 00:01:37,070
reinforcement learning to our CULE learning concepts and that's the way this section is structured that
20
00:01:37,290 --> 00:01:43,080
we introduce the Bollmann equation a very simplistic form and then slowly throughout the tutorials we
21
00:01:43,260 --> 00:01:48,550
adds layers of sophistication to it in order to get to the final version.
22
00:01:48,690 --> 00:01:53,880
That is our designated destination in terms of Hillary but we'll get there slowly.
23
00:01:54,000 --> 00:01:58,830
In order for us to have enough time to process all that information and let it settle in.
24
00:01:58,890 --> 00:02:05,400
And mark of dissident proses is an extra layer of sophistication on top of what we've discussed or what
25
00:02:05,400 --> 00:02:11,220
we will have or it discussed by then there will talk about policies versus plans.
26
00:02:11,220 --> 00:02:13,830
Another interesting Tauriel they're all interesting.
27
00:02:13,830 --> 00:02:19,590
Just another quick tutorial on how policy is different from plans and what the differences there are
28
00:02:19,590 --> 00:02:25,980
and these are terms that you will probably hear or read in the literature if you're going to be delving
29
00:02:25,980 --> 00:02:29,980
into it to get additional information on reinforcement learning.
30
00:02:29,980 --> 00:02:34,590
They're all talk about adding a living penalty to our environments.
31
00:02:34,770 --> 00:02:41,850
And that's that's kind of another way of adding complexity into the environments that our agents are
32
00:02:41,850 --> 00:02:43,340
going to be operating in.
33
00:02:43,370 --> 00:02:48,780
They're all talk about the intuition behind keep learning so up until that tutorial we're going to be
34
00:02:48,780 --> 00:02:50,690
talking values of states.
35
00:02:50,790 --> 00:02:57,300
And then finally we're going to switch to talking about values or actions or cube values and then we're
36
00:02:57,300 --> 00:02:59,880
going to introduce the temporal difference.
37
00:02:59,910 --> 00:03:06,690
This is a tutorial where everything that we've learned is going to come together to explain how exactly
38
00:03:06,690 --> 00:03:13,930
do agents or artificial does artificial intelligence learn how does it update its values through all
39
00:03:14,090 --> 00:03:16,420
the iterative process that is going through.
40
00:03:16,830 --> 00:03:23,100
And then finally we're going to look at a visible zation of learning so we're going to take everything
41
00:03:23,100 --> 00:03:29,550
we learn and we're going to look at it happen in front of our eyes and watch an artificial intelligence
42
00:03:29,730 --> 00:03:35,870
actually perform CULE learning and do all the things that we're going to discuss on an intuitive level
43
00:03:35,880 --> 00:03:42,600
is going to actually do in practice and that will help us even further grasp that knowledge that we're
44
00:03:42,810 --> 00:03:44,530
going to be coming off in the section.
45
00:03:44,550 --> 00:03:47,460
So hopefully you're very excited about these upcoming tutorials.
46
00:03:47,460 --> 00:03:48,800
I definitely am.
47
00:03:48,810 --> 00:03:55,380
And there some very interesting slides coming up and more important the concepts themselves are very
48
00:03:55,380 --> 00:03:59,540
very interesting and I'm sure you're going to enjoy them quite a lot.
49
00:03:59,760 --> 00:04:01,410
And I look forward to seeing you next time.
50
00:04:01,410 --> 00:04:03,080
Until then enjoy AI.
5965
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.