WEBVTT
00:19.369 --> 00:26.369
This is a lecture on indirect adaptive control
of a robot manipulator. This is the fourth
00:27.079 --> 00:34.079
lecture in this module on neural control.
In the previous two classes, we covered the
00:37.050 --> 00:44.050
background that is necessary to understand
this lecture. First we talked about network
00:44.579 --> 00:51.579
inversion. Given a network that has learnt
a dynamic model, can I use that network to
00:56.260 --> 01:03.260
compute what is input given for a target output?
Next we talked about the problem of having
01:10.050 --> 01:17.050
a model of a robot manipulator. The reason
is, robot manipulator is an open-loop unstable
01:17.850 --> 01:24.850
system; data generation can be only done using
a PD controller.
01:29.090 --> 01:36.090
Given those two backgrounds we will have the
topics that will be covering: System identification
01:37.819 --> 01:44.819
using a feed-forward network, we have already
discussed this; we will be overviewing neural
01:46.039 --> 01:52.679
network training and data-insufficiency, we
have discussed earlier and query based learning,
01:52.679 --> 01:59.289
these are all from previous classes. Today,
we will be talking about two different indirect
01:59.289 --> 02:05.619
adaptive control schemes. One is based on
forward-inverse modeling and second is network
02:05.619 --> 02:12.619
inversion approach and finally, we will show
these methods how they can be validated through
02:18.379 --> 02:20.959
simulation.
02:20.959 --> 02:27.959
Robot manipulator, we have been talking about
this model since last two classes. This is
02:31.700 --> 02:38.700
vector equation of motion of N-link rigid
manipulator; this is the state space model.
02:39.680 --> 02:46.680
We have 2N states, with N angular position
and angular velocity vector then, the control
02:50.870 --> 02:57.870
vector is that joint torque actuated which
is N dimensional and it is desired that robot
02:58.219 --> 03:02.019
manipulators should follow this desired trajectory.
03:02.019 --> 03:08.260
We have already talked about given generic
system of this particular form. Where this
03:08.260 --> 03:15.260
is our output state, you can easily see for
robot manipulator n is 2N and u isůthe dimension
03:25.840 --> 03:32.840
is P; this is n state system, small n, for
robot manipulator it will be 2 into capital
03:36.060 --> 03:43.060
N.
We collect the data online from a robot manipulator
by taking the links; the manipulator link
03:58.700 --> 04:05.700
around desired trajectory. We get various
data from the robot manipulator. We use those
04:06.560 --> 04:13.560
data to model the robot manipulator in this
particular form which is x (k plus 1) is f(x(k),
04:15.209 --> 04:22.209
u(k)). You can easily see that, this radial
basis function network that has n plus p,
04:26.190 --> 04:33.190
input x1 to xn, u1 to up and the outputs are
x1 to xn at sampling instant k plus 1, input
04:36.819 --> 04:43.819
are at sampling instant at k. These are the
values taken from actual plant, these are
04:46.250 --> 04:53.190
the control action actuated and you get the
output of the radial basis function network.
04:53.190 --> 05:00.190
If it is trained properly, then, it can mimic
the robot dynamic behavior. Last class also
05:04.479 --> 05:11.430
we talked about dimensionally insufficient
data which means, the robot manipulator is
05:11.430 --> 05:13.780
an open-loop unstable system.
05:13.780 --> 05:20.780
We said that if this is my plant, this is
my robot manipulator then, I use a PD controller
05:34.759 --> 05:41.759
here and I put some dither signal
and here is your command signal
and this is your sensors. The output of robot
manipulator if we actuate with joints, we
06:09.470 --> 06:16.470
observe what are the link angular positions
vector and angular velocity vector. They are
06:17.280 --> 06:24.280
fed back at the command signal and command
signal is normally x - desired, the desired
06:25.360 --> 06:32.360
trajectory that link should follow, you have
this PD controller. PD output we add this
06:34.689 --> 06:41.689
dither signal just to make sure that, the
data that are being collected that only spread
06:44.659 --> 06:51.659
over n dimension rather n plus, actually 2n
plus p, 2n plus p is also here n. Actually,
07:00.120 --> 07:07.120
given the state space model, we talked about
n plus p, n is for x, p is for u. Input-output
07:11.069 --> 07:18.069
dimension is 2n plus p because, your number
of input is n plus p and that is when you
07:24.280 --> 07:31.280
have a network. If I have a network that has
modeled this robot manipulator then, you have
07:33.740 --> 07:40.740
n plus P input where, n is capital 2N, n is
capital 2N and P is also N. So that is 3N
07:48.520 --> 07:55.520
number of input, number of output is 2N, 2
capital N. So, the total input- output dimension
07:59.210 --> 08:06.210
is 5N; the objective is to collect data in
5N dimension. To be able to do that is very
08:12.169 --> 08:19.169
difficult job because, we do not have a method
by which we can independently give input signal
08:23.569 --> 08:30.569
to the plant or to the robot manipulator.
08:30.889 --> 08:37.889
The input actuation cannot be made state independent
because we are using a PD controller. If I
08:39.979 --> 08:46.979
do not give dither signal, I am simply collecting
data in n dimension. This n is actually 2N
08:51.350 --> 08:58.350
and if I add dither signal the maximum dimensionality
into which the data will span is 2N plus N
09:00.060 --> 09:07.060
is 3N. This is the problem of dimensionally
insufficient data. What is a neural network?
09:18.170 --> 09:24.510
Neural network means, we are simply fitting
the input data from input to output; it is
09:24.510 --> 09:31.510
simply a data mapping. How well data are generated?
This point we discussed in the last class
09:33.100 --> 09:37.860
and that is why we proposed a query based
learning algorithm.
09:37.860 --> 09:44.730
In query based learning algorithm what we
proposed that after the way we generated a
09:44.730 --> 09:51.160
data here, we generated data like this and
then we provided those data to the radial
09:51.160 --> 09:58.160
basis function network to emulate an actual
robot manipulator. So that first iteration
10:00.389 --> 10:07.389
robot emulator or neural emulator is placed
in parallel to the robot manipulator.
10:07.889 --> 10:14.889
What this query based learning algorithm has
to do is that; this neural emulator will give
10:23.570 --> 10:30.570
new learning trajectories. Along each learning
trajectory, through neural emulator, the query
10:33.550 --> 10:40.550
through network inversion will be asked. What
would be the input for the desired target
10:42.930 --> 10:49.930
point defined by learning trajectory? Through
network inversion we compute what should be
10:53.620 --> 10:59.910
the tau, we actuate that to the actual physical
plant, and if we see the response of the physical
10:59.910 --> 11:06.910
plant in the next sampling instant is far
away from the actual value of that is set
11:09.079 --> 11:16.079
as according to the learning trajectory. Then
we accept that as a new example because, in
11:18.699 --> 11:25.699
that zone the neural emulator does not have
proper information. The basic motivation here
11:26.540 --> 11:33.540
is that, if I have a neural emulator that
has modeled robot manipulator. Kindly, hear
11:34.649 --> 11:39.970
very attentively this particular statement,
what I am trying to say is that, if this neural
11:39.970 --> 11:46.970
emulator or this model of a robot manipulator,
if it has actually trained properly to capture
11:47.320 --> 11:54.320
the dynamic behavior of the robot manipulator,
it is not sufficient, to simply ask this question
11:55.480 --> 12:02.480
given a desired sequence of desired input.
If the neural emulator is able to follow the
12:04.070 --> 12:11.070
behavior of robot manipulator then, this is
a proper emulator; now we are going a step
12:12.290 --> 12:19.290
beyond. Here, what we are saying the first
step is that, neural emulator should follow
12:28.449 --> 12:35.449
the behavior of robot manipulator. What it
meant is, given a sequence of control actuation
12:43.570 --> 12:48.880
to both the robot manipulators as well as
neural emulator, the response of neural emulator
12:48.880 --> 12:54.910
and robot manipulator should match, this is
standard practice. But, we are introducing
12:54.910 --> 13:01.910
a second constraint to say that, the neural
emulator actually is emulating robot manipulator
13:04.899 --> 13:05.970
properly.
13:05.970 --> 13:12.759
Second point is that, we ask this question
to neural emulator. Given a target point,
13:12.759 --> 13:19.759
what should be the control input? We give
that control input to the actual robot manipulator
13:21.199 --> 13:28.199
and we see if their manipulator goes to the
target point. If that is the case then we
13:28.329 --> 13:35.329
would say that, the neural emulator is properly
mimicking the robot manipulators. In this
13:44.160 --> 13:51.160
case, using network inversion and neural emulator
predict the control input for a given target
14:19.370 --> 14:26.370
trajectory. Control input sequence I would
say. This is second point that we are imposing
14:34.360 --> 14:41.360
to say that a robot manipulator model in using
neural network, how that can be robust? Not
14:44.850 --> 14:50.639
only should it be able to predict the target
point, it should also predict the target input
14:50.639 --> 14:57.639
because, to this robot manipulator given tau
it produces q q dot. So, one way we can say
14:58.329 --> 15:05.190
that, if this is a neural emulator give the
same q q dot to the neural emulator and tell
15:05.190 --> 15:10.880
him it should predict what tau is. The other
way is that gives the tau to the neural emulator
15:10.880 --> 15:17.589
and predicts q q dot and we have to do both;
if neural emulators satisfy both conditions
15:17.589 --> 15:23.319
then, we say this is an exact replica of the
robot manipulator. Of course, we cannot say
15:23.319 --> 15:30.319
exact replica but, in some sense it has much
more robust behavior than taking only this
15:32.040 --> 15:37.269
condition alone. We have already discussed
these things.
15:37.269 --> 15:44.269
The control objective is given a desired state
trajectory vector that is output activation
15:44.420 --> 15:50.810
of the radial basis function network model
which is x d (k plus 1) and actual system
15:50.810 --> 15:57.810
state vector x (k): design a control law so
that neural model g is Lyapunov stable. What
15:58.839 --> 16:05.839
we are trying to say is that, this is our
robot manipulator. This is our neural emulator;
16:15.440 --> 16:22.440
neural emulator that mimics the robot manipulator
behavior. Now, I utilize this neural manipulator
16:25.769 --> 16:32.769
and design a controller here. This controller
will actuate a control signal tau to the robot
16:36.180 --> 16:43.180
manipulator and robot follows certain trajectory.
In the state space up angular position and
16:46.759 --> 16:53.759
angular velocity, this is called in literature
which we say indirect adaptive control, this
17:00.350 --> 17:07.350
is indirect adaptive control. What you are
seeing here is that, what is meaning of Lyapunov
17:08.230 --> 17:15.230
stable is that, if I represent this NE by
a function g because, the normal stability
17:25.730 --> 17:32.730
study is that, I can put instead of robot
manipulator this NE. There neural emulator
17:37.600 --> 17:44.600
and neural emulator in the feedback loop,
they should be Lyapunov stable that is the
17:50.960 --> 17:52.580
control objective.
17:52.580 --> 17:57.710
Here, the first type of indirect adaptive
control scheme that we will be discussing
17:57.710 --> 18:03.880
today. This is called forward-inverse modeling.
In this forward-inverse modeling what you
18:03.880 --> 18:09.730
are seeing here is that, this is the robot
manipulator, this is again indirect adaptive
18:09.730 --> 18:16.470
control you see. This is also indirect adaptive
control but, the principle is forward-inverse
18:16.470 --> 18:19.880
modeling. I will just explain forward-inverse
modeling.
18:19.880 --> 18:26.880
This is our robot manipulator which has a
neural emulator that we have already discussed
18:27.919 --> 18:34.919
a lot. This neural emulator has the capacity
to exactly mimic the behavior of robot manipulator
18:36.470 --> 18:43.470
not only in the terms what is of desired target
but given the desired target and desired input.
18:45.070 --> 18:52.070
In both ways this is an ideal neural emulator.
This is our desired trajectory q d, here is
19:05.370 --> 19:12.370
my neural controller. I have two-stage controller
here; the neural controller actuates the feed-forward
19:13.480 --> 19:20.480
task and the PD controller actuates a feedback
torque and together they are actuated to robot
19:22.390 --> 19:29.390
manipulator. The question now here is, given
the q d whatever the output of neural controller
19:48.270 --> 19:55.270
and PD controller output the same output is
given to the neural emulator which gives an
20:00.590 --> 20:07.590
output q hat and we take it, compare to the
desired trajectory and I get an error. I use
20:15.610 --> 20:22.610
a weight update algorithm to update the weights
of neural controller. The meaning of forward-inverse
20:28.130 --> 20:35.130
modeling is neural emulator represents the
forward dynamics of the robot manipulator.
20:38.779 --> 20:45.779
Now what we can do, we extract from this neural
emulator the information known as: Del x upon
20:47.620 --> 20:54.460
Del u, which I will show you just now. This
is very important information that we get
20:54.460 --> 21:01.460
from neural emulator. We use that information
which is necessary in our weight update rule
21:02.240 --> 21:09.240
to compute which is w dot. We compute and
update the weights of the neural controller
21:16.590 --> 21:22.390
based on this w dot, and what is this neural
network controller, this is again another
21:22.390 --> 21:28.039
radial basis function network. We have two
radial basis function network, one radial
21:28.039 --> 21:33.580
basis function network represents neural emulator;
another radial basis function network is the
21:33.580 --> 21:40.279
neural controller. What we do not know in
this radial basis function network. We fix
21:40.279 --> 21:47.279
the centers in their place in specific domain
because, we know that to this controller what
21:49.529 --> 21:56.529
will be the various possible input that is
q d the desired trajectories, actually to
22:00.990 --> 22:07.370
this neural controller, we have 3 n inputs,
n inputs from angular position, n input from
22:07.370 --> 22:14.320
angular velocity, and capital N inputs from
angular acceleration and then neural controller
22:14.320 --> 22:21.320
predicts, what is the forward torque necessary
for making the robot manipulator to exactly
22:22.860 --> 22:29.860
follow the trajectory. Now, we will show you
how we design a weight update rule for this
22:32.570 --> 22:39.570
forward-inverse modeling such that, the controllers
the total feedback control system here is
22:41.900 --> 22:48.900
Lyapunov stable. See how we are doing it here,
consider the Lyapunov function to be half
22:49.110 --> 22:55.039
x tilde, transpose x tilde where x tilde is
x d minus x hat.
22:55.039 --> 23:02.039
This x hat is response of the radial basis
function network, which is here q hat. The
23:05.880 --> 23:12.159
time derivative of the Lyapunov function can
be derived as follows: V dot rate derivative
23:12.159 --> 23:19.159
is x tilde transpose into x
tilde dot and then we can write that x tilde
transpose. Then, we can write that: Del x
23:29.870 --> 23:36.870
upon Del u because, you see that this x tilde
dot is x minus x hat dot. That is, minus d
23:48.770 --> 23:55.770
x hat by d t so d x hat by d t can be written
as: doe x upon doe u into doe u upon doe W
24:01.850 --> 24:08.850
into doe W by dw by dt. That is, we are expanding
that x as a function of u the control input,
24:15.140 --> 24:22.070
u is a function of W and W is a function of
time. Say if you go back here this is the
24:22.070 --> 24:29.070
meaning of forward-inverse modeling that is
given x hat here, this is a function of u
24:31.779 --> 24:38.779
here. That is why, doe x upon doe u is computed
from the neural emulator, again the neural
24:43.380 --> 24:50.380
controller output is u. We can say the output
is here is u, so doe x upon doe u into doe
24:59.070 --> 25:06.070
u upon doe W that again can be computed from
this neural controller into W dot is, what
25:13.220 --> 25:20.220
is dx upon dt. I have to explain to you again
what we are trying to compute is dx upon dt.
25:28.230 --> 25:35.230
To compute dx upon dt, which is x hat that
is the rate change of the output of the neural
25:37.409 --> 25:44.409
emulator can be written as, doe x upon doe
u intoů because this neural emulator has
25:51.950 --> 25:58.950
a functional relation between q and u, again
doe x upon doe u here it is, doe u upon doe
26:00.770 --> 26:07.770
W. The neural controller is characterized
by a weight vector w and this u output neural
26:09.779 --> 26:16.779
controller is a function of this W, this is
notů. because PD controller gains are fixed,
26:17.870 --> 26:24.870
these really do not affect, this q hat what
affects is the weights because, these are
26:28.049 --> 26:35.049
changing neural controller. So, doe x upon
doe u again doe u upon doe W, u is the output
26:37.350 --> 26:44.350
of the neural controller, neural controller
is characterized by the parameter W and the
26:45.200 --> 26:50.820
input is qd which are all known. Naturally,
doe u upon doe W it can be easily computed
26:50.820 --> 26:57.820
from the neural controller into dw by dt.
This is the weight update law, now we have
27:02.570 --> 27:09.570
to find out what should be this dw by dt such
that, the whole system is Lyapunov stable.
27:12.169 --> 27:18.669
Meaning of that is that Lyapunov stable means,
if I consider this to be a Lyapunov function,
27:18.669 --> 27:25.669
the rate derivative of this function V dot
has to be negative definite, if I can prove
27:25.760 --> 27:32.760
that this is negative definite, then this
is a stable controller. The objective is here
27:33.440 --> 27:40.440
I have defined j to be this quantity, this
is Jacobin, the objective is to select a weight
27:41.850 --> 27:48.289
of that W dot such that, the derivative of
the Lyapunov function remains negative semi-definite.
27:48.289 --> 27:53.890
We will now represent two such weight update
laws that are converging in the sense of Lyapunov
27:53.890 --> 28:00.890
functions. We are selecting first what is
W dot and this is my W dot the weight update
28:02.169 --> 28:08.380
law. If I select this weight update law and
put this weight update law in this particular
28:08.380 --> 28:15.380
replace here, then what I get V dot is minus
x tilde norm square, meaning this quantity
28:21.580 --> 28:25.029
this particular quantity is either.
28:25.029 --> 28:32.029
If x tilde is not 0, then it is positive;
hence it is negative and if x tilde is 0,
28:33.090 --> 28:39.279
all the terms in x tilde because this is a
vector, if all the terms are 0 then only at
28:39.279 --> 28:45.279
that time it is 0, which is desired, x tilde
finally should be 0. That is desired then,
28:45.279 --> 28:52.279
the system is stable. Thus, the update law
the previous law what we saw? What we saw
28:56.590 --> 29:03.590
is this stabilizes the control system; the
complete control system including the planned
29:12.409 --> 29:19.409
dynamics is stable, according to this weight
update law. Now, the weight update law however
29:21.570 --> 29:27.590
does not ensure the boundedness of the weight
because, you see that, we are updating this
29:27.590 --> 29:32.690
weight and there is no way we are talking
about boundedness of the weight. So that we
29:32.690 --> 29:37.640
will be doing now, thus the update law is
modified by adding the gradient of the cost
29:37.640 --> 29:43.340
function H as follows. W dot earlier term
was this one we have added another term.
29:43.340 --> 29:50.340
Where you have taken a gradient of a function
H and this H is normally we select here, you
29:54.490 --> 30:01.490
can check here, this is half W transpose tilde
into W tilde. Taking this H to be this particular
30:03.750 --> 30:10.750
function the, Del H upon Del W if you compute
that, and where this particular function that
30:12.970 --> 30:19.970
you are looking at is a function of W is defined
by this quantity. If you take that, add this
30:27.100 --> 30:34.100
term to this W dot, then again you take this
new weight update law into the Lyapunov function,
30:40.950 --> 30:47.950
then Lyapunov function again becomes x tilde
norm square negative, negative of x norm x
30:49.850 --> 30:56.850
tilde norm square implying again the system
is stable. Again this control law stabilizes
31:06.149 --> 31:13.149
the control system in the sense of Lyapunov.
The system is Lyapunov stable, finally this
31:18.980 --> 31:25.980
update law ensures convergence of tracking
vector x tilde 0. A new function is chosen
31:27.409 --> 31:34.409
such that, x transpose J this function is
0, which ensures the time derivative of the
31:35.080 --> 31:42.080
Lyapunov function remains negative semi definite
simultaneously, by selecting H to be half
31:42.500 --> 31:49.120
W tilde transpose W tilde, we are intuitively
providing a damping to the weight that is
31:49.120 --> 31:52.679
increasing; this ensures the boundedness of
the weight vector W.
31:52.679 --> 31:59.679
Finally, summarize two weight update rules
are derived which guarantees stability for
32:00.549 --> 32:07.549
forward inverse modeling based indirect adaptive
control. Weight update rule 1 was, W dot is,
32:12.940 --> 32:19.940
how if we go back what was the problem? Originally
the problem was that, we place a neural controller
32:23.350 --> 32:30.350
whose weights are unknown, how do we update
these weights such that, the overall control
32:32.070 --> 32:39.070
system is stable and the perfect tracking
is achieved at the plant level. That is what
32:40.880 --> 32:47.880
we are doing first; we found out the weight
update law for the weights of the neural controller.
32:48.519 --> 32:55.519
Similarly, weight update law, second neural
update law we found out. This does not take
32:58.639 --> 33:05.070
into account of the boundedness of the weights,
this does take into account.
33:05.070 --> 33:11.919
Now, we will go to the second approach that
is using network inversion, we have already
33:11.919 --> 33:18.750
talked about network inversion. Now, the concept
here is very simple I have a robot manipulator,
33:18.750 --> 33:25.750
I have a neural emulator. Now, can I use this
neural emulator as a whole to construct my
33:27.870 --> 33:34.870
control law here that would stabilize the
entire control system here? That is the question.
33:36.139 --> 33:42.500
That is what I am trying to say is that, if
I have a controller sitting here in conjunction
33:42.500 --> 33:49.500
with the neural emulator can I say, given
a desired trajectory here, next desired state
33:53.429 --> 34:00.429
present and past states and past input or
you can always also say here that q d. Given
34:00.450 --> 34:06.440
a desired trajectory, can I say that I have
a control law in conjunction with the neural
34:06.440 --> 34:13.440
emulator in such a way that, finally my neural
emulator output are always following the desired
34:21.020 --> 34:26.909
trajectory and system is Lyapunov stable?
34:26.909 --> 34:32.660
This is the network inversion, control law
using network inversion. What we are trying
34:32.660 --> 34:38.290
to do is we are constructing a Lyapunov function
which is half x tilde, transpose x tilde plus
34:38.290 --> 34:45.290
u tilde transpose u tilde. We have introduced
a new element or new term in the Lyapunov
34:48.180 --> 34:55.180
function, where u tilde is u hat minus u.
What is this u hat? If you go back you see
34:57.280 --> 35:03.380
that, when I invert the network I get what
should be the control input that will take
35:03.380 --> 35:10.380
my neural emulator to the target vector. You
know already that, there may be a situation
35:14.440 --> 35:21.380
where my predicted control input may not take
the robot manipulator to its actual target.
35:21.380 --> 35:28.380
Because of the problem that we enumerated
in the beginning that, actually neural emulator
35:28.590 --> 35:35.590
is that one which not only predicts the target,
also predicts the desired input. If that training
35:37.050 --> 35:43.030
is not complete then obviously, the neural
emulator will fail sometime to predict the
35:43.030 --> 35:50.030
control input. Now, we have to find some method
by which this prediction can be made properly.
35:51.540 --> 35:58.540
With this introduction of this new term, we
take the time derivative of the Lyapunov function
36:00.390 --> 36:07.390
and then we see that if I differentiate this
term, x tilde transpose x tilde dot and x
36:07.690 --> 36:13.670
tilde dot can be written as: minus doe x upon
doe u. That is why, the minus sign comes into
36:13.670 --> 36:20.670
doe u by d t and so du by d t that is how
is the first term. Second term is, u tilde
36:23.700 --> 36:30.700
transpose into u dot. Where, we can write
this term as minus x tilde transpose J plus
36:37.760 --> 36:44.760
D u dot where, J is doe x upon doe u, and
D is 1 upon x tilde norm square x tilde u
36:49.020 --> 36:56.020
tilde transpose. Control law using network
inversion; first of all, we present a theorem.
36:57.510 --> 37:04.510
If an arbitrary initial input activation u
not is updated by this formula, this identity
37:05.600 --> 37:12.600
u(t) is u not plus 0 to t dash u dot d t where,
u dot is given by this expression then, x
37:14.050 --> 37:20.770
tilde converges to 0 under the condition that,
u dot exists along the convergence trajectory.
37:20.770 --> 37:27.770
Substituting u dot in previous V dot, you
get V dot is minus x tilde whole square that
37:35.880 --> 37:42.880
is, the overall system will be Lyapunov stable.
The iterative input activation update rule
37:44.640 --> 37:51.640
will be because, this is continuous update,
so iterative will be u(t) is u t minus 1 mu
37:53.060 --> 38:00.060
u dot t minus 1. This is iterative actually
because, t we have given in terms of time,
38:00.250 --> 38:07.250
maybe we can put this is k, k, k where mu
is a small constant representing the update
38:10.040 --> 38:17.040
rate. Where, u dot is computed from this expression
at the sampling instant k minus 1. We proposed
38:20.550 --> 38:27.550
two different control indirect adaptive control
schemes and we found out the control law and
38:31.200 --> 38:37.830
weight update law. First case, we found out
what should be the weight update law for the
38:37.830 --> 38:44.800
controller; in second case we found out what
is the control law and now we will show that
38:44.800 --> 38:48.340
how this controller is effective.
38:48.340 --> 38:55.340
We selected a high speed robot manipulator,
whose dynamics are given this way where in
38:59.160 --> 39:06.160
this particular dynamic model C21 stands for
cos q2 minus q1, and S21 stands for sine q2
39:09.350 --> 39:16.350
minus q1 this two-link manipulator
39:24.360 --> 39:31.360
and the parameters a1, a3, a4 and a2 they
are 0.15, 0.04, 0.03 and 0.025 kg meter square
39:37.750 --> 39:40.180
respectively.
39:40.180 --> 39:46.610
We have selected the same robot manipulator
that we have been discussing in the last two
39:46.610 --> 39:52.040
classes. The online data generation scheme
for training the radial basis function network
39:52.040 --> 39:57.570
is the same. A PD controller is used to generate
the training data to find a neural model of
39:57.570 --> 40:03.060
the robot arm. Data are collected as the robot
arm is made to track various random ôpick
40:03.060 --> 40:09.160
and placeö trajectories and sinusoid trajectories.
While tracking random trajectories at each
40:09.160 --> 40:15.210
sampling instant, various dither signals in
the form of white noise, impulses, step functions,
40:15.210 --> 40:20.750
ramp and parabolic type of functions are added
to PD controller to increase the spread of
40:20.750 --> 40:27.100
the training data in the input-output space.
This is from; we shift the data from n manifold
40:27.100 --> 40:34.100
to n plus p manifold. 3000 examples are collected
while the sampling interval is kept at 10
40:36.900 --> 40:38.420
milliseconds.
40:38.420 --> 40:45.420
Then radial basis function network which has
6 inputs that is, two terms for joint inputs,
40:50.740 --> 40:57.420
two terms for angular positions, two terms
for angular velocity at sampling instant k.
40:57.420 --> 41:02.560
When you give to the network, the network
will predict what should be the actual value
41:02.560 --> 41:09.560
of the angular position and angular velocity
at k plus 1. The model incorporates 100 radial
41:10.100 --> 41:15.590
centers. Training of RBFN is carried out using
3000 examples for 10 numbers of passes that
41:15.590 --> 41:21.010
means 30,000 iterations. After training is
over the RMS error for a test data set is
41:21.010 --> 41:28.010
found to be 0.003. The validation test of
neural model thus learned is done by finding
41:32.780 --> 41:39.780
input through recall process for a given test
data. It is not simply because, when you have
41:39.850 --> 41:46.850
the RMS error 0.003 then, you can easily predict
the output. The radial basis function network
41:49.970 --> 41:56.500
will effectively predict what should be the
output given an input, but what is the case.
41:56.500 --> 42:03.500
Given an output, can the same network predict
the input? These results we showed in the
42:04.120 --> 42:05.470
last class.
42:05.470 --> 42:11.520
This is a for a sinusoid trajectory when robot
manipulator is tracking a sinusoid trajectory,
42:11.520 --> 42:17.450
we asked a question to the neural emulator
corresponding to the robot manipulator. What
42:17.450 --> 42:24.450
is the input given the target output? That
input is already given to the robot manipulator;
42:24.960 --> 42:31.960
so what this result is all about is I can
tell you again. This is my robot manipulator
42:32.550 --> 42:39.550
and this is my neural emulator, this is my
u I give to the robot manipulator and I get
42:40.990 --> 42:47.990
q, I give that same q to the neural emulator
and predict, what should be that u. If this
42:53.080 --> 43:00.080
u hat matches this u, then this neural emulator
is a good emulator of the robot manipulator.
43:01.860 --> 43:06.570
I hope you understand very clearly; again
I repeat what is the meaning of this validation
43:06.570 --> 43:13.570
through inversion. Validation through inversion
means, of course we have trained this neural
43:14.190 --> 43:20.120
emulator by giving the training, set u and
q to this. Obviously, if I give u this will
43:20.120 --> 43:27.120
predict what q is but, given q can I predict
the u? This is the question we obtained. Both
43:27.370 --> 43:34.370
ways the neural emulator should do a perfect
job and then it is a robust identifier or
43:36.000 --> 43:40.660
robust model of the actual robot manipulator.
What you are seeing is that, we have already
43:40.660 --> 43:46.610
discussed in the last class that, before the
query based learning that we discussed today,
43:46.610 --> 43:52.430
also yesterday, that before query based learning
the input prediction was very bad. You see
43:52.430 --> 43:58.240
that these are lot of variation from this
solid line, solid line is the actual control
43:58.240 --> 44:05.240
input that is given to the robot arm for a
desired sinusoid trajectory but, after query
44:07.920 --> 44:14.920
based learning for the joint 1 you see that,
the predicted one, the broken one and the
44:15.330 --> 44:21.940
solid one, they are almost very close. Where
the dotted one this is before query based
44:21.940 --> 44:28.300
learning. After query based learning, the
model has become very robust, not only the
44:28.300 --> 44:35.300
neural emulator is predicting the target given
input, it is also predicting what should be
44:35.300 --> 44:41.420
the input given the target and the same thing
also valid for the joint 2. You see that,
44:41.420 --> 44:45.890
before query based learning again you see
these dotted lines they were not good. They
44:45.890 --> 44:52.890
were not similar to the actual the solid line
that is the control signals sequence given
44:53.850 --> 44:59.850
to the actual robot manipulator. But, after
query based learning you see that, this broken
44:59.850 --> 45:06.850
line almost follows the solid line. This is
the advantage; this is the neural emulator,
45:07.840 --> 45:13.810
we have now utilized for testing our controller.
I will not discuss because, these things we
45:13.810 --> 45:20.680
have already talked in the last class. We
will now show the simulation result for the
45:20.680 --> 45:27.680
two control algorithms that we derive today
to repeat for your own understanding.
45:27.840 --> 45:33.000
We first of all propose a forward - inverse
modeling. In forward - inverse modeling, this
45:33.000 --> 45:39.700
is my robot manipulator, this is my neural
controller. Neural controller is supposed
45:39.700 --> 45:46.700
to actuate, you know a feed-forward torque
such that, the q tracks q d the output of
45:53.480 --> 46:00.480
the manipulator tracks q d. The neural controller
is a radial basis function network whose W
46:00.570 --> 46:07.570
is not known, now how do I update this W in
such a manner that my
q follows q d and this is Lyapunov stable.
46:18.200 --> 46:25.200
For that, as you saw that today in this class,
we derived a weight update law called W dot
46:26.460 --> 46:33.460
and this W dot is computed by using two terms
first term del doe x upon doe u is computed
46:35.120 --> 46:42.120
from the neural emulator. Another term we
computed this term, doe u upon doe W from
46:43.110 --> 46:50.110
the neural controller itself using these two.
We computed what is W dot and this W dot in
46:53.580 --> 46:59.740
the first case this was: x tilde norm square
upon J transpose x tilde norm square into
46:59.740 --> 47:06.740
J transpose x tilde and the second rule that
we saw we added a gradient term of the weight,
47:09.820 --> 47:16.820
where H is half W transpose W. When we took
this value for H and we differentiated that,
47:21.780 --> 47:28.780
we found that this weight update rule is also
Lyapunov stable or gives us Lyapunov stability.
47:30.730 --> 47:37.730
Using these two rules, we did simulation;
this is forward-inverse modeling simulation.
47:41.570 --> 47:48.570
What you are seeing is that we provide the
robot manipulator the same trajectory for
47:54.780 --> 48:01.780
50 times, we start here, what we started with
this neural controller. The weights are all
48:06.660 --> 48:09.120
initially randomly initialized.
48:09.120 --> 48:16.120
We did not know, now with this initial random
weight, we used various algorithms first are
48:19.500 --> 48:26.500
gradient descent. So, what we did is that,
we simply
updated the weight according to the gradient
48:32.520 --> 48:39.520
descent and you see the over 50 trails the
error in joint 1 position is this is trial
48:42.930 --> 48:49.930
1, this is the RMS error and slowly the RMS
error reduced and it is here. But, when we
48:54.200 --> 49:00.630
use the gradient descent but with weight update
thrice per sampling interval then, you see
49:00.630 --> 49:07.630
that error further decreased. Then the third
one is the adaptive tuning. First we have
49:15.190 --> 49:21.700
two update rules; the first type of update
rule. You can easily see this is the upper
49:21.700 --> 49:28.700
one and then you have the same adaptive tuning
algorithm but, weight update thrice per sampling
49:33.630 --> 49:40.630
interval, then you see that it is further
improved. But, you see the amazing aspect
49:44.450 --> 49:51.450
of the adaptive tuning that is the type 2.
When you do the type 2, you see that, error
49:52.400 --> 49:59.400
is actually almost 0 and independent of trial,
it does not require any trial that is; instantaneously
50:02.970 --> 50:09.970
the error goes to 0 point 000001. This is
a very fantastic influence of this controller;
50:12.360 --> 50:19.110
the controller that we talked is this type
2. You see, the type 2 controller here also
50:19.110 --> 50:26.110
the error is on the x-axis means 0 point 000001.
Here also, you see a joint 1 positive, the
50:28.790 --> 50:35.430
error is very small and here also the error
is very small; whereas, other schemes they
50:35.430 --> 50:40.800
have relatively large errors.
50:40.800 --> 50:47.800
We compared, now we will take an example here.
This is the sinusoid trajectory that is being
50:51.270 --> 50:58.270
tracked by this robot manipulator, this is
joint 1 angle. You can easily see in the beginning
50:58.750 --> 51:05.750
there are some error and that error died down
as time progressed and you can easily see
51:05.840 --> 51:12.840
there are two, this is the tracking error,
this is trajectory tracking, this is the tracking
51:13.800 --> 51:20.800
error at joint 1, this solid line is using
the new update rule; whereas, the dotted one
51:27.480 --> 51:30.580
is using gradient descent.
51:30.580 --> 51:37.580
Similarly, here on joint 2 the tracking is
very perfect; whereas, because you cannot
51:39.510 --> 51:46.510
even see the two trajectories, the actual
trajectory and the desired trajectory they
51:46.940 --> 51:53.940
are super imposing perfectly that they appear
to be the same trajectory, tracking is perfect.
51:55.890 --> 52:01.600
Tracking error at joint 2, if we see in a
very micro scale then you see that for gradient
52:01.600 --> 52:08.600
descent the error is quite visible; whereas,
this is almost 0 for the adaptive tuning that
52:09.170 --> 52:16.170
we have done using Lyapunov stability theory.
That is a simulation result of the first part;
52:17.120 --> 52:22.500
the second part is a network inversion. A
network inversion what we did is that, we
52:22.500 --> 52:27.940
have already a neural network. We said why
we should put another neural network there.
52:27.940 --> 52:34.940
Instead we have neural network that has already
identified the model of the robot manipulator.
52:35.680 --> 52:42.600
We utilize that neural network to predict
our control law such that, the overall the
52:42.600 --> 52:49.600
system is Lyapunov stable. What should be
the control input u dot here or the u here?
52:50.950 --> 52:57.950
We say this is my u or tau. That will be a
function of u hat that has been given from
53:00.740 --> 53:07.740
the network inversion algorithm and J is the
Jacobian that is computed from the neural
53:09.010 --> 53:16.010
emulator that is Del x upon Del u.
53:19.330 --> 53:26.330
We derive this algorithm, my control algorithm
is u (t), u is same as tau is u (t minus 1)
53:29.680 --> 53:36.680
mu u do and where u dot is this expression,
which can be computed very easily because,
53:37.250 --> 53:41.670
these are simply norm square J is a Jacobian
matrix computed from the neural emulator.
53:41.670 --> 53:48.670
D is an expression that we already expressed
in this class. When you implement this thing,
53:55.030 --> 54:02.030
here what you are seeing is that, we have
two robot manipulators and this is joint 1
54:06.850 --> 54:13.850
and this is joint 2. The A is tracking error
in joint 1 and joint 2 using control law for
54:22.680 --> 54:29.680
sinusoid trajectory after query based learning
and before query based learning and controller
54:29.810 --> 54:35.330
response; this is the controller response.
54:35.330 --> 54:41.080
Controller response for joint 1 and joint
2 corresponding to figure; this is the controller
54:41.080 --> 54:48.080
response; this is tracking error, and this
is u; this curve gives you u. You see that,
54:56.980 --> 55:03.390
this dotted line what you are seeing here
this is the control action before query based
55:03.390 --> 55:09.980
learning, we implemented. You implement this
network inversion control before query based
55:09.980 --> 55:16.430
learning. This is, you see that controller
is fluctuating but, s1 is you did the query
55:16.430 --> 55:22.640
based learning, then the controller is smooth
the solid line. Correspondingly, when the
55:22.640 --> 55:28.950
controller was not smooth the error was very
large but, when controller becomes smooth
55:28.950 --> 55:34.750
the error is almost very negligible in joint
1. Similar is the case with joint 2, you see
55:34.750 --> 55:41.750
that the controller is fluctuating here, quite
very much fluctuating before query based learning
55:43.080 --> 55:49.510
and that means, when a not properly trained
neural network is used, then we have a large
55:49.510 --> 55:55.610
error. But, when the query based learning
was completed, again implement the controller.
55:55.610 --> 56:02.610
Do you see that the torque is very smooth
here and the error is very small?
56:05.000 --> 56:12.000
Similar thing here that in this case, that
is the previous one is the sinusoid trajectory,
56:13.740 --> 56:19.810
this is exponential trajectory. For the exponential
trajectory you see that, this fluctuation
56:19.810 --> 56:26.810
is before query based learning and fluctuation
died down after I am implementing the controller
56:30.560 --> 56:37.560
after the query based learning, so that is
the solid line, that is quite smooth. Corresponding
56:39.280 --> 56:46.280
to solid line, this solid curve here implies
the error in joint 1 tracking is almost negligible
56:47.150 --> 56:54.150
not there; whereas, without query based learning
the error is there always existing. Similarly,
56:55.000 --> 57:01.110
joint 2, with query based learning this is
a solid line almost no error and error is
57:01.110 --> 57:07.910
there before query based learning, and you
can see this fluctuation here that indicates
57:07.910 --> 57:14.910
that controller control actuation is not smooth.
57:16.470 --> 57:23.470
In summary, indirect adaptive control for
a robot manipulator we discussed today. This
57:24.460 --> 57:31.460
indirect adaptive control has two different
architectures we proposed, one is indirect
57:31.490 --> 57:36.700
adaptive control using forward-inverse modeling
approach, another is indirect adaptive control
57:36.700 --> 57:42.970
using network inversion, we say these are
indirect adaptive control because, we are
57:42.970 --> 57:49.970
utilizing the neural emulator the forward
modeling that is the forward dynamic model,
57:50.350 --> 57:57.150
we trained a neural network that captures
the forward dynamics of the robot manipulator
57:57.150 --> 58:04.150
and utilize that neural network to tune our
controller or to update our controller. Both
58:06.530 --> 58:11.990
the control schemes are shown to be Lyapunov
stable, simulation results are provided to
58:11.990 --> 58:18.510
validate efficacy of the proposed schemes.
We saw that, how our tracking is almost perfect
58:18.510 --> 58:25.510
when we have the neural emulator that is perfectly
trained using query based learning and then
58:28.780 --> 58:33.640
we implement this controller, the result is
fantastic.
58:33.640 --> 58:40.640
Those who want to pursue further work in this
the last three classes their kind of one in
58:41.340 --> 58:48.340
a box type of course, you can follow these
references that we have given here for further
58:51.000 --> 58:58.000
work. First one is our own paper that is published
in 2003 in Computers and Electrical Engineering
58:58.620 --> 59:05.620
Journal, second one is again our paper published
in IEEE Transaction Neural Network in 1996
59:07.900 --> 59:14.900
and this is volume 7, number 6. The third
paper is again our paper is IEEE proceedings
59:17.260 --> 59:24.260
control theory application in 1995, volume
142 and number 6. This is another paper that
59:27.580 --> 59:32.630
is called Inverting Feed-forward Neural Networks
using linear and nonlinear programming by
59:32.630 --> 59:39.630
Bau-Lian Lu and H.Kita Nishikawa Y this is
in IEEE Transaction Neural Network volume
59:42.010 --> 59:49.010
10, issue 6 and another one is a query based
learning for aerospace applications that is
59:50.180 --> 59:57.180
there in IEEE Transaction and Neural Networks
volume 14, issue 6, November 2003. So thus,
59:59.550 --> 1:00:06.550
that should give you a good exposure on how
to design indirect adaptive control schemes
1:00:07.430 --> 1:00:14.430
for robot manipulators in particular and in
general for nonlinear systems.Thank you.
406