WEBVTT
Kind: captions
Language: en
00:00:00.599 --> 00:00:05.319
So one more thing that is a conditional expectation.
00:00:05.319 --> 00:00:12.800
So since I said X given Y is a random variable,
I can go for finding out what is the expectation
00:00:12.800 --> 00:00:20.500
of X given Y. So this is called the conditional
expectation. That means the X given Y is the,
00:00:20.500 --> 00:00:24.370
still it is a random variable, but it is a
conditional distribution. Therefore, finding
00:00:24.370 --> 00:00:28.040
out the expectation for that, that is called
the conditional expectation.
00:00:28.040 --> 00:00:35.780
Suppose I treat both the random variable are
continuous case, then the conditional expectation
00:00:35.780 --> 00:00:43.629
is nothing, but minus infinity to infinity
x times f x given y of x given y integration
00:00:43.629 --> 00:00:48.940
with respect to x. That means by treating
X and Y are continuous random variable, I
00:00:48.940 --> 00:00:55.350
can able to define the conditional expectation
is this provided this expectation exist. That
00:00:55.350 --> 00:00:59.149
means in absolute sense if this integration
converges.
00:00:59.149 --> 00:01:04.089
Then without absolute whatever the value we
are going to get that is going to the conditional
00:01:04.089 --> 00:01:10.799
expectation of the random variable. And if
you note that since the Y also can take any
00:01:10.799 --> 00:01:19.460
value, therefore this is a function of y.
Not only this is a function of y, the conditional
00:01:19.460 --> 00:01:28.899
expectation is a random variable also. That
means X given Y is a random variable, the
00:01:28.899 --> 00:01:35.530
expectation of X given Y is a function of
y and Y is a random variable. It takes a different
00:01:35.530 --> 00:01:36.530
values y.
00:01:36.530 --> 00:01:41.760
Therefore, expectation of X given Y is also
a random variable. That means you can able
00:01:41.760 --> 00:01:49.990
to find out what is the expectation of, expectation
X given Y. If you compute that it is going
00:01:49.990 --> 00:01:56.299
to be expectation of X. This is a very important
property in which you are relating two different
00:01:56.299 --> 00:02:00.920
random variables with the conditional sense
and if you are trying to find out the expectation
00:02:00.920 --> 00:02:03.450
of that, that is going to be the original
expectation.
00:02:03.450 --> 00:02:10.440
That means the usage of this concept instead
of finding out the expectation of one random
00:02:10.440 --> 00:02:17.280
variable, if it is easy to find out the conditional
expectation then you find out the expectation
00:02:17.280 --> 00:02:23.330
of conditional expectation that is same as
the original expectation. Suppose you have
00:02:23.330 --> 00:02:33.330
two random variables or independent random
variables, then you know that there is no
00:02:33.330 --> 00:02:36.300
dependency over the random variable X and
Y.
00:02:36.300 --> 00:02:45.010
Therefore, the expectation of Xgiven Y that
is same as the expectation of X. So this can
00:02:45.010 --> 00:02:49.850
be validated here also because this expectation
of X given Y is going to be expectation of
00:02:49.850 --> 00:02:57.560
X is a constant and the expectation of a constant
is a constant that is same as the same constant,
00:02:57.560 --> 00:03:04.560
so that can be cross checked. So here I have
given expectation of X given Y in the integration
00:03:04.560 --> 00:03:06.880
form, if both the random variables are continuous.
00:03:06.880 --> 00:03:13.880
Then accordingly you have to use initially
the joint probability mass function then conditional
00:03:13.880 --> 00:03:22.180
probability mass function to get that conditional
expectation and this conditional expectation
00:03:22.180 --> 00:03:28.240
is very much important to give one important
property called Martingale property in the
00:03:28.240 --> 00:03:34.380
stochastic process, in which you are going
to discuss not only two random variables,
00:03:34.380 --> 00:03:39.280
you are going to discuss, you have n random
variables.
00:03:39.280 --> 00:03:45.150
And you can try to find out what is the conditional
expectation of one random variable, given
00:03:45.150 --> 00:03:56.710
that other random variable takes some value
already. So there we are going to find out
00:03:56.710 --> 00:04:02.450
what is the conditional expectation of n dimensional
random variable with given that remaining
00:04:02.450 --> 00:04:08.930
n minus 1 random variable take already some
value. So here I have given only with the
00:04:08.930 --> 00:04:12.310
two random variables how to compute the conditional
expectations.
00:04:12.310 --> 00:04:18.340
But as such you are going to find out the
conditional expectation of n random variables
00:04:18.340 --> 00:04:28.470
with n minus 1 random variables already taken
some values.
00:04:28.470 --> 00:04:38.570
So before I go to another concept, let me
just give a few examples in which I have already
00:04:38.570 --> 00:04:44.150
given if both the random variables are of
a discrete type, I have given example of joint
00:04:44.150 --> 00:04:53.240
probability mass function as 1 divided by
2 power x plus y and x takes a value 1, 2
00:04:53.240 --> 00:04:57.340
and so on and y takes a value 1, 2 and so
on. So this is the joint probability mass
00:04:57.340 --> 00:04:59.370
function example.
00:04:59.370 --> 00:05:05.650
Suppose you have random variables of the continuous
type, then I can give one simple example of
00:05:05.650 --> 00:05:12.260
the joint probability density function of
two dimensional continuous type random variable
00:05:12.260 --> 00:05:21.250
as joint probability density function, lambda
times mu e power minus lambda x minus mu y,
00:05:21.250 --> 00:05:27.850
where x can take the value greater than 0,
y can take the value greater than 0, and lambda
00:05:27.850 --> 00:05:31.190
is strictly greater than 0 as well as mu greater
than 0.
00:05:31.190 --> 00:05:36.370
So this is going to be the joint probability
density function of a two dimensional continuous
00:05:36.370 --> 00:05:41.390
random variable. You can cross check this
is going to be joint because it is going to
00:05:41.390 --> 00:05:47.139
be always take greater than or equal to 0
values for all x and y and if you make a double
00:05:47.139 --> 00:05:55.400
integration over minus to infinity over x
and y, then that is going to be 1. And you
00:05:55.400 --> 00:05:58.400
can verify the other one.
00:05:58.400 --> 00:06:02.889
If you find out the marginal distribution
of this random variable, you may land up the
00:06:02.889 --> 00:06:07.050
marginal distribution of this random variable
is going to be lambda times e power minus
00:06:07.050 --> 00:06:12.370
lambda x and similarly if you find out the
marginal distribution of the same, one, you
00:06:12.370 --> 00:06:19.110
will get mu times minus mu y, and if you cross
check the product is going to be the joint
00:06:19.110 --> 00:06:23.389
probability density function, then you can
conclude both the random variables are independent
00:06:23.389 --> 00:06:24.639
random variable.
00:06:24.639 --> 00:06:31.550
Similarly, you can find out what is the marginal
distribution of the random variable X, similarly
00:06:31.550 --> 00:06:38.520
marginal distribution of Y, if you cross check
the similar property of independent, then
00:06:38.520 --> 00:06:43.490
that is satisfied, therefore, you can conclude
here the random variables X and Y both are
00:06:43.490 --> 00:06:49.920
discrete as well as both are independent random
variable also. So the advantage with the independent
00:06:49.920 --> 00:06:57.110
random variable, always from the joint you
can find out the marginals.
00:06:57.110 --> 00:07:02.889
But if you have marginals, you cannot find
out the joints unless otherwise they are the
00:07:02.889 --> 00:07:08.330
independent random variable. Therefore, the
independent random variables makes easier
00:07:08.330 --> 00:07:14.970
to find out the joint distribution with the
provided marginal distribution. And here is
00:07:14.970 --> 00:07:25.900
the one simple example of bivariate normal
distribution.
00:07:25.900 --> 00:07:31.889
In which, both the random variables X and
Y are normally distributed, therefore the
00:07:31.889 --> 00:07:35.580
together joint distribution is going to be
of the form.
00:07:35.580 --> 00:07:46.400
Let me write the joint probability density
function of two dimensional normal distribution
00:07:46.400 --> 00:07:57.610
random variable as 1 divided by 2pi, sigma
1, sigma 2, multiplied by square root of 1
00:07:57.610 --> 00:08:10.750
minus rho square into e power minus half times
1 minus rho square multiplied by x minus mu
00:08:10.750 --> 00:08:26.850
1 by sigma 1 whole square minus 2 times rho
into x minus mu 1 by sigma 1 that is multiplied
00:08:26.850 --> 00:08:42.169
by y minus mu 2 by sigma 2 plus y minus mu
2 by sigma 2 whole square.
00:08:42.169 --> 00:08:47.569
So here if you find the marginal distribution
of the random variable X and marginal distribution
00:08:47.569 --> 00:08:55.570
of Y, you can conclude X is going to be normally
distributed with the mean mu 1 and the variance
00:08:55.570 --> 00:09:01.949
sigma 1 square and similarly you can come
to the conclusion Y is also normally distributed
00:09:01.949 --> 00:09:08.470
with the mean mu 2 and the variance sigma
2 square, that means if you make the plot
00:09:08.470 --> 00:09:13.029
for the joint probability density function,
that will be of this shape.
00:09:13.029 --> 00:09:18.000
One is the X and one is the Y and this is
going to be the joint probability density
00:09:18.000 --> 00:09:26.130
function for fixed values of mu 1, mu 2 and
sigma 1 and sigma 2 and this is going to be
00:09:26.130 --> 00:09:31.149
the joint probability density function and
here rho is nothing but the correlation coefficient.
00:09:31.149 --> 00:09:37.290
That means what is the way the random variable
X and Y are correlated that comes into the
00:09:37.290 --> 00:09:43.339
picture when you are giving the joint probability
density function of this random variable.
00:09:43.339 --> 00:09:48.550
And they are not independent random variable,
unless otherwise the rho is going to be zero.
00:09:48.550 --> 00:09:54.949
So if the rho is going to be zero, then its
gets simplified and you can able to verify
00:09:54.949 --> 00:10:01.790
the joint probability density function will
be the product of two probability density
00:10:01.790 --> 00:10:07.579
function and each one is going to be a probability
density of a normal distribution with a mean
00:10:07.579 --> 00:10:12.160
mu 1 and the variance sigma 1 square and mu
2 and sigma 2 square.
00:10:12.160 --> 00:10:18.700
So this bivariate normal distribution is very
important one, when you discuss the multi
00:10:18.700 --> 00:10:24.750
nominal normal distribution. So only we can
able to give the joint probability density
00:10:24.750 --> 00:10:31.670
function of the bivariate, so the multi-variate
you can able to visualize how the joint probability
00:10:31.670 --> 00:10:38.860
density function will look like and what is
the way the other factors will come into the
00:10:38.860 --> 00:10:41.410
picture.
00:10:41.410 --> 00:10:53.779
So other than covariance correlation and correlation
coefficient, we need the other called covariance
00:10:53.779 --> 00:10:57.619
matrix also.
00:10:57.619 --> 00:11:02.689
Because in the stochastic process, we are
going to consider n dimensional random variable
00:11:02.689 --> 00:11:07.220
as well as the sequence of random variables,
so you should know how to define the covariance
00:11:07.220 --> 00:11:13.579
matrix of n dimensional random variable. That
means, if suppose you have a n random variables
00:11:13.579 --> 00:11:26.490
X1 to Xn, then you can define the covariance
matrix as, you just make rows X1 to Xn and
00:11:26.490 --> 00:11:30.829
column also you make X1 to Xn, now you can
fill up.
00:11:30.829 --> 00:11:38.410
This is going to be n x n matrix in which
each entity is going to be covariance of,
00:11:38.410 --> 00:11:45.970
so that means the matrix entity of i, j is
nothing but what is the covariance of that
00:11:45.970 --> 00:11:52.970
random variable Xi with Xj. You know that
the way I have given the definition covariance
00:11:52.970 --> 00:12:00.589
of Xi and Xj, if i and j are same, then that
is nothing but e of X square minus e X whole
00:12:00.589 --> 00:12:06.089
square. Therefore, that is nothing but the
variance of that random variable.
00:12:06.089 --> 00:12:10.540
Therefore, this is going to be variance of
X1 and this is going to be the variance of
00:12:10.540 --> 00:12:17.809
X2. Therefore, all the diagonal elements are
going to be variance of Xi. Whereas other
00:12:17.809 --> 00:12:22.339
than the diagonal elements, we can fill it
up this is going to be a covariance of X1
00:12:22.339 --> 00:12:30.209
with X2 and like that the last element will
be covariance of X1 with Xn. Similarly, second
00:12:30.209 --> 00:12:34.709
row first column will be covariance of X2
with X1.
00:12:34.709 --> 00:12:41.730
You can use the other property the covariance
of Xi, Xj is same as covariance of Xj with
00:12:41.730 --> 00:12:47.309
Xi also. Because you are trying to find out
expectation of X into Y minus expectation
00:12:47.309 --> 00:12:55.000
of X into expectation of Y. Therefore, both
the covariance of X2 with X1 is same X1 with
00:12:55.000 --> 00:12:59.679
X2. So it is going to be a, whatever the value
you are going to get, it is going to be the
00:12:59.679 --> 00:13:04.550
symmetric matrix and all the diagonal elements
are going to be the variance.
00:13:04.550 --> 00:13:09.869
So the way I have given the two dimensional
normal distribution, that is a bivariate normal,
00:13:09.869 --> 00:13:16.209
suppose you have n dimensional random vector,
in which each random variable is a normal
00:13:16.209 --> 00:13:24.129
distribution, then you need what is the covariance
matrix for that, then only you can able to
00:13:24.129 --> 00:13:28.899
write what is a joint probability density
function of n dimensional random variable.