WEBVTT
Kind: captions
Language: en
00:00:22.330 --> 00:00:34.140
This is Stochastic Processes, Module 8: Renewal
Processes, Lecture 2: Generalized Renewal
00:00:34.140 --> 00:00:38.070
Processes and Renewal Limit Theorems.
00:00:38.070 --> 00:00:45.940
In the Lecture 1, we have discussed the renewal
processes definition and its properties.
00:00:45.940 --> 00:00:55.880
Followed by renewal process definition, we
have discussed the renewal function, and then
00:00:55.880 --> 00:01:03.010
we have discussed the renewal equation, and
also we have seen few examples in the Lecture
00:01:03.010 --> 00:01:04.120
1.
00:01:04.120 --> 00:01:11.450
In the Lecture 2, we are planning to discuss
the reward renewal process.
00:01:11.450 --> 00:01:20.690
As a special case, we are going to discuss
the Markov reward models.
00:01:20.690 --> 00:01:28.690
Then we are going to discuss two different
renewal processes, that is alternative renewal
00:01:28.690 --> 00:01:31.800
process and delayed renewal process.
00:01:31.800 --> 00:01:38.750
With this we are completing the generalized
renewal processes.
00:01:38.750 --> 00:01:47.240
The second half, we are going to discuss the
Central Limit Theorem, Long-run properties
00:01:47.240 --> 00:01:52.300
and three important renewal limit theorems.
00:01:52.300 --> 00:01:58.460
What is a reward renewal process?
00:01:58.460 --> 00:02:09.560
Let X1 be the time to the first renewal and
let Xn be the time between (n-1)th renewal
00:02:09.560 --> 00:02:12.989
and nth renewal.
00:02:12.989 --> 00:02:19.890
The same definition which we have used for
the renewal process and also you assume that
00:02:19.890 --> 00:02:28.299
Xi's are i.i.d. random variables with the
CDF capital F of x.
00:02:28.299 --> 00:02:36.140
Since Xi's are the inter-arrival times, the
mean will be a positive.
00:02:36.140 --> 00:02:41.600
Mean exists and it will be positive.
00:02:41.600 --> 00:02:53.400
Let Rn be the nth reward at time t of the
nth renewal.
00:02:53.400 --> 00:03:02.940
For each renewal, you are attaching the reward
at the time of the renewal.
00:03:02.940 --> 00:03:08.750
Usually, Rn may depend on Xn.
00:03:08.750 --> 00:03:14.270
Now we are defining a new random variable
R(t) function of t, that is nothing but the
00:03:14.270 --> 00:03:31.819
summation of Rn's where n is running from
1 to N(t) where N(t) is a renewal process.
00:03:31.819 --> 00:03:36.680
The collection of R(t) for t greater than
or equal to zero will be called it as a Reward
00:03:36.680 --> 00:03:37.680
Renewal Process.
00:03:37.680 --> 00:03:44.050
That R(t) is nothing but a reward earned by
time t.
00:03:44.050 --> 00:03:53.160
So from the renewal process, we are attaching
the reward for each renewal and by defining
00:03:53.160 --> 00:04:01.430
R(t) is equal to summation of n is equal to
1 to N(t) Rn, that will be call it as a Reward
00:04:01.430 --> 00:04:03.620
Renewal Process.
00:04:03.620 --> 00:04:11.879
Note that unlike the Xn, each Rn may take
a negative values as well as the positive
00:04:11.879 --> 00:04:13.629
values.
00:04:13.629 --> 00:04:22.069
The Xi's are nothing but the inter-arrival
time of the renewals whereas the rewards may
00:04:22.069 --> 00:04:28.630
take negative values as well as positive values.
00:04:28.630 --> 00:04:35.490
See the sample path of a reward renewal process.
00:04:35.490 --> 00:04:38.449
So the x-axis is the time.
00:04:38.449 --> 00:04:45.820
The y-axis is N(t).
00:04:45.820 --> 00:04:54.270
And these are all the time points in which
the renewals takes place.
00:04:54.270 --> 00:04:57.310
The first renewal takes place, second renewal
and so on.
00:04:57.310 --> 00:05:09.650
So this is the inter-arrival time and attaching
the reward Ri to the each Xn and these Ri's
00:05:09.650 --> 00:05:22.550
may be negative or positive, then the collection
of Ri's with this form Rt is equal to summation
00:05:22.550 --> 00:05:29.160
n is equal to 1 to N(t) Rn will be the reward
renewal process.
00:05:29.160 --> 00:05:35.159
So it is very difficult to show the sample
path of R(t).
00:05:35.159 --> 00:05:40.499
So here we are showing the sample path of
renewal process with the rewards attached
00:05:40.499 --> 00:05:50.520
with the each Xi's.
00:05:50.520 --> 00:05:56.470
Now we are moving into the simple example
of reward renewal process.
00:05:56.470 --> 00:06:02.069
Consider an age replacement model.
00:06:02.069 --> 00:06:13.849
In this model, a component that is used continuously
with replacement -- replacements.
00:06:13.849 --> 00:06:21.919
Let X be the lifetime of the component, which
is random with the distribution function F.
00:06:21.919 --> 00:06:29.819
The component is replaced by a new one upon
failure or at a fixed time period capital
00:06:29.819 --> 00:06:33.169
T, whichever comes first.
00:06:33.169 --> 00:06:44.009
The replacement policy is called an age replacement
because the component is replaced by a new
00:06:44.009 --> 00:06:51.830
one upon failure or at a fixed time period
T.
00:06:51.830 --> 00:07:05.610
The cost of a new component is c1 and the
additional cost is incurred by a failure is
00:07:05.610 --> 00:07:07.050
c2.
00:07:07.050 --> 00:07:12.300
Our interest is to find the long-run average
cost.
00:07:12.300 --> 00:07:20.369
Let N(t) be the number of replacements of
components by time.
00:07:20.369 --> 00:07:23.710
N(t) is a renewal process.
00:07:23.710 --> 00:07:29.409
Let R(t) be the amount of cost incurred by
time t.
00:07:29.409 --> 00:07:33.259
So this is a reward renewal process.
00:07:33.259 --> 00:07:45.020
So the long-run average cost can be expressed
as the ratio of expectation of R by expectation
00:07:45.020 --> 00:07:48.740
of X that we are going to conclude later.
00:07:48.740 --> 00:07:55.210
Now we are using the long-run average cost
is the expectation of R divided by expectation
00:07:55.210 --> 00:08:00.129
of X.
Expectation of R is nothing but expectation
00:08:00.129 --> 00:08:07.860
of cost during a cycle and expectation of
X is nothing but expectation of the cycle
00:08:07.860 --> 00:08:09.099
length.
00:08:09.099 --> 00:08:15.990
Now one can find the expectation of cycle
length, that is nothing but the expectation
00:08:15.990 --> 00:08:27.199
of the cycle length is a random variable,
which is nothing but the minimum of X or T.
00:08:27.199 --> 00:08:34.250
Here the component is replaces either upon
a failure or at each capital T. Time between
00:08:34.250 --> 00:08:38.950
replacement is called a cycle.
00:08:38.950 --> 00:08:48.649
That is nothing but the integration 0 to t
(1 - F(x))dx.
00:08:48.649 --> 00:08:56.940
Given that, the reward or cost during the
cycle R, therefore, R will be c1 if X is greater
00:08:56.940 --> 00:09:07.300
than T, if it fails after T. If it fails before
T, then there is an additional cost c2.
00:09:07.300 --> 00:09:17.680
Therefore, the cost will be c1 + c2 if failure
occurs before the fixed time T.
00:09:17.680 --> 00:09:27.930
Hence, the expected value that is the expected
cost is either c1 with the probability X is
00:09:27.930 --> 00:09:36.470
greater than T or c2 plus -- c1 + c2 with
the probability X is less than or equal to
00:09:36.470 --> 00:09:42.120
T. So that is nothing but c1 + c2 times F(T).
00:09:42.120 --> 00:09:50.130
Hence, the long-run average cost is nothing
but expectation of R divided by expectation
00:09:50.130 --> 00:09:51.430
of X.
00:09:51.430 --> 00:10:01.020
That is nothing but expectation of cost during
a cycle divided by expectation of cycle length.
00:10:01.020 --> 00:10:02.020
Substitute the values.
00:10:02.020 --> 00:10:06.910
You will get the long-run average cost.
00:10:06.910 --> 00:10:22.280
Now we are moving into the special case of
a reward renewal process that is a Markov
00:10:22.280 --> 00:10:25.590
Reward Model.
00:10:25.590 --> 00:10:33.240
A Markov Reward Model is a labeled continuous-time
Markov chain augmented with a state reward
00:10:33.240 --> 00:10:38.460
and impulse reward structures.
00:10:38.460 --> 00:10:48.120
The state reward structure is a function r
that assigns to each state a reward rs such
00:10:48.120 --> 00:11:03.130
that if t time units are spent in the state
s, a reward of rs times t is acquired.
00:11:03.130 --> 00:11:09.740
The rewards that are defined in the state
reward structure can be interpreted in various
00:11:09.740 --> 00:11:10.990
ways.
00:11:10.990 --> 00:11:21.140
They can be recorded as the gain or benefit
acquired by staying in some state and they
00:11:21.140 --> 00:11:29.180
can also be regarded as the cost spent by
staying in some state, and this type of Markov
00:11:29.180 --> 00:11:37.020
reward model is called a rate-based Markov
reward model.
00:11:37.020 --> 00:11:42.420
The impulse reward structure, on the other
hand, is a function of -- function t that
00:11:42.420 --> 00:11:50.750
assigns to each transition from s to s' where
s and s' are belonging to the state space
00:11:50.750 --> 00:12:01.820
S, and the spending time in transition of
s to s' is positive, and the reward t of s
00:12:01.820 --> 00:12:15.150
to s' such that if the transition from s to
s' occurs, a reward of t(s, s') is acquired.
00:12:15.150 --> 00:12:20.390
Similar to the state reward structure, the
impulse reward structure can be interpreted
00:12:20.390 --> 00:12:23.260
in various ways.
00:12:23.260 --> 00:12:31.760
An impulse reward can be considered as the
cost of taking a transition or the gain that
00:12:31.760 --> 00:12:40.050
is acquired by taking the transition.
00:12:40.050 --> 00:12:46.780
Markov reward models are commonly used for
the performance, dependability and performability
00:12:46.780 --> 00:12:50.750
analysis of computer and communication systems.
00:12:50.750 --> 00:12:58.670
In general, the reward rate is assigned on
the basis of desired measures.
00:12:58.670 --> 00:13:08.480
Let Z(t) is nothing but rX(t) be the instantaneous
reward rate of the Markov reward model at
00:13:08.480 --> 00:13:09.980
time t.
00:13:09.980 --> 00:13:17.730
Then the expected instantaneous reward rate
at time t is given by: expectation of Z(t),
00:13:17.730 --> 00:13:23.680
that is nothing but summation of ri Pi(t).
00:13:23.680 --> 00:13:30.470
The expected reward rate in steady state is
nothing but expectation of Z is summation
00:13:30.470 --> 00:13:36.250
of ri pi.
00:13:36.250 --> 00:13:42.240
Suppose in the perform -- availability model,
our interest is to find out the availability
00:13:42.240 --> 00:13:50.910
of the system, then one can assign the rewards
for the up states are 1 and rewards for the
00:13:50.910 --> 00:13:52.870
down states are zeros.
00:13:52.870 --> 00:14:02.620
Then the expected reward rate will be the
availability for the system by making a summation
00:14:02.620 --> 00:14:12.370
of ri's and probability of being in those
states.
00:14:12.370 --> 00:14:18.660
The different types of measures namely steady-state
measures, transient measures, cumulative measures
00:14:18.660 --> 00:14:24.400
and performability measures are supported
by Markov reward models.
00:14:24.400 --> 00:14:32.320
For instant, steady-state measures are computed
from the steady-state behavior of the Markov
00:14:32.320 --> 00:14:35.230
chain.
00:14:35.230 --> 00:14:40.750
So for the availability model, if you know
the steady-state behavior of the Markov chain
00:14:40.750 --> 00:14:47.740
such as a steady-state probability of being
in this system, then by assigning the rewards
00:14:47.740 --> 00:14:56.860
1 to the up states and 0s to the down states,
one can get the steady-state availability
00:14:56.860 --> 00:14:58.240
of the system.
00:14:58.240 --> 00:15:05.710
Similarly, reverting the rewards, you can
get the steady-state unavailability of the
00:15:05.710 --> 00:15:07.650
system also.
00:15:07.650 --> 00:15:17.380
Under the assumption that the steady-state
distribution of X(t), one can find the steady-state
00:15:17.380 --> 00:15:25.220
measures by multiplying ri's with the pi's
where i is -- i belonging to the state space
00:15:25.220 --> 00:15:28.780
S.
Similarly, the cumulative measures denoted
00:15:28.780 --> 00:15:38.450
by Y(t) express the overall gain that is received
from a system over some finite time interval.
00:15:38.450 --> 00:15:45.050
When transient measures are integrated over
the time interval 0 to t, then Y(t) can be
00:15:45.050 --> 00:15:50.490
given as integration 0 to t rX(s) ds.