Which of the Following Is a Potential 95% Prediction Interval for the Adult Height of the Baby

março 11, 2022 Postar um comentário

Prediction Interval

Thus, the prediction interval is of more interest to the consumer of the next bottle of milk, whereas the confidence interval is of more involvement to the dairy.

From: Biostatistics (Second Edition) , 2007

Regression

Sheldon 1000. Ross , in Introduction to Probability and Statistics for Engineers and Scientists (Sixth Edition), 2021

nine.4.four Prediction interval of a hereafter response

It is frequently the case that it is more important to estimate the actual value of a future response rather than its mean value. For case, if an experiment is to be performed at temperature level ${ten}_{0}$ , then we would probably be more interested in predicting $Y (x_{0})$ , the yield from this experiment, than we would be in estimating the expected yield — $E [Y (x_{0})] = α + β 10_{0}$ . (On the other paw, if a series of experiments were to be performed at input level $x_{0}$ , then we would probably desire to estimate $α + β x_{0}$ , the mean yield.)

Suppose first that we are interested in a unmarried value (as opposed to an interval) to apply as a predictor of $Y (10_{0})$ , the response at level $x_{0}$ . At present, information technology is clear that the best predictor of $Y (x_{0})$ is its mean value $α + β x_{0}$ . [Really, this is non so immediately obvious since i could argue that the best predictor of a random variable is (1) its mean — which minimizes the expected foursquare of the difference between the predictor and the actual value; or (2) its median — which minimizes the expected absolute difference between the predictor and the actual value; or (3) its mode — which is the almost likely value to occur. Still, equally the mean, median, and style of a normal random variable are all equal — and the response is, by assumption, normally distributed — at that place is no doubt in this situation.] Since α and β are not known, it seems reasonable to employ their estimators A and B and thus utilise $A + B x_{0}$ equally the predictor of a new response at input level $x_{0}$ .

Permit us now suppose that rather than beingness concerned with determining a unmarried value to predict a response, we are interested in finding a prediction interval that, with a given caste of confidence, will incorporate the response. To obtain such an interval, let Y denote the futurity response whose input level is $x_{0}$ and consider the probability distribution of the response minus its predicted value — that is, the distribution of $Y - A - B x_{0}$ . Now,

$Y \sim Due north (α + β x_{0}, σ^{two})$

and, as was shown in Section ix.4.3,

$A + B x_{0} \sim Northward (α + β {ten}_{0}, σ^{2} [\frac{1}{n} + \frac{{(x_{0} - \overline{ten})}^{2}}{{Due south}_{x 10}}])$

Hence, because Y is contained of the before data values $Y_{ane}, Y_{2}, \dots, Y_{n}$ that were used to determine A and B, it follows that Y is independent of $A + B x_{0}$ and then

$Y - A - B x_{0} \sim N (0, σ^{2} [1 + \frac{i}{n} + \frac{{({ten}_{0} - \overline{x})}^{two}}{{Due south}_{x ten}}])$

or, equivalently,

(9.iv.6) $\frac{Y - A - B x_{0}}{σ \sqrt{\frac{northward + ane}{n} + \frac{{(x_{0} - \overline{ten})}^{2}}{{South}_{x x}}}} \sim North (0, 1)$

Now, using once again the effect that $S {Southward}_{R}$ is independent of A and B (and also of Y) and

$\frac{Due south S_{R}}{σ^{2}} \sim χ_{northward - 2}^{2}$

nosotros obtain, by the usual argument, upon replacing $σ^{ii}$ in Equation (9.4.6) by its estimator $South {Due south}_{R} / (north - two)$ that

$\frac{Y - A - B x_{0}}{\sqrt{\frac{n + one}{n} + \frac{{(x_{0} - \overline{x})}^{2}}{S_{x x}}} \sqrt{\frac{South S_{R}}{north - two}}} \sim t_{due north - two}$

and so, for any value $a, 0 < a < 1$ ,

$P {- t_{a / 2, n - 2} < \frac{Y - A - B x_{0}}{\sqrt{\frac{north + one}{n} + \frac{{(x_{0} - \overline{x})}^{2}}{{Southward}_{x 10}}} \sqrt{\frac{S S_{R}}{northward - 2}}} < t_{a / 2, n - 2}} = 1 - a$

That is, nosotros have merely established the following.

Prediction interval for a response at the input level $x_{0}$ Based on the response values $Y_{i}$ respective to the input values $x_{i}, i = 1, 2, \dots, n$ : With $100 (one - a)$ percent conviction, the response Y at the input level $x_{0}$ will exist contained in the interval

$A + B x_{0} \pm t_{a / ii, n - 2} \sqrt{[\frac{north + i}{due north} + \frac{{(x_{0} - \overline{10})}^{2}}{S_{ten x}}] \frac{S S_{R}}{n - two}}$

Case nine.4.f

In Example nine.four.c, suppose we desire an interval that we tin can "exist 95 per centum certain" volition contain the height of a given male whose father is 68 inches tall. A simple computation now yields the prediction interval

$Y (68) \in 67.568 \pm ane . 050$

or, with 95 percent confidence, the person'due south elevation will exist between 66.518 and 68.618. ■

Remarks (a) At that place is often some confusion well-nigh the difference between a confidence and a prediction interval. A confidence interval is an interval that does incorporate, with a given degree of confidence, a fixed parameter of interest. A prediction interval, on the other manus, is an interval that will contain, again with a given caste of confidence, a random variable of interest.

Inferences About	Use the Distributional Result
β	$\sqrt{\frac{(north - 2) S_{x x}}{S {Southward}_{r}}} (B - β) \sim t_{n - two}$
α	$\sqrt{\frac{north (north - 2) {South}_{x x}}{\sum_{i} 10_{i}^{ii} S {Due south}_{R}}} (A - α) \sim t_{northward - 2}$

Inferences About	Utilize the Distributional Result
α +β10 ₀	$\frac{A + B x_{0} - α - β x_{0}}{\sqrt{(\frac{1}{northward} + \frac{{({ten}_{0} - \overline{ten})}^{2}}{S_{10 x}}) (\frac{Southward {Due south}_{R}}{north - 2})}} \sim t_{due north - 2}$
Y(ten ₀)	$\frac{Y (x_{0}) - A - B 10_{0}}{\sqrt{(1 + \frac{1}{north} + \frac{{(x_{0} - \overline{10})}^{2}}{S_{x ten}}) (\frac{S S_{R}}{due north - 2})}} \sim t_{n - 2}$

(b) One should not brand predictions about responses at input levels that are far from those used to obtain the estimated regression line. For case, the data of Example ix.iv.c should not be used to predict the acme of a male whose father is 42 inches tall.

Read full chapter

URL:

https://world wide web.sciencedirect.com/science/article/pii/B9780128243466000181

Linear Regression

Sheldon M. Ross , in Introductory Statistics (4th Edition), 2017

12.7 Prediction Intervals for Future Responses

Suppose, in the linear regression model, that input values $x_{i}$ take led to the response values $y_{i}$ , $i = 1, \dots, n$ . The best prediction of the value of a new response at input $x_{0}$ is, of form, $\hat{α} + \hat{β} {ten}_{0}$ . However, rather than give a single number as the predicted value, it is oft more than useful to exist able to nowadays an interval that you predict, with a certain degree of conviction, will contain the response value. Such a prediction interval is given by the following.

Prediction interval for a response at input value $10_{0}$ , based on the response values $y_{i}$ at the input values $x_{i}, i = 1, \dots, northward$ :

With 100( $1 - γ$ ) caste confidence, the response Y at the input value $x_{0}$ will lie in the interval

$\hat{α} + \hat{β} x_{0} \pm t_{n - 2, γ / 2} W$

where $t_{due north - 2, γ / 2}$ is the 100( $1 - γ / two$ )th percentile of the t distribution with $due north - 2$ degrees of freedom, and

$West = \sqrt{[ane + \frac{1}{n} + \frac{{({ten}_{0} - \overline{10})}^{2}}{{Due south}_{x 10}}] \frac{{SS}_{R}}{north - ii}}$

The quantities $\hat{α}, \hat{β}, \overline{x}, S_{x x}$ , and ${SS}_{R}$ are all computed from the data $x_{i}, y_{i}, i = 1, \dots, north$ .

Example 12.7

Using the information of Example 12.6, specify an interval that, with 95 percent confidence, will contain the adult meridian of a newborn son whose father is 70 inches tall.

Solution

From the output of Program 12-1, nosotros obtain

$\begin{matrix} \hat{α} + 70 \hat{β} = 68.497 \\ W = 0.4659 \end{matrix}$

Since from Table D.2, t

{}_{8, 0.025}= 2.306

, nosotros run into that the 95 percent prediction interval of the height of the son of a seventy-inch-tall man is

$68.497 \pm 2.306 (0.4659) = 68.497 \pm ane.074$

That is, we can be 95 percent confident that the son'southward summit will be betwixt 67.423 and 69.571 inches.

Example 12.8

A visitor that runs a hamburger concession at a college football stadium must decide on Mon how much to society for the game that is to exist played on the post-obit Sat. The company bases its lodge on the number of tickets for the game that accept already been sold past Monday. The following data give the advance ticket sales and the number of hamburgers purchased for each game played this year. All data are in units of 1000.

Advance ticket sales	Hamburgers sold
29.four	nineteen.five
21.four	xvi.2
eighteen.0	15.3
25.2	18.0
32.5	twenty.4
23.nine	sixteen.8

If 26,000 tickets accept been sold by Monday for adjacent Saturday's game, make up one's mind a 95 percent prediction interval for the amount of hamburgers that will be sold.

Solution

Running Program 12-1 gives the following output, if we request predicted time to come responses and the value of the input is 26.

The predicted response is 18.04578.

Due west =0.3381453

Since t ${}_{iv, 0.025}= two.776$ , we run into from the output that the 95 percent prediction interval is

$18.046 \pm 2.776 (0.338) = xviii.046 \pm 0.938$

That is, with 95 percent confidence, between 17,108 and xviii,984 hamburgers will be sold.

Read total chapter

URL:

https://www.sciencedirect.com/scientific discipline/commodity/pii/B9780128043172000126

Confidence Intervals

Andrew F. Siegel , in Practical Business Statistics, 2012

9.v Prediction Intervals

The conviction interval tells you lot where the population hateful is, with known probability. This is fine if y'all are seeking a summary measure for a large population. If, on the other hand, yous want to know nigh the observed value for an private case, this confidence interval is not appropriate. Instead, you need a much wider interval that reflects not simply the estimated uncertainty ${Southward}_{\bar{10}} = S / \sqrt{n}$ of $\bar{X}$ (which may be very small when n is big) merely also the estimated uncertainty S of an individual ascertainment.

The prediction interval allows y'all to utilise data from a sample to predict a new ascertainment with known probability, provided you obtain this additional ascertainment in the aforementioned style as you obtained your by data. The state of affairs is as follows: You have a random sample of north units from a population and take measured each 1 to obtain 10 ₁, X ₂, … , X_n . You would now like to make a prediction about an additional unit randomly selected from the same population.

The uncertainty measure out to use here is the standard error for prediction , a measure of variability of the altitude between the sample average and the new observation. Two kinds of randomness are combined: for the sample average and for the new ascertainment. This standard fault for prediction is plant by multiplying the standard divergence by the square root of (1 + one/n):

Standard Error for Prediction

$S \sqrt{1 + \frac{1}{northward}}$

The standard error for prediction is fifty-fifty larger than the estimator S of the variability of individuals in the population. This is appropriate because the prediction interval must combine the dubiety of individuals in the population (as measured past S) together with the uncertainty of the sample boilerplate (as measured by ${South}_{\bar{Ten}} = S / \sqrt{n}$ ).

Once you have an estimator ( $\bar{Ten}$ ) and the standard error for prediction, you tin can form the prediction interval in much the same manner as you form an ordinary confidence interval. The t value is found in the tabular array in just the same way for a given prediction confidence level and sample size north (not including the additional observation, of course). Simply the standard mistake is unlike; exist certain to apply the standard error for prediction in place of the standard error of the boilerplate.

The Prediction Interval for a New Observation

Two-sided

We are 95% certain that the new observation will be between

$\bar{X} - t Due south \sqrt{1 + 1 / n} and \bar{X} + t S \sqrt{ane + i / north}$

Ane-sided

We are 95% sure that the new observation will be at to the lowest degree

$\bar{10} - t_{one - sided} South \sqrt{ane + 1 / n}$

We are 95% sure that the new ascertainment will be no larger than

$\bar{X} + t_{one - sided} S \sqrt{one + 1 / due north}$

What does the figure 95% signify here? Information technology is a probability according to the following random experiment: Get a random sample, find the prediction interval, get a new random observation, and see if the new observation falls in the interval. Note in particular that the 95% probability refers to drawing a new sample besides as a new observation. This is only natural; since ane sample differs from another, the proportion of new observations that falls inside the prediction interval will likewise vary from one sample to another. Averaged over the randomness of the initial sample, the resulting probability is 95% (or another specified conviction level).

The following table summarizes when to apply a prediction interval instead of a confidence interval.

When You lot Demand to Acquire About	Use
The population mean	Confidence interval
A new observation similar the others	Prediction interval

Case

How Long until Your Order Is Filled?

How long should y'all wait before ordering new supplies for production inventory? If you order also shortly, yous pay interest on the upper-case letter used to buy them while they sit around costing you rent for the warehouse space they occupy. If you club too late, so you risk being without necessary parts and bringing role of the product line to a halt.

The past 8 times that your supplier has said, "They'll be there in two weeks," you made a annotation of how many business concern days information technology really took for them to arrive. These numbers were as follows:

$10, ix, 7, 10, 3, 9, 12, five$

The average is $\bar{X}$ = 8.125 days, and the standard deviation is S = ii.94897 days. The standard mistake of the boilerplate is ${South}_{\bar{10}}$ = one.04262 days, but we do not need it. The standard fault for prediction is

$\begin{array}{l} Standard error for prediction & = S \sqrt{1 + i / n} \\ = 2.94897 \sqrt{1 + ane / 8} \\ = 2.94897 \sqrt{1.125} \\ = 3.12786 \end{array}$

For a two-sided 95% prediction interval, the t value from the table for n = 8 is t = two.365. The prediction interval extends from

$\begin{array}{l} \bar{X} - t (Standard error for prediction) & = 8.125 - (2.365) (three.12786) \\ = 0.728 \end{array}$

$\begin{array}{l} \bar{X} + t (Standard mistake for prediction) & = viii.125 + (2.365) (3.12786) \\ = 15.52 \end{array}$

You will be assuming that the delivery times are approximately ordinarily distributed, that the northward = 8 delivery times observed stand for a random sample from the idealized population of "typical delivery times," and that the next delivery time is randomly selected from this same population. The final prediction interval argument is as follows:

We are 95% certain that the next delivery time will be somewhere between 0.7 and 15.5 days.

Why does this prediction interval extend over such a large range? This reflects the underlying dubiousness of the situation. In the past, based on your 8 observations, the delivery times have been quite variable. Naturally, this makes verbal predictions difficult.

If yous merely want to be bodacious that the side by side delivery time will not be as well tardily, y'all may construct a one-sided prediction interval using t = i.895 from the one-sided 95% confidence column in the t table. The upper limit is then

$\begin{array}{l} \bar{Ten} + t (Standard fault for prediction) & = 8.125 + (1.895) (three.12786) \\ = fourteen.1 \end{array}$

You lot may then brand the following ane-sided prediction interval statement:

We are 95% certain that the next delivery time will be no more than 14.i days.

If you are willing to accept a xc% i-sided prediction interval, then the upper limit (using t = 1.415) would be

$\begin{array}{l} \bar{X} + t (Standard error for prediction) & = 8.125 + (i.415) (3.12786) \\ = 12.half dozen \end{array}$

Yous would then make the following one-sided prediction interval statement:

We are 90% certain that the side by side commitment time will be no more 12.vi days.

Read full affiliate

URL:

https://world wide web.sciencedirect.com/scientific discipline/article/pii/B9780123852083000092

Some bug in statistical applications

Kandethody One thousand. Ramachandran , Chris P. Tsokos , in Mathematical Statistics with Applications in R (Third Edition), 2021

xiv.5.1 A simple model for univariate information

Suppose that we have a data set that characterizes a phenomenon of interest. Suppose our problem is to create a statistical model for the data set in the grade of a probability distribution from which the information set came. Offset we create a dot plot and summary of the basic statistics. The dot plot volition provide u.s. with an idea of the probability distribution of the data and any unusual behavior of the information that will not be credible from the basic statistics such as sample mean and sample standard deviation. Having identified the probability distribution of the sample statistic, we can proceed to obtain 95% conviction limits on parameters such as the mean and variance. In add-on, we tin obtain a 95% prediction interval of the next observation using the following expression:

$\bar{y} \pm (t - value) southward \sqrt{one + \frac{ane}{n}} .$

Note that the prediction interval is always wider than the corresponding conviction interval. The confidence interval provides a measure out of reliability for estimating a parameter. The prediction interval provides a measure of reliability for the prediction of an observation. Thus, the prediction interval needs to account for interpretation mistake too as the natural variability of a single observation. These steps tin can exist considered as the first modeling effort for univariate information. Note that if nosotros take a small-scale sample size, using a t value in the confidence interval and/or prediction interval supposes a modeling assumption of normality for the corresponding population. The preliminary verification of this is washed by the dot plot. For more detailed verification of this modeling assumption, use the normal plots.

Example 14.5.1

Consider the following data from an experiment:

0.xv	0.xiv	0.fifteen	0.xiv	0.26	0.00	0.00	0.47	0.35	0.sixteen
0.15	0.fifteen	0.23	0.13	0.19	0.15	0.22	0.53	0.17	0.23
0.22	0.16	0.12	0.xiii	0.eleven	0.14	0.18	0.15	0.14	0.21
0.13	0.12	0.13	0.13	0.21	0.22	0.18	0.twenty	0.22	0.16
0.17	0.00	0.23	0.21	0.18	0.05	0.16	0.13	0.23	0.xviii
0.14	0.29	0.21	0.22	0.11	0.sixteen	0.23	0.13	0.07	0.17
0.08	0.xiv	0.06	0.08	0.07	0.11	0.12	0.14	0.16	0.12
0.10	0.27	0.19	0.thirteen	0.27	0.16	0.07	0.09	0.04	0.53
0.29	0.fifteen	0.12	0.eleven	0.ten	0.14	0.xiv	0.sixteen	0.16	0.17
0.36	0.46	1.21	0.39	0.01	0.52	0.09	0.18	0.16	0.xvi
0.14	0.15	0.09	0.09	0.13	0.13	0.08	0.14	0.xx	0.09
0.09	0.sixteen	0.08	0.10	0.34	0.24	0.15	0.44	0.08	0.08
0.sixteen	0.14	0.18	0.23	0.19	0.xi	0.xix	0.ten	0.14	0.11
0.xiv	0.17	0.17	0.17	0.05	0.12	0.14	0.eleven	0.20	0.14
0.23	0.03	0.10	0.29	0.13	0.26	0.xiii	0.15	0.27	0.14
0.50	0.16	0.15	0.18	0.16	0.14	0.13	0.08	0.20	0.17
0.17	0.16	0.fifteen	0.11	0.13	0.76	0.xviii	0.19	0.09	0.12
0.11	0.12	0.08	0.26	0.23	0.20	0.nineteen	0.nineteen	0.16	0.11
0.12	0.xiii	0.32	0.05	0.18	0.12	0.thirteen	0.50	0.13	0.04
0.00	−0.xi	0.18	0.15	0.fourteen	0.xv	0.02	0.xx

(a): Create a dot plot.
(b): Summate the basic statistics, sample mean, sample median, and sample standard deviation.
(c): Obtain a 95% confidence interval for the true mean.
(d): Obtain a 95% prediction interval.

Solution

(a): Each dot in Fig. 14.17 represents three points.

Figure 14.17. Dot plot of the data.
(b): We can use Minitab's describe control to obtain the following:

	N	Mean	Median	Tr Mean	StDev	SE hateful
C1	198	0.17038	0.15121	0.15982	0.13610	0.00967
	Min	Max	Q1	Q3
	−0.39575	1.22076	0.12059	0.19284

(c): Again using Minitab commands, we tin obtain (where information are stored in C1 ), MTB > ZInterval 95.0 0.136 c1.

The causeless σ = 0.136
	N	Mean	StDev	SE mean	95.0% CI
C1	198	0.17038	0.13610	0.00967	(0.15143, 0.18933)

(d): For the prediction interval use the large sample formula $\bar{y} \pm (z_{α / 2}) s \sqrt{1 + \frac{i}{north}}$ to obtain the 95% prediction interval for the true mean as (0.097, 0.4387).

Read full chapter

URL:

https://world wide web.sciencedirect.com/scientific discipline/article/pii/B9780128178157000142

Some Problems in Statistical Applications

Kandethody Thousand. Ramachandran , Chris P. Tsokos , in Mathematical Statistics with Applications in R (Second Edition), 2015

14.five.1 A Simple Model for Univariate Information

Suppose that we have a data set that characterizes a miracle of involvement. Suppose our problem is to create a statistical model for the data set in the form of a probability distribution from which the information set came. First we create a dotplot and summary of the basic statistics. The dotplot will provide us with an idea of the probability distribution of the data and any unusual behavior of the data that will non be apparent from the basic statistics such as sample mean and sample standard deviation. Having identified the probability distribution of the sample statistic, nosotros can proceed to obtain 95% conviction limits on parameters such as the hateful and variance. In add-on, nosotros can obtain a 95% prediction interval of the next ascertainment using the expression

$\bar{y} \pm (t - value) s \sqrt{ane + \frac{1}{n}} .$

Notation that the prediction interval is ever wider than the corresponding confidence interval. The confidence interval provides a measure of reliability for estimating a parameter. The prediction interval provides a measure of reliability for the prediction of an ascertainment. Thus, the prediction interval needs to account for estimation fault likewise as the natural variability of a single observation. These steps can be considered as the first modeling effort for univariate data. Note that if we have a small-scale sample size, using a t-value in the confidence interval and/or prediction interval supposes a modeling assumption of normality for the respective population. The preliminary verification of this is done past the dotplot. For more detailed verification of this modeling assumption, use the normal plots.

Example 14.5.ane

Consider the following data from an experiment:

(a): Obtain a dotplot.
(b): Calculate the bones statistics, sample hateful, sample median, and sample standard deviation.
(c): Obtain a 95% confidence interval for the truthful mean.
(d): Obtain a 95% prediction interval.

0.15	0.14	0.xv	0.14	0.26	0.00	0.00	0.47	0.35	0.sixteen
0.15	0.15	0.23	0.thirteen	0.19	0.xv	0.22	0.53	0.17	0.23
0.22	0.sixteen	0.12	0.13	0.11	0.14	0.xviii	0.xv	0.fourteen	0.21
0.13	0.12	0.13	0.13	0.21	0.22	0.xviii	0.20	0.22	0.xvi
0.17	0.00	0.23	0.21	0.18	0.05	0.16	0.13	0.23	0.18
0.14	0.29	0.21	0.22	0.eleven	0.xvi	0.23	0.13	0.07	0.17
0.08	0.fourteen	0.06	0.08	0.07	0.11	0.12	0.14	0.16	0.12
0.10	0.27	0.nineteen	0.thirteen	0.27	0.16	0.07	0.09	0.04	0.53
0.29	0.xv	0.12	0.11	0.10	0.fourteen	0.14	0.16	0.16	0.17
0.36	0.46	1.21	0.39	0.01	0.52	0.09	0.18	0.16	0.xvi
0.14	0.15	0.09	0.09	0.13	0.13	0.08	0.14	0.20	0.09
0.09	0.16	0.08	0.10	0.34	0.24	0.xv	0.44	0.08	0.08
0.sixteen	0.14	0.xviii	0.23	0.19	0.11	0.19	0.10	0.fourteen	0.eleven
0.14	0.17	0.17	0.17	0.05	0.12	0.14	0.11	0.20	0.14
0.23	0.03	0.10	0.29	0.13	0.26	0.13	0.15	0.27	0.14
0.50	0.16	0.fifteen	0.xviii	0.xvi	0.14	0.thirteen	0.08	0.xx	0.17
0.17	0.xvi	0.15	0.11	0.13	0.76	0.eighteen	0.19	0.09	0.12
0.11	0.12	0.08	0.26	0.23	0.xx	0.19	0.19	0.16	0.xi
0.12	0.13	0.32	0.05	0.xviii	0.12	0.13	0.fifty	0.xiii	0.04
0.00	− 0.xi	0.18	0.15	0.14	0.15	0.02	0.20

Solution

(a)

Each dot in Figure xiv.17 represents 3 points.

(b)

We can use Minitab'southward draw control to obtain the following.

	Northward	Mean	Median	TR mean	St. dev	SE mean
Ci	198	0.17038	0.15121	0.15982	0.13610	0.00967
	Min	Max	Q1	Q3
	− 0.39575	1.22076	0.12059	0.19284

(c)

Over again using Minitab commands, we can obtain (where data are stored in C1), MTB > ZInterval 95.0 0.136 c1.

The assumed Sigma = 0.136
	Northward	Mean	STdev	SE mean	95.0% C.I.
C1	198	0.17038	0.13610	0.00967	(0.15143, 0.18933)

(d)

For the prediction interval use the large sample formula $\bar{y} \pm (z_{α / ii}) s \sqrt{i + \frac{1}{{north}^{'}}}$ to obtain the 95% prediction interval for the true mean as (0.097, 0.4387).

Read total chapter

URL:

https://world wide web.sciencedirect.com/science/commodity/pii/B978012417113800014X

Linear Regression

Ronald N. Forthofer , ... Mike Hernandez , in Biostatistics (Second Edition), 2007

13.3.2 Prediction Interval for Y | X

In the preceding department, we saw how to grade the confidence interval for the hateful of SBP for a summit value. In this section, we shall form the prediction interval — the interval for a single observation. The prediction interval is of involvement to a physician because the md is examining a single person, non an unabridged community. How does the person'south SBP value relate to the standard?

As we saw in Chapter vii in the fabric on intervals based on the normal distribution, the prediction interval is wider than the corresponding conviction interval considering we must add the individual variation well-nigh the hateful to the mean's variation. Similarly, the formula for the prediction interval based on the regression equation adds the individual variation to the hateful'southward variation. Thus, the estimated standard fault for a single observation is

$e southward t . s . e . ({\hat{y}}_{grand}) s_{Y | X} \sqrt{1 + \frac{one}{n} + \frac{{({ten}_{g} - \bar{ten})}^{2}}{\sum {(x_{k} - \bar{x})}^{2}}} .$

The corresponding ii-sided (i − α)*100 per centum prediction interval is

${\hat{y}}_{1000} \pm t_{n - 2, 1 - α / 2} eastward s t . south . e . ({\hat{y}}_{m}) .$

Figure thirteen.8 shows the 95 percent prediction interval for the data in Table 13.1. The prediction interval is much wider than the corresponding conviction interval because of the add-on of the individual variation in the standard fault term. The prediction interval here is about 60 mmHg broad. Note that most of the information points are inside the prediction interval ring. Inclusion of the individual variation term has profoundly reduced the event of the (x _k − $\bar{x}$ ;)² term in the estimated standard error in this example. The upper and lower limits are essentially straight lines, in contrast to the shape of the upper and lower limits of the confidence interval.

Software packages can be used to perform the calculations necessary to create the 95 percent confidence and prediction intervals (meet Program Notation 13.2 on the website).

Example thirteen.1

Nosotros utilize the prediction interval to develop the standard for systolic blood pressure. Since nosotros are only concerned near systolic blood pressures that may exist too high, we shall use a one-sided prediction interval in the creation of the height-based standard for SBP for girls. The upper (one − α) * 100 percentage prediction interval for SBP is found from

${\hat{y}}_{one thousand} \pm t_{north - 2, 1 - α / two} due east s t . s . due east . ({\hat{y}}_{k}) .$

Because the standard is the value such that 95 pct of the SBP values fall beneath it and 5 percent of the values are greater than it, we shall use the upper 95 percent prediction interval to obtain the standard.

The information shown in Effigy thirteen.8 tin be used to help create the superlative-based standards for SBP. The difference between the 1- and two-sided interval is the employ of t _north ₋ _two,1 ₋ _α in place of t _n ₋ _2,1 ₋ _α _/two. Thus, the amount to be added to y _k for the upper one-sided interval is just 0.834 (= t _48,0.95/t _48,0.975) times the amount added for the ii-sided interval. To discover the amount added for the two-sided interval, we subtract the predicted SBP value shown from the upper limit of the 95 percent prediction interval. For case, for a girl 35 inches alpine, the amount added, using the two-sided interval, is found by subtracting 88.05 (predicted value) from 118.l (upper limit of the two-sided prediction interval). This yields a difference of 30.45 mmHg. If we multiply this difference by 0.834, we have the amount to add to the 88.05 value. Thus, the standard for a girl 35 inches tall is

$0.834 (118.l - 88.05) + 88.05 = 113.45 mmHg .$

Table 13.5 shows these calculations and the height-based standards for SBP for girls. As just shown, the calculations in Tabular array 13.v consist of taking column 2 minus column three. This is stored in column 4. Column five contains 0.834 times cavalcade iv. The standard, column vi, is the sum of column 3 with column v.

Tabular array 13.5. Creation of summit-based standards for SBP (mmHg) for girls.

ten _k (Inches) (1)	Upper Limit of Prediction Interval (2)	χ _yard (three)	Departure (4)	Difference Times 0.834 (5)	Standard (6)
35	118.50	88.05	30.45	25.twoscore	113.45
40	121.87	91.89	29.98	25.00	116.89
45	125.40	95.93	29.67	24.74	120.67
l	129.09	99.58	29.51	24.61	124.19
55	132.93	103.42	29.51	24.61	128.03
60	136.93	107.27	29.66	24.74	132.01
65	141.09	111.11	29.98	25.00	136.11
seventy	145.41	114.95	30.46	25.40	140.35

The upper ane-sided prediction interval is one way of creating height-based standards for SBP. It has the reward over simply using the observed 95th percentiles of the SBP at the unlike heights in that it does not require such a large sample size to reach the same precision. If SBP is really linearly related to acme, standards based on the prediction interval also shine out random fluctuations that may be found in because each height separately.

The standards developed hither are illustrative of the procedure. If one were going to develop standards, a larger sample size would be required. We would too adopt to utilize additional variables or another variable to increase the amount of variation in the SBP that is accounted for by the independent variable(s). In addition, every bit we only stated, the rationale for having standards for blood pressure in children is much weaker than that for having standards in adults. In adults, there is a straight linkage betwixt high blood pressure and illness, whereas in children no such linkage exists. Additionally, the evidence that relatively high claret pressure in children carries over into adulthood is inconclusive. Utilize of the 95th percentile or other percentiles equally the basis of a standard implies that some children volition be identified as having a problem when none may exist.

So far nosotros have focused on a single independent variable. In the adjacent section, we consider multiple independent variables.

Read total chapter

URL:

https://world wide web.sciencedirect.com/science/article/pii/B9780123694928500182

Book three

A.C. Olivieri , Northward.Yard. Faber , in Comprehensive Chemometrics, 2009

3.03.iv.3 Prediction Intervals

In univariate calibration, expressions for the prediction interval are available in the IUPAC literature. ⁴ They follow from the application of t statistics and the estimated standard error of prediction every bit:

(15) $\hat{c} - t_{υ, α / ii} s (c) \leq c \leq \hat{c} + t_{υ, α / 2} south (c)$

where t _v,α/two is the upper α/ii-per centum point of the t-distribution with v degrees of freedom.

As in univariate calibration, prediction intervals can exist constructed in the multivariate scenario from the estimated standard error of prediction (foursquare root of a variance estimate) and the relevant t statistics. Withal, an important upshot should exist dealt with in the multivariate case: in univariate calibration, it is usually causeless that but the instrument bespeak carries an uncertainty, although a number of works have emphasized the importance of taking into account the errors in both axes when studying a unmarried constituent. ^66,116–120 This leads to exact variance expressions and t statistics. In contrast, multivariate models are often constructed using calibration concentrations that are not error-complimentary (see Equation (13)). As a outcome, only approximations are obtained to the variance expressions and distributional properties of the examination statistic:

(16) $ρ = \frac{\hat{c} - c}{s (c)}$

Still, the cardinal limit theorem justifies the normality assumption for the prediction mistake (numerator) when the number of spectral values (J) is large: for large J, the prediction ĉ, hence its fault, volition be approximately unremarkably distributed. Furthermore, the multivariate standard error of prediction (denominator) may have contributions from unlike sources, and as a consequence, the required χ²-distribution is only approximately obeyed. Hence, the number of degrees of freedom should be established every bit a compromise between the degrees of freedom corresponding to each fault source. It has been proposed ⁷⁴ to calculate an overall number of degrees of freedom using the well-known Satterthwaite's dominion. ¹²¹

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780444527011000739

Truth, Possibility and Probability

In North-Holland Mathematics Studies, 1991

Prediction intervals

Presume that we are trying to get a "prediction interval" for Ten _n+one, where 10 ₁,X ₂,…,10_north ,Ten _n+one is a random sample from a about normal population with unknown hateful and variance, and we have observed the first north results. Nosotros have that $10_{n + ane} - {\bar{X}}_{northward}$ is normal with hateful 0 and variance $σ^{2} + (σ^{two} / n)$ . Thus, information technology is not difficult to show that

$\sqrt{\frac{due north}{n + ane}} \frac{X_{n + i} - \bar{X_{n}}}{{Southward}_{n}}$

has a student's $T_{northward - 1}$ . Therefore

$Pr [- t_{α / ii, n - 1} \leq \sqrt{\frac{n}{n + one}} \frac{X_{n + 1} - \bar{X_{northward}}}{{Southward}_{n}} \leq t_{α / 2, n - i}] \approx i - α .$

Thus, if ${\vec{x}}_{north}$ and s_n are observed, nosotros could call back of a prediction interval of the form

$({\bar{x}}_{n} - \sqrt{\frac{n + 1}{due north} s_{northward}} t_{α / 2, n - i}, {\bar{x}}_{due north} + \sqrt{\frac{due north + one}{n}} s_{north} t_{α / 2, due north - ane}) .$

We take, notwithstanding, that 10 _north+ane is independent of ${\bar{X}}_{n}$ and S_northward , and, hence, we cannot conclude that

$Pr [{\bar{x}}_{n} - \sqrt{\frac{n + 1}{n}} s_{n} t_{α / two, n - 1} \leq {Ten}_{n + 1} \leq {\bar{ten}}_{n} + \sqrt{\frac{n + 1}{north}} s_{n} t_{α / 2, n - 1}] \approx 1 - α .$

Only the joint distribution of $\sqrt{(n / n + 1)} (({Ten}_{n + 1} - {\bar{X}}_{north}) / S_{n})$ is $T_{n - one}$ . Information technology is besides clear, that in that location are no set up of alternative hypotheses to exist accepted or rejected.

In guild to become something similar to a prediction interval, nosotros can utilize the technique of tolerance limits, which tin be easily accommodated in my framework,although not using the estimators of Rule XVIII.2. Suppose, as earlier, that the possible distributions for the variable Ten are normal with unknown mean and variance, and that 10 _one,X ₂,…,X_n ,10 _{due north+1} are independent repetitions of Ten. With knowledge of the first due north variables, we would similar to take an interval such that X is in this interval with probability of one – γ, this with confidence 1 – α. Permit us phone call F _μ whatsoever cumulative distribution function when μ is the mean and some variance. That is

$F_{μ} (x) = {Pr}_{μ} [X_{northward + 1} \leq 10] .$

There are tables that give the number one thousand such that

${Pr}_{μ} [F_{μ} ({\bar{Ten}}_{n} + k S_{n}) - F_{μ} ({\bar{Ten}}_{n} - k S_{due north}) \geq 1 - γ] = 1 - α$

for every μ. Our experiment is the same Thousand_{due north} every bit in Case XVIII.2. Suppose that we obtain the result ${\vec{10}}_{n}$ and s_north and assume that

$F_{μ} ({\bar{x}}_{n} + k s_{n}) - F_{μ} ({\bar{ten}}_{n} - k s_{n}) < 1 - γ$

for a certain μ.

A result (chiliad_north, r_north ) is at to the lowest degree as bad for μ every bit $({\vec{10}}_{north}, {south}_{n})$ , if

$\frac{‖ m_{n} - μ ‖}{r_{n}} \geq \frac{‖ {\bar{ten}}_{northward} - μ ‖}{{south}_{n}} .$

and we have for a worse result

$F_{μ} (m_{due north} + k r_{northward}) - F_{μ} (k_{n} - one thousand r_{north}) < 1 - γ .$

But

${Pr}_{μ} ([F_{μ} ({\bar{X}}_{n} + thousand {Due south}_{north}) - F_{μ} ({\bar{X}}_{north} - m S_{northward}) < i - γ]) = α$

so that the probability of the rejection set is less than α. This means that we are able to pass up any μ such that

$F_{μ} ({\bar{x}}_{north} + k {south}_{n}) - F_{μ} ({\bar{ten}}_{n} - k {southward}_{n}) < 1 - γ$

and, thus, we accept with confidence 1 – α that the side by side repetition X _n+1 has a cumulative distribution F with

$F ({\bar{10}}_{n} + chiliad s_{n}) - F ({\bar{x}}_{n} - k s_{n}) \geq 1 - γ$

and, hence, that with probability one - γ, the upshot of X _n+i volition be in the interval

$({\bar{x}}_{northward} - m s_{northward}, {\bar{x}}_{n} + k {south}_{n}) .$

Read total chapter

URL:

https://world wide web.sciencedirect.com/science/commodity/pii/S0304020808724307

Interval Estimation

Ronald Northward. Forthofer , ... Mike Hernandez , in Biostatistics (Second Edition), 2007

Conclusion

In this chapter, the concept of interval interpretation was introduced. Nosotros presented prediction, confidence, and tolerance intervals and explained their applications. Nosotros showed how distribution-complimentary intervals and intervals based on the normal distribution were calculated. The idea and utilize of conviction intervals discussed in this chapter will be explored further to introduce methods of testing statistical hypotheses in the next 2 chapters. Parenthetically, information technology is worth pointing out that the idea of confidence interval is often expressed as a margin of error in journalistic reporting, which refers to one-one-half of the width of a two-sided confidence interval.

We likewise pointed out that characteristics — for instance, size — of the intervals could be examined before actually conducting the experiment. If the characteristics of the interval are satisfactory, the investigator uses the proposed sample size. If the characteristics are unsatisfactory, the design of the experiment, the topic of the side by side chapter, needs to be modified.

Read full affiliate

URL:

https://world wide web.sciencedirect.com/science/article/pii/B9780123694928500121

Linear regression models

Kandethody M. Ramachandran , Chris P. Tsokos , in Mathematical Statistics with Applications in R (Tertiary Edition), 2021

7.4 Predicting a item value of Y

In the earlier sections, we have seen how to fit a to the lowest degree-squares line for a given fix of data. As well using this line, nosotros could discover Due east(Y), for any given value of 10. Instead of obtaining this mean value, nosotros may be interested in predicting the particular value of Y for a given x. In fact, 1 of the primary uses of the estimated regression line is to predict the response value of Y for a given value of 10. Prediction bug are very important in several real-world issues; for example, in economics one may be interested in predicting a particular gain associated with an investment.

Let ${\hat{Y}}_{0}$ denote a predictor of a particular value of Y = Y ₀ and let the corresponding values of 10 exist x ₀. We shall choose ${\hat{Y}}_{0}$ to be $E (\hat{Y} | 10_{0})$ . Allow $\hat{Y}$ denote a predictor of a particular value of Y. Then the error η of the predictor in comparison to a item value of Y is

$η = Y - {\hat{Y}}_{0} .$

Both Y and $\hat{Y}$ are normal random variables, and the error is a linear part of Y and $\hat{Y}$ . This means that η itself is normally distributed. Also, because $E (\hat{Y}) = E (Y)$ , we accept

$Eastward (η) = E (Y | x_{0}) - E (\hat{Y}) = 0.$

Furthermore,

$V a r (η) = V a r (Y - \hat{Y}) = V a r (Y) + V a r (\hat{Y}) - 2 C o v (Y, \hat{Y}) .$

Nosotros tin can consider Y and $\hat{Y}$ equally contained, because we are predicting a dissimilar value of Y, not used in the calculation of $\hat{Y}$ . Therefore, $C o five (Y, \hat{Y}) = 0$ . In that case,

$\begin{matrix} V a r (η) = V a r (Y_{0}) + V a r ({\hat{Y}}_{0}) \\ = σ^{2} + σ^{ii} [\frac{1}{n} + \frac{{(10 - \bar{x})}^{2}}{S_{x x}}] \\ = [i + \frac{one}{due north} + \frac{{(x - \bar{10})}^{two}}{S_{x x}}] σ^{2} . \end{matrix}$

Hence, the mistake of predicting a particular value of Y, given x, is normally distributed with mean zero and variance $[one + \frac{i}{n} + \frac{{(x - \bar{x})}^{2}}{{Due south}_{ten x}}] σ^{2} .$

That is,

$η \sim N (0, [i + \frac{1}{n} + \frac{{(x - \bar{ten})}^{ii}}{S_{x x}}] σ^{2}),$

and

$Z = \frac{Y - \hat{Y}}{σ \sqrt{[ane + \frac{i}{n} + \frac{{(ten - \bar{x})}^{2}}{S_{ten x}}]}} \sim Due north (0,ane) .$

If we substitute the sample standard deviation South for σ, then we tin show that

$T = \frac{Y - \hat{Y}}{S \sqrt{[1 + \frac{1}{n} + \frac{{(10 - \bar{ten})}^{2}}{{Due south}_{ten x}}]}},$

follows the t-distribution with [n–(k + 1)] degrees of freedom. Using this fact, we now give a prediction interval for the random variable Y, the response of a given state of affairs.

Nosotros know that

$P (- t_{α / 2} < T < t_{α / 2}) = 1 - α .$

Substituting for T, we have

$P (- t_{α / 2} < \frac{Y - \hat{Y}}{Due south \sqrt{[1 + \frac{1}{n} + \frac{{(x - \bar{x})}^{2}}{S_{x x}}]}} < t_{α / 2}) = i - α,$

which implies that

$P [\hat{Y} - t_{α / 2} Southward \sqrt{[1 + \frac{ane}{n} + \frac{{(x - \bar{ten})}^{ii}}{S_{x ten}}]} < Y < \hat{Y} + t_{α / 2} S \sqrt{[ane + \frac{i}{n} + \frac{{(x - \bar{x})}^{2}}{{South}_{x 10}}]}] = 1 - α .$

Hence, we have the following.

A (1 − α)100% prediction interval for Y is

$\hat{Y} \pm t_{α / two} S \sqrt{[1 + \frac{i}{due north} + \frac{{(x - \bar{x})}^{2}}{S_{x x}}]}$

where t _α/2 is based on (n

−

two) degrees of freedom and $S^{two} = \frac{S S Due east}{northward - 2} = \sqrt{M South East} .$

We illustrate this statistical procedure with the following case.

Example 7.4.one

Using the data given in Example 7.ii.ane, obtain a 95% prediction interval at ten = 5.

Solution

Nosotros have shown that $\hat{y} = - 3.1011 + 2.0266 x .$ Hence, at 10 = 5, $\hat{y} = 7.0319.$

Also, $\bar{x} = three.8,$ South _xx = 263.vi, SSE = 7.79,028, and $S = \sqrt{\frac{7.79028}{8}} = 2.306.$

From the t-table, t _0.025,8 = 2.306.

Thus, nosotros have

$7.0319 \pm (2.306) (0.98681) \sqrt{[1 + \frac{1}{10} + \frac{{(5 - 3.8)}^{2}}{263.half dozen}]},$

which gives the 95% prediction interval equally (four.6393, 9.4245).

Nosotros tin conclude with at least 95% conviction that the truthful value of Y at the point x = 5 will be somewhere between 4.6393 and 9.4245.

Read full chapter

URL:

https://www.sciencedirect.com/scientific discipline/commodity/pii/B9780128178157000075

henryyousuponchis.blogspot.com

Source: https://www.sciencedirect.com/topics/mathematics/prediction-interval

Henry Yousuponchis

Which of the Following Is a Potential 95% Prediction Interval for the Adult Height of the Baby

Prediction Interval

Regression

nine.4.four Prediction interval of a hereafter response

Linear Regression

12.7 Prediction Intervals for Future Responses

Confidence Intervals

9.v Prediction Intervals

Standard Error for Prediction

The Prediction Interval for a New Observation

Two-sided

Ane-sided

Case

How Long until Your Order Is Filled?

Some bug in statistical applications

xiv.5.1 A simple model for univariate information

Solution

Some Problems in Statistical Applications

14.five.1 A Simple Model for Univariate Information

Linear Regression

13.3.2 Prediction Interval for Y | X

Book three

3.03.iv.3 Prediction Intervals

Truth, Possibility and Probability

Prediction intervals

Interval Estimation

Conclusion

Linear regression models

7.4 Predicting a item value of Y

Solution

Postar um comentário for "Which of the Following Is a Potential 95% Prediction Interval for the Adult Height of the Baby"