Geography 360:
Problem Set Five
{The following
problems may be worked by hand, in S-Plus or in a spread sheet-see below.}
1. In the previous problem set, you worked with data representing the
height of a sample of five-year-old boys [113.7, 104.6, 108.6, 105.9, 110.0,
106.7, 108.5, 107.7, 114.3, 116.7, 103.5, 96.1, 110.8, 97.2, 109.6, 110.5,
105.9, 106.2]. By assuming that the
frequency distribution of height for the population approaches a normal
distribution (we would expect the
majority of boys' heights to be close to the population mean, while
exceptionally short or tall boys would be few in number), you were able to
calculate the probability of a child being within a particular range of heights
or the probability of encountering a child larger than or smaller than a given
height. Now we want to draw some
conclusions about the population of five-year-old boys from which the sample
was taken. In other words, what does
the sample mean tell us about the population mean?
a) Please calculate a confidence interval of 90% around your sample
mean. Assume that the population is all
five-year-old boys and is sufficiently large to eliminate the need for a finite
population calculation. Because we have
a small population size, use the t-distribution (remember that your degrees of
freedom are equal to the sample size minus one - n-1). Now calculate a 95% confidence interval
around the mean. Why does your 95%
confidence interval have a wider range than the 90% interval? Finally, calculate a 95% confidence interval
using the Z-distribution (remember, the only thing that changes is that you
replace the value of t associated with 95% with the value of Z associated with
95%). Express the added uncertainty of
the small sample size in the difference between the ranges of the confidence
intervals calculated with t and Z values.
How much taller or shorter than the sample mean might the population
mean be when you are 95% certain?
b) Now assume that the above sample was selected from a group of 150
five-year-olds who are participants in a study on the impact of dietary
supplements. Recalculate a 95%
confidence interval of the mean, this time including the finite population
correction. What is the absolute
difference on the range of the interval compared to that in part a)? Why can you be more confident that the
population mean in this case is closer to the sample mean?
2) The food service administrator at Whatcha Eatin U. wants to know
how well the food served at the school's cafeteria for evening meals is being
received. She decides that this can be
calculated based on the level of students' expressed preference for alternative
meal sources. Because the local pizza
delivery service is the only place to buy prepared meals in Tie-Knee-Town (home
of WEU), she conducted a survey of student use of the delivery service. Because only 30% of students are on the
school's meal plan, she decided to conduct a stratified survey of 40 students
(12 on the meal plan, 28 not on the plan) chosen at random from WEU's 11,315
students (3,394 on plan and 7921 off plan).
Her survey produced the following data for number of times pizza was
ordered per month by students: on plan students [5,5,3,1,1,6,2,1,6,8,3,6]; off
plan [1,7,9,5,4,8,3,5,2,6,3,8,2,7,0,3,3,3,7,6,4,6,8,9,3,10,7,1].
a) Calculate a 90% confidence interval of the mean number of pizzas
ordered by students during the previous month.
b) Calculate an 85% confidence interval of the total pizzas consumed
by students during the previous month.
c) The administrator feels she is successfully meeting the meal
preferences of students if the proportion of students ordering pizza more than
1.5 times per week (> 6 times per month) is less than 20%. Calculate a 95% confidence interval of the
proportion in the sample and state whether she should be fairly confident of
meeting her goal.
d) Barney, the owner of the town's pizzeria, decides to acquire the
results of the food service survey. He
is convinced that the information will help him plan his purchases for the
coming month (that is, how many pizzas should he expect to be ordered by
students). He is not happy, however,
with the uncertainty (the potential error) associated with the small sample
size. How many students would he need
to interview in order to be 90% certain that the sample mean (average per
month) was within 1 pizza of the population mean? How many students would he have to interview in order to be 95% certain
the total pizzas consumed by the sample were within 500 pizzas of the total
consumed by the students as a whole? Do
you think that this sample size would allow for a cost effective means of
estimating student demand for pizzas?
You can construct a
spreadsheet to calculate the confidence intervals based on the following
pattern. Type the characters between
the quotation marks into the spreadsheet cells indicated. Enter the necessary values into the cells
with square brackets. For problem 1,
you will need to fill data points in cell A4 to cell A21. For problem 1b, enter "finite
population correction" in cell C3; "=sqrt((150-18)/150)" in cell
C4; change B4 to "=(C2*(B2/sqrt(18))*C4)".
In order to account for the
stratified nature of the sampling procedure in problem 2, you will need to move
the characters in C1 and C2 to E1 and E2.
Change A1 to “On Plan Mean”; B1 to “On Plan Std. Dev.”; C1 to “Off Plan
Mean”; D1 to “Off Plan Std. Dev.”; A2 to “=average(A4:A15)”; B2 to
“stdev(A4:A15)”; C2 to “=average(A16:A43)”; D2 to “stdev(A16:A43); B4 to
“=(E2*(sqrt((((3394^2)*((B2^2)/12))+((7921^2)*((D2*^2)/28))/(11315^2)))))”; C3
to “estimated mean”; C4 to “=(((A2*12)+(C2*28))/40)”; B6 to “=C4-B4”; and C6 to
“=C4+B4”. If you are still attempting
this, try changing the formulas to match that for estimating Total pizzas and
proportion of greater than 6 pizzas per month.
Once you have made the above calculations, figuring the minimum sample
size is relatively straightforward.
|
|
A |
B |
C |
|
1 |
"Sample
Mean" |
"Sample Std.
Dev." |
"Z or t value" |
|
2 |
"=average(A4:A21)" |
"=stdev(A4:A21)" |
[enter value here] |
|
3 |
"Values" |
"confidence
interval" |
|
|
4 |
[enter first data value] |
"=(C2*(B2/sqrt(18)))" |
|
|
5 |
[enter second data value] |
"range of confidence" |
|
|
... |
[enter data to A21] |
"=A2-B4" |
"=A2+B4" |