Geography 360:

Problem Set Five

{The following problems may be worked by hand, in S-Plus or in a spread sheet-see below.}

 

1.  In the previous problem set, you worked with data representing the height of a sample of five-year-old boys [113.7, 104.6, 108.6, 105.9, 110.0, 106.7, 108.5, 107.7, 114.3, 116.7, 103.5, 96.1, 110.8, 97.2, 109.6, 110.5, 105.9, 106.2].  By assuming that the frequency distribution of height for the population approaches a normal distribution (we would expect the majority of boys' heights to be close to the population mean, while exceptionally short or tall boys would be few in number), you were able to calculate the probability of a child being within a particular range of heights or the probability of encountering a child larger than or smaller than a given height.  Now we want to draw some conclusions about the population of five-year-old boys from which the sample was taken.  In other words, what does the sample mean tell us about the population mean? 

 

a)  Please calculate a confidence interval of 90% around your sample mean.  Assume that the population is all five-year-old boys and is sufficiently large to eliminate the need for a finite population calculation.  Because we have a small population size, use the t-distribution (remember that your degrees of freedom are equal to the sample size minus one - n-1).  Now calculate a 95% confidence interval around the mean.  Why does your 95% confidence interval have a wider range than the 90% interval?  Finally, calculate a 95% confidence interval using the Z-distribution (remember, the only thing that changes is that you replace the value of t associated with 95% with the value of Z associated with 95%).  Express the added uncertainty of the small sample size in the difference between the ranges of the confidence intervals calculated with t and Z values.  How much taller or shorter than the sample mean might the population mean be when you are 95% certain?

 

b)  Now assume that the above sample was selected from a group of 150 five-year-olds who are participants in a study on the impact of dietary supplements.  Recalculate a 95% confidence interval of the mean, this time including the finite population correction.  What is the absolute difference on the range of the interval compared to that in part a)?  Why can you be more confident that the population mean in this case is closer to the sample mean?

 

 

2)  The food service administrator at Whatcha Eatin U. wants to know how well the food served at the school's cafeteria for evening meals is being received.  She decides that this can be calculated based on the level of students' expressed preference for alternative meal sources.  Because the local pizza delivery service is the only place to buy prepared meals in Tie-Knee-Town (home of WEU), she conducted a survey of student use of the delivery service.  Because only 30% of students are on the school's meal plan, she decided to conduct a stratified survey of 40 students (12 on the meal plan, 28 not on the plan) chosen at random from WEU's 11,315 students (3,394 on plan and 7921 off plan).  Her survey produced the following data for number of times pizza was ordered per month by students: on plan students [5,5,3,1,1,6,2,1,6,8,3,6]; off plan [1,7,9,5,4,8,3,5,2,6,3,8,2,7,0,3,3,3,7,6,4,6,8,9,3,10,7,1].

 

a)  Calculate a 90% confidence interval of the mean number of pizzas ordered by students during the previous month. 

 

b)  Calculate an 85% confidence interval of the total pizzas consumed by students during the previous month.

 

c)  The administrator feels she is successfully meeting the meal preferences of students if the proportion of students ordering pizza more than 1.5 times per week (> 6 times per month) is less than 20%.   Calculate a 95% confidence interval of the proportion in the sample and state whether she should be fairly confident of meeting her goal.

 

d)  Barney, the owner of the town's pizzeria, decides to acquire the results of the food service survey.  He is convinced that the information will help him plan his purchases for the coming month (that is, how many pizzas should he expect to be ordered by students).  He is not happy, however, with the uncertainty (the potential error) associated with the small sample size.  How many students would he need to interview in order to be 90% certain that the sample mean (average per month) was within 1 pizza of the population mean?  How many students would he have to interview in order to be 95% certain the total pizzas consumed by the sample were within 500 pizzas of the total consumed by the students as a whole?  Do you think that this sample size would allow for a cost effective means of estimating student demand for pizzas?

 

 

You can construct a spreadsheet to calculate the confidence intervals based on the following pattern.  Type the characters between the quotation marks into the spreadsheet cells indicated.  Enter the necessary values into the cells with square brackets.  For problem 1, you will need to fill data points in cell A4 to cell A21.  For problem 1b, enter "finite population correction" in cell C3; "=sqrt((150-18)/150)" in cell C4; change B4 to "=(C2*(B2/sqrt(18))*C4)".

 

In order to account for the stratified nature of the sampling procedure in problem 2, you will need to move the characters in C1 and C2 to E1 and E2.  Change A1 to “On Plan Mean”; B1 to “On Plan Std. Dev.”; C1 to “Off Plan Mean”; D1 to “Off Plan Std. Dev.”; A2 to “=average(A4:A15)”; B2 to “stdev(A4:A15)”; C2 to “=average(A16:A43)”; D2 to “stdev(A16:A43); B4 to “=(E2*(sqrt((((3394^2)*((B2^2)/12))+((7921^2)*((D2*^2)/28))/(11315^2)))))”; C3 to “estimated mean”; C4 to “=(((A2*12)+(C2*28))/40)”; B6 to “=C4-B4”; and C6 to “=C4+B4”.  If you are still attempting this, try changing the formulas to match that for estimating Total pizzas and proportion of greater than 6 pizzas per month.  Once you have made the above calculations, figuring the minimum sample size is relatively straightforward.

 

 

 

A

B

C

1

"Sample Mean"

"Sample Std. Dev."

"Z or t value"

2

"=average(A4:A21)"

"=stdev(A4:A21)"

[enter value here]

3

"Values"

"confidence interval"

 

4

[enter first data value]

"=(C2*(B2/sqrt(18)))"

 

5

[enter second data value]

"range of confidence"

 

...

[enter data to A21]

"=A2-B4"

"=A2+B4"