Geography 360:

Standard deviation, skewness, kurtosis

In order to better explain the coefficient of variance consider the following data taken from the unemployment statistics provided in class. The data was provided as thousands of people who report as being unemployed. Below, I have taken the values for every fifth state in the alphabetical listing. In the first column, I have rounded the scores for each state to the nearest 100,000. The second column gives the data as reported. You would expect the variance within each data set to be comparable because they represent the same values. How do the calculated standard deviations compare? Why is the value for the second column so much higher? What happens when the standard deviation is divided by the mean?

# reported unemployed

100,000s

1,000s

 

1

126.1

 

2

158.3

 

3

263

 

3

333.4

 

1

113.2

 

0

13.2

 

0

11.2

Mean

2.5

248.37

Standard deviation

3.32

323.65

Coefficient of Variation

1.33

1.30

 

Skewness: There are various ways to calculate skewness - that is, to quantify the symmetry of a frequency distribution.  Each method emphasizes are particular aspect of skewness and values should only be compared if calculated the same way.  All of these procedures effectively emphasize outliers - those variables which deviate greatly from a symmetrical frequency distribution. Because the mean is affected to a greater extent by outliers than the median, subtracting the second from the first (mean-median) is a simple way to quantify skewness. Other calculations do similar things by cubing the deviation between each variable and the mean. Positive skewness is skewed to the right, or has outliers above the mean. Negative skewness is skewed to the left and has outliers less than the mean. Draw the frequency distribution for columns 1-5 (because of the small number of variables, grouping these in categories with a range of five will make this more understandable visually). Note the impact of adding an additional outlier similar in value to the one already present when comparing 1 to 2 and 3 to 4.

Kurtosis, as with skewness, describes the shape of the frequency distribution. It quantifies the peakedness of the distribution. Consider distributions 5-7 treated as you did in the previous exercise (or adjust the category ranges as seems appropriate). How do these compare visually?

 

1

2

3

4

5

6

7

 

 

18

18

23

22

23

15

23

 

 

21

21

69

23

26

42

62

 

 

23

23

75

69

40

55

66

 

 

25

25

78

75

53

66

69

 

 

25

25

87

78

56

70

69

 

 

27

27

89

87

58

70

69

 

 

30

30

92

89

61

74

69

 

 

50

50

99

92

74

85

72

 

 

112

112

112

99

88

98

76

 

 

 

112

 

112

91

125

115

 

Mean

36.78

44.3

80.44

74.6

57

70

69

 

Std Dev

27.98

34.84

23.7

28.51

22.01

28.46

20.85

 

C.V.

0.76

0.79

0.29

0.38

0.39

0.41

0.3

 

Variance

782.62

1,213.61

561.8

813.04

484.6

810

434.8

 

Skewness

2.08

1.32

-1.26

-0.89

0

0

0

 

Kurtosis

2.86

-0.05

1.24

-0.4

-1

-0.01

1.74

 

 

The following worktable can be used to help with the calculation of the mean, standard deviation, coefficienct of variation, skewness and kurtosis for data sets.  Please see me if you would like a sample data set.  I can not post these on the site due to copyright issues.

 

Worktable for Calculating Mean, Standard Deviation, Coefficient of Deviation, Skewness, Kurtosis

Observation i

Xi

Xi - :

(Xi - :)2

:2

(Xi - :)3

(Xi - :)3

1

 

 

 

 

 

 

2

 

 

 

 

 

 

3

 

 

 

 

 

 

4

 

 

 

 

 

 

5

 

 

 

 

 

 

6

 

 

 

 

 

 

7

 

 

 

 

 

 

8

 

 

 

 

 

 

9

 

 

 

 

 

 

10

 

 

 

 

 

 

11

 

 

 

 

 

 

12

 

 

 

 

 

 

13

 

 

 

 

 

 

14

 

 

 

 

 

 

15

 

 

 

 

 

 

Sum of Column

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Please see me for sample data to use with the above table.

N =                                   : = 1/N(EXi) =                    =

F =                                    Coefficient of variation =

skewness =

kurtosis =