# Comparing Normal Distributions & Histograms

Hi there! I’m having difficulty answering questions when they ask us to compare a normal distribution with a histogram (e.g. 3bii 2016).

Shape: I understand we discuss bell-curved and symmetrical for the normal; and asymmetrical (left or right skew) for the histogram.

Centre: The normal is unimodal, and the mean = mode = median. The histogram may or may not be bimodal, and then we say if the median and mean are left/right of the centre. How do we know if mean and median are left or right of the centre? Do we calculate them somehow?

Spread: Find and compare the range (don’t normal distributions technically extend forever? Where does the graph ‘start’ and ‘stop’?) If we were to calculate standard deviation to show range, how could we do this?

Proportions: What does this mean?

Thank you, any help is appreciated!!

Hello! Welcome back

Shape: I understand we discuss bell-curved and symmetrical for the normal; and asymmetrical (left or right skew) for the histogram.

That is correct. For the histogram you have to say that it skewed to the right (where the “tail” is).

Centre: The normal is unimodal, and the mean = mode = median. The histogram may or may not be bimodal, and then we say if the median and mean are left/right of the centre. How do we know if mean and median are left or right of the centre? Do we calculate them somehow?

1. You described the first graph correctly.
2. For the second graph you should mention that it is also unimodal. If it was bimodal histogram it would like like that:

You should be able to see two obvious peaks. Because histogram in the question doesn’t have them you need to say that it is unimodal.
You can visually see mode (the highest peak). Median and mean often are not so obvious to be seen on the histogram and you have to calculate them. For this example you can estimate that about 54 bottles (20 from 298-300 and about 34 from 300-302) weigh below 302-304g and about 54 bottles (28 from 304-306, 15 from 306-308, 8 from 308-310 and 3 from 310-312) weigh above 302-304g while 42 bottles weigh between 302 -304g. So you can conclude that median weight of the bottles is between 302-304g. You can also mention that mean is located to the left of the center as the graph is skewed to the right.

Spread: Find and compare the range (don’t normal distributions technically extend forever? Where does the graph ‘start’ and ‘stop’?)

You are correct, theoretical graph does go forever asymptotically approaching X axis. However, in practical situation we can accept that very close to zero is “about zero” and it where it “starts” and 'ends". The graph shows you values of 296 to 324g as “start” and “end” so you can use these numbers to find the range. Your answer should state the range of about 28g. That will indicated infinite nature of the distribution graph but demonstrate that you still can use this graph for the practical purpose.

If we were to calculate standard deviation to show range, how could we do this?

You learn in this course that in order to find standard deviation you need to use probability of a value to be within a certain range. You don’t have this information so you can’t use this method. But if you really want you can estimate it. You know that most of the data (99.7%) is located within +/- 3 standard deviations from the center. It’s about +/- 14g so you standard deviation should be around 4.67g. But the way to find it is again to assume that the range (most of your data) is located between 296 and 324g.

Proportions: What does this mean?

Proportional distribution is worth mentioning if you aim for excellence as you were asked to compare the graphs. The obvious mismatch is that in your histogram a reasonably large proportion of the bottles weigh between 298-300g (20 out 150) compare to the first graph where only very small proportion of the bottles have this weight (it is very close to the left “end” of the graph). You may also notice that on the histogram very small proportion of the bottles weigh between 310 and 312 g (3 out 150) while on the first graph this weight is very close to the center of the diagram and therefore is very popular.