Normal Distribution
Data can be "distributed" (spread out) in different ways.
It can be spread out more on the left | Or more on the right | |
Or it can be all jumbled up |
But there are many cases where the data tends to be around a central value with no bias left or right, and it gets close to a "Normal Distribution" like this:
A Normal Distribution
The "Bell Curve" is a Normal Distribution.
It is often called a "Bell Curve" because it looks like a bell. |
Many things closely follow a Normal Distribution:
- heights of people
- size of things produced by machines
The Normal Distribution has:
- mean = median = mode
- symmetry about the center
- 50% of values less than the mean
and 50% greater than the mean
Standard Deviation and Variance
Deviation just means how far from the normal.
Standard Deviation
The Standard Deviation is a measure of how spread out numbers are.
Its symbol is σ (the greek letter sigma)
The formula is easy: it is the square root of the Variance. So now you ask, "What is the Variance?"
Variance
The Variance is defined as:
The average of the squared differences from the Mean.
To calculate the variance follow these steps:
- Work out the Mean (the simple average of the numbers)
- Then for each number: subtract the Mean and square the result (the squared difference).
- Then work out the average of those squared differences.
Why square the differences?
If we just added up the differences from the mean ... the negatives would cancel the positives:4 + 4 - 4 - 4 = 0 4 So that won't work. How about we use absolute values?|4| + |4| + |-4| + |-4| = 4 + 4 + 4 + 4 = 4 4 4 That looks good (and is the Mean Deviation), but what about this case:|7| + |1| + |-6| + |-2| = 7 + 1 + 6 + 2 = 4 4 4 Oh No! It also gives a value of 4, Even though the differences are more spread out!So let us try squaring each difference (and taking the square root at the end):√ 42 + 42 + 42 + 42 = √ 64 = 4 4 4 √ 72 + 12 + 62 + 22 = √ 90 = 4.74... 4 4 That is nice! The Standard Deviation is bigger when the differences are more spread out ... just what we want!In fact this method is a similar idea to distance between points, just applied in a different way.And it is easier to use algebra on squares and square roots than absolute values, which makes the standard deviation easy to use in other areas of mathematics.Example
You and your friends have just measured the heights of your dogs (in millimeters):The heights (at the shoulders) are: 600mm, 470mm, 170mm, 430mm and 300mm.Find out the Mean, the Variance, and the Standard Deviation.Your first step is to find the Mean:Answer:
Mean = 600 + 470 + 170 + 430 + 300= 1970= 394 55so the mean (average) height is 394 mm. Let's plot this on the chart:Now we calculate each dog's difference from the Mean:To calculate the Variance, take each difference, square it, and then average the result:So, the Variance is 21,704.And the Standard Deviation is just the square root of Variance, so:Standard Deviation: σ = √21,704 = 147.32... = 147 (to the nearest mm)And the good thing about the Standard Deviation is that it is useful. Now we can show which heights are within one Standard Deviation (147mm) of the Mean:So, using the Standard Deviation we have a "standard" way of knowing what is normal, and what is extra large or extra small.But ... there is a small change with Sample Data
Our example was for a Population (the 5 dogs were the only dogs we were interested in).But if the data is a Sample (a selection taken from a bigger Population), then the calculation changes!When you have "N" data values that are:- The Population: divide by N when calculating Variance (like we did)
- A Sample: divide by N-1 when calculating Variance
All other calculations stay the same, including how we calculated the mean.Think of it as a "correction" when your data is only a sample.Formulas
Here are the two formulas, explained at Standard Deviation Formulas if you want to know more:
The "Population Standard Deviation":The "Sample Standard Deviation": Looks complicated, but the important change is to
divide by N-1 (instead of N) when calculating a Sample Variance.- Reference https://www.mathsisfun.com/data/standard-normal-distribution.html
No comments:
Post a Comment