Averages, statistical diagrams, cumulative frequency, box plots and histograms.
Types of Data
Statistics begins with knowing what kind of data you are handling, because it decides which averages and diagrams are sensible.
Key terms Primary data is collected first-hand by you; secondary data is taken from an existing source.
Population is the whole group of interest; a sample is a smaller part chosen to represent it. A good sample is large and unbiased.
Mean, Median, Mode and Range
For a simple list of values:
Worked example For : mean , median (3rd value), mode , range .
Mean from a Frequency Table
When data is grouped by frequency, multiply each value by its frequency , total the products, then divide by the total frequency.
| Goals | Frequency | |
|---|---|---|
| 0 | 5 | 0 |
| 1 | 8 | 8 |
| 2 | 4 | 8 |
| 3 | 3 | 9 |
| Total |
Mean goals. The mode is (highest frequency). The median is the th value, which falls in the "1" group, so the median is .
Estimated Mean from Grouped Data
With grouped continuous data you do not know exact values, so use the midpoint of each class as a best estimate of .
Worked example Times (minutes) for runners:
| Time (min) | Freq | Midpoint | | |---|---|---|---| | | 6 | 5 | 30 | | | 14 | 15 | 210 | | | 13 | 25 | 325 | | | 7 | 35 | 245 |
Watch out It is only an estimate — say "estimated mean" and never give exact-looking accuracy. Use midpoints, not class boundaries.
Statistical Diagrams
Exam tip In a pie chart question, to go back from an angle to a frequency, divide the angle by and multiply by the total.
Scatter Graphs and Correlation
A scatter graph plots paired data to reveal a relationship.
A line of best fit is a straight line following the trend with roughly equal points either side, passing through the mean point . Use it to estimate values — interpolation (within the data) is reliable; extrapolation (beyond it) is risky.
Watch out Correlation does not prove causation. Two things may rise together because of a third hidden factor.
Cumulative Frequency
Cumulative frequency is a running total of frequencies. Plot it against the upper class boundary of each group, then join the points with a smooth curve.
To find the median, read across from (for a curve, use , not ). The lower quartile is read at and the upper quartile at .
Worked example For : median at min; at min; at min.
Box Plots
A box plot (box-and-whisker) summarises five numbers: minimum, , median, , maximum. The box spans the IQR; the line inside is the median; whiskers reach the extremes.
Box plots make it easy to compare two distributions: compare medians for average and IQRs (box widths) for consistency.
Histograms with Unequal Class Widths
When class widths differ, bar heights must show frequency density, not frequency — otherwise wide classes look misleadingly large. The area of each bar equals the frequency.
Worked example | Mass (kg) | Freq | Width | Freq density | |---|---|---|---| | | 8 | 10 | 0.8 | | | 18 | 10 | 1.8 | | | 24 | 20 | 1.2 | | | 9 | 30 | 0.3 |
Exam tip The golden rule is frequency = area = frequency density class width. If a question gives you density and width, multiply; if it gives frequency and width, divide. Bars in a histogram have no gaps.
Viewing only
This content is free to read on superexams.com and cannot be printed or downloaded.
Read the full note, free
Create a free account to read this note in full. Every free account gets 2 complete revision notes, no card needed.