Year 9/10 Maths Tutorials

Summarising Data

Types of Data

Data can be:

Categorical - e.g. makes of cars: Ford, Toyota, Holden, Mazda
Numerical - height of a person in centimetres
Nominal - yes, no
Ordinal - first, second, third etc

Data can also be classified as

Discrete – count data, categorical, etc
Continuous – if the data can take any vaue between two prescribed values, e.g. time to run 100m.

Variables

A variable is a name given to a set of data, e.g. Let X be the heights of 100 selected people. A variable is denoted by a (capital) letter.

A particular value of X is called a score, e.g the height of the tenth person is 74.44cm. To talk about scores in general, use the lower case version of the variable. So x is a score relating to the variable X.

Stem and Leaf Plots

Large amounts of data often need to be summarised in order to make it easier to analyse.

Stem and leaf plots are a method for showing the frequency with which certain classes of values occur. A stem and leaf plot is a two column table - stem and leaf. Each row in the table represents data points with the same stem but different leafs.

There is a lot of choice in how you decide on which part of a number is a stem and which part is a leaf.

For two digit numbers, you can make the stem the left-most digit of the number. The leaf is the final digit of the number. The digits in the leaf are arranged in increasing order. One leaf for each data item, so repeated leaf values can occur.

For numbers with more than two digits, other types of stem/leaf division may need to be used.

Example Here are the scores of 13 students in a test:

and here is the stem and leaf plot

When summarising a set of data, making a stem and leaf plot before you do anything else helps to get a feel for the data.

Example Here are the lengths in centimetres of 45 examples of a certain species of tropical fish:

14	19	15	30	25	24	36	21	43
20	13	18	16	31	26	23	37	20
27	21	12	17	17	32	27	22	38
22	26	22	11	16	18	33	28	21
41	23	28	23	32	15	19	34	29

and a possible stem and leaf plot is

You can see from the stem and leaf plot that just using the first digit to break the data values into groups may not give a very good picture of the spread of the data. So the grouping may need to be refined.

Guided Examples

Dot Plots

Data can be plotted with Dot plots – vertical dots to represent data values

Example Here are the brands of mobile phones owned by 26 people

Brand	Number
Apple	5
Samsung	14
Sony	4
Motorola	2
Telstra	1

and here is a dot plot of the data:

Frequency Tables

A frequency table summarises the data by grouping it in ranges of values. The ranges are called classes.

A frequency table has at least two columns - one for the classes the data is grouped into and one for the number of data items in each class. This column is the frequency.

The process for constructing a frequency table is

Decide on a convenient number of (generally equal size) classes. There are generally between 5 and 15 classes, depending on the amount of data.
Find the class width - take the difference between the highest and lowest score and round this number up so that it is divisible by the number of classes.
Allocate each score to its class.
Count the number of items belonging to each class. This count is called the class frequency.

There are some terms associated with the resulting table.

Class interval - the width or size of each class
Upper and Lower boundaries - the largest and smallest values in a class
Class mark - the mid point of the class interval. This number is used as the score for all members of the class.

Example Continuing with our sample of 45 examples of a certain species of tropical fish, you saw from the stem and leaf pot that just using four classes did not give a very enlightening picture of the spread of the data. Using more classes will help.

For this data, seven classes was selected. To calculate the class width you need to find the maximum and minimum score in the data. Then to calculate the class width: \[ \text{max - min} = 43 - 11 = 32 \] and if you round this up to 35 to make it divisble by 7, this gives a class width of 5.

Here is the frequency table constructed from the data:

Once you have a frequency table you can use it to create different visual presentations of your data.

14	19	15	30	25	24	36	21	43
20	13	18	16	31	26	23	37	20
27	21	12	17	17	32	27	22	38
22	26	22	11	16	18	33	28	21
41	23	28	23	32	15	19	34	29

14	19	15	30	25	24	36	21	43
20	13	18	16	31	26	23	37	20
27	21	12	17	17	32	27	22	38
22	26	22	11	16	18	33	28	21
41	23	28	23	32	15	19	34	29

14	19	15	30	25	24	36	21	43
20	13	18	16	31	26	23	37	20
27	21	12	17	17	32	27	22	38
22	26	22	11	16	18	33	28	21
41	23	28	23	32	15	19	34	29