When running an experiment or conducting a survey we can potentially end up with many hundreds, thousands or even millions of values in the resulting data set. Too much data can be overwhelming and we need to reduce them or represent them in a way that is easier to understand and communicate.
Statistics is about summarising data. The methods of statistics allow us to represent the essential information in a data set while disregarding the unimportant information. We have to be careful to make sure that we do not accidentally throw away some of the important aspects of a data set.
By applying statistics properly we can highlight the important aspects of data and make the data easier to interpret. By applying statistics poorly or dishonestly we can also hide important information and let people draw the wrong conclusions.
In this chapter we will look at a few numerical and graphical ways in which data sets can be represented, to make them easier to interpret.
10.1 Collecting data (EMA6X)
 Data

Data refers to the pieces of information that have been observed and recorded, from an experiment or a survey.
The word data is the plural of the word datum, and therefore one should say, “the data are” and not “the data is”.
We distinguish between two main types of data: quantitative and qualitative.
 Quantitative data

Quantitative data are data that can be written as numbers.
Quantitative data can be discrete or continuous. Discrete quantitative data can be represented by integers and usually occur when we count things, for example, the number of learners in a class, the number of molecules in a chemical solution, or the number of SMS messages sent in one day.
Continuous quantitative data can be represented by real numbers, for example, the height or mass of a person, the distance travelled by a car, or the duration of a phone call.
 Qualitative data

Qualitative data are data that cannot be written as numbers.
Two common types of qualitative data are categorical and anecdotal data. Categorical data can come from one of a limited number of possibilities, for example, your favourite cooldrink, the colour of your cell phone, or the language that you learnt to speak at home.
Anecdotal data take the form of an interview or a story, for example, when you ask someone what their personal experience was when using a product, or what they think of someone else's behaviour.
Categorical qualitative data are sometimes turned into quantitative data by counting the number of times that each category appears. For example, in a class with \(\text{30}\) learners, we ask everyone what the colours of their cell phones are and get the following responses:
black  black  black  white  purple  red  red  black  black  black 
white  white  black  black  black  black  purple  black  black  white 
purple  black  red  red  white  black  orange  orange  black  white 
This is a categorical qualitative data set since each of the responses comes from one of a small number of possible colours.
We can represent exactly the same data in a different way, by counting how many times each colour appears.
Colour  Count 
black  \(\text{15}\) 
white  \(\text{6}\) 
red  \(\text{4}\) 
purple  \(\text{3}\) 
orange  \(\text{2}\) 
This is a discrete quantitative data set since each count is an integer.
Worked example 1: Qualitative and quantitative data
Thembisile is interested in becoming an airtime reseller to his classmates. He would like to know how much business he can expect from them. He asked each of his \(\text{20}\) classmates how many SMS messages they sent during the previous day. The results were:
\(\text{20}\)  \(\text{3}\)  \(\text{0}\)  \(\text{14}\)  \(\text{30}\)  \(\text{9}\)  \(\text{11}\)  \(\text{13}\)  \(\text{13}\)  \(\text{15}\) 
\(\text{9}\)  \(\text{13}\)  \(\text{16}\)  \(\text{12}\)  \(\text{13}\)  \(\text{7}\)  \(\text{17}\)  \(\text{14}\)  \(\text{9}\)  \(\text{13}\) 
Is this data set qualitative or quantitative? Explain your answer.
The number of SMS messages is a count represented by an integer, which means that it is quantitative and discrete.
Worked example 2: Qualitative and quantitative data
Thembisile would like to know who the most popular cellular provider is among learners in his school. This time Thembisile randomly selects \(\text{20}\) learners from the entire school and asks them which cellular provider they currently use. The results were:
Cell C  Vodacom  Vodacom  MTN  Vodacom 
MTN  MTN  Virgin Mobile  Cell C  8ta 
Vodacom  MTN  Vodacom  Vodacom  MTN 
Vodacom  Vodacom  Vodacom  Virgin Mobile  MTN 
Is this data set qualitative or quantitative? Explain your answer.
Since each response is not a number, but one of a small number of possibilities, these are categorical qualitative data.
Exercise 10.1
The following data set of dreams that learners have was collected from Grade 12 learners just after their final exams.
\(\{\text{"I want to build a bridge!"; "I want to help the sick."; "I want running water!"}\}\)
Categorise the data set.
This data set cannot be written as numbers and so must be qualitative.
This data set is anecdotal since it takes the form of a story.
Therefore the data set is qualitative anecdotal.
The following data set of sweets in a packet was collected from visitors to a sweet shop.
\(\{23; 25; 22; 26; 27; 25; 21; 28\}\)
Categorise the data set.
This data set is a set of numbers and so must be quantitative.
This data set is discrete since it can be represented by integers and is a count of the number of sweets.
Therefore the data set is quantitative discrete.
The following data set of questions answered correctly was collected from a class of maths learners.
\(\{3; 5; 2; 6; 7; 5; 1; 2\}\)
Categorise the data set.
This data set is a set of numbers and so must be quantitative.
This data set is discrete since it can be represented by integers and is a count of the number of questions answered correctly.
Therefore the data set is quantitative discrete.