159 2
English Pages [753] Year 2023
VCE GENERAL MATHEMATICS Units 3 & 4
Robert Borg, Duyen Duong, James Boyce, Zephyr Howson, Sophie Watt, Clinton Bouphasavanh, Joshua Clements, Victoria Flynn, Nina Miriyagalla, Angus Plowman, Talia Scott-Hayward, Justin Tan, Ying Qin, Patrick Robertson Need help? Email our School Support team at [email protected] Or call 1300 EDROLO | 1300 337 656
At Edrolo, we’re transforming the way the students learn and teachers teach. Our mission is simple: to improve education.
PUBLISHED IN AUSTRALIA BY Edrolo 321 Exhibition Street Melbourne VIC 3000, Australia
© Edrolo 2023 Ref: 1.1.1 The moral rights of the authors have been asserted. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Edrolo, or as expressly permitted by law, by licence, or under terms agreed with the appropriate reprographics rights organisation. Enquiries concerning reproduction outside the scope of the above should be sent to Edrolo, at the address above. You must not circulate this work in any other form and you must impose this same condition on any acquirer. National Library of Australia Cataloguing-in-Publication data TITLE: Edrolo VCE General Mathematics Units 3 & 4 CREATOR: Edrolo et al. ISBN: 978-1-922901-01-9 TARGET AUDIENCE: For secondary school age. SUBJECTS: General Mathematics--Study and teaching (Secondary)--Victoria General Mathematics--Victoria--Textbooks. General Mathematics--Theory, exercises, etc. OTHER CREATORS/CONTRIBUTORS: Daniel Tram, Hannah Liu, Simon Hamlet, Odette Mawal, Irene Platis, James Vella, James Wallace REPRODUCTION AND COMMUNICATION FOR EDUCATIONAL PURPOSES
The Australian Copyright Act 1968 (the Act) allows a maximum of one chapter or 10% of the pages of this work, whichever is the greater, to be reproduced and/or communicated by any educational institution for its educational purposes provided that the educational institution (or the body that administers it) has given a remuneration notice to Copyright Agency Limited (CAL) under the Act. FOR DETAILS OF THE CAL LICENCE FOR EDUCATIONAL INSTITUTIONS CONTACT:
Copyright Agency Limited Level 15, 233 Castlereagh Street Sydney NSW 2000 Telephone: (02) 9394 7600 Facsimile: (02) 9394 7601 Email: [email protected] LAYOUT DESIGN: Emma Wright and Edrolo TYPESET BY: Emma Wright, Arslan Khan, Belle Gibson, Esra Yang, Dean Dragonetti COVER DESIGN BY: Cat MacInnes
Labelled images used under licence from Shutterstock.com. Every effort has been made to trace the original source of copyright material in this book. The publisher will be pleased to hear from copyright holders to rectify any errors or omissions. DISCLAIMER: Extracts from the VCE General Mathematics Study Design (2023-2027) used with permission. VCE is a registered
trademark of the VCAA. The VCAA does not endorse or make any warranties regarding this study resource. Current VCE Study Designs, VCE exams and related content can be accessed directly at www.vcaa.vic.edu.au. Printed in Australia by Ligare Printing Pty Ltd
The paper this book is printed on is in accordance with the standards of the Forest Stewardship Council®. The FSC® promotes environmentally responsible, asocially beneficial and economically viable management of the world’s forests.
CONTENTS FEATURES OF THIS BOOK
IV
AOS 1 Data analysis VI Calculator quick look-up guide
01 Investigating data distributions 1 1A 1B 1C 1D 1E 1F 1G 1H 1I
Types of data ...................................................... 2 Displaying and describing categorical data ............. 9 Displaying numerical data .................................. 20 Log scales and graphs ........................................ 37 The five-number summary and boxplots .............. 47 Describing numerical data .................................. 63 Introduction to standard deviation ....................... 80 The normal distribution ...................................... 87 z-scores .. .......................................................... 97
02 Investigating associations between two variables 105 2A A ssociations between two categorical variables . . ....................................................... 106 2B Associations between numerical and categorical variables . . ........................................................ 119 2C Associations between two numerical variables..... 131 2D Correlation and causation . . ................................ 145
03 Investigating and modelling linear associations 155 3A 3B 3C 3D 3E
Fitting a least squares regression line .................. 156 Interpreting a least squares regression line .......... 169 Performing a regression analysis ........................ 179 Data transformations ........................................ 192 Data transformations – applications .. ................. 206
04 Investigating and modelling time series data 221 4A 4B 4C 4D 4E
Time series data and their graphs ...................... 222 Smoothing – moving means .............................. 234 Smoothing – moving medians ........................... 254 Seasonal adjustments ...................................... 268 Time series data and least squares regression modelling ....................................................... 280
AOS 2 Recursion and financial modelling
292
Calculator quick look-up guide
5D 5E 5F 5G
Depreciation – finding the rule for the nth term.... 325 Simple interest ................................................ 333 Compound interest .......................................... 339 Nominal and effective interest rates . . ................. 348
06 Advanced financial mathematics 353 6A 6B 6C 6D 6E 6F
Introducing financial applications ...................... 354 Reducing balance loans .................................... 362 Interest-only loans ........................................... 375 Amortising annuities . . ...................................... 383 Perpetuities .................................................... 395 Annuity investments ........................................ 405
AOS 2 Matrices
416
Calculator quick look-up guide
07 Matrices 417 7A 7B 7C 7D 7E 7F 7G 7H 7I
Introduction to matrices .................................... 418 Operations with matrices ................................. 429 Advanced operations with matrices ................... 439 Inverse matrices ............................................... 451 Binary and permutation matrices . . ..................... 460 Communication and dominance matrices . . ......... 469 Introduction to transition matrices . . ................... 481 The equilibrium state matrix ............................. 496 Applications of transition matrices .................... 504
AOS 2 Networks and decision mathematics
516
08 Networks and decision mathematics 517 8A 8B 8C 8D 8E 8F 8G 8H 8I 8J
Introduction to graphs and networks . . ................. 518 Graphs, networks and matrices ......................... 532 Exploring and travelling problems ...................... 547 Minimum connector problems .......................... 562 Flow problems . . ............................................... 573 Shortest path problems .................................... 585 Matching problems . . ........................................ 596 Activity networks and precedence tables . . .......... 608 Critical path analysis . . ...................................... 622 Crashing ......................................................... 636
Answers 647 GLOSSARY
742
05 Recurrence relations and basic financial applications 293 5A Recurrence relations and their graphs ................ 294 5B Flat rate and unit cost depreciation – recurrence relations ......................................................... 306 5C Reducing balance depreciation – recurrence relations .......................................................... 316
III
Back to contents
FEATURES OF THIS BOOK Edrolo’s VCE General Mathematics Units 3 & 4 product has the following features.
Textbook theory Key terms identify newly defined mathematical terminology and provide a reference for navigating glossary definitions.
Study design dot points provide explicit links between the content covered in each lesson and the VCAA curriculum.
Step 1: Identify which segment of the transformation wave this scatterplot resembles.
1 𝑥𝑥𝑥𝑥
a square, logarithmic (base 10), or reciprocal transformation (applied to one axis only)
Key skills break the theory down into smaller chunks that focus on only one skill at a time, with key skill headings replicated throughout the theory, questions, and answers for easy navigation.
3B
3C
3D
•
Applying a squared transformation
•
𝑦𝑦𝑦𝑦𝑦
𝑥𝑥𝑥
𝑦𝑦𝑦
1 • reciprocal transformations (_ 1 and _ )
The transformation wave can help identify which transformations are most appropriate to linearise a distribution of data. The shape of the relationship between the variables should be compared to each of the segments of the transformation wave. All transformation options provided within the most similarly shaped segment can be used to linearise the data.
log 𝑥𝑥𝑥𝑥 1 𝑥𝑥𝑥𝑥
𝑦𝑦𝑦𝑦2
𝑥𝑥𝑥𝑥2
𝑦𝑦𝑦𝑦2
1 𝑥𝑥𝑥𝑥
log 𝑦𝑦𝑦𝑦 1 𝑦𝑦𝑦𝑦
𝑦𝑦𝑦
𝑦𝑦𝑦𝑦
Explanation – Method 2: Casio ClassPad 𝑥𝑥𝑥𝑥
Worked example 2
Step 1: From the main menu, tap . Name list1 ‘weight’ and enter the data values starting from row 1 into the column below.
𝑥𝑥𝑥𝑥
A scatterplot was constructed from the following data. 3
4
8
𝑦𝑦𝑦𝑦𝑦𝑦
Worked example 1
Determine the transformations that could be used to linearise the data in the scatterplot.
Step 6: Identifythesamplemean, − 𝑥𝑥𝑥 , and standard deviation, s𝑥𝑥𝑥 , (scrolling up may be required).
𝑦𝑦𝑦𝑦-squared transformation
𝑥𝑥𝑥𝑥2
log 𝑦𝑦𝑦𝑦 1 𝑦𝑦𝑦𝑦
3D THEORY
𝑦𝑦𝑦𝑦𝑦
𝑥𝑥𝑥
𝑥𝑥𝑥𝑥𝑥𝑥
𝑦𝑦𝑦
𝑥𝑥𝑥
𝑦𝑦𝑦𝑦𝑦𝑦
𝑥𝑥𝑥
𝑥𝑥𝑥𝑥-squared transformation
𝑥𝑥𝑥𝑥𝑥𝑥
log 𝑥𝑥𝑥𝑥
𝑥𝑥𝑥𝑥
• squared transformations ( 𝑥2 and 2) • log (base 10) transformations (log and log y)
The -squared transformation involves ‘stretching’ the larger values more than the smaller values. The values remain the same. 𝑦𝑦𝑦𝑦
Calculator methods with screenshots step students through using the ‘TI-Nspire’ and ‘Casio ClassPad’ CAS calculators.
Step 5: Onthenextscreen,select‘weight’asthe‘X1List’ using the dropdown list, then select ‘OK’.
The -squared transformation involves ‘stretching’ the larger values more than the smaller values. The values remain the same. 𝑦𝑦𝑦
𝑦𝑦𝑦
𝑥𝑥𝑥 𝑦𝑦𝑦
𝑥𝑥𝑥
• • • •
transformation -squared transformation log transformation log transformation Reciprocal -reciprocal transformation -reciprocal transformation
To linearise data is to use a transformation to make non-linear data linear. There are three main types of transformations used to linearise data: 𝑥𝑥𝑥
𝑥𝑥𝑥𝑥
The scatterplot most closely resembles the second segment of the transformation wave.
Choosing an appropriate data transformation
The variable is time and the variable is height in this instance. The transformations become time 2 and height 2.
The time 2 or height 2 transformations could be applied.
𝑦𝑦𝑦
A least squares regression line should not be fitted to data if it is not linear, as any interpretations or predictions will not be accurate. If data is not linear, it may be possible to linearise it by applying a transformation to one of the variables. Three possible transformations are a squared transformation, a log transformation, and a reciprocal transformation.
2 transformations may be applied.
𝑥𝑥𝑥
• choosing an appropriate transformation • applying a squared transformation • applying a log transformation • applying a reciprocal transformation.
1 𝑦𝑦𝑦𝑦
• Linearise • -squared
During this lesson, you will be:
1 𝑦𝑦𝑦𝑦
Answer
KEY TERMS
KEY SKILLS
𝑦𝑦𝑦𝑦2
𝑦𝑦𝑦𝑦2
3E
The 𝑥2 or
Step 3: Express the transformations in terms of the variables given.
log 𝑦𝑦𝑦𝑦
3A
𝑥𝑥𝑥𝑥2
Step 2: Identify which transformations may be applied to this segment.
𝑥𝑥𝑥𝑥2
1 𝑥𝑥𝑥𝑥 log 𝑦𝑦𝑦𝑦
• data transformation and its use in transforming some forms of non-linear data to linearity using
log 𝑥𝑥𝑥𝑥
𝑦𝑦𝑦
log 𝑥𝑥𝑥𝑥
𝑥𝑥𝑥
STUDY DESIGN DOT POINT
1G THEORY
Explanation
Data transformations
3D
Worked examples provide fully stepped out exemplar solutions.
17
5
24
6
Step 2: Tap the ‘Calc’ menu at the top of the screen and select ‘One-Variable’. On the screen that follows, select ‘main\weight’asthe‘XList’usingthedropdownlist, then tap ‘OK’.
7
33
52
𝑥𝑥𝑥
Apply an -squared transformation and plot the transformed data.
Explanation – Method 1: By hand
3
𝑥𝑥𝑥𝑥𝑥𝑥
height (m)
𝑥𝑥𝑥
Step 1: Calculate the square of all the values. 𝑦𝑦𝑦𝑦𝑦𝑦
8
2 𝑥𝑥𝑥𝑥𝑥𝑥
9
4
17 16
5
24 25
6
7
33
52
36
49
time (s) Continues →
Introductions provide a launchpad for the lesson and serve to give context for the theory.
2
Step 3: Identifythesamplemean, − 𝑥𝑥𝑥 ,and standard deviation, s𝑥𝑥𝑥 .
Continues →
ChApTer 3: InvesTIgATIng AnD moDellIng lIneAr AssoCIATIons
3
3D DATA TrAnsformATIons
Answer – Method 1 and 2 Standarddeviation:7.23kg
Exam question breakdowns provide an extra level of support by stepping through past exam questions, including the percentage of students who answered the question correctly, as well as common misconceptions and errors made, based on VCAA statistics. chapTer 1: InvesTIGaTInG daTa dIsTrIbuTIons
Exam question breakdown
VCAA 2019 Exam 2 Recursion and financial modelling Q9a
Phil would like to purchase a block of land.
He will borrow $350 000 to make this purchase.
Interest on this loan will be charged at the rate of 4.9% per annum, compounding fortnightly.
After three years of equal fortnightly repayments, the balance of Phil’s loan will be $262 332.33.
1G InTroducTIon To sTandard devIaTIon
What is the value of each fortnightly repayment Phil will make?
3
Roundtothenearestcent. (1 MARK)
Explanation
Step 1: Determinethefinancialsolverinputs. N
I(%) PV PMT FV PpY CpY
78
(there are 78 fortnights in 3 years)
350 000
(this is positive because Phil receives it from the lender)
4.9
(annual interest rate)
−262 332.33
(this is negative because Phil still owes the lender)
26
(interest compounds fortnightly)
26
(payments made fortnightly)
Step 2: UsethefinancialsolvertosolveforPMT. PMT
27% of students answered this question correctly.
−1704.0300…
The PMT is negative because Phil pays the lender.
Answer $1704.03
Textbook questions
6B QUESTIONS
Mean:88.50kg
A significant number of students incorrectly entered a positive FV value into the financial solver. The future value of a loan needs to be negative as it represents the money that is owed, or yet to be paid. A few students incorrectly rounded to $1704.05 or $1704.
6B Questions Using recurrence relations to model reducing balance loans 1.
Which of the following graphs is most likely to represent the value of a reducing balance loan over 5 years? Vn
A.
a.
b.
Classifying data as categorical or numerical 1.
number of wardrobes
C.
cost of a house
Which of the following variables is numerical? B.
type of cake
C.
type of painting
8.
D. laptop brand (1 = Apple,2 = ASUS,3 = HP,4 = other)
Classifying categorical data as nominal or ordinal 3.
A. clay quality (low,medium,high) C.
class participation (low,moderate,high)
weather forecast (sunny,clear,cloudy,raining)
C.
b.
IV
(4, 8632.96)
B.
C.
Ordinal
1
2
11.8
3
10.7
4
9.0
5
6
6.0
7.0
7
4.1
strongly disagree
8
2
3
4
5
1
2
3
4
5
strongly agree
D. Continuous
Discrete
9
4.8
1
5000
The program is easy to navigate
The table shows the day number and the minimum temperature, in degrees Celsius, for 15 consecutive days in May 2017. 12.7
0
n
9.2
10
6.7
11
7.5
12
8.0
13
14
8.6
9.8
10.
categorical data
Ashleigh and Savannah are training to run a marathon by running as far as they can inside 3 hours and 30 minutes. The dot plot displays the difference in distance run by Ashleigh in relation to Savannah (i.e. 0.5 means Ashleigh ran 500 m more than Savannah, while −0.5 means Ashleigh ran 500 m less than Savannah). They ran together 28 times. The percentage of days in which Ashleigh ran one less kilometre than Savannah is A. 7.1%
B.
10.7%
Adapted from VCAA 2018 Exam 1 Data analysis Q1
C.
15
7.7
81% of students answered this question correctly.
n = 28
−4
−2
0
2
4
difference in running distance (km) 14.3%
D. 25.0%
athlete number
high jump (metres)
shot-put (metres)
javelin (metres)
1
1.76
15.34
41.22
E.
28.0%
1A Types of dATA
5
(3, 7000.00) (4, 6000.000)
6000
Data analysis Year 11 content
Fill in the gaps with the following terms: nominal data, discrete data, numerical data, and ordinal data.
ChApTer 1: InvesTIgATIng dATA dIsTrIbuTIons
user rating (1 = notsatisfactory,2 = neutral,3 = satisfactory)
A software company wants to see if they need to upgrade their program. They conduct a survey where the participants are asked to comment on the statement 'The program is easy to navigate'. They collect the responses under the variable response(1 = stronglydisagree,2 = disagree,3 = neutral, 4 = agree,5 = stronglyagree).
(5, 8276.84)
7000
Questions from multiple lessons
assessment grade (A,B,C,D,E,F)
data
4
j.
VCAA 2019 Exam 2 Data analysis Q1a
type of car(1 = sedan,2 = sports,3 = convertible,4 = other)
continuous data
postcode
Which of the two variables in this data set is an ordinal variable? (1 MARK)
Joining it all together 6.
number of users
i.
minimum temperature (°C)
personality type (INTP,ISTJ,ENTJ,etc…)
Classify the following categorical variables as either nominal or ordinal. a.
student number
h.
day number
difficulty ranking(1 = easy,2 = moderate,3 = hard)
D. favourite ice cream flavour (blacksesame,greentea,vanilla) 5.
exam grades (HD = highdistinction,D = distinction,C = credit,P = pass,N = fail)
g.
Exam practice 9.
Which of the following categorical variables is ordinal? A. keyboard switch type (blue,red,brown) B.
perfume brand
height of basketball players (cm)
A. Nominal
D. level of processing (shallow,moderate,deep) 4.
weight of textbook (kg)
What type of data are they collecting?
Which of the following categorical variables is nominal? B.
chapTeR 6: advanced fInancIal maThemaTIcs number of employees
e. f.
D. type of kitchen 2.
car brand (1 = Toyota,2 = Holden,3 = Ford,4 = other)
d.
c.
Which of the following variables is categorical? A. number of lamps B.
Classify the following variables as either nominal, ordinal, discrete or continuous.
1A QUESTIONS
1A QUESTIONS
7.
(2, 8000.00)
8000
(3, 8983.24)
8500
A. number of teachers
Joining it all together questions scaffold students to link multiple skills from the lesson together.
Exam practice questions provide students with past VCAA exam questions to get them ready for exams. 8000
(1, 9000.00)
9000
(2, 9327.78)
9000
1A Questions
(0, 10 000.00)
10 000
(1, 9666.67) 9500
Key skills questions link to key skills in the theory and ask students to apply only one skill at a time.
Vn
B.
(0, 10 000.00)
10 000
Questions from multiple lessons provide ongoing revision from a range of topics.
0
1
2
3
4
(5, 5000.00) n
5
6B ReducIng Balance loans
7
Back to contents
ikely creases
N
Textbook answers
I(%) PV
240
(there are 240 months in 20 years)
450 000
(this is positive because Bimal receives it from the lender)
Step 1: Determine the balance remaining after the initial three years. N
I(%)
PpY CpY
I(%)
12
(interest compounds monthly)
12
6B Reducing balance loans Using recurrence relations to model reducing balance loans
3.599…
1.
A
2. a. 3.
a. b.
4. a.
c.
$600
V0 = 8500, Vn+1 = 1.0018 × Vn − 250
V0 = 985000, Vn+1 = 1.0006 × Vn − 1200 b.
6.72% p.a.
c.
$22137.69
b.
$15 984.69
d. $9563.51
96months
c.
4.8% p.a.
5months
Using amortisation tables to solve problems involving reducing balance loans
c.
6. a. 7.
b.
D
payment number 0
3.599... R = 1 + ____________ 12 × 100 3.599... = 1 + _ 1200 8.
= 1.00299... ≈ 1.003
0.00
interest 0.00
principal reduction 0.00
D
balance of loan
18 000.00
1
1627.00
90.00
1537.00
payment number
payment
interest
principal reduction
balance of loan
0.00
0.00
0.00
1200.00
242.42
3.22
239.20
722.40
0 1
242.42
3
242.42
2 4
242.42
5
9. 2.4% p.a. 10.
payment
c.
C
payment number
238.40
2.42
240.00
1.62
242.41
0.81
payment
0
4.02
0.00
interest 0.00
240.80 241.60
I(%)
V0 = 5000, Vn+1 = 1.006 × Vn − 1200
b.
$162.61
Exam practice
42weeks
c.
$231.40
241.60 0.00
V4 = 24706.192...
V5 = 24380.310...
19. Explanation
Step 1: Calculate the interest for payment number 2. r × previous loan balance interest = _ 100 4.8 = ____________ × 249500.00 12 × 100 4.8 = _ × 249500.00 1200
491.63
2018.37
3
500.00
6.73
493.27
1525.10
6 quarters
b.
40% p.a.
c.
A
N
(payments made monthly)
I(%)
R = 1.001
PV
6.9
70 000
(there are 36 months in 3 years)
−800
(annual interest rate)
12
(this is positive because Ken receives it from the lender) (this is negative since Ken pays the lender)
12
(payments made monthly) (interest compounds monthly)
−54 151.599…
At this point in time, Ken will make a lump sum payment, $L to reduce the balance of the loan.
Step 2: Determine the required balance at the start of the next three years.
The next three years will see that the loan is paid off in full. The balance at the start of this period is unknown. N
PV
36
6.9
r = 0.001 × 100
= 0.1% per compounding period
Step 3: Calculate the annual interest rate.
Interest compounds on a fortnightly basis.
PMT Answer
r 1.001 = 1 + _ 100
−800 = 2.6% p.a.
r = 0.001 × 100 = 0.1%percompoundingperiod annual interest rate = 0.1% × 26 = 2.6%p.a.
FV
0
A significant number of students incorrectly responded with calculator commands. Any response involving calculator syntax or notation such as writing an equation involving r and ‘solve’ does not warrant full marks.
CpY
Step 1: Identify R from the recurrence relation. R = 1.001
Step 2: Calculate r, the interest rate per compounding period. r R = 1 + _ 100 1.001 = 1 + _ r 100
PV
Online – Other resources
12 12
PMT
36
(there are 36 months in 3 years)
−800
(this is negative since Ken pays the lender)
12
(payments made monthly)
6.9
(annual interest rate)
(there are 36 months in 3 years) FV
PpY CpY
0
12
(the loan is to be fully repaid) (interest compounds monthly)
(annual interest rate) PV
25 947.576…
This means that Ken’s loan balance needs to be $25 947.576… for him to fully pay off the loan in the remaining 3 years.
The difference between the balance at the end of the first three years and the required beginning balance of the next three years will be equal to the lump sum payment $L that Ken makes.
(this is negative since Ken pays the lender)
Step 3: Calculate the difference between the two balances. 54151.599… − 25947.576… = 28204.023...
Answer
≈ 28204
$28204
(the loan is to be fully repaid)
A significant number of students only calculated the loan balance after three years.
(payments made monthly) ANSWERS
15
(interest compounds monthly)
25 947.576…
This means that Ken’s loan balance needs to be $25 947.576… for him to fully pay off the loan in the remaining 3 years.
r = 0.001 × 100
The difference between the balance at the end of the first three years and the required beginning balance of the next three years will be equal to the lump sum payment $L that Ken makes.
= 0.1% per compounding period
annual interest rate = 0.1% × 26 = 2.6%p.a.
36
This means that after 3 years of repayments, Ken will still owe $54 151.599…
I(%)
Step 2: Calculate r, the interest rate per compounding period. r R = 1 + _ 100 1.001 = 1 + _ r 100
PpY
Step 3: Calculate the difference between the two balances. 54151.599… − 25947.576… = 28204.023...
Answer
r = 0.001 × 100 = 0.1%percompoundingperiod
FV
3.599…
Step 1: Identify R from the recurrence relation.
ANSWERS
r 1.001 = 1 + _ 100
CpY
(interest compounds monthly)
12
33% of students incorrectly answered option D. This is likely because they did not factor in the compounding period before calculating R.
21. Explanation
$42000
= 2.6% p.a.
PpY
≈ 1.003
Answer
19% of students incorrectly answered option C. This is likely because they understood that the principal reduction increases with each payment, but didn’t know how to calculate it.
annual interest rate = 0.1% × 26
FV
= 1.00299...
21. Explanation
Answer
12
3.599... R = 1 + ____________ 12 × 100 = 1 + _ 3.599... 1200
= 502.00
Question sets provide the the ability complete all questions Step 3: Calculate annualto interest rate. online, with instant feedback on student responses. Interest compounds on a fortnightly basis.
PMT
(the loan is to be fully repaid)
0
% per The interest compounds monthly, so r = _ 3.599... 12 compounding period.
13. 5 weeks
14
(this is negative since Bimal pays the lender)
annual interest rate = 0.1% × 26
d. $141.73
PV
The next three years will see that the loan is paid off in full. The balance at the start of this period is unknown.
33% of students incorrectly answered option D. This is likely because they did not factor in the compounding period before calculating R. 12. a.
I(%)
The annual interest rate is 3.599…% p.a.
Using financial applications of technology to solve problems involving reducing balance loans 11. B
N
(this is positive because Bimal receives it from the lender)
Step 2: Calculate R.
= 1500.00 − 998.00 B
Step 1: Determine the balance remaining after the initial three years.
Step 2: Determine the required balance at the start of the next three years.
principal reduction = payment − interest
Answer
2510.00
−2633
I(%)
= 998.00
3000.00
490.00
PMT
CpY
Step 2: Calculate the principal reduction for payment number 2.
0.00
10.00 8.37
482.40
V3 = 25031.099...
A
balance of loan
500.00 500.00
961.60
450 000
PpY
V 0 = 26000
Answer
(there are 240 months in 20 years)
At this point in time, Ken will make a lump sum payment, $L to reduce the balance of the loan.
UserecursiontocalculateV1 ,V2 ,V3 ,V4 andV5 . V2 = 25355.034...
−54 151.599…
PV
FV
18. Explanation
V1 = 25678.00
240
(interest compounds monthly)
This means that after 3 years of repayments, Ken will still owe $54 151.599…
$279.38
principal reduction
1 2
16463.00
N
(this is negative since Ken pays the lender)
22. Explanation
Step 1: Determine the annual interest rate.
FV
(annual interest rate)
(payments made monthly)
12
20. Explanation
16. R = 1.0195 17. a.
(this is positive because Ken receives it from the lender)
12
CpY
7.2% p.a.
3.599... % per The interest compounds monthly, so r = _ 12 compounding period.
A
PpY
$28563.51
70 000 −800
FV
Joining it all together b.
b.
$10 000
5. 3 fortnights
Step 2: Calculate R.
14. a.
15. a.
The annual interest rate is 3.599…% p.a.
Answer
PMT
(payments made monthly)
(there are 36 months in 3 years)
6.9
PV
(thisexam is negative since Bimal Fully worked solutions are provided for all practice questions, PMT −2633 pays the lender) complete with commentary on common misconceptions and errors (the loan is to be fully FVstatistics 0 made, based on VCAA (where applicable). repaid)
36
≈ 28204
$28204
A significant number of students only calculated the loan balance after three years.
A significant number of students incorrectly responded with calculator commands. Any response involving calculator syntax or notation such as writing an equation involving r and ‘solve’ does not warrant full marks.
Static solutions provide fully worked solutions for all questions.
ANSWERS
Video solutions for every question provide extra guidance on how to answer questions, complete with guided calculator solutions for TI-Nspire and Casio ClassPad CAS calculators. 6B ANSWERS
umber 2.
22. Explanation
Step 1: Determine the annual interest rate.
6B ANSWERS
1.40
20. Explanation
6B ANSWERS
563.51
15
Chapter reviews provide visual theory summaries and application questions that scaffold students towards answering questions using multiple skills within the chapter. Area of study reviews provide teachers with a practice assessment that links concepts from an entire area of study.
V
Back to contents
AOS 1
Data analysis CALCULATOR QUICK LOOK-UP GUIDE Displaying data using histograms ........................................................................... 27 Calculating logarithmic values . . .............................................................................. 38 Displaying data using a logarithmic scale ................................................................. 39 Calculating the five-number summary ..................................................................... 48 Displaying data using boxplots ............................................................................... 53 Calculating the sample mean and standard deviation ................................................. 81 Displaying data using scatterplots ......................................................................... 133 Calculating the Pearson correlation coefficient . . ....................................................... 146 Calculating the least squares regression equation . . ................................................... 157 Constructing residual plots ................................................................................... 181 Applying a squared transformation ........................................................................ 193 Applying a log transformation ............................................................................... 196 Applying a reciprocal transformation .. .................................................................... 198 Calculating the least squares regression equation for transformed data .. ..................... 207 Displaying time series data using scatterplots .. ....................................................... 223 Smoothing time series data over an odd number of points using moving means ............ 235 Smoothing time series data over an even number of points using moving means . . ......... 240 Calculating the least squares regression equation for time series data ........................ 280 Calculating the least squares regression equation for seasonal data ........................... 282
VI
Back to contents
1
CHAPTER 1
Investigating data distributions LESSONS 1A Types of data 1B
Displaying and describing categorical data
1C
Displaying numerical data
1D Log scales and graphs 1E
The five-number summary and boxplots
1F
Describing numerical data
1G Introduction to standard deviation 1H The normal distribution 1I
z-scores
KEY KNOWLEDGE • types of data
• summary of the distributions of numerical variables; the
• representation, display and description of the distributions
five-number summary and boxplots (including the use of the lower fence (Q1 − 1.5 × IQR) and upper fence (Q3 + 1.5 × IQR) to identify and display possible outliers); the sample mean and standard deviation and their use in comparing data distributions in terms of centre and spread • the normal model for bell-shaped distributions and the use of the 68–95–99.7% rule to estimate percentages and to give meaning to the standard deviation; standardised values (z-scores) and their use in comparing data values across distributions.
of categorical variables: data tables, two-way frequency tables and their associated segmented bar charts • representation, display and description of the distributions of numerical variables: dot plots, stem plots, histograms; the use of a logarithmic (base 10) scale to display data ranging over several orders of magnitude and their interpretation in terms of powers of ten • use of the distribution(s) of one or more categorical or numerical variables to answer statistical questions
Image: ChristianChan/Shutterstock.com
1
Back to contents
1A
Types of data
STUDY DESIGN DOT POINT
• types of data 1B
1A
1C
1D
1E
1F
1G
1H
KEY SKILLS
1I
KEY TERMS
• • • • • • •
During this lesson, you will be:
In the Information Age, data is becoming increasingly more important to everyday life. Classifying data into data types is necessary before analysis can be performed, or the most appropriate data visualisations can be constructed.
Classifying data as categorical or numerical Data is a set of values, words or responses, that is collected and ordered by variables.
Data that can be organised into categories or groups is known as categorical data. It is also referred to as qualitative data, as it represents a quality or attribute.
Data that can be counted or measured is known as numerical data. It is also referred to as quantitative data, as it represents a quantity.
Worked example 1
Classify the following variables as either categorical or numerical.
a. type of pasta
Explanation
The variable type of pasta is categorised into different pasta types such as gnocchi, fettuccine, spaghetti or lasagne.
Answer
Categorical
b. number of candles
Explanation
The variable number of candles is counted.
Answer
Numerical
i
i
i
a
a
i
a
i
Ch pter 1: Invest g t ng d t d str but ons a
2
• classifying data as categorical or numerical • classifying categorical data as nominal or ordinal • classifying numerical data as discrete or continuous.
Data Categorical data Numerical data Nominal data Ordinal data Discrete data Continuous data
Back to contents
1A THEORY
Classifying categorical data as nominal or ordinal Categorical data can be further classified as either nominal or ordinal.
Categorical data that cannot be sorted into a logical ordered list or hierarchy is called nominal data. For example, type of bread(white bread, multigrain, sourdough) has no inherent ranking system and is classified as nominal categorical data.
Categorical data that can be ordered into a logical ordered list or hierarchy is called ordinal data. For example, drink size(small, medium, large) can be ordered such that medium is greater than small, and large is greater than medium. This is an inherent ranking system, so it is classified as ordinal categorical data.
Worked example 2
Classify the following categorical variables as either nominal or ordinal. a. type of shoe (runners, boots, sandals, slides)
Explanation
The categories within the variable type of shoecannot be inherently ordered.
Answer Nominal
b. shirt size (small, medium, large)
Explanation
The categories within the variable shirt sizecan be inherently ordered (small to medium to large).
Answer Ordinal
Classifying numerical data as discrete or continuous Numerical variables can be further classified as either discrete or continuous.
Numerical data that can only consist of a set of fixed values within a range is called discrete data. Discrete data usually consists of whole numbers and would typically be collected by counting. For example, the number of stepstaken in a day can only be represented by whole numbers starting from zero, and is classified as discrete numerical data.
Numerical data that can consist of any value within a range is called continuous data. Continuous data usually consists of both whole numbers and decimals and would typically be collected by measuring. For example, the distance (km) walked in a day is classified as continuous numerical data as it is measured and can consist of any positive value, such as 5.1, 5.01 or even 5.001. Continuous data that has been rounded to the nearest whole number is still considered to be continuous.
1A Types of data
3
Back to contents
a. length (m)
Classify the following numerical variables as either discrete or continuous.
Explanation
The variable length (m) can be expressed in decimals and can consist of any value measured on a continuous scale.
Answer
b. number of tennis racquets
Continuous
Explanation
The variable number of tennis racquets cannot be expressed in decimals and can only be counted.
Answer Discrete
VCAA 2016 Exam 1 Data analysis Q2
Exam question breakdown
The variables blood pressure (low, normal, high) and age (under 50 years, 50 years or over) are
A. both nominal variables.
B. both ordinal variables.
C. a nominal variable and an ordinal variable respectively.
D. an ordinal variable and a nominal variable respectively.
E. a continuous variable and an ordinal variable respectively.
Explanation
31% of students answered this question correctly.
B
i
i
i
a
a
i
a
i
Ch pter 1: Invest g t ng d t d str but ons
45% of students incorrectly chose option D, as they identified the variable age (under 50 years, 50 years or over) as a nominal variable. The variable age is ordinal since one group of people can be classified as younger than the other group, creating an inherent order between the two categories.
a
4
These categories can also be sorted into ascending or descending order. Therefore, age (under 50 years, 50 years or over) can be further classified as an ordinal variable.
Answer
The variable age has two categories; ‘under 50 years’ and ‘50 years or over’. As such, this is a categorical variable.
These categories can be sorted into ascending or descending order. Therefore, blood pressure (low, normal, high) can be further classified as an ordinal variable.
Step 2: Classify the variable age (under 50 years, 50 years or over).
The variable blood pressure has three categories, low, medium and high. As such, this is a categorical variable.
Step 1: Classify the variable blood pressure (low, normal, high).
1A THEORY
Worked example 3
Back to contents
1A QUESTIONS
1A Questions Classifying data as categorical or numerical 1.
Which of the following variables is categorical? A. number of lamps
B. number of wardrobes C. cost of a house D. type of kitchen 2.
Which of the following variables is numerical? A. number of teachers
B. type of cake
C. type of painting D. laptop brand (1 = Apple, 2 = ASUS, 3 = HP, 4 = other) 3.
Classify the following variables as either categorical or numerical. a. age
b. exam difficulty (1 = easy, 2 = medium, 3 = hard)
Classifying categorical data as nominal or ordinal 4.
Which of the following categorical variables is nominal? A. clay quality (low, medium, high)
B. class participation (low, moderate, high)
C. weather forecast (sunny, clear, cloudy, raining) D. level of processing (shallow, moderate, deep)
5.
Which of the following categorical variables is ordinal? A. keyboard switch type (blue, red, brown)
B. difficulty ranking (1 = easy, 2 = moderate, 3 = hard)
C. personality type (INTP, ISTJ, ENTJ, etc…)
D. favourite ice cream flavour (black sesame, green tea, vanilla)
6.
Classify the following categorical variables as either nominal or ordinal. a. type of car (1 = sedan, 2 = sports, 3 = convertible, 4 = other)
b. assessment grade (A, B, C, D, E, F)
Classifying numerical data as discrete or continuous 7.
Which of the following numerical variables is discrete? A. time elapsed
B. height
C. number of keyboards
D. volume of CO2 output
1A Types of data
5
Back to contents
Which of the following numerical variables is continuous?
8.
number of dogs
C.
tennis tournaments won
B.
1A QUESTIONS
A. student enrolments
Classify the following numerical variables as either discrete or continuous. The number of parrots found in different rainforests.
The haemoglobin count of a group of people, in (g/dl).
b.
a.
9.
D. bone mass
Joining it all together
10.
Fill in the gaps with the following terms: nominal data, discrete data, numerical data, and ordinal data. data
categorical data
continuous data
postcode
i.
number of users
h.
student number
user rating (1 = not satisfactory, 2 = neutral, 3 = satisfactory)
A tennis coach collected data on the number of tennis racquets used and the serve speed (km/h) for several tennis players for the upcoming Australian Open. a.
Which of the two variables is continuous?
b.
Which of the two variables is discrete?
A. Nominal
B.
C.
Ordinal
i
i
i
a
a
i
a
i
Ch pter 1: Invest g t ng d t d str but ons a
The program is easy to navigate strongly disagree
1
2
3
4
5
A software company wants to see if they need to upgrade their program. They conduct a survey where the participants are asked to comment on the statement 'The program is easy to navigate'. They collect the responses under the variable response (1 = strongly disagree, 2 = disagree, 3 = neutral, 4 = agree, 5 = strongly agree).
What type of data are they collecting?
6
g.
exam grades (HD = high distinction, D = distinction, C = credit, P = pass, N = fail)
j.
perfume brand
e.
height of basketball players (cm)
f.
d.
weight of textbook (kg)
c.
13.
number of employees
b.
12.
car brand (1 = Toyota, 2 = Holden, 3 = Ford, 4 = other)
a.
Classify the following variables as either nominal, ordinal, discrete or continuous.
11.
Discrete
D. Continuous
strongly agree
Back to contents
14.
The table shows the day numberand the minimum temperature, in degrees Celsius, for 15 consecutive days in May 2017. day number
inimum m temperature(°C)
1
12.7
2
11.8
3
10.7
4
9.0
5
6.0
6
7.0
7
8
4.1
4.8
Which of the two variables in this data set is an ordinal variable? (1 MARK)
9
9.2
10
6.7
7.5
8.0
13
8.6
14
9.8
15
7.7
Data relating to the following five variables was collected from insects that were caught overnight in a trap:
• • • • •
colour
name of species
number of wings
body length (in millimetres) body weight (in milligrams)
The number of these variables that are discrete variables is
A. 1 D. 4
B. E.
VCAA 2020 Exam 1 Data analysis Q7
16.
12
81% of students answered this question correctly.
VCAA 2019 Exam 2 Data analysis Q1a
15.
11
1A QUESTIONS
Exam practice
2
C.
5
3
69% of students answered this question correctly.
In the sport of heptathlon, athletes compete in seven events.
These events are the 100 m hurdles, high jump, shot-put, javelin, 200 m run, 800 m run and long jump. Fifteen female athletes competed to qualify for the heptathlon at the Olympic Games.
Their results for three of the heptathlon events – high jump, shot-put and javelin – are shown in the table. athlete number
high jump (metres)
shot-put (metres)
javelin (metres)
1
1.76
15.34
41.22
1.83
13.87
46.53
2
1.79
4
1.82
3 5 6 7 8 9
10 11 12 13 14 15
1.87 1.73 1.68 1.82 1.83 1.87 1.87 1.80 1.83 1.87 1.78
16.96 14.23 13.78 14.50 15.08 13.13 14.22 13.62 12.01 12.88 12.68 12.45 11.31
Write down the number of numerical variables in the table. (1 MARK)
VCAA 2021 Exam 2 Data analysis Q1a
42.41 40.53 40.62 45.62 42.33 40.88 39.22 42.51 42.75 38.12 42.65 41.32 42.88
52% of students answered this question correctly.
1A Types of data
7
Back to contents
The variables number of moths (less than 250, 250–500, more than 500) and trap type (sugar, scent, light) are
1A QUESTIONS
17.
A. both nominal variables.
both ordinal variables.
C.
a numerical variable and a categorical variable respectively.
B.
D. a nominal variable and an ordinal variable respectively. E.
46% of students answered this question correctly.
an ordinal variable and a nominal variable respectively.
VCAA 2017 Exam 1 Data analysis Q7
Questions from multiple lessons Data analysis Year 11 content Ashleigh and Savannah are training to run a marathon by running as far as they can inside 3 hours and 30 minutes. The dot plot displays the difference in distance run by Ashleigh in relation to Savannah (i.e. 0.5 means Ashleigh ran 500 m more than Savannah, while −0.5 means Ashleigh ran 500 m less than Savannah). They ran together 28 times.
n = 28
18.
The percentage of days in which Ashleigh ran one less kilometre than Savannah is A. 7.1%
B.
C.
10.7%
Adapted from VCAA 2018 Exam 1 Data analysis Q1
−4
−2
0
D. 25.0%
14.3%
Recursion and financial modelling Year 11 content 19.
Arthur gets $1000 for his birthday and wants to save his money. He opens a savings account and deposits his $1000. The account earns interest at a rate of 3% per annum, compounding annually.
Let Vn be the value of Arthur’s account n years after he initially deposits his money.
V1 = 1000, Vn+1 = 1.3 Vn
E.
D. V0 = 1030, Vn+1 = 1.03 Vn
V0 = 1000, Vn+1 = 1.03 Vn
C.
V1 = 1030, Vn+1 = Vn + 30
B.
The expected growth of Athur’s savings account can be modelled by A. V0 = 1000, Vn+1 = Vn + 30
Adapted from VCAA 2015 Exam 1 Number patterns Q3
Data analysis Year 11 content
20.
The number of cars that park in a particular car park on each day of one week are counted and recorded in the following table. number of cars
Mon
Tue
Wed
Thu
Fri
Sat
Sun
103
84
92
79
93
64
48
From the information given, determine a.
the range. (1 MARK)
the percentage of days that had less than 90 cars parked, correct to one decimal place. (1 MARK)
b.
Adapted from VCAA 2017 Exam 2 Data analysis Q1
i
i
i
a
a
i
a
i
Ch pter 1: Invest g t ng d t d str but ons a
8
2
4
difference in running distance (km) E.
28.0%
Back to contents
1B
Displaying and describing categorical data
STUDY DESIGN DOT POINTS
• representation, display and description of the distributions of categorical variables: data tables, two-way frequency tables and their associated segmented bar charts
• use of the distribution(s) of one or more categorical or numerical variables to answer statistical questions 1B
1A
1D
1C
1E
1F
1G
1H
KEY TERMS
KEY SKILLS
• • • • •
During this lesson, you will be:
• • • •
1I
constructing frequency tables constructing bar charts constructing segmented bar charts describing the distribution of categorical data.
Lists of categorical information can be converted into tables, graphs and charts so that they can be easily read and interpreted. These displays can be used to identify the number, or percentage, of data for each category, as well as the most frequently occurring category.
Frequency table Percentage frequency Bar chart Segmented bar chart Percentage segmented bar chart • Mode
Constructing frequency tables A frequency table is a table that tallies how often each value in a data set occurs. This is the first step in making a set of data easier to summarise and analyse.
Data can be recorded within a frequency table as either frequency or percentage frequency. The percentage frequency is the proportion of times each value or category occurs in relation to the entire data set, represented as a percentage. frequency × 100 percentage frequency = ___________ total frequency
Worked example 1 The students in a prep class were asked the question, ‘Would you describe your teacher’s height as short, average or tall?’. Their responses were as follows: average
short
tall
tall
short
short
average
tall
tall
average
average
tall
average
average
average
average
tall
average
short tall
average tall
short
short
Use this data to create a frequency table displaying both frequency and percentage frequency, correct to the nearest decimal place.
Continues →
1B Displaying and describing categorical data
9
Back to contents
Step 3: Calculate the frequency as a percentage for each category, making sure the percentages add up to 100.
Step 2: Fill in the frequency number column by counting from the data set, including the total. frequency
teacher’s height
number
short
%
6
average
10
total
24
tall
short
number
%
6
6 × 100 = 25.0 ____ 24
8
8 × 100 ≈ 33.3 ____ 24
average
10
total
24
tall
8
Answer
frequency
teacher’s height short
number
%
6
25.0
8
33.3
average
10
total
24
tall
41.7 100.0
Constructing bar charts
t
b
t
d
t
d
t
t
I
hap er 1: nves iga ing a a is ri u ions t
10
C
A bar chart is a graphical display that is commonly used to display categorical data. The frequency or percentage frequency of each category is represented by columns of varied height. Spaces are included between columns to indicate that the categories are separate.
10 × 100 ≈ 41.7 ____ 24
total
frequency
teacher’s height
tall
average
frequency percentage frequency = ___________ × 100 total frequency
short
Remember that the question asks for percentages given to the nearest decimal place.
%
number
frequency
teacher’s height
Note: When percentages have been rounded, they may not add up to exactly 100. In these situations this is okay, as long as the rounding has been done accurately.
The table should have 3 columns for the variable collected, and the frequency as a number and percentage. There should be an appropriate number of rows to include all the categories. Finally, a row should be included for the total.
Step 1: Set up a frequency table.
1B THEORY
Explanation
100.0
Back to contents
Worked example 2
Use the frequency table to construct a frequency bar chart.
teacher’s height short
The vertical axis should at least extend to the maximum value.
The horizontal axis should include labels for each of the categories.
%
6
25.0
8
33.3
10
total
24
Explanation
number
average tall
Step 1: Construct axes with the ‘frequency’ on the vertical axis and ‘teacher′s height’ on the horizontal axis.
frequency
1B THEORY
24 students in a prep class were asked the question, ‘Would you describe your teacher’s height as short, average or tall?’. Their responses are recorded in the frequency table shown.
41.7 100.0
Step 2: Draw vertical columns for each category according to their value in the frequency table.
Remember that each column should be separated by a gap.
10
frequency
8 6 4 2 0
short
average
tall
teacher’s height
Answer 10
frequency
8 6 4 2 0
short
average
tall
teacher’s height
Constructing segmented bar charts A segmented bar chart is a variation of a bar chart with each category stacked into one column. They are particularly useful for comparing the distribution of categories across different sets of data. This will be explored further later.
Each category within a segmented bar chart has its own segment, with no gaps between segments. The height of each segment indicates the frequency of each category. A legend indicates which segments of the bar relate to which categories. Segmented bar charts can also be constructed for the percentage frequency of a data set. This is called a percentage segmented bar chart.
1B Displaying and describing categorical data
11
Back to contents
24 students in a prep class were asked the question, ‘Would you describe your teacher’s height as short, average or tall?’. Their responses are recorded in the frequency table shown.
teacher’s height short
frequency number
%
6
25.0
8
33.3
average
10
total
24
tall
41.7 100.0
a. Use the frequency table to construct a segmented bar chart.
Explanation
The ‘short’ segment should end at 6.
Ensure each segment is clearly defined.
16
frequency
The ‘tall’ segment should end at 16 + 8 = 24.
20
The ‘average’ segment should end at 6 + 10 = 16.
24
For this segmented bar chart, we will go from short,to average, to tall.
The vertical axis should at least extend to the total frequency.
Step 2: Construct the column by adding the value of each segment.
Step 1: Construct axes with the ‘frequency’ on the vertical axis and ‘teacher’s height’ on the horizontal axis.
24
12 20 8
frequency
16 4 0
teacher’s height
12 8 4 0
teacher’s height
Step 3: Add a legend so the graph can be interpreted correctly.
Answer 24
short average
20
tall frequency
16 12 8 4 0
teacher’s height Continues →
t
b
t
d
t
d
t
t
I
hap er 1: nves iga ing a a is ri u ions t
12
C
1B THEORY
Worked example 3
Back to contents
b. Use the frequency table to construct a percentage segmented bar chart. Step 1: Construct axes with the frequency as a percentage on the vertical axis and ‘teacher′s height’ on the horizontal axis.
The vertical axis should extend to 100%. 100
frequency (%)
80
Step 2: Construct the column by adding the percentage of each segment.
For this percentage segmented bar chart, we will go from short, to average, to tall.
The ‘average’ segment should end at 25 + 41.7 = 66.7.
Ensure each segment is clearly defined.
60
The ‘short’ segment should end at 25.
The ‘tall’ segment should end at 66.7 + 33.3 = 100. 100
40
80
frequency (%)
20
0
1B THEORY
Explanation
teacher’s height
60
40
20
0
teacher’s height
Step 3: Add a legend so the graph can be interpreted correctly.
Answer 100
short average
frequency (%)
80
tall
60
40
20
0
teacher’s height
Describing the distribution of categorical data When describing data, the mean, median and mode are often mentioned as measures of centre. This is the middle, or ‘average’ value of a distribution. The mode is the only available measure of centre for categorical data as the mean and median only apply to numerical data. The mode is the most frequently occurring value in the data set. It can be identified from a bar chart or segmented bar chart by looking at the column or segment with the greatest vertical height. 1B Displaying and describing categorical data
13
An interpretation of frequency tables, bar charts and segmented bar charts often involves writing a report which can:
• summarise the data type and the number of values represented in the data set • identify the modal category (if it is obvious) • compare the percentage frequencies of different categories.
In larger data sets, not all categories need to be mentioned. It might be easier to draw attention to the largest and smallest columns.
Worked example 4
100
short average
80
frequency (%)
24 students in a prep class were asked the question, ‘Would you describe your teacher’s height as short, average or tall?’. Their responses are shown in the given percentage segmented bar chart.
tall
60
40
20
0
teacher’s height
a. Find the modal category of the data set.
Explanation
Identify the segment with the greatest vertical height.
Answer Average
b. Describe the distribution of the data set.
Explanation
Consider the elements to be included in the report describing the distribution.
• Number of people surveyed • Modal category • Other significant percentages
Answer
24 prep students were surveyed on how tall they thought their teacher was. The most common response was average, accounting for 41.7% of responses, while 25% said their teacher was short, and 33.3% said their teacher was tall.
t
b
t
d
t
d
t
t
I
hap er 1: nves iga ing a a is ri u ions t
14
C
1B THEORY
Back to contents
Back to contents
1B QUESTIONS
1B Questions Constructing frequency tables 1.
A group of people were asked whether they preferred coffee, tea, or neither in the morning. Their results are displayed in the following frequency table, with percentages rounded to the nearest whole number. frequency
drink preference
number
%
coffee
19
59
neither
8
25
tea
5
total
16
32
100
Which of the following statements is true?
A. 19 people were surveyed, and 59% preferred coffee.
B. 32 people were surveyed, and 8% preferred neither.
C. 32 people were surveyed, and 59% preferred coffee.
D. 32 people were surveyed, and 16 of them preferred tea. 2.
20 members of the Italian Club were asked what their favourite type of pastais. Their results were as follows: penne
fettuccine
spaghetti
spaghetti
penne
macaroni
fettuccine
fettuccine
spaghetti
spaghetti
penne
spaghetti
fettuccine
penne
macaroni
penne
penne
spaghetti
penne
fettuccine
Use these results to construct a frequency table including frequencies and percentages.
Constructing bar charts
A class of 312 Year 12 boys were asked their shoe size. Their results are recorded in the given bar chart. The number of Year 12 boys with a size 9 shoe is closest to A. 10
B. 30 C. 45
200
frequency
3.
150 100 50 0
D. 55 4.
The s hirt sizes(extra small, small, medium, large, extra large) of 49 people are displayed in the given frequency table. Percentages are rounded to the nearest decimal place. a. Use the frequency table to construct a frequency bar chart.
b. Use the frequency table to construct a percentage frequency bar chart.
7
shirt size extra small small
8
9
shoe size
11
10
frequency number
%
3
6.1
14
28.6
medium
17
34.7
extra large
6
12.2
large total
9
49
18.4
100.0
1B Displaying and describing categorical data
15
Back to contents
A group of people were asked what their preferred streaming service was. Their responses are shown in the frequency segmented bar chart shown
150
Netflix
5.
Which of the following statements is false?
frequency
C.
B.
More people prefer Disney+ than Netflix and Stan combined. 140 people were surveyed.
D. 15 people prefer Amazon Prime.
Stan
120
A. 20 people prefer Stan.
Disney+ Amazon Prime
90
Other 60 30 0
streaming service
a.
Use the data from the frequency table to construct a frequency segmented bar chart. Use the data from the frequency table to construct a percentage segmented bar chart.
b.
139 people were asked what their favourite animal was. The results are shown in the frequency table shown. Percentages have been rounded to the nearest whole number.
6.
frequency
favourite animal
number
%
dog
47
34
guinea pig
22
16
cat
52
horse
37
14
snake
10
4
total
139
3
100
Describing the distribution of categorical data
C.
A. Hawkeye
Iron Man
Captain America
30
frequency
The modal superhero is
B.
40
88 people were asked who their favourite Avenger was. The results are shown in the bar chart provided.
7.
E.
8.
20 10 0
Black Widow
The brands of 50 cars entering a carpark were recorded, with the results shown in the percentage segmented bar chart provided. Use this data to fill out the following report template.
The brands of _ cars were recorded as they entered a car park. All the cars were either ‘Holden’, ‘Ford’, or ‘Toyota’. The most commonly occurring brand of car was _, accounting for _% of all cars. The next most commonly occurring brand was _, representing _%. Finally, the last _% of the cars were _.
t
b
t
d
t
d
t
t
I
hap er 1: nves iga ing a a is ri u ions t
C
Iron Man
Captain America
100
Hulk
Black Widow
Toyota Holden
80
Ford
60 40 20 0
16
Hawkeye
favourite Avenger
frequency (%)
D. Hulk
1B QUESTIONS
Constructing segmented bar charts
brand of car
Back to contents
A barista collected information on the type of milk that customers ordered with their coffee. The results are shown in the given percentage segmented bar chart.
The barista remembered that 36 people ordered almond milk. The number of people that ordered regular milk is closest to A. 8
B. 10 C. 15 D. 25
100
almond
75
oat soy
50
25
0
10.
regular skim
frequency (%)
9.
milk choice
A group of musicians were asked who their f avourite jazz drummerwas. Their responses were as follows: Tony Williams
Tony Williams
Elvin Jones
Buddy Rich
Brian Blade
Elvin Jones
Brian Blade
Elvin Jones
Art Blakey
Elvin Jones
Elvin Jones
Buddy Rich
Tony Williams
Elvin Jones
Buddy Rich
a. Use these results to construct a frequency table.
Brian Blade
Art Blakey
Buddy Rich
Buddy Rich
Tony Williams
b. Using the frequency table from part a, construct a percentage bar chart to show the results. c. Using the frequency table from part a, construct a frequency segmented bar chart to show the results.
d. Use the data to write a paragraph on the distribution of favourite jazz drummers amongst the musicians.
A group of toddlers were asked about their least favourite vegetable. The results are represented in the given bar chart. 14 12 10
frequency
11.
8 6 4 2 0
kale
brussels sprouts
beetroot
onion
least favourite vegetable Draw a percentage segmented bar chart to represent this data, correct to the nearest percentage.
1B Displaying and describing categorical data
17
1B QUESTIONS
Joining it all together
Back to contents
A group of office workers were asked what their favourite TV show was. The results are displayed in the bar chart shown.
60
6 people said Peaky Blinders was their favourite show. Use this information and the bar chart to construct a frequency table that represents this information.
50
frequency (%)
1B QUESTIONS
12.
40 30 20 10 0
The Bachelor
Squid Game
Peaky Blinders
favourite TV show
Exam practice
The number of sugar traps that caught less than 250 moths is closest to
A. 30
C.
90
more than 500
60 50 40
20
500
250–500
80
30
250
D. 300 E.
less than 250
90
70
frequency (%)
There were 300 sugar traps.
B.
100
A study was conducted that investigated the number of moths caught in a sugar moth trap (less than 250, 250–500, more than 500). The results are summarised in the percentage segmented bar chart shown.
13.
10
Adapted from VCAA 2017 Exam 1 Data analysis Q5
0
number of moths
According to the bar chart, the percentage of the 214 days on which the wind direction was observed to be east or south-east is closest to
A. 10%
B.
C.
16% 25%
D. 33%
E.
north north- east south- south south- west northeast east west west
68% of students answered this question correctly.
t
b
t
d
t
d
t
t
I
hap er 1: nves iga ing a a is ri u ions t
C
45 40 35 30 25 20 15 10 5 0
wind direction
35%
VCAA 2012 Exam 1 Data analysis Q2
18
frequency
The given bar chart shows the distribution of wind directions recorded at a weather station at 9:00 am on each of 214 days in 2011.
14.
76% of students answered this type of question correctly.
Back to contents
1B QUESTIONS
Questions from multiple lessons Data analysis Year 11 content 15.
The c lothing size(small, medium, large), and age (under 10 years, 10 years or over) of students at a primary school were collected. In this context, the variables clothing size and age are A. ordinal and nominal respectively
B. nominal and ordinal respectively
C. ordinal and continuous respectively D. both ordinal
E. both nominal
Adapted from VCAA 2016 Exam 1 Data analysis Q2
Recursion and financial modelling Year 11 content 16.
As part of her new year resolutions, Sarah decides to read every month from January to December for one year. Each month she counts the number of pages that she has read. In January, she reads 12 pages of a book. In February, she reads 18 pages. In March, she reads 24 pages. In April, she reads 30 pages. The number of pages she reads each month continues to increase according to this pattern. The number of pages she reads in September is A. 48
B. 54 C. 60 D. 66 E. 72
Adapted from VCAA 2014 Exam 1 Number patterns Q1
Recursion and financial modelling Year 11 content 17.
Alex invested $1000 in a savings account, with interest compounding annually. years. Mn is the amount of money in the account after n
The following calculations show the amount of money in Alex’s account initially, and after one and two years. M0 = 1000
M1 = 1.04 × 1000 = 1040
M2 = 1.04 × 1040 = 1081.60
a. Find a recurrence relation in terms of M0, Mn+1, and Mn that models the amount of money in Alex’s savings account after n years. (1 MARK)
b. Alex wants to buy a new laptop for $1250. What is the minimum interest rate per annum that would have been required for Alex to afford this laptop after two years? Give your answer correct to two decimal places. (1 MARK) Adapted from VCAA 2018NH Exam 2 Recursion and financial modelling Q7c,d
1B Displaying and describing categorical data
19
Back to contents
1C
Displaying numerical data
STUDY DESIGN DOT POINT
• representation, display and description of the distributions of numerical variables: dot plots, stem plots, histograms; the use of a logarithmic (base 10) scale to display data ranging over several orders of magnitude and their interpretation in terms of powers of ten
1A
1B
1C
1D
1E
1F
1G
1H
1I
KEY TERMS
KEY SKILLS
• Dot plot • Stem plot • Grouped
displaying data using dot plots displaying data using stem plots constructing grouped frequency tables displaying data using histograms.
frequency table
• Histogram
• • • •
During this lesson, you will be:
Dot plots, stem plots and histograms are displays that help us visualise the distribution of numerical data. These displays can then be used to identify the number, or percentage, of data within certain ranges of values, as well as the most frequently occurring values.
Displaying data using dot plots A dot plot is a simple way to display discrete numerical data, where each data point is represented by a dot above a single axis.
The number of dots above a value on the axis represents the frequency of the value. The mode of the data set (also known as the modal value) is the value with the most number of dots.
2
Dot plots are ideal for displaying small/medium-sized data sets with a small range of values.
3
4 5 board games owned
7
6
Worked example 1
Sophie surveyed 12 of the families living on her street.
She asked for the number of pets each of them owned and the results were recorded. 3 2 0 1 1 3 0 5 2 1 1 2
a. Construct a dot plot to display this data.
Explanation
0 0 1 1 1 1 2 2 2 3 3 5
The highest value is five.
The lowest value is zero.
Step 2: Construct a number line with an appropriate scale.
Step 1: Rearrange the data set into ascending order and determine the lowest and highest value.
The scale should cover all values between zero and five.
0
1
2 3 number of pets
4
5
ti
i
s r bu ons t
t
d
ti
ti
Chap er 1: Inves ga ng a a t
20
di
Continues →
Back to contents
Mark a dot above the number on the number line each time a value appears in the data set.
Spacing between each of the vertical dots should be consistent to allow for comparison of frequency across different values.
1C THEORY
Step 3: Represent each value with a dot.
If the same data value appears multiple times, illustrate this by placing the corresponding number of dots in a vertical line.
Answer
0
1
2 3 number of pets
4
5
b. What was the modal number of petsowned by families that Sophie surveyed?
Explanation
Find the value with the most dots on the dot plot.
0
1
2 3 number of pets
4
5
Answer 1 pet
Displaying data using stem plots A stem plot is a way to display numerical data, where data points are grouped by their leftmost digit(s). Each leaf represents the last digit of an individual data value, and each stem represents the leftmost digit(s) of a group of leaves.
See worked example 2
Stems are shown vertically, to the left of a vertical line, ordered from smallest to largest.
leaves are positioned to the right of the vertical line, in line with their corresponding stem. Within each stem, leaves should be ordered from smallest to largest. Key: 4 | 3 = 43 4
3
5
6
7
9
9
1
4
5
7 8
1
1 0
3
1
5
5
7
3
9
9
6
8
When constructing a stem plot, always remember to include a key. The key demonstrates the scale of the data. The key allows for data of many forms to be shown in a stem plot. This includes decimals and three (or more) digit numbers.
1C Displaying numerical data
21
Back to contents
Key: 1 | 2 = 1.2
Three-digit:
1
3
3
4
6
8
3
1
1
1
4
5
2
0
4
2
4
9
Key: 20 | 0 = 200 20 0
0
1
22 1
6
9
21 0
8
23 0
24 5
1
4
7
3
4
1
2
8
8
5
8
The frequency of a single data value can be found by finding the corresponding stem and counting the number of corresponding leaves within it. The modal value is the value with the most number of identical leaves within a single stem.
In some cases it can be difficult to see the underlying distribution due to having a lot of data within a small range. This problem is solved by ‘splitting’ the stems. Usually each stem is split into either two or five stems, depending on how close together the data is. Key: 1 | 2 = 12 0
1
1
2
1
2
3
3
0
5
1
6
6
6
7
7
3
4
8
9
8
9
9
8
See worked example 3
Stem plots are ideal for displaying small/medium-sized data sets with a large range of values.
Worked example 2
Ms Smyth’s maths class of 25 students sat their end-of-year exam. Their results (%) were recorded. 55 68 76 90 83 89 75 66 59 84 48 62 58 95 80 77 61 92 99 63 84 65 70 81 96
a. Construct a stem plot to display this data.
Explanation
Step 1: Consider the most appropriate scale.
Step 3: Fill in the leaves for each stem.
The data values are two-digit numbers.
The stems will refer to ‘tens’.
Repeat this for each stem.
The leaves will refer to ‘ones’.
Start with the smallest stem and fill the corresponding leaves in ascending order.
Step 2: Fill in the appropriate stems.
The data values range from 48 to 99.
All values which fall in the 40s, 50s, 60s, 70s, 80s and 90s need to be covered.
9
8
9
0
5
6
0
0
2
1
2
3
5
6
5
6
9
3
7 4
4
8
9
6 7
8 9
ti
i
s r bu ons t
t
d
ti
ti
Chap er 1: Inves ga ng a a
di
Note: Each stem within the range of the data needs to be included, even if there are no data values within it.
t
1
5
Step 4: Construct a key.
5
22
6
8
The appropriate stems are 4, 5, 6, 7, 8 and 9. 4
8
7
4
5
1C THEORY
Decimal:
A key shows the scale in which the data is represented.
As decided in step 1, the stems refer to ‘tens’ and the leaves refer to ‘ones’. Demonstrate this scale with an example.
Continues →
Back to contents
Answer 4
8
6
1
5 7
8
9
5
8 2
0
5
0
1
0
2
9 3
5
6
5
6
9
6
3
7 4
4
1C THEORY
Key: 4 | 8 = 48% 8
9
b. How many students scored above 70% on the exam?
Explanation
Count the number of leaves that represent a value greater than 70.
This will include any leaves on the ‘7’ stem that are greater than 0 and all leaves on stems greater than 7.
Key: 4 | 8 = 48% 4
8
6
1
5 7 8
9
Answer
5
8
9
0
5
6
0
0
2
1
2
3
5
6
5
6
9
3
7 4
4
8
9
14 students
Worked example 3 Ms Goyle’s maths class of 25 students sat their end-of-year exam. Their results (%)were recorded.
75 68 76 80 83 69 65 66 79 84 78 62 88 75 80 77 61 62 69 73 84 75 60 81 66
Construct a split stem plot to display this data, with stem intervals of 5%.
Explanation
Step 1: Consider the most appropriate scale.
The data values are two-digit numbers.
The leaves will refer to ‘ones’.
The stems will refer to ‘tens’.
Step 2: Fill in the appropriate stems.
The data values range from 60 to 88.
The question specifies stem intervals of 5%.
All values which fall in the 60s, 70s, and 80s need to be covered. The appropriate stems are 6, 6, 7, 7, 8 and 8. 6 6 7 7 8 8
Continues →
1C Displaying numerical data
23
Back to contents
The top stem for each stem value will include leaves from 0–4 and the bottom stem for each value will include leaves from 5–9.
Start with the smallest stem and fill the corresponding leaves in ascending order.
1C THEORY
Step 4: Construct a key.
Step 3: Fill in the leaves for each stem.
Repeat this for each stem. 6
0
7
3
6
7
8
Answer
8
5
5
0
8
1
2
2 8
9
9
5
5
6
7
8
6 0
6 1
3
4
A key shows the scale in which the data is represented.
As decided in step 1, the stems refer to ‘tens’ and the leaves refer to ‘ones’. Demonstrate this scale with an example.
9
4
Key: 6 | 0 = 60% 6
0
7
3
6
5
7
0
8
2
2 8
9
9
5
5
6
7
8
6
5
8
1
0
8
6 1
3
4
9
4
Constructing grouped frequency tables
test mark
number
%
50–