# Analyses Report

General instructions
• You need to answer Questions 1 to 4 and then write a report that summarises and explains your
results from Questions 1 to 4.
• In each question, you will need to decide on the most appropriate hypothesis test to perform or a
statistical technique to apply.
• You will first need to download the data file SleepStudy2020.xlsx from the Data tab in learnonline
and decide which variables are categorical and which are numerical. Variable descriptions are given
in Appendix A, Table A.1. The decision tree from Week 7 called ‘Which statistical test?’ will also help.
Question 1 (20 marks)
Are you a Lark or an Owl? Studies indicate that about 10% of us are morning people (Larks) while 20%
are evening people (Owls) and the rest are neither. Studies also indicate that this circadian (owl/lark)
preference may not be settled until the age of 22 or later. In this question you are going to analyse
circadian preferences among university students.
(a) (10 marks) Is there evidence that the circadian (owl/lark) preferences for university students differ
from the claimed proportions? Formulate and perform an appropriate hypothesis test at a 5%
significance level using the summarised data shown in Table 1. Use the STATE-FORMULATE-SOLVECONCLUDE procedure and perform follow-up analysis if appropriate. For full marks, include
appropriate Minitab output.
Type Count Claimed proportion
Lark 41 0.1
Neither 163 0.7
Owl 49 0.2
Table 1. Circadian preference: Summary of survey responses and claimed proportions
Additional Minitab instructions: In order to complete this question, enter the data from Table 1
above into Minitab and perform a ‘Chi-Square Goodness-of-Fit (One Variable)’ test using the option
to ‘Test specific proportions’.
(b) (10 marks) How much do Larks and Owls differ in their sleep habits? Use Minitab to obtain the
confidence intervals for a comparison between Larks and Owls (ignore the Neither category) on
each of the variables WeekdayBed, WeekdayRise and WeekdaySleep. These variables are stored in
the SleepStudy2020.xlsx data file and their descriptions can be found in Table A.1 in Appendix A.
For full marks, include appropriate Minitab output and check the requirements. Do not use the full
STATE-FORMULATE-SOLVE-CONCLUDE procedure. Instead, identify and interpret confidence
intervals that correspond to statistically significant differences between Larks and Owls.
Additional Minitab instructions: In order to complete this question, unstack the columns
WeekdayBed, WeekdayRise and WeekdaySleep using categories in variable LarkOwl.
Question 2 (18 marks)
Does circadian preference matter when it comes to sleep quality? In order to address this question,
you are going to work with variables PoorSleepQuality and LarkOwl from the SleepStudy2020.xlsx data
file. Variables descriptions are given in Appendix A, Table A.1.
(a) (6 marks) Use Minitab to produce boxplots of PoorSleepQuality by circadian preference (three
categories in variable LarkOwl)shown horizontally within the same graph. Comment briefly on how
4
sleep quality compares across circadian preference categories (Owl, Lark, Neither) and whether
you expect to find any statistically significant differences.
(b) (10 marks) Is there a statistically significant difference in sleep quality based on students’ circadian
preferences? Formulate and perform an appropriate hypothesis test at a 5% significance level. Use
the STATE-FORMULATE-SOLVE-CONCLUDE procedure. For full marks, include appropriate Minitab
output.
(c) (2 marks) Is it appropriate to argue cause and effect, in either direction, based on these results?
Why or why not? Explain briefly. Hint: What type of study is this?
Question 3 (10 marks)
Which sleep related habits might influence academic performance? In order to answer this question,
you are going to investigate the relationship between GPA and each of the following variables: time to
rise on weekdays, time to go to bed on weekdays, amount of sleep per night on weekdays, and the
number of missed classes. Answer the questions that follow.
(a) (4 marks) Use Minitab to obtain the Pearson correlation coefficient and the corresponding P-value
for GPA and each of the following variables: WeekdayRise, WeekdayBed, WeekdaySleep, and
ClassesMissed. Variable descriptions are given in Table A.1 in Appendix A. For full marks, include
relevant Minitab output here (i.e. four correlations and their P-values). Do not interpret that
output, you will do that in part (b).
(b) (6 marks) Based on your Minitab output in part (a), answer the following questions:
• Do sample correlations provide sufficient evidence of an association between GPA and the
other variables? In other words, which correlation estimates are statistically significant? How
do you know?
• How strong are the relationships that turned out to be statistically significant?
• Are your statistically significant correlations positive or negative? What does it mean in
practical terms for each relationship? Explain briefly.
Question 4 (24 marks)
Sleep Quality and DAS score. In the study students were rated on sleep quality (PoorSleepQuality) as
well as on Depression, Anxiety and Stress scales, with the DAS score (DASScore) giving a composite of
the three scores. How well does the DAS score predict sleep quality? Answer the questions that follow.
Variable descriptions are given in Table A.1 in Appendix A.
(a) (3 marks) Use Minitab to obtain a scatterplot with DASScore as the independent variable (x) and
PoorSleepQuality as the dependent variable (y). Does it make sense to fit a linear regression model
in this case? Justify your answer briefly.
(b) (2 marks) Use Minitab to fit a simple linear regression model including residual plots. You can
generate a fitted line plot if you wish (it is not required), but you must show regression tables from
the session window, together with residual plots. You will use the Minitab output from here to
answer questions that follow in parts (c) to (f).
(c) (6 marks) Are conditions for linear regression satisfied? Answer in terms of Linearity, Independence,
Normality and Population standard deviations.
5
(d) (2 marks) Comment on the strength of the relationship between sleep quality and DAS score using
the coefficient of determination. What is its value? What precisely does it measure in this scenario?
(e) (2 marks) What is the value of the slope? What does it measure in this scenario?
(f) (3 marks) Is the relationship between sleep quality and DAS score statistically significant? In other
words, is the slope estimate statistically significant at 5% level? How do you know? Explain briefly.
(g) (6 marks) Suppose that one of the students at this university has a fairly high DAS score of 40. Use
Minitab to obtain a prediction of sleep quality for this student, including an appropriate interval for
that prediction. Discuss the accuracy of that prediction as shown in Week 9 workshop.
Statistical Analysis Report (28 marks)
Your report should consist of sections described below.
Introduction (3 marks)
Provide the context and rationale for the study. Use your own words!
There is no word limit, just ensure you have explained what the report will contain. As a guideline, one
paragraph is sufficient.
Methods (6 marks)
Discuss the methods used to collect and analyse data from this study:
• What type of study was conducted? Name the study design and briefly describe, in your own words,
the interventions that were part of this study.
• Describe the sample (including the sample size and any demographic information, e.g. who the
study participants were, their age, gender split etc).
• Briefly describe variables that you have analysed.
• Provide a list of statistical displays and procedures that you have used, along with confidence and
significance levels used in the analysis.
There is no word limit. As a guideline, one paragraph for this section is sufficient.
Results & Discussion (16 marks)
First, summarise the main results of your analyses from Questions 1 to 4. You may use subsections,
tables etc. as you see fit. Present and discuss results in a clear and simple way:
• Present findings of statistical analyses in a logical sequence. Descriptive statistics about variables
of interest are usually presented first, followed by the results of further statistical analyses.
• Include copies of key diagrams from Questions 1 to 4 as relevant to your presentation of results.
Useful diagrams to include in a report are bar charts, histograms, boxplots, error diagrams,
scatterplots, etc. Normal Probability Plots or residual plots should not be included in the report.
• State each result and the corresponding statistical procedure, and report P-values to three decimal
places. However, do not include numerical calculations or full details of statistical procedures and
condition checking (e.g. full Minitab output).
Next, interpret your statistical findings by discussing their practical significance. Use plain language;
there should be no technical details or statistical terminology. Are any of the results surprising?
Finally, in another short paragraph indicate shortcomings, if any, of the study design and analyses that
were performed. Are there any issues with internal and external validity of this study?
6
There is no word limit. As a guideline, one and a half pages (two pages at most) will be sufficient for this
section, including any tables and graphs. Remember, marks are awarded for quality not quantity!
Conclusion (3 marks)
What can you conclude from your analysis about sleep quality, mood, circadian preference and
academic performance? Which factors appear to be important?
There is no word limit. As a guideline, one paragraph will be sufficient. Do not introduce any new
information in this section, and do not simply repeat statements made elsewhere in your report!
Note: You are not required to include additional sources (e.g. internet articles or scientific papers) but
if you do, ensure you include a reference list and cite them in text appropriately.
Appendix A. Data file and variable descriptions
Some of the data from the Onyper et al (2012) study is stored in the file called SleepStudy2020.xlsx that
can be downloaded from the Data tab within the course website. Below are descriptions of variables in
that data file:
Name Description
ClassesMissed Number of classes missed in a semester
ClassYear Year level from first to fourth year, coded 1 to 4
DASScore Combined score on Depression, Anxiety and Stress scale, commonly
used to assess mood. Higher values indicate more mood complaints,
e.g. depression, anxiety and/or stress
Gender 0 = Female and 1 = Male
GPA Grade point average measured on 0-4 scale, self-reported
LarkOwl Responses to the following survey question: ‘Are you an early riser or
a night owl?’ Possible categories: Lark, Neither, or Owl
PoorSleepQuality Measure of sleep quality derived from responses to the Pittsburgh
Sleep Quality Index (PSQI) questionnaire. Higher values indicate
poorer sleep
WeekdayBed Typical weekday bedtime derived from responses on PSQI and sleep
diaries, reported in hours since previous midnight, e.g. a bedtime of
25 corresponds to going to bed at 1:00am
WeekdayRise Typical weekday rise time derived from responses on PSQI and sleep
diaries, reported in hours since midnight, e.g. a rise time of 7.25
corresponds to getting up at 7:15 am
WeekdaySleep Typical weekday sleep duration, estimated as a period of time from
shutting the eyes with intent to go to sleep until the time the
participants awoke and did not close their eyes to go back to sleep.
Derived from responses on PSQI and sleep diaries and reported in

