Objectives
This assessment addresses Unit Learning Outcomes 1, 2, 3 & 4:
 Explain how statistical choices in analysis link directly to the research study design that generated the data, and the type of data,
 Explain the rationale behind hypothesis testing, and the concept of Type I and II errors,
 Differentiate the most appropriate descriptive and inferential statistics to use for common types of health data,
 Analyse health data using a statistical software package, and interpret the results.
Task
Your task is to answer a set of research questions using data from the Framingham Heart Study (sent by email). This data was obtained at a particular point in time (crosssectional) from adult volunteers recruited from the community and contains a variety of clinical data. Details of the data set and the variables are provided below. Additional information about the Framingham Heart Study, including some published results, can be found in Learning Resources.
Description of Variables
Variable

Description


Values







ID

Unique identification number for each




participant

2448 – 9999312









SEX

Participant sex

1

= Male; 2 = Female; 999 = Missing







AGE

Age at exam (years)

32 – 81; 999 = Missing










SYSBP

Systolic Blood Pressure (mean of last two




of three measurements) (mmHg)

83.5 – 295; 999 = Missing









DIABP

Diastolic Blood Pressure (mean of last




two of three measurements) (mmHg)

30 – 150; 999 = Missing









CIGPDAY

Number of cigarettes smoked each day

0

– 90; 999 = Missing









0

= zero to 5years;




1

= 6_to_11years;


EDU

Attained education level

2

= High_school_dipl;




3

= Some_college_or_TAFE;




4

= College_degree_or_more;




999 = Missing


TOTCHOL

Serum Total Cholesterol (mg/dL)

107 – 696; 999 = Missing










HDLC

High Density Lipoprotein Cholesterol

10  189; 999 = Missing


(mg/dL)











LDLC

Low Density Lipoprotein Cholesterol

20  565; 999 = Missing


(mg/dL)











BMI

Body Mass Index, weight in

14.43  56.8;


kilograms/height meters squared



999 = Missing










GLUCOSE

Casual serum glucose (mg/dL)

39  478; 999 = Missing











Diabetic according to criteria of first

0

= No; 1 = Yes;


DIABETES

exam treated or first exam with casual

999 = Missing



glucose of 200 mg/dL or more







HEARTRTE

Heart rate (Ventricular rate) in

37  220; 999 = Missing


beats/min









ANGINA

History of Angina Pectoris at exam

0

= No; 1 = Yes; 999 = Missing








History of Coronary Heart Disease

0

= No;



defined as preexisting Angina Pectoris,




ANYCHD

Myocardial Infarction (hospitalized,

1

= Yes;



silent or unrecognized), or Coronary

999 = Missing



Insufficiency (unstable angina)







STROKE

History of stroke

0

= No; 1 = Yes; 999 = Missing








History of hypertension. Subject was

0

= No;



defined as hypertensive if treated or if




HYPERTEN

second exam at which mean systolic was

1

= Yes;



>=140 mmHg or mean Diastolic >=90

999 = Missing



mmHg







A cleaned (error free) subset of the original dataset (which is unique to you) has been sent by email and you will need to use this version for the Assignment. Note: In SPPS you will need to specify which value(s) represent missing values and also the measure type for each variable (e.g scale, nominal, etc.). These are the only changes/edits you need to make as value labels are already defined in the dataset.
In your analytical report you are required to answer the following research questions:
Answer all of the questions (Q1, 2, 3, 4 & 5):
Q1: Is there an association between smoking (Yes/No) and hypertension? (Note: you will need to create a new variable called `SMOKER’ which will contain two groups (`Yes’ or `No’) using information on the number of cigarettes smoked per day). In the new variable name include the number of the dataset you have been assigned e.g if you have been assigned the dataset `Framingham_42.sav’, name the variable `SMOKER_42’.
Q2: Are there differences in systolic blood pressure for people who are underweight, normal, overweight or obese? (Note: You will need to create a new variable called `BMI_4grps’ using the existing variable `BMI’.
Define: BMI < 20.0 = `underweight’; 20.0 ≤ BMI < 25.0 = `normal’; 25.0 ≤ BMI < 30.0 = `overweight’; BMI ≥ 30.0 = `obese’.). In the new variable name include the number of the dataset you have been assigned e.g if you have been assigned the dataset `framingham_42.sav’, name the variable `BMI_4grps_42’.
Q3: Is there a difference in Serum total cholesterol between male and female?
Q4: Is there an association between a participant’s age and their heart rate?
Q5: Are age, low density lipoprotein cholesterol, Serum total cholesterol, casual serum glucose level and body mass index significant predictors of a person’s systolic blood pressure? Which of the variables explain the largest amount of variation in systolic blood pressure?
For each research question (Q1 to 5) you are required to fully detail an analytical plan, similar to that used in the PUN105 Activity Workbook, Week 6 (page 35).
Please use the marking guide on pages 6 and 7 to guide the extent of the analysis and answers presented for each question.
This should include, at a minimum, the following:
 State the question
 Develop and clearly articulate an analysis plan that will allow you to answer the question
 Implement the analysis plan using SPSS and report all relevant output. If you need to modify or create new variables to implement the plan then you should describe these new / modified variables and how they were calculated.
 Interpret the results of the analysis
 Write a summary paragraph describing the question, the data and the results. Graphics should be incorporated if relevant.
 Tables and figures in the report should be professionally presented with clear numbering, titles and appropriate referencing in the written sections of the report. e.g Table 1.1 shows the results from a Chisquare test examining the association between
 The original or raw numerical output from SPSS used to present the results should be presented in an Appendix to the report with clear sections indicating the question and results the output refers to. Failure to provide the original (raw) output from SPSS in an Appendix will result in the assignment being returned to you.
Formatting and Word Limits
Your report should contain a title page clearly identifying the unit code, your name and student number. You should also indicate the word count for each section of your report as outlined below and the file name and number of the datasetyou used. The number of your dataset also needs to be included in any new variables you create (if needed in the question). Failure to do so will result in the assignment not being marked.
Each research question should be treated as a separate section in your report and it is expected that you will use appropriate headings within each section.
You are not required to provide a formal introduction, or search any literature or provide references in your analytical report.
The report must:
 have minimum of 1.5 line spacing, and
 have page margins no smaller than 2cm
It is expected the report will be well written using professional language and be free from grammatical and spelling errors. The written sections of the report should be no longer than 3,000 words excluding the analysis plans and SPSS output. The word count (excluding the analysis plan and SPSS output) for each section should be stated on the title page of the report.
Marking Criteria
The analytical report will be marked out of a total of 355 marks according to the criteria on pages 6 & 7 (last 2 pages). Please ensure you review the criteria prior to submitting your assessment.
Marking Criteria

Element

Max. marks



Question 1

(60)




• Clear & comprehensive analytical plan to answer question that is technically

10




correct including scientific hypothesis, statistical test & assumptions

10




• Clear description of new variable created and process used




• All SPSS output included and matches the analytical plan

10




• Clearly documented evidence that all test assumptions have been tested for

10




validity (& revise of analysis if it required)






• Concise & accurate written summary describing the data

10




• Comprehensive and correct interpretation and reporting of statistical results

10




Question 2

(75)




• Clear & comprehensive analytical plan to answer question that is technically

10




correct including scientific hypothesis, statistical test & assumptions

10




• Clear description of new variable created and process used




• All SPSS output included to match the analytical plan

10




• Clearly documented evidence that all test assumptions have been tested for

15




validity (& revise of analysis if it required)

10




• Concise & accurate written summary describing the data




• Comprehensive and correct interpretation and reporting of statistical results

20




Question 3

(50)




• Clear & comprehensive analytical plan to answer question that is technically

10




correct including scientific hypothesis, statistical test & assumptions

10




• All SPSS output included to match the analytical plan




• Clearly documented evidence that all test assumptions have been tested for

10




validity (& revise of analysis if it required)






• Concise & accurate written summary describing the data

10




• Comprehensive and correct interpretation and reporting of statistical results

10










Question 4

(50)



• Clear & comprehensive analytical plan to answer question that is technically correct

10




including scientific hypothesis, statistical test & assumptions






• All SPSS output included to match the analytical plan

10




• Clearly documented evidence that all test assumptions have been tested for validity

10




(& revision of analysis if required)






• Concise & accurate written summary describing the data

10




• Comprehensive and correct interpretation and reporting of statistical results






Question 5


(90)




• Clear & comprehensive analytical plan to answer question that is technically

20




correct including scientific hypothesis, statistical test & assumptions






• Clear description of univariate & bivariate analysis undertaken

20




• Clearly documented evidence that all test assumptions have been tested for

25




validity, and test all relevant correlations, describe significant single linear






relationships with regression and multiple regression.






• All SPSS output included to match the analytical plan

10




• Concise & accurate written summary describing the data

15
















Overall Report






• Written report contains all of the required information and adheres to formatting

10



requirements including maximum prescribed length





• Written report uses professional language to clearly articulate meaning with

10



minimal typographic and grammatical errors











TOTAL

This will be converted to a final mark of 60


345

