Q1: Is there an association between smoking (Yes/No) and hypertension? (Note: you will need to create a new variable called `SMOKER’ which will contain two groups (`Yes’ or `No’) using information on the number of cigarettes smoked per day). In the new variable name include the number of the dataset you have been assigned e.g if you have been assigned the dataset `Framingham_42.sav’, name the variable `SMOKER_42’.
Analytical plan:
STUDY DESIGN
Observational-Framinton heart study was a cross sectional study
VARIABLES
IV : Smoker –categorical dichotomous
DV : Hypertension – categorical dichotomous
HYPOTHESIS
H0: The proportion of people who smoke is similar for people with hypertension and those without Hypertension
H1: The proportion of people who smoke is different for people with hypertension and those without Hypertension
UNIVARIATE ANALYSIS
Smoker-numerical summary =proportion (or %) of sample with people with smoking habit; graphical summary could be a bar graph {note: since the data is dichotomous it is usually better to summarize numerically}
Hypertension-numerical summary=proportion (or %) of sample with hypertension; graphical summary could be a bar graph.
BIVARIATE ANALYSIS
In this bivariate analysis we do the cross tabulation by contingency which interpret the results between the relation between hypertension and smokers
Numerical summary=2 x 2 contingency table Graphical summary=side –by- side bar chart
STATISTICAL TESTS AND ASSUMPTIONS
Chi-square test of independent .because the table is 2x2 should use Fisher’s exact test or Continuity Correction, rather than Pearson’s Chi-square .I will use Fisher’s exact test.{note:the choice of which test to use is up to you.You should state which test you are going to use and then llist the assumptions for the test}
Fisher’s exact test assumptions :1) observations are independent
SIGNIFICANCE LEVELS
P
Results:
Univariate analysis:
In this sample there are 300 individuals in smoker_29 we have valid individuals of 296 are valid but we have 4 missing. In this sample the 166(55.3%) do not smoke ,130(43.3%) people do smoking as we have 4(1.3%) could not be calculated due to the missing data. Among the same sample 78(26%) are not with incident hypertension,222(74%) are the individuals with hypertension.
The below tables shows the summary on the results provided in the frequency tables.
Statistics
|
|
Smoker_29
|
Incident Hypertension
|
|
|
|
|
N
|
valid
|
296
|
300
|
|
|
|
|
|
Missing
|
4
|
0
|
|
|
|
|
SUMMARY TABLE SHOWING UNIVARIABLE ANALYSIS -SMOKER_29
|
|
Frequency
|
Percent
|
Valid Percent
|
Cumulative
|
|
|
|
percent
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Valid
|
No
|
166
|
55.3
|
56.1
|
56.1
|
|
|
|
|
|
|
|
|
|
Yes
|
130
|
43.3
|
43.9
|
100.0
|
|
|
|
|
|
|
|
|
|
Total
|
296
|
98.7
|
|
|
|
|
|
|
|
|
|
Missing 999
|
4
|
1.3
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Total
|
|
300
|
100.0
|
|
|
|
|
|
|
|
|
|
|
SUMMARY TABLE SHOWING UNIVARIABLE ANALYSIS - INCIDENT
HYPERTENSION
|
Frequency
|
Percent
|
Valid Percent
|
Cumulative
|
|
|
Percent
|
|
|
|
|
|
|
|
|
|
|
|
|
Valid No
|
78
|
26.0
|
26.0
|
26.0
|
|
|
|
|
|
|
|
Yes
|
222
|
74.0
|
74.0
|
100.0
|
|
|
|
|
|
|
|
Total
|
300
|
100.0
|
100.0
|
|
|
|
|
|
|
|
|
Graphical Representation
In univariate analysis the representation is done by bar diagram as they are categorical variable on which incident hypertension on x-axis and count on y- axis .
GRAPH SHOWING UNIVARIATE VARIABLE (INCIDENT HYPERTENSION)
In this bar diagram it shows about the smoker_29 on which count is on y- axis and smoker_29 is on x-axis showing yes and no. The number of people who smoke are more when compared to non smokers .
Bivariate analysis
In this analysis the sample consist of valid individuals of 296(98.7%) and 4(1.3%) are missing
Case processing summary
|
|
|
|
|
|
Cases
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Valid
|
|
Missing
|
|
Total
|
|
|
|
|
|
|
|
|
|
|
|
N
|
|
Percent
|
N
|
|
Percent
|
N
|
|
Percent
|
|
|
|
|
|
|
|
|
|
|
Incident
|
|
|
|
|
|
|
|
|
|
Hypertension*
|
296
|
|
98.7%
|
4
|
|
1.3%
|
300
|
|
100.0%
|
SMOKER_29
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
In this contingency table here gives the interpretation of results between the relation of hypertension and smoker_29. Here we have an clear description that there are 31(39.4%) individuals do not smoke have no hypertension,47(60.3%) do smoke but do not have the hypertension. In the same sample individuals 135(61.9%) do not smoke but they have hypertension,83(38.1%)they do smoke and have hypertension.
Incident Hypertension*SMOKER_29 crosstabulation
|
|
|
SMOKER_29
|
|
|
|
|
|
|
|
|
Total
|
|
|
|
|
No
|
Yes
|
|
|
|
|
|
|
|
|
Incident
|
No
|
Count
|
31
|
47
|
78
|
|
Hypertension
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
%within Incident
|
39.7%
|
60.3%
|
100.0%
|
|
|
|
Hypertension
|
|
|
|
|
|
|
|
|
|
|
|
|
Yes
|
Count
|
135
|
83
|
218
|
|
|
|
%within Incident
|
|
|
|
|
|
|
61.9%
|
38.1%
|
100.0%
|
|
|
|
Hypertension
|
|
|
|
|
|
|
|
|
|
|
|
Total
|
|
Count
|
166
|
130
|
296
|
|
|
|
|
|
|
|
|
|
|
%within Incident
|
56.1%
|
43.9%
|
100.0%
|
|
|
|
Hypertension
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Incident Hypertension*SMOKER_29 crosstabulation
|
|
|
SMOKER_29
|
|
|
|
|
|
|
|
|
Total
|
|
|
|
|
No
|
Yes
|
|
|
|
|
|
|
|
|
Incident
|
No
|
Count
|
31
|
47
|
78
|
|
Hypertension
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
%within Incident
|
39.7%
|
60.3%
|
100.0%
|
|
|
|
Hypertension
|
|
|
|
|
|
|
|
|
|
|
|
|
Yes
|
Count
|
135
|
83
|
218
|
|
|
|
%within Incident
|
|
|
|
|
|
|
61.9%
|
38.1%
|
100.0%
|
|
|
|
Hypertension
|
|
|
|
|
|
|
|
|
|
|
|
Total
|
|
Count
|
166
|
130
|
296
|
|
|
|
|
|
|
|
|
|
|
%within Incident
|
56.1%
|
43.9%
|
100.0%
|
|
|
|
Hypertension
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Graphical representation
In bivariate analysis the relation of incident hypertension and smoker-29 is shown by the side –by-side diagram.
Chi-square analysis:
In this sample we do Chi-square tests because it is the 2x2 contingency table.In this we get the degree of freedom, the continuity correction and Fisher’s exact test are calculated.
For Fisher’s exact test we use two tailed hypothesis. Hence the p-value for this sample is 0.001.this means it is lesser than 0.05.this means we reject the null hypothesis as we do not have any significant analysis and conclude that the proportion of people who smoke is different for people with hypertension and those without Hypertension as we know that the Fisher’s Exact Test observations are independent as it is due to the study design.
The proportion of people who smoke is different for people with hypertension and those without Hypertension the Fisher’s Exact test,p=0.001)
This summary, gives a clear descriptive summary and provides the explanation of the data and the relationship between the smokers_29 and hypertension.
Chi – Square Tests
|
|
|
Asympotic
|
Exact sig.
|
Exact Sig.
|
|
Value
|
Df
|
Significance
|
(2-sided)
|
(1-sided)
|
|
|
|
( 2 sided)
|
|
|
|
|
|
|
|
|
Pearson Chi-square
|
11.477a
|
1
|
.001
|
|
|
|
|
|
|
|
|
Continuity correction
|
10.594
|
1
|
.001
|
|
|
|
|
|
|
|
|
Likelihood Ratio
|
11.440
|
1
|
.001
|
|
|
|
|
|
|
|
|
Fisher Exact Test
|
|
|
|
.001
|
.001
|
|
|
|
|
|
|
Linear-by-linear
|
11.438
|
1
|
.001
|
|
|
|
|
|
|
|
|
Association
|
296
|
|
|
|
|
|
|
|
|
|
|
N of Valid Cases
|
11.477a
|
1
|
.001
|
|
|
|
|
|
|
|
|
SUMMARY:
This is the data which shows consists of two variables with independent variable as smoker_29 and dependent variable as hypertension these are dichotonomous categorical variables which are done with graphical representation by bar diagrams and bivariate by side by side chart. The statistical analysis is done by chi square test in which fisher exact value i.e p value is 0.001 which is lesser than 0.05 in which we reject the null hypothesis and shows the that the proportion of people who smoke is different for people with hypertension and those without Hypertension
APPENDIX:
Statistics
|
|
|
Incident
|
|
|
SMOKER_29
|
Hypertension
|
|
|
|
|
N
|
Valid
|
296
|
300
|
|
Missing
|
4
|
0
|
|
|
|
|
FREQUENCY TABLE
SMOKER_29
|
|
|
|
|
Cumulative
|
|
|
Frequency
|
Percent
|
Valid Percent
|
Percent
|
|
|
|
|
|
|
Valid
|
No
|
166
|
55.3
|
56.1
|
56.1
|
|
Yes
|
130
|
43.3
|
43.9
|
100.0
|
|
Total
|
296
|
98.7
|
100.0
|
|
Missing
|
999
|
4
|
1.3
|
|
|
Total
|
|
300
|
100.0
|
|
|
|
|
|
|
|
|
Incident Hypertension
|
|
|
|
Cumulative
|
|
Frequency
|
Percent
|
Valid Percent
|
Percent
|
|
|
|
|
|
Valid No
|
78
|
26.0
|
26.0
|
26.0
|
Yes
|
222
|
74.0
|
74.0
|
100.0
|
Total
|
300
|
100.0
|
100.0
|
|
|
|
|
|
|
CROSS TABS
Case Processing Summary
|
Cases
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Valid
|
|
Missing
|
|
Total
|
|
|
|
|
|
|
|
|
|
|
|
N
|
Percent
|
N
|
Percent
|
N
|
Percent
|
|
|
|
|
|
|
|
|
|
Incident Hypertension *
|
296
|
98.7%
|
4
|
1.3%
|
300
|
100.0%
|
|
SMOKER_29
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Incident Hypertension * SMOKER_29 Crosstabulation
Count
|
|
SMOKER_29
|
|
|
|
|
|
|
|
|
No
|
Yes
|
Total
|
|
|
|
|
|
Incident
|
No
|
31
|
47
|
78
|
Hypertension
|
Yes
|
135
|
83
|
218
|
Total
|
|
166
|
130
|
296
|
|
|
|
|
|
Incident Hypertension*SMOKER_29 crosstabulation
|
|
|
SMOKER_29
|
|
|
|
|
|
|
|
|
|
|
|
|
No
|
Yes
|
Total
|
|
|
|
|
|
|
|
|
Incident
|
No
|
Count
|
31
|
47
|
78
|
|
Hypertension
|
|
% within Incident
|
39.7%
|
60.3%
|
100.0%
|
|
|
|
Hypertension
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yes
|
Count
|
135
|
83
|
218
|
|
|
|
% within Incident
|
61.9%
|
38.1%
|
100.0%
|
|
|
|
Hypertension
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Total
|
|
Count
|
166
|
130
|
296
|
|
|
|
|
|
|
|
|
% within Incident
56.1% 43.9% 100.0%
Hypertension
Chi-Square Tests
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Asymptotic
|
|
|
|
|
|
|
Significance
|
Exact Sig. (2-
|
Exact Sig. (1-
|
|
|
Value
|
df
|
(2-sided)
|
sided)
|
sided)
|
|
|
|
|
|
|
|
|
Pearson Chi-Square
|
11.477a
|
1
|
.001
|
|
|
|
Continuity Correctionb
|
10.594
|
1
|
.001
|
|
|
|
Likelihood Ratio
|
11.440
|
1
|
.001
|
|
|
|
Fisher`s Exact Test
|
|
|
|
.001
|
.001
|
|
Linear-by-Linear
|
11.438
|
1
|
.001
|
|
|
|
Association
|
|
|
|
|
|
|
|
|
|
N of Valid Cases
|
296
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|