- This Assignment should be completed individually it is worth 20% of the total assessment (Maximum total allocated marks= 100).
- Use Excel / and SPSS in your computational work.
- Please provide manual calculation for all tasks, otherwise there will be mark deduction.
- Present your computer results of Tasks 3 to 8 as an appendix to support your case.
- If you need extra help visit help desk, attend Lab sessions or visit tutor and lecturer during their consulting times.
- Please do not copy your data set on your word files since it will trigger the plagiarism issue and the University regulations are very strict on this.
- Please do not write questions in your assignment since it will trigger the plagiarism issue, rather mention task number.
- Please copy your data set on excels files that you need to submit with the assignment.
- Please do prepare the manual calculation together with the SPSS or excel calculation. (unless it is advised by the tutor.
The Data Set for the end of semester Assignment is in a file called Data Set for Assignment.xls which presents Weekly Income (WI), Weekly Expenditure on Food (WEF), Highest Level of Education (HLE)1, Family Size (FS), and Gender of the Head of Household (GHH) for the a population of 1000 households.
- Column A consists of Households are named by number from 1 to 1000.
- Columns B to F record these households’ WI, WEF, HLE, FS and GHH, respectively.
- Column I presents 100 samples of households as your sample, consisting:
- 1st household is based on the first 3 digit of your student ID
- 2nd household is based on the last 3 digit of your student ID
- 3rd – 5th is based on the day, month and the last 2 digits year of your birthday, respectively.
- 6th – 100th is randomly chosen.
- Example: if your student ID is: 181XX728 and you were born in 31st of March 1985
- 1st household is no. 181
- 2nd household is no. 728
- 3rd household is no. 31, 4th is no. 03, and 5th is no. 85
- 6th to 100th is randomly chosen
Task 1 (10 marks)
- Organize your sample data in a spreadsheet as per instructions above. (Students who failed to follow the instruction will not be marked and “0” mark will be awarded to them)
- What sampling method is used to select your sample data?
- Do you think that is the best method of sampling? Why not? Why yes?
- What is the best statistic used to compare the volatility in WEF, WI, and FS values? Why?
Task 2 (10 marks)
Based on your sample data:
- Develop the tabular form and graphical bar chart of WI based on this intervals:
- 1st Class = Very Poor
- 2nd Class = Poor
- 3rd Class = Moderate
- 4th Class = Rich
- 5th Class = Very Rich
- What is the most frequent group in your WI sample data? What does that indicates in term of your data distribution?
- Do you think your WI of sample data is normally distributed? Provide the “statistical reason” for your answer?
Task 3 (10 marks)
- What is the top 10% and bottom 10 % of your WEF household values?
- What is the probability that your WI values will be less and equal than $200?
- What is the probability that FS will be equal to 2?
- Is there any outlier of your sample data WEF? Show the graph or proof for that!
If yes, what is the best statistic to measure the dispersity of your WEF?
Task 4 (15 Marks)
- What is the probability that the head of household is women and her HLE is Primary?
- What is the probability that the head of household is men and has the College degree?
- What is the proportion of having the Secondary as the highest degree from among male?
- What is the proportion of having the Intermediate as the highest degree from among male?
- Do you think the probability that gender of head is male and having the college degree is the independent event?
For task 5 onward, assumed that your sample data is normally distributed.
Task 5 (15 Marks)
- Provide the most accurate of interval estimate of WI and interpret your result.
- Provide the least accurate of interval estimate of WEF and interpret your result.
- Provide the most and least accurate of interval estimates of FS and interpret your result.
- Explain the main differences between the most and least accurate of interval estimate!
Why they called as most and least accurate of interval estimate?
Task 6 (15 Marks)
- After surveying many countries, Michael Scott, one of La Trobe University researcher believe in order to be considered as the wealthy city, the average weekly income of the household would be at least $1200. Based on the statement above, can you consider your sample data is from a wealthy city? (𝛼 = 0.10)
- Michael Scott also believe one city can be considered as the fertile city if the average of family size of household is greater than 8 (𝜇𝐹𝑆 > 8). Based on the statement above, can you consider your sample as the fertile city? (𝛼 = 0.05)
- Michael Scott also believe one city can be considered as the obese city if the average spending of Weekly food expenditure of household is greater and equal than 50. Based on the statement above, can you consider your sample as the obese city? (𝛼 = 0.01)
- Based on the calculation above, which prediction is the most accurate and why?
Task 7 (15 Marks)
- What is the relationship between the amount of WEF and FS in your sample?
- What is the relationship between the amount of WI and gender of the family head in your sample?
- How the HLE and WI do affected the WEF in your selected sample?
NB: Use the linear regression line to estimate, R, R2 and graph in order to explain the relationship.
Task 8 (10 Marks)
As one of the largest city in USA, New York also known as the food city. In this city people spend so much money in food, and Bill de Blasio, Mayor of New York believe, that the average amount of weekly income (WI) spent by households is not equal with your sample data. In order to prove that he collects a random sample of 100 households data of his city. (The data is attached on excel file New York tab).
Based on the Bill de Blasio’s statement, perform the analysis on hypothesis testing with level of significance of 5%. Do you think Bill de Blasio’s statement is correct?
You may consider the following assumptions while performing this test:
- Populations for both of your sample data and New York are normally distributed and samples are independent.
- Population variances of Weekly Income (WI) are unknown and unequal.