# Hypothesis calculations

Milestone 3: Inference

Overview and Objectives

The primary objective of this project will be to perform a few hypothesis tests, make inferences on the test statistics, and state null/alternative hypotheses. The techniques used in modern inferential research require software beyond our current means (SPSS, R, SAS); however, we are nonetheless able to observe a few examples that are illustrative of the research process.

You will work with populations, samples, and proportions in these exercises. You’ll then determine whether to use a Z-statistic or T-statistic to determine whether to reject the null hypothesis and you’ll also make connections between hypothesis testing and confidence intervals. You’ll have the opportunity to explore real-world datasets in the fields of criminal justice, equity, and surveying.

With a firm knowledge of this statistical concept, you’ll possess the conceptual knowledge necessary to further pursue these techniques if they are of interest to you.

Part One: Crime Analysis from 2016-2017

National law enforcement noticed an overall decline in crimes among US Citizens. This data set records the number of crimes in each state and the percent change from 2016 to 2017. Let’s suppose that leadership would like to conduct a hypothesis test at the 99% significance level that crime has decreased over time (by using percent change). For this part, you should look at the total percent change column to run your test.

Performing the Test

Step 1: Write the null hypothesis/alternative hypotheses below. Determine the tail type.

:

Step 2: Perform the Hypothesis Test in Excel. Fill out the cells indicated on the spreadsheet and determine whether to accept/reject the null hypothesis.

Step 3: Find a 99% confidence interval for the percent change from 2016 to 2017 by filling out the cells in the Excel sheet.

Analysis of Results

What is the conclusion of the hypothesis test?

Was your null hypothesis mean contained within your confidence interval? Does this agree with the results of your hypothesis test?

Part Two: Gender Perception in the Workforce

Gender perception in the workplace can often be skewed. For example, jobs in STEM can often be overrepresented by men, which can have negative consequences for budding female scientists. But how can we quantify the extent of this gender misperception? This data set is taken from Pew Research Center, which used an algorithm to sample the proportion of women found in images that represent various occupations. They then compared this to the proportion of women who actually work in the field, taken from the Bureau of Labor Statistics (BLS).

Suppose researchers want to show that women are underrepresented by at least 5% in STEM-related jobs at a 95% level of confidence. Our data set then serves as a representative sample of all occupations and would give us a good look into whether women were accurately portrayed in images.

Performing the Test

Step 1: Write the null hypothesis/alternative hypotheses below. Determine the tail type.

:

Step 2: Find the difference between the true proportion of women who work in the field compared to the image search proportion for each occupation (bls_proportion_women) – (image_search_proportion_women).

Step 3: Perform the Hypothesis Test in Excel. Remember that you’re working with a proportion, so you’ll need to use the according formula. Fill out the cells indicated on the spreadsheet and determine whether to accept/reject the null hypothesis.

Analysis of Results

What is the conclusion of the hypothesis test? Were you able to reject the null hypothesis?

What kinds of fields were women most unrepresented? Most overrepresented? Would these misrepresentations significantly skew your results?

Part Three: Survey Data/Construct Your Own Hypothesis Test

In 2013 students from a Statistics class at FSEV UK were asked to survey their friends who were 15-30 years old, in order to explore the interests of young people.

The survey has a variety of questions, which are detailed in the Survey Key. For example, the column header “Music” is short for the question “I enjoy listening to music” and is rated 1-5, where 1 represents strongly disagree and 5 represents strongly agree.

There is also some categorical data present as well. Survey participants record their gender, how often they smoke/drink, their internet usage, and education (just to name a few!). Please perform a hypothesis test based on one of the categorical variables and one survey question. For example, you could choose to prove the following: Survey participants who identify as an “only child” enjoy meeting new people (Survey score ).

Here is an example: people who are right-handed (as identified in the data set) enjoy listening to music. Please be sure to use two columns for your test—one that identifies your participants and one that provides a rating.

Performing the Test

Step 1: Write the null hypothesis/alternative hypotheses below for the hypothesis test you would like to carry out based on the data. Determine the tail type. Also include your desired confidence level.

:

Step 2: Create a new tab and take the data you’re interested in studying (so only two columns), so you have plenty of space to make calculations.

Step 3: Perform the hypothesis test. You’ll need to do something similar to what you did in the other parts, with one major difference. Since you’re performing a test on a sampling distribution, you’ll need to use the T-Statistic.

Analysis of Results:

Provide a paragraph that describes your test. What were your null/alternative hypotheses? Were you able to reject the null hypothesis. What was the p-value associated with your test?

In terms of this sampling distribution, what population does it represent? All young people ages 15-30? Or some other population? Explain why.

