Wading Through the Data Swamp:
Program Evaluation 201
Marijuana Use
Chi-Square Calculation for Marijuana Use (steps 1-6)
Step 1. State your hypotheses.
Here are the hypotheses that relate to marijuana use:
H0 (null): There is no difference between the participants' and comparison group's use of marijuana at posttest.
Hi (research): There is a difference between the participants' and the comparison group's use of marijuana at posttest.
Step 2. Collapse your data.
In real life, when data are collected for the number of days someone used a drug in the past 30 days, many report no use and a few report many days. Because of the extreme scores reported, an evaluator collapses the data or recodes them.
What does that mean? Kids who report any drug use in the past 30 days are coded as "users." Kids who report no days are coded as "nonusers." Thus, if a participant has used a drug in the past 30 days, that participant will be coded as a user. It doesn't matter how many days the person used a drug. If the participant has not used any drugs within the past 30 days, he or she will be coded as a nonuser.
Because it uses data collapsed into nominal-level variables, chi-square is known as a "distribution free" statistic. It is a test that does not require normally distributed data. We can test for a relationship between program participation and drug use without looking at number of days used. We just need to know whether or not kids used drugs.
Step 3. Insert the collapsed data into your contingency table.
|
Participants |
Comparisons |
|
|
Number and proportion of kids who smoked marijuana in past month |
Cell A |
Cell B |
|
Number and proportion of kids who did not smoke marijuana in past month |
Cell C |
Cell D |
Step 4. Add up totals for each row and column.
|
Participants |
Comparisons |
|
|
|
Number and proportion of kids who smoked marijuana in past month |
Cell A |
Cell B |
|
|
Number and proportion of kids who did not smoke marijuana in past month |
Cell C |
Cell D |
|
|
Total |
Step 5. Compare the frequencies.
As you can see from the table, 20 participants and 20 comparison group members smoked marijuana within the past month. Thirty comparison group members and 28 participants did NOT smoke marijuana. (Remember that two participants didn't answer the marijuana question at posttest.) The observed frequencies for the participant and comparison groups are almost identical.
However, we cannot just eyeball the numbers. We need to use statistics to make sure these differences are "real." For contingency tables, we always use a chi-square statistic to do this. Chi-square can help us determine if differences are statistically significant.
This is how we determine that the differences are not due to chance. Jack's evaluator used the chi-square test to determine statistical significance. He used this particular statistic because both the independent variable (program participation) and dependent variable (alcohol use) are nominal-level variables, in this case "yes" or "no."
To determine if differences are statistically significant, we need to go through several more steps. Hang in there. It's not like climbing the Washington Monument--that has way more steps than this.
Step 6. Select a probability level.
In the social sciences, findings with more than 5 percent likelihood of happening by chance are generally considered to be "not significant." We express this likelihood as p = 0.05. This means that we are 95 percent sure that the differences are "real." (Think of it this way: 100 percent certainty - 5 percent chance = 95 percent certainty.)








