Wading Through the Data Swamp:
Program Evaluation 201
Mean Marijuana Use

Jack's evaluator told him that kids were smoking pot, "on average, between 1 and 2 times in the last month (1.7, to be exact). By "average, he meant the mean. But the kids' actual reports of the number of days that they smoked marijuana during the past 30 days tell a different story.
Days of Marijuana Use, Past 30 Days
Scrolling Table! You can use the table below to scroll through the data.
| Student # | Days Used (x) |
|---|
| 1 | 0 |
| 2 | 0 |
| 3 | 6 |
| 4 | 0 |
| 5 | 4 |
| 6 | 0 |
| 7 | 0 |
| 8 | 0 |
| 9 | 6 |
| 10 | 0 |
| 11 | 3 |
| 12 | 5 |
| 13 | 0 |
| 14 | 0 |
| 15 | 3 |
| 16 | 0 |
| 17 | 0 |
| 18 | 5 |
| 19 | 0 |
| 20 | 0 |
| 21 | 0 |
| 22 | 0 |
| 23 | 0 |
| 24 | 4 |
| 25 | 0 |
| 26 | 0 |
| 27 | 0 |
| 28 | 0 |
| 29 | 0 |
| 30 | 7 |
| 31 | 0 |
| 32 | 0 |
| 33 | 7 |
| 34 | 0 |
| 35 | 5 |
| 36 | 3 |
| 37 | 0 |
| 38 | 0 |
| 39 | 8 |
| 40 | 0 |
| 41 | 0 |
| 42 | 0 |
| 43 | 0 |
| 44 | 0 |
| 45 | 7 |
| 46 | 0 |
| 47 | 0 |
| 48 | 5 |
| 49 | 7 |
| 50 | 0 |
Here's what the numbers look like when plotted on a graph.

Mean = 1.7, the sum of all scores (85)/the number of scores (50)
Median = 0 (half the scores are above 0, half below).
Mode = 0, the most frequently reported number
In his report, what the evaluator didn't mention was that 68 percent of the kids reported no marijuana use at all. In other words, the great majority didn't report smoking marijuana, although the few kids who did were smoking a lot! Scores for the kids who smoked a lot of pot are called extreme scores or outliers.
Example: The Trouble With Means
Let's say we look at the average income of a group consisting of you, a few of your friends, and Bill Gates. Mr. Gates' income is an outlier, because it's way off the curve. Your group's average income would be a few hundred million dollars, but sadly, this probably misrepresents you and your buddies. That's why you usually see reports about "median income" instead of the average or mean income of a group.
Extreme scores can often disrupt that nice "bell curve" we like to see when a distribution is considered normal. Instead it will be replaced with something that looks much less symmetrical. We call this a skewed distribution. When you see one, you should watch out for statements about "averages."
For many questions, such as illegal drug use, a distribution of scores is "skewed." A few really large numbers are way out in the "tail, representing extreme values. In such cases, the median is generally a better indicator of where the "middle" is than the mean.
Need an example? I thought so. This can get tricky.








