Evaluation for the Unevaluated:
Module 2: What Does an Evaluation Project Look Like - Page 8 of 14
Step 4. Choose and test instruments and procedures
This step is very important--and fun. This is where the rubber starts to hit the road and, with luck, doesn't come to a screeching halt! You need to decide how you'll collect information about your program.
You probably will be able to use existing instruments. Hundreds of instruments have been scientifically validated. The Evaluation Tools provides information on many instruments. Quite a lot of them are free.

Many programs develop their own instruments. This can be problematic for several reasons:
- Validity and reliability have not been established.
- Much effort has gone into the construction, development, and validation of a wide range of standardized measures used to evaluate prevention programs. It is likely that your own efforts would be duplicating these efforts and you'd be reinventing the wheel.
- An "expert" has probably done a better job of developing a measure than you as a director or evaluator for a prevention project can do in the time available.
- If you decide to construct your own scale, you may find it difficult to write a set of questions that are clear to respondents, tap different aspects of important attitudes, beliefs, and behaviors, and "hang together" well enough to make a good, reliable, scale. You will probably make some errors in your first attempts and will need to rewrite or drop some items.
- Even after you think you've developed a good measure, you should try to conduct studies of the measure's reliability and validity. You'll need to look at whether it seems to have enough room for change in the groups of participants you propose to use it with and whether it is culturally sensitive in the settings where you plan to use it.
- Developing and assessing your new scale may be very time consuming and you may not have the resources to support the effort.
- If you use existing measures, it is easier to compare your results to other evaluations that have used the same measure.
- Critics will be able to point to the ad hoc nature of your measures and question their reliability and validity--unless you are able to publish studies of the measures' properties before you need to present the results of program evaluations that use the measures.
If you do decide to develop your own instruments, it is best to consult an expert.
To decide the best sources for information (for example, attendance records, grades, interviews, focus groups, surveys), ask yourself three questions:
- What sources are likely to provide the most accurate information?
- What sources are the least costly or time consuming? Cost and time depend on various factors. For example, interviews may be more costly due to time spent arranging the interview and traveling. Record reviews may be cheaper because you can have copies sent to you.
- Does the information collection pose an undue burden on the sources? You don't want to alienate your sources. For example, trying to interview a single parent with two jobs may be difficult. It's important to respect participants.
Having accurate data sources for the evaluation is the most important factor. For example, it may be cheaper and quicker to interview program staff, but staff may not provide as accurate information about services as case records could. When you interview staff, you are relying on their memories. When you review case records, you should be able to obtain information about what actually did happen.
Once you select or develop instruments, you need to road test them to see if they work as intended. You will also have an opportunity to test your logistical procedures related to data collection, handling, and storage. For example:
- Who will administer the instruments?
- Where will they be administered?
- How long does it take to administer the instrument?
- How will the instruments be collected (e.g., tear-off sheet, mail in, hand delivered in a sealed envelope)?
- How many copies do you need of completed instruments?
- Where will original completed instruments be kept?
Remember that a pilot test is the stage at which you use your instruments (surveys, record review forms, personal interview forms, etc.) in as "real" a setting as possible.
The pilot test can tell you:
- If consent forms or letters about the evaluation can be easily delivered
- How long the instruments take to administer
- If the subjects understand the questions
- If any records you need are readily available
- If you can collect needed information in the established timeframe
- If the instruments give you the information you want
Tips for Conducting a Pilot Test
- Always ask for feedback from the subjects in your pilot test. They will tell you what worked for them and what didn't. You should instruct them to take notes and make comments on the process of using the instruments. These notes and comments can be used to determine any needed changes. Jack's lucky. Middle school kids are especially willing to inform the adults where they "messed up"!
- Record the time it takes to complete each item. It may take only 10 minutes to complete a survey, but if half the time is spent on one question, it needs to be rewritten.
- Talk through the instrument. Some respondents feel comfortable talking out loud while answering. When they do, note their comments.
- Record questions asked and clarifications made. When the interviewee asks a question, record key words or verbatim text as well as your own response next to the relevant item. These comments will help in rewriting items.
- Note nonverbal behavior. Record any nonverbal behavior and body language that coincide with particular questions. Hesitance in responding, foot tapping, fidgeting, and similar behaviors may indicate item-design faults, question difficulty, or lack of relevance.
- Note whether instructions and format were easy to follow. Question instructions and format vary from item to item. Notice how smoothly and quickly the interviewee reads directions and moves from one item to another.
- Note erasures, incomplete items, errors, and inconsistencies. These types of responses may indicate questionnaire design flaws.
Jack's team selected an instrument recommended by the Center for Substance Abuse Prevention: The Government Performance and Results Act (GPRA) Client Outcome Measures for Discretionary Programs Youth Survey.
Of course, he pilot tested them.