How to Design Program Evaluation Surveys That Prove Impact and Secure Funding
A comprehensive guide to building program evaluation surveys using logic models, Theory of Change, and mixed-methods approaches to measure outcomes and demonstrate impact to funders.
How to Design Program Evaluation Surveys That Prove Impact and Secure Funding
Every program exists to create change. But whether you're running an after-school literacy initiative, a workforce development program, or a community health intervention, the question funders, boards, and stakeholders inevitably ask is: How do you know it's working?
Program evaluation surveys are the bridge between good intentions and demonstrable impact. When designed well, they don't just satisfy reporting requirements — they generate actionable insights that help you improve your program in real time, build a compelling case for continued funding, and scale what works.
This guide walks you through everything you need to design rigorous, practical program evaluation surveys — from grounding your evaluation in a logic model to choosing the right question types for measuring outcomes at every level.
Why Program Evaluation Surveys Matter More Than Ever
The landscape for nonprofits, educational institutions, and social programs has shifted dramatically. Funders no longer accept anecdotal evidence of impact. The W.K. Kellogg Foundation's Evaluation Handbook emphasizes that systematic evaluation is essential for organizational learning and accountability.
Here's what's at stake:
- Funding retention: 78% of institutional funders now require quantitative outcome data in grant reports
- Program improvement: Without measurement, you can't identify which components drive results
- Scaling decisions: Expansion requires evidence that your model works across contexts
- Stakeholder trust: Participants, communities, and partners deserve transparency about what's working
The CDC Framework for Program Evaluation identifies six essential steps: engage stakeholders, describe the program, focus the evaluation design, gather credible evidence, justify conclusions, and ensure use of findings. Your survey design touches every one of these steps.
Start with Your Logic Model
Before writing a single survey question, you need a clear logic model. A logic model is a visual representation of how your program's resources lead to activities, which produce outputs, which generate outcomes, and ultimately create impact.
The W.K. Kellogg Foundation Logic Model Development Guide remains the gold standard framework. Here's how each component maps to survey design:
Inputs → Process Evaluation Questions
Inputs are your resources: staff, funding, materials, partnerships. Survey questions here assess whether resources were adequate and well-deployed.
Example questions:
- Scale (1-5): "How adequate were the resources and materials provided during the program?" (Strongly Inadequate to Strongly Adequate)
- Single choice: "Which program resource was most valuable to your experience?" (Mentorship sessions / Workshop materials / Online resources / Peer networking / Guest speakers)
Activities → Implementation Fidelity Questions
Activities are what your program actually does. Survey questions here verify that the program was delivered as designed.
Example questions:
- Scale (1-7): "How closely did the program sessions follow the described curriculum?" (Not at all closely to Extremely closely)
- Yes/No: "Did you receive all scheduled coaching sessions?"
- Open-ended: "Describe any sessions or activities that felt significantly different from what was promised."
Outputs → Participation and Dosage Questions
Outputs measure the direct products of your activities: number of sessions attended, hours of service, materials distributed.
Example questions:
- Single choice: "How many program sessions did you attend?" (All sessions / Most sessions / About half / Fewer than half / Only 1-2 sessions)
- Multiple choice: "Which program components did you participate in?" (Select all that apply: Workshops / Individual coaching / Group mentoring / Online modules / Community events)
Short-Term Outcomes → Knowledge and Awareness Questions
These measure immediate changes in knowledge, attitudes, skills, or awareness.
Example questions:
- Scale (1-5): "After completing the program, how confident are you in your ability to [specific skill]?" (Not at all confident to Extremely confident)
- Ranking: "Rank the following skills from most improved to least improved as a result of this program" (Financial literacy / Resume writing / Interview skills / Networking / Time management)
Medium-Term Outcomes → Behavior Change Questions
These assess whether participants are actually doing things differently.
Example questions:
- Scale (1-7): "To what extent have you applied the skills learned in this program to your daily work?" (Not at all to A great extent)
- Single choice: "In the 3 months since completing the program, how frequently have you used the budgeting techniques taught?" (Daily / Several times a week / Weekly / A few times a month / Rarely / Never)
- Open-ended: "Describe a specific situation where you applied something you learned in the program."
Long-Term Impact → Systemic Change Questions
These measure the ultimate goals: improved employment, better health outcomes, stronger communities.
Example questions:
- Yes/No: "Since completing the program, have you obtained employment in your field of training?"
- Scale (1-10): "How would you rate your overall quality of life compared to before the program?" (1 = Much worse, 10 = Much better)
- Open-ended: "How has this program changed your life trajectory? Please be as specific as possible."
Theory of Change: Going Deeper Than Logic Models
While a logic model shows what happens, a Theory of Change explains why it happens. Developed by the Aspen Institute Roundtable on Community Change, Theory of Change identifies the causal mechanisms and assumptions underlying your program.
For survey design, this means testing your assumptions explicitly:
If your theory assumes that mentorship builds self-efficacy, which leads to job-seeking behavior, which leads to employment:
- Test the mechanism: Scale question measuring self-efficacy before and after mentorship
- Test the link: Questions about whether increased confidence actually led to more job applications
- Test the outcome: Employment status at follow-up intervals
This layered approach gives you diagnostic power. If outcomes aren't materializing, you can identify exactly where in the causal chain the breakdown occurs.
Formative vs. Summative Evaluation Surveys
Your evaluation timing dramatically affects survey design. The distinction between formative and summative evaluation, first articulated by Michael Scriven and later formalized in the American Evaluation Association's guiding principles, should guide your approach.
Formative Evaluation Surveys (During the Program)
Purpose: Improve the program while it's running.
Design principles:
- Keep surveys short (5-8 questions) to minimize participant burden
- Focus on process and experience rather than outcomes
- Include open-ended questions that surface unexpected issues
- Administer at regular intervals (after each session or module)
Sample formative survey structure:
- Scale (1-5): "How useful was today's session for your learning goals?"
- Scale (1-5): "How would you rate the pace of today's session?"
- Single choice: "What was the most valuable part of today's session?" (Content / Discussion / Activities / Guest speaker / Materials)
- Yes/No: "Do you feel you can apply what you learned today?"
- Open-ended: "What would have made this session more valuable for you?"
Summative Evaluation Surveys (After the Program)
Purpose: Judge overall program effectiveness and impact.
Design principles:
- More comprehensive (15-25 questions across all logic model levels)
- Include retrospective pre-post questions to measure perceived change
- Balance quantitative scales with qualitative depth
- Collect demographic data for subgroup analysis
Retrospective pre-post technique:
Rather than administering a pre-test and post-test separately, ask participants to rate their current level and recall their pre-program level in the same survey. Research published in the American Journal of Evaluation shows this reduces response-shift bias — the tendency for participants' frame of reference to change during a program, making traditional pre-post comparisons unreliable.
Example:
- "Rate your confidence in managing personal finances BEFORE the program" (Scale 1-5)
- "Rate your confidence in managing personal finances NOW" (Scale 1-5)
Choosing the Right Outcome Indicators
Strong evaluation surveys measure indicators that are:
- Valid: They actually measure what you claim to measure
- Reliable: They produce consistent results across time and contexts
- Sensitive: They can detect meaningful change
- Practical: They can be collected within your resource constraints
- Meaningful: Stakeholders understand and value them
The Urban Institute's Outcome Indicators Project provides validated indicator sets for common program areas including education, workforce development, health, and housing.
Indicator Categories for Survey Design
Knowledge indicators: Test factual understanding or awareness
- "Which of the following is the recommended daily sugar intake?" (Single choice — correct answer measurable)
- Scale: "How well do you understand the process for filing a tax return?"
Attitude indicators: Measure beliefs, values, and dispositions
- Scale (1-7): "How important is regular physical activity to your overall health?"
- Ranking: "Rank these factors by how much they influence your career decisions"
Skill indicators: Assess capability and self-efficacy
- Scale (1-5): "How confident are you in your ability to create a monthly budget?"
- Self-reported behavioral frequency: "How often do you use conflict resolution techniques in your workplace?"
Behavior indicators: Track actions and practices
- Single choice frequency: "In the past month, how many times did you exercise for 30+ minutes?"
- Yes/No: "Have you scheduled a preventive health screening in the past 6 months?"
Condition indicators: Measure status changes
- Single choice: "What is your current employment status?"
- Scale (1-10): "How would you rate your current financial stability?"
Designing for Equity in Program Evaluation
The American Evaluation Association's Statement on Cultural Competence and the growing field of equitable evaluation emphasize that evaluation design itself can either reinforce or challenge power dynamics.
Practical survey design considerations:
- Language accessibility: Offer surveys in participants' primary languages
- Cultural relevance: Ensure response options reflect diverse experiences
- Inclusive demographics: Use inclusive categories for gender, race/ethnicity, and disability status
- Trauma-informed design: Avoid re-traumatizing questions; provide opt-out options for sensitive items
- Participatory design: Involve program participants in survey development
Example of culturally responsive demographic question:
- "How do you describe your racial or ethnic identity?" (Open-ended, rather than forcing predefined categories)
- Follow-up single choice for reporting: "For reporting purposes, which category best represents your racial/ethnic identity?" (With comprehensive, inclusive options including "Prefer to self-describe")
Survey Administration Best Practices
Timing and Frequency
| Evaluation Type | When to Survey | Recommended Length |
|---|---|---|
| Baseline/Intake | Program start | 10-15 minutes |
| Formative check-in | After each module/phase | 3-5 minutes |
| Mid-point assessment | Program midpoint | 8-10 minutes |
| Summative/Exit | Program completion | 15-20 minutes |
| Follow-up | 3, 6, or 12 months post | 10-15 minutes |
Response Rate Strategies
The American Association for Public Opinion Research notes that response rates below 40% raise serious concerns about representativeness. To maximize participation:
- Embed surveys in program activities rather than sending separate links
- Explain the purpose — participants respond more when they understand how data improves the program
- Keep it proportional — survey length should match program intensity
- Follow up personally with non-respondents
- Offer multiple modalities — online, phone, in-person, paper
How Koji Transforms Program Evaluation
Traditional program evaluation surveys face a fundamental tradeoff: quantitative scales provide measurable data but miss nuance, while qualitative interviews capture depth but are expensive and time-consuming to analyze.
Koji eliminates this tradeoff entirely.
With Koji's AI-powered conversational interviews, you can:
-
Embed structured questions naturally: Koji's AI interviewer weaves scale, single-choice, multiple-choice, ranking, and yes/no questions into a flowing conversation. Participants don't feel like they're filling out a form — they feel heard.
-
Automatically probe for depth: When a participant rates a program component as 2 out of 5, Koji's AI naturally follows up: "Tell me more about what made that experience fall short." You get the quantitative data point AND the qualitative explanation.
-
Conduct evaluations at scale: Whether you have 20 participants or 2,000, every single one gets a personalized, in-depth evaluation interview. No more choosing between depth and breadth.
-
Reduce evaluator bias: The AI interviewer asks consistent questions while adapting follow-ups, eliminating the variability that comes with multiple human interviewers across program sites.
-
Generate analysis automatically: Koji aggregates quantitative responses, identifies themes in qualitative data, and produces evaluation reports that are ready for funders.
-
Support multilingual evaluation: Conduct evaluation interviews in participants' preferred languages without hiring multilingual evaluation staff.
For programs serving vulnerable populations, Koji's conversational approach is especially powerful. Participants who struggle with written surveys or feel intimidated by formal evaluation processes often open up in a natural conversation — leading to richer, more honest data.
Building Your Evaluation Survey: A Step-by-Step Checklist
- Map your logic model — Identify inputs, activities, outputs, short-term outcomes, medium-term outcomes, and long-term impact
- Articulate your Theory of Change — Document the causal assumptions connecting each level
- Select indicators for each outcome level — Choose valid, reliable, sensitive, practical, and meaningful indicators
- Write survey questions for each indicator — Use appropriate question types (scales for attitudes, single choice for behaviors, open-ended for context)
- Design your evaluation timeline — Plan formative, summative, and follow-up data collection points
- Pilot test with participants — Run through the survey with 3-5 representative participants and refine based on feedback
- Set up in Koji — Build your evaluation interview with structured questions embedded in a conversational flow
- Establish baseline data — Collect pre-program measures before intervention begins
- Analyze and report — Use Koji's built-in analysis to generate funder-ready reports
- Close the loop — Share findings with participants, staff, and stakeholders to inform program improvement
Common Mistakes to Avoid
- Measuring only satisfaction: Participants can love a program that doesn't produce outcomes. Satisfaction is necessary but not sufficient.
- Asking leading questions: "How much did this excellent program improve your life?" biases responses. Keep questions neutral.
- Ignoring attribution: Correlation isn't causation. Include questions about other factors that may have contributed to observed changes.
- Survey fatigue: Don't ask 50 questions when 20 will do. Respect participants' time.
- Collecting data you won't use: Every question should map to a specific evaluation question and reporting need.
- Waiting until the end: Build evaluation into program design from day one, not as an afterthought.
Program evaluation surveys are your most powerful tool for demonstrating impact, improving programs, and securing the resources to serve more people. With a clear logic model, the right indicators, and a platform like Koji that combines quantitative rigor with qualitative depth, you can build an evaluation system that truly serves your mission.
Related Articles
How to Build Course Evaluation Surveys That Actually Improve Teaching
The complete guide to course evaluations for universities and training programs. Learn how conversational AI produces 2x response rates and 10x richer feedback compared to traditional end-of-course surveys.
How to Build Event Feedback Surveys That Improve Every Future Event
The complete guide to event feedback surveys for conferences, webinars, workshops, and training sessions. Learn how to capture actionable feedback while the experience is fresh and turn it into measurable improvements.
How to Measure Student Satisfaction and Improve Institutional Outcomes
A comprehensive guide to designing student satisfaction surveys that capture meaningful feedback across academic, social, and administrative dimensions to drive institutional improvement.
How to Build Donor Surveys That Increase Giving and Strengthen Relationships
Learn how to design nonprofit donor surveys that measure satisfaction, understand giving motivations, improve stewardship, re-engage lapsed donors, and identify major gift capacity through AI-powered conversational research.
How to Build an NPS Survey That Actually Drives Action
A comprehensive guide to designing, deploying, and acting on Net Promoter Score surveys. Learn the best practices that separate vanity metrics from actionable insights, and how Koji's conversational approach unlocks the "why" behind every score.
How to Build a CSAT Survey That Improves Customer Satisfaction
The complete guide to Customer Satisfaction Score surveys. Learn when to measure CSAT vs NPS, how to design questions that reveal improvement opportunities, and how Koji turns satisfaction data into actionable insights.