Part 3: What 17 Experts Taught Us About AI Feedback

May 28

The final part of a three-part series on integrating generative AI into an eLearning module, a project with Jennifer Chien published as a case study in Training Industry. [Read Part 1] · [Read Part 2]

In Part 1, I shared how this project began. In Part 2, how we designed the study and built the modules, and how the prompt turned out to be the hardest part. This post is about what happened when we finally put it in front of people who would know whether it worked.

Before any of that, the project went through our organization's AI Governance Council for review and approval. That step mattered. Doing innovative AI work inside institutional guardrails, rather than around them, is part of what made this credible. Only after approval did we recruit reviewers.

Who reviewed it

Seventeen people completed the evaluation: a mix of instructional designers and training professionals, plus subject matter experts in suicide prevention. They worked through the AI-integrated module, experimented with different responses, and answered a survey of Likert-scale and open-ended questions about their experience.

What the data showed

The reception was strong.

Across all the rated questions, 87.5% of responses were favorable (the top two points on a five-point scale), and roughly 45% were the highest possible rating. Every reviewer rated the feedback clear and consistent at the top two levels. 94% rated it appropriately sensitive to the subject matter. The highest-scoring dimensions were practical, actionable guidance, learner engagement, and the feedback's contribution to the module's overall effectiveness.

For a generative AI tool giving feedback on suicide prevention, evaluated by people who design learning and people who know the subject deeply, that is a meaningful result. The open-ended responses reinforced it. Reviewers described the feedback as gently guiding, helpful, and effective, and several said the ability to experiment in a low-stakes space changed how they engaged with the content.

The most useful finding was a limitation

Here is the part I think matters most.

The two lowest-scoring items were about personalization: whether the feedback felt specific to the learner's individual response, and whether it was tailored enough to unique inputs. Several reviewers noticed that the AI sometimes returned similar feedback regardless of what they wrote, or offered a suggestion that repeated language the learner had already used.

We did not bury that. We reported it plainly in the published case study, and it became the explicit basis for what a future iteration should improve. A few reviewers also flagged practical issues, like the "try again" button not always resetting the text field, which traced back to the JavaScript and the hosting service and were fixable once understood. One reviewer made a sharp point worth keeping: a closed-source AI tool, one a designer could constrain to specific curriculum, might feel safer than an open-source tool for high-stakes content. That's a genuine design consideration, not a footnote.

Reporting weaknesses as openly as strengths isn't a confession. It's the part that makes the rest trustworthy.

What I actually believe after doing this

The lesson I carry out of this project is not "AI can do feedback now." It's more specific than that.

The value was never AI replacing the instructional designer or the subject matter expert. It was AI working as a thought partner inside a structure that humans designed and reviewed at every layer. We wrote the scenario. We engineered and tested the prompt. A SME checked the guidance. The governance council approved the approach. The learner experiments in a safe space, the AI responds in the moment, and humans make sure it's right.

That's the model I think works for AI in sensitive, high-stakes learning. Not automation. Collaboration, with a human firmly in the loop.

The full case study, with the complete methodology and results, was published in Training Industry. If you want the detailed version, it's there. If you've read this far, thank you for following the whole arc. The next project is already taking shape, and I plan to document that one too.

Read the published case study in Training Industry.

Carly Becker

Hi! I’m Carly. I’m an instructional designer by trade, but a tarot reader by heart. What is instructional design? It means that I have an advanced degree in designing course – in understanding how people learn. I’m also an avid tarot reader (mostly for myself, but for others, too). I’m here to combine my skills with course design and my passion for teaching people tarot! Let me know how I can help you by emailing me at carly@inkspellshop.com, or DMing me on Instagram @inkspellshop.

https://inkspellshop.com

Part 3: What 17 Experts Taught Us About AI Feedback

Who reviewed it

What the data showed

The most useful finding was a limitation

What I actually believe after doing this

Carly Becker

Location

Contact

Part 3: What 17 Experts Taught Us About AI Feedback

Who reviewed it

What the data showed

The most useful finding was a limitation

What I actually believe after doing this

Part 2: Designing the Study and Building the Modules

Carly Becker

Location

Contact