Skip links

AI Feedback as a Complement to Professional Learning

Abstract

AI-powered professional learning tools that provide teachers with individualized feedback on their instruction have proven effective at improving instruction and student engagement in virtual learning contexts. Despite the need for consistent, personalized professional learning in K-12 settings, the effectiveness of automated feedback tools in traditional classrooms remains unexplored. We present results from 224 Utah mathematics and science teachers who engaged in a pre-registered randomized controlled trial, conducted in partnership with TeachFX, to assess the impact of automated feedback in K-12 classrooms.
 
This feedback targeted “focusing questions” — questions that probe students’ thinking by pressing for explanations and reflection. We find that teachers opened emails containing the automated feedback about 53–65% of the time, and the feedback increased their use of focusing questions by 20% (p < 0.01) compared to the control group. The feedback did not impact other teaching practices. Qualitative interviews with 13 teachers revealed mixed perceptions of the automated feedback. Some teachers appreciated the reflective insights, while others faced barriers such as skepticism about accuracy, data privacy concerns, and time constraints. Our findings highlight the promises and areas of improvement for implementing effective and teacher-friendly automated professional learning tools in brick-and-mortar classrooms.

Introduction

Formative feedback grounded in teachers’ practices can enhance instruction and improve student outcomes (Kraft, Blazar, & Hogan, 2018; Shute, 2008; Steinberg & Sartain, 2015; Taylor & Tyler, 2012). Instructional coaches or mentor teachers often provide such feedback by observing classrooms, guiding teacher reflections and offering improvement suggestions (Kraft et al., 2018).
 
However, expert coaching is expensive and time-consuming, limiting most teachers’ access to consistent, high-quality feedback. In the United States, only ∼40% of schools provide teachers access to an instructional coach (Taie & Goldring, 2017) and in many schools, teachers primarily receive feedback from their principals, who often lack the time and knowledge to support teachers’ thorough analysis and synthesis of evaluation data (Firestone & Donaldson, 2019; Rigby et al., 2017).
 
Technology has emerged as a promising way to fill the gaps in teacher professional learning by providing teachers with data-driven opportunities to facilitate instructional improvement. Computerized tools can help teachers refresh their pedagogical content knowledge and rehearse in simulated environments (Copur-Gencturk, Li, Cohen, & Orrill, 2024; Markel, Opferman, Landay, & Piech, 2023), better respond to students’ written explanations in between class sessions (Bywater et al., 2019, Bywater et al., 2023), and reflect on recordings of their instruction through video (Chen, 2020) or text analytics (Demszky, Liu, Hill, Jurafsky, & Piech, 2023; Jacobs et al., 2022). Different from tools that provide computer-scaffolded instruction directly to students (e.g., Cognitive Tutor; Anderson, Corbett, Koedinger, and Pelletier (1995)), these teacher-facing tools represent the potential of computer technology to influence human-to-human instructional quality broadly through numerous mechanisms.
 
Using natural language processing (NLP), tools that provide formative feedback to teachers based on their instruction may complement human observation and coaching, improving instructional practice (Jacobs, Scornavacco, Clevenger, Suresh, & Sumner, 2024) and even student outcomes in online settings (Demszky & Liu, 2023; Demszky et al., 2023).
 
Such automated feedback tools take a recording of a teacher’s lesson as input, transcribe and analyze the recording to identify high-leverage instructional practices, and deliver insights to the teacher to facilitate reflection and instructional improvement. Since such feedback is cost-efficient, scalable, and can be delivered privately, quickly, and frequently, researchers and technology providers (e.g. TeachFX, EdThena) are seeking to understand how they can be best put to teachers’ service.
 
Despite the need for scalable K-12 classroom observation and feedback tools as well as encouraging studies with such tools in online teaching settings, to our knowledge, there exists no rigorous experimental evaluation of whether automated feedback might work in K-12 in-person learning contexts. To this end, we present results from an experiment in which we provided teachers with feedback related to focusing questions — a high-leverage teaching practice involving asking questions that probe student thinking and encourage students to reflect on their thoughts and those of their classmates (Alic et al., 2022; Herbel-Eisenmann & Breyfogle, 2005; National Council of Teachers of Mathematics, 2014; Wood, 1998). Our experimental design allows us to causally estimate whether providing teachers feedback about focusing questions increases the number of such questions and whether it yields related improvements in instruction, such as increasing the amount of student talk and student reasoning.
 
We also add to the literature on human-computer interaction by seeking to understand how teachers engage with and perceive the utility of our automated feedback. Prior research on technology integration in classrooms indicates that teachers’ perceived utility of the technology has a strong influence on their technology adoption (Backfisch et al., 2021, Ertmer et al., 2012, Fütterer et al., 2023, Kale, 2018, Lachner et al., 2021, Scherer et al., 2019, Wang and Zhao, 2023). This suggests that teachers need to see the value of receiving automated feedback on their instruction in order to effectively use the tool. Jacobs et al., 2022, Jacobs et al., 2024 have found that many teachers see automated feedback as a valuable vehicle for self-reflection, but that perceptions of accuracy can impact their engagement with such feedback. We seek to deepen this knowledge about factors that may impact teachers’ perception of and engagement with automated feedback by conducting qualitative interviews with a subset of teachers in the experimental study.
 
Thus this mixed-method study is the first that combines a pre-registered randomized controlled trial and qualitative interviews to experimentally test the impact of and describe K-12 teacher engagement with automated feedback on instruction. In doing so, we address the following research questions:
 
  1. To what extent do K-12 teachers engage with the automated feedback on focusing questions.
  2. Does the automated feedback on focusing questions impact instruction, including teachers’ use of focusing questions, student talk time, and student reasoning?

    We augment these questions with a third question, which we answer with our qualitative interviews in this mixed-methods study

  3. How do teachers perceive the automated feedback on both focusing questions and other teaching practices? What are the barriers for them to engage with the feedback?
 
To answer these questions, we partnered with TeachFX, a company that delivers feedback to teachers based on classroom recordings via a phone application. We leveraged TeachFX’s newly established partnership with the state of Utah to facilitate professional learning for mathematics and science teachers. We randomly assigned teachers to a treatment or control condition based on whether they received automated feedback on focusing questions. We collected recordings of their instruction, post-treatment surveys, and interview data to understand the impact of the treatment as well as teachers’ engagement with and perceptions of automated feedback.
 
In the following sections, we begin with an overview of related work on technology-based professional learning for teachers and teachers’ technology integration. Subsequently, we provide a background for our current study (Sections 3 and 4), with details on the technology platform and participants. In Section 5, we describe the experimental design, including the randomized setup, study procedures and the interview protocol. In Section 6, we provide an overview of the approach we took to answer each research question.
 
In Section 7, we provide the results of our research questions. We conclude by discussing the implications of these results for both research and practice related to using computerized tools in teacher professional learning.

roductive Teacher talk

A large body of education research has shown that teacher talk that encourages students to verbalize, share and co-construct knowledge improves student learning, agency and sense of belonging (Alexander, 2020; Asterhan, Clarke, & Resnick, 2015; Chapin, O’Connor, & Anderson, 2009; Howe, Hennessy, Mercer, Vrikki, & Wheatley, 2019). Conceptualized under several related frameworks (dialogic instruction, accountable talk, academically productive talk (Michaels, O’Connor, & Resnick, 2008; Resnick,

Feedback via the TeachFX platform & email

In this section, we describe TeachFX (Section 3.1), the platform we partnered with to deliver feedback as part of the study. We then provide an overview of how we delivered feedback on focusing questions via the platform (Section 3.2) and an email (Section 3.3).

Participants

TeachFX had recently formed a new research partnership with the state of Utah, as part of which mathematics and science teachers were encouraged to use TeachFX and received professional development opportunities related to automated feedback. Because these new users were not biased by exposure to automated feedback beforehand, they were ideal participants for the study. Furthermore, our target construct – focusing questions – applies to both mathematics and science instruction (Hagenah et al.,

Experimental design & procedures

We conducted a randomized controlled trial to evaluate the effectiveness of providing feedback to mathematics and science teachers on focusing questions. The study was approved under Stanford’s IRB. Our experiment ran for five months between October 10, 2022, and March 10, 2023. We ended the study in March because of the start of the standardized testing season, which interfered with teachers’ bandwidth to use the tool. After a teacher completed five weeks of recordings during the study period, 

Analytic approach

This section outlines the analytic approach we took to answer each of the three research questions, leveraging quantitative data from TeachFX and qualitative data collected from the interviews.

RQ1: to what extent do teachers engage with the automated feedback on focusing questions?

We tracked two key metrics of engagement with the automated feedback: email opens and views of the focusing question insight on the TeachFX platform. Overall, treatment group teachers opened the emails with the feedback at a much higher rate than viewing the insight on the platform. Between 53 and 65% of treatment teachers opened their emails across weeks (65% for 1st email, 53% for 2nd, 61% for 3rd, 55% for 4th and 56% for 5th), but only 17–23% of them viewed the focusing insight page (23% for 

Implications of study findings

Computerized tools are emerging as a scalable complement to human-based solutions for teacher professional learning. In particular, NLP-powered formative feedback grounded in teaching practices has been proven to be effective in a few online learning contexts (Demszky & Liu, 2023, Demszky, Liu, Hill, Jurafsky, &#38; Piech, 2023). The present study was among the first to investigate the impact of automated feedback in brick-and-mortar classrooms using a randomized controlled trial targeting a

Conclusion

This study provides the first experimental evidence of the impact of automated feedback on focusing questions in K-12 brick-and-mortar classrooms. We find that such feedback significantly increases teachers’ use of this high-leverage instructional practice, demonstrating the potential of automated feedback to enhance teacher professional learning. We also highlight critical challenges in effectively engaging teachers with these tools, especially around transcription accuracy, feedback

CRediT authorship contribution statement

Dorottya Demszky: Writing – review & editing, Writing – original draft, Validation, Supervision, Project administration, Methodology, Investigation, Funding acquisition, Formal analysis, Data curation, Conceptualization. Jing Liu: Writing – review & editing, Writing – original draft, Validation, Supervision, Methodology, Investigation, Funding acquisition, Formal analysis, Conceptualization. Heather C. Hill: Writing – review & editing, Writing – original draft, Supervision, Methodology, Funding 
 

Article: Automated feedback improves teachers’ questioning and instruction insights (Emerging Education Science report, 2024–25)
Key point: AI feedback tools — including NLP-powered dashboards — provide individualized feedback that supports teacher reflection and instructional quality.
Quote: Tools that provide formative feedback on instruction can improve instructional practice and complement traditional coaching.
🔗 https://www.sciencedirect.com/science/article/abs/pii/S0360131524001970