The Texas Education Agency began the transition from human grading to artificial intelligence grading for the STAAR tests in 2023. Recently, updates were made to the scoring rubrics to help improve the experience for students.
“AI is one of those things that is here now and needs to be embraced,” testing coordinator Nicole Bowden said. “It is not going away. It is something that can be incredibly beneficial in a classroom, but as with any other tool, educators will need to learn how best to use it in their classrooms and schools.”
Previously, each written STAAR response was graded by two human scorers. If two graders were to disagree on a score, a third person would then grade the assessment to break the tie.
“I do believe it is something that needed to be done,” Bowden said. “It does make the scoring have less bias than if completely graded by humans. Of course, with all things computer, it is only as good as the programming. So as long as they keep it updated and calibrated, I think it is a great addition to state testing.”
Using AI to grade STAAR assessments saves a large amount of time and allows for scores to be returned to students quicker. However, not all students agree with the change.
“I personally do not agree with the concept of AI grading papers and saving time because it ultimately can do more harm than good,” sophomore Nikhil Sridharan said. “I have heard stories about AI miscalculating grades and negatively impacting the students’ score. So I do not think it is worth it that results are able to come back sooner since all these issues are providing setbacks and really prolonging the grading process.”
AI grading saves the state a lot of money since it does not have to pay for graders to work on scoring the tests. Sridharan said he believes efficiency should not come at the cost of accuracy.
“I believe that it is not viable for the state to push out grades and use AI because of these miscalculations that affect students’ scores,” Sridharan said. “Although it may save money, budgeting an amount and spending it on graders allows for better calculated scores and ensures that there are fewer or no mistakes.”
An issue that schools saw last spring was an increase in responses receiving a score of zero. This was partly due to a rubric change requiring students to clearly answer the prompt in addition to having a well written response.
“With AI grading, versus people grading, the zero was given out more consistently because there was not the human factor involved with graders ‘feeling bad’ about giving a zero for a well written response that did not answer the question,” Bowden said. “Part of this is restating the question and answering it, which is why many English departments across the state somewhat went away from teaching. The answer has to be an ‘obvious’ answer to the question in that the student needs to restate the question in the answer.”
Concerns about AI grading are about more than speed and money. Some students continue to ask about AI’s ability to properly evaluate a student’s thought.
“It seems like it is useful for multiple choice questions because it can have a definite answer,” freshman Nathaniel Bauermeister said. “Essay grading is different because it is based on multiple factors, not just if it is right or wrong. If someone were to look over it after I think it could work. I do not think it should grade essays since it is more complicated and it doesn’t understand the human level that you need to be writing at.”