Google Scholar

Ensuring fairness of human-and AI-generated test items

WCM Belzak, B Naismith, J Burstein - International Conference on Artificial …, 2023 - Springer

Large language models (LLMs) have been a catalyst for the increased use of AI for
automatic item generation on high-stakes assessments. Standard human review processes
applied to human-generated content are also important for AI-generated content because AI-
generated content can reflect human biases. However, human reviewers have implicit
biases and gaps in cultural knowledge which may emerge where the test population is
diverse. Quantitative analyses of item responses via differential item functioning (DIF) can …

Save Cite Cited by 11 Related articles

[PDF] researchgate.net

[PDF][PDF] Ensuring Fairness of Human-and AI-Generated Test Items

J Burstein - researchgate.net

Large language models (LLMs) have been a catalyst to the increased use of AI for automatic
item generation on high-stakes assessments. Standard human review processes applied to
human-generated content are also important for AI-generated content because AI-generated
content can reflect human biases. However, human reviewers have implicit biases and gaps
in cultural knowledge which may emerge where the test population is diverse. Quantitative
analyses of item responses via differential item functioning (DIF) can help to identify these …

Save Cite Related articles View as HTML

Showing the best results for this search. See all results

Cite

Advanced search

Saved to My library

Ensuring fairness of human-and AI-generated test items

[PDF][PDF] Ensuring Fairness of Human-and AI-Generated Test Items