What predicts variation in reliability and validity of online peer assessment? A large‐scale cross‐context study.

Published In: Journal of Computer Assisted Learning, 2023, v. 39, n. 6. P. 2004 1 of 3
Database: Academic Search Ultimate 2 of 3
Authored By: Xiong, Yao; Schunn, Christian D.; Wu, Yong 3 of 3

Abstract

Background: For peer assessment, reliability (i.e., consistency in ratings across peers) and validity (i.e., consistency of peer ratings with instructors or experts) are frequently examined in the research literature to address a central concern of instructors and students. Although the average levels are generally promising, both reliability and validity can vary substantially from context to context. Meta‐analyses have identified a few moderators that are related to peer assessment reliability/validity, but they have lacked statistical power to systematically investigate many moderators or disentangle correlated moderators. Objectives: The current study fills this gap by addressing what variables influence peer assessment reliability/validity using a large‐scale, cross‐context dataset from a shared online peer assessment platform. Methods: Using multi‐level structural equation models, we examined three categories of variables: (1) variables related to the context of peer assessment; (2) variables related to the peer assessment task itself; and (3) variables related to rating rubrics of peer assessment. Results and Conclusions: We found that the extent to which assessment documents varied in quality on the given rubric played a central role in mediating the effect from different predictors to peer assessment reliability/validity. Other variables that are significantly associated with reliability and validity included: Education Level, Language, Discipline, Average Ability of Peer Raters, Draft Number, Assignment Number, Class Size, Average Number of Raters, and Length of Rubric Description. The results provide information to guide practitioners on how to improve reliability and validity of peer assessments. Lay Description: What is already known about this topic: The reliability and validity of peer assessment have been found to be, on average, at or above acceptable levels.Inter‐rater reliabilities, in particular, were reported to be medium to high in different online peer assessment studies in both K‐12 and higher education settings.Correlations between peer ratings and expert ratings were reported to be medium to high in different studies. What this paper adds: Document Quality Variability (DQV) is the degree to which submitted documents varied in quality on the given rubric dimension versus being all of roughly similar quality.DQV strongly predicts peer assessment reliability and validity.Class Size, Average Number of Raters, and Length of Rubric Descriptions are associated with reliability directly.Average Number of Raters and Assignment Number predict validity directly.Average Ability of Peer Raters, Draft Number, and Assignment Number predict validity indirectly through DQV. Implications for practice and/or policy: To increase IRR and validity of ratings, instructors can include more students with different proficiency levels in the subject area in peer assessment.Instructors should avoid using rubric scales that will mainly have students cluster at the top (or bottom) of the scale.The appropriate number of peer raters usually lies within 3‐to‐5 raters per document.Instructors should use peer assessment activities primarily for first drafts rather than for second or later drafts of an assignment. [ABSTRACT FROM AUTHOR]

Additional Information

Source:Journal of Computer Assisted Learning. 2023/12, Vol. 39, Issue 6, p2004
Document Type:Article
Subject Area:Education
Publication Date:2023
ISSN:0266-4909
DOI:10.1111/jcal.12861
Accession Number:173586102
Copyright Statement:Copyright of Journal of Computer Assisted Learning is the property of Wiley-Blackwell and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)

Looking to go deeper into this topic? Look for more articles on EBSCOhost.