Journal of Educational Measurement

Papers
(The median citation count of Journal of Educational Measurement is 1. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-11-01 to 2025-11-01.)
ArticleCitations
Measuring the Uncertainty of Imputed Scores22
22
How Many Plausible Values?18
NCME Presidential Address 2022: Turning the Page to the Next Chapter of Educational Measurement17
The Automated Test Assembly and Routing Rule for Multistage Adaptive Testing with Multidimensional Item Response Theory15
Optimal Calibration of Items for Multidimensional Achievement Tests14
A Statistical Test for the Detection of Item Compromise Combining Responses and Response Times12
A Note on the Use of Categorical Subscores11
The Precision and Bias of Cut Score Estimates from the Beuk Standard Setting Method11
Comparing Data‐Driven Methods for Removing Options in Assessment Items9
Linking Error on Achievement Levels Accounting for Dependencies and Complex Sampling9
Using Linkage Sets to Improve Connectedness in Rater Response Model Estimation8
A Deterministic Gated Lognormal Response Time Model to Identify Examinees with Item Preknowledge8
Issue Information8
Two IRT Characteristic Curve Linking Methods Weighted by Information8
Using Item Parameter Predictions for Reducing Calibration Sample Requirements—A Case Study Based on a High‐Stakes Admission Test7
Historical Perspectives on Score Comparability Issues Raised by Innovations in Testing7
Automated Coding of Communications in Collaborative Problem‐Solving Tasks Using ChatGPT6
Issue Information6
Model Selection Posterior Predictive Model Checking via Limited‐Information Indices for Bayesian Diagnostic Classification Modeling5
Validity Arguments for AI‐Based Automated Scores: Essay Scoring as an Illustration5
An Exponentially Weighted Moving Average Procedure for Detecting Back Random Responding Behavior5
Briggs, Derek C.Historical and Conceptual Foundations of Measurement in the Human Sciences: Credos and Controversies5
5
Using Response Time in Multidimensional Computerized Adaptive Testing4
On the Positive Correlation between DIF and Difficulty: A New Theory on the Correlation as Methodological Artifact4
Parametric Bootstrap Mantel‐Haenszel Statistic for Aggregated Testlet Effects4
Information Functions of Rank‐2PL Models for Forced‐Choice Questionnaires4
Differential and Functional Response Time Item Analysis: An Application to Understanding Paper versus Digital Reading Processes4
Likelihood‐Based Estimation of Model‐Derived Oral Reading Fluency4
A Generalized Objective Function for Computer Adaptive Item Selection3
Detecting Group Collaboration Using Multiple Correspondence Analysis3
Special Issue: Adaptive Testing in Large‐Scale Assessments3
Addressing Bias in Spoken Language Systems Used in the Development and Implementation of Automated Child Language‐Based Assessment3
Gender Bias in Test Item Formats: Evidence from PISA 2009, 2012, and 2015 Math and Reading Tests3
Score Comparability between Online Proctored and In‐Person Credentialing Exams3
Issue Information3
DIF Detection for Multiple Groups: Comparing Three‐Level GLMMs and Multiple‐Group IRT Models3
Using Eye‐Tracking Data as Part of the Validity Argument for Multiple‐Choice Questions: A Demonstration3
Sensemaking of Process Data from Evaluation Studies of Educational Games: An Application of Cross‐Classified Item Response Theory Modeling3
Controlling the Speededness of Assembled Test Forms: A Generalization to the Three‐Parameter Lognormal Response Time Model3
Issue Information2
A Highly Adaptive Testing Design for PISA2
Comparing and Combining IRTree Models and Anchoring Vignettes in Addressing Response Styles2
Utilizing Response Time for Item Selection in On‐the‐Fly Multistage Adaptive Testing for PISA Assessment2
An Item Response Tree Model for Items with Multiple‐Choice and Constructed‐Response Parts2
Measuring the Impact of Peer Interaction in Group Oral Assessments with an Extended Many‐Facet Rasch Model2
Cognitive Diagnostic Multistage Testing by Partitioning Hierarchically Structured Attributes2
The Vulnerability of AI‐Based Scoring Systems to Gaming Strategies: A Case Study2
Issue Information2
Subscores: A Practical Guide to Their Production and Consumption. ShelbyHaberman, SandipSinharay, RichardFeinberg, and HowardWainer. Cambridge, Cambridge University Press2024, 176 pp. (paperback)2
BettyLanteigne, ChristineCoombe, & James DeanBrown. 2021. Challenges in Language Testing around the World: Insights for language test users. Singapore: Springer, 2021, 129.99 € (hardcover),2
Explanatory Cognitive Diagnostic Modeling Incorporating Response Times2
2
On the Choice of Parameters for the Lognormal Model for Response Times: Commentary on Becker et al. (2013)2
2
2
Using Multilabel Neural Network to Score High‐Dimensional Assessments for Different Use Foci: An Example with College Major Preference Assessment2
MSAEM Estimation for Confirmatory Multidimensional Four‐Parameter Normal Ogive Models2
Detecting Multidimensional DIF in Polytomous Items with IRT Methods and Estimation Approaches2
Exploring the Impact of Random Guessing in Distractor Analysis2
Issue Information2
Issue Information2
Online Monitoring of Test‐Taking Behavior Based on Item Responses and Response Times2
Detecting Differential Item Functioning Using Posterior Predictive Model Checking: A Comparison of Discrepancy Statistics1
Validity Arguments Meet Artificial Intelligence in Innovative Educational Assessment1
Modeling Hierarchical Attribute Structures in Diagnostic Classification Models with Multiple Attempts1
From Item Estimates to Test Operations: The Cascading Effect of Rapid Guessing1
Simultaneous Detection of Cheaters and Compromised Items Using a Biclustering Approach1
Influence of Intersectional Routing Modules between Dimensions on Measurement Precision in Multidimensional Multistage Testing1
Constructing a Robust Score Scale from IRT Scores with Informed Boundaries1
Using Item Scores and Distractors in Person‐Fit Assessment1
The Impact of Cheating on Score Comparability via Pool‐Based IRT Pre‐equating1
Finding Words Associated with DIF: Predicting Differential Item Functioning Using LLMs and Explainable AI1
Fully Gibbs Sampling Algorithms for Bayesian Variable Selection in Latent Regression Models1
IRT Observed‐Score Equating for Rater‐Mediated Assessments Using a Hierarchical Rater Model1
1
Validity Arguments Meet Artificial Intelligence in Innovative Educational Assessment: A Discussion and Look Forward1
Using Keystroke Dynamics to Detect Nonoriginal Text1
1
Evaluating the Consistency and Reliability of Attribution Methods in Automated Short Answer Grading (ASAG) Systems: Toward an Explainable Scoring System1
Reckase, M.The Psychometrics of Standard Setting: Connecting Policy and Test Scores: First edition published 2023 by CRC Press, 6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487‐2741
A Topic Testlet Model for Calibrating Testlet Constructed Responses1
A Dual‐Purpose Model for Binary Data: Estimating Ability and Misconceptions1
A Factor Mixture Model for Item Responses and Certainty of Response Indices to Identify Student Knowledge Profiles1
Modeling the Intraindividual Relation of Ability and Speed within a Test1
Curvilinearity in the Reference Composite and Practical Implications for Measurement1
Argument‐Based Approach to Validity: Developing a Living Document and Incorporating Preregistration1
1
Robustness of Item Response Theory Models under the PISA Multistage Adaptive Testing Designs1
1
0.033614158630371