OOIR: Observatory of International Research

Papers

(The median citation count of Applied Measurement in Education is 1. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-08-01 to 2025-08-01.)

Article	Citations
Impact of Violating Unidimensionality on Rasch Calibration for Mixed-Format Tests	14
Detecting Item Parameter Drift in Small Sample Rasch Equating	8
Shifting Educational Measurement from an Agent of Systemic Racism to an Anti-Racist Endeavor	8
IRT Characteristic Curve Linking Methods Weighted by Information for Mixed-Format Tests	7
Computer-Based Listening Test with Full Video, Visual-Limited Video, and Audio: A Comparative Analysis Based on Difficulty, Discrimination Power, and Response Time	7
TheStandardsWill Never Be Enough: A Racial Justice Extension	7
Characterizing the Latent Classes in a Mixture IRT Model Using DIF	6
Analyzing Student Response Processes to Evaluate Success on a Technology-Based Problem-Solving Task	5
Comparing Examinee-Based and Response-Based Motivation Filtering Methods in Remote Low-Stakes Testing	5
Not-reached Items: An Issue of Time and of test-taking Disengagement? the Case of PISA 2015 Reading Data	5
Comparing Drift Detection Methods for Accurate Rasch Equating in Different Sample Sizes	5
Response Demands of Reading Comprehension Test Items: A Review of Item Difficulty Modeling Studies	4
Applying a Culturally Responsive Pedagogical Framework to Design and Evaluate Classroom Performance-Based Assessments in Hawai‘i	4
The Impact of Non-Effortful Responding on Item and Person Parameters in Item-Pool Scaling Linking	4
Measurement Invariance in Relation to First Language: An Evaluation of German Reading and Spelling Tests	4
New Tests of Rater Drift in Trend Scoring	4
Modeling Dimensions Converging at the Upper Anchor in Learning Progressions: An Example of Micro-Evolution	3
Don’t Test After Lunch: The Relationship Between Disengagement and the Time of Day That Low-Stakes Testing Occurs	3
Dissecting Knowledge, Guessing, and Blunder in Multiple Choice Assessments	3
The Consideration of Admissions Testing at Colleges and Universities: A Perspective	3
A Method for Displaying Incremental Validity with Expectancy Charts	3
Does the Response Options Placement Provide Clues to the Correct Answers in Multiple-choice Tests? A Systematic Review	3
Automated Scoring of Short-Answer Questions: A Progress Report	3
Between- versus Within-Examinee Variability in Test-Taking Effort and Test Emotions during a Low-Stakes Test	3
Enacting a Process for Developing Culturally Relevant Classroom Assessments	2

Identifying Careless Responses in Computer-Adaptive Affective Surveys Using Person Fit Analysis	2
Gender Differences and Types of Test-Taking Behaviors in PIRLS 2021	2
Change in Engagement During Test Events: An Argument for Weighted Scoring?	2
Gauging Misclassification in Rapid Guessing Identification in a Fast-Paced Vocabulary Test	2
Item-Writing Guidelines on Response Option Placement: A Systematic Review	2
Multi-Group Generalizations of SIBTEST and Crossing-SIBTEST	2
Are Online and Paper Tests Comparable? Evidence from Statewide K-12 Tests	2
Efficient Assessment of Students’ Proportional Reasoning	2
A Method of Empirical Q-Matrix Validation for Multidimensional Item Response Theory	2
Comparing School Reports and Empirical Estimates of Relative Reliance on Tests Vs Grades in College Admissions	2
Performance Decline as an Indicator of Generalized Test-Taking Disengagement	1
Comparing the Robustness of Three Nonparametric DIF Procedures to Differential Rapid Guessing	1
An Examination of Individual Ability Estimation and Classification Accuracy Under Rapid Guessing Misidentifications	1
Personalized Online Learning, Test Fairness, and Educational Measurement: Considering Differential Content Exposure Prior to a High Stakes End of Course Exam	1
A Critical Review of Fairness from Multiple Perspectives: Implications for Classroom Assessment Theory	1
Reconceptualizing Rapid Responses as a Speededness Indicator in High-Stakes Assessments	1
Maintaining Score Scales Over Time: A Comparison of Five Scoring Methods	1
Using Bayesian Networks for Cognitive Assessment of Student Understanding of Buoyancy: A Granular Hierarchy Model	1
Can Adaptive Testing Improve Test-Taking Experience? A Case Study on Educational Survey Assessment	1