OOIR: Observatory of International Research

Papers

(The median citation count of Educational and Psychological Measurement is 2. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2022-01-01 to 2026-01-01.)

Article	Citations
Iterative Item Selection of Neighborhood Clusters: A Nonparametric and Non-IRT Method for Generating Miniature Computer Adaptive Questionnaires	234
Summary Intervals for Model-Based Classification Accuracy and Consistency Indices	31
Using Deep Reinforcement Learning to Decide Test Length	21
Assessing the Properties and Functioning of Model-Based Sum Scores in Multidimensional Measures With Local Item Dependencies: A Comprehensive Proposal	19
Model Specification Searches in Structural Equation Modeling Using Bee Swarm Optimization	18
Functional Approaches for Modeling Unfolding Data	16
Collapsing Sparse Responses in Likert-Type Scale Data: Advantages and Disadvantages for Model Fit in CFA	16
Optimal Number of Replications for Obtaining Stable Dynamic Fit Index Cutoffs	15
Detecting Differential Item Functioning Using Response Time	15
Using Item Scores and Response Times to Detect Item Compromise in Computerized Adaptive Testing	14
Generalized Mantel–Haenszel Estimators for Simultaneous Differential Item Functioning Tests	13
Investigating Confidence Intervals of Item Parameters When Some Item Parameters Take Priors in the 2PL and 3PL Models	13
An Explanatory Multidimensional Random Item Effects Rating Scale Model	11
An Illustration of an IRTree Model for Disengagement	11
Detecting Preknowledge Cheating via Innovative Measures: A Mixture Hierarchical Model for Jointly Modeling Item Responses, Response Times, and Visual Fixation Counts	11
Improving the Use of Parallel Analysis by Accounting for Sampling Variability of the Observed Correlation Matrix	10
Examination of ChatGPT’s Performance as a Data Analysis Tool	10
Identifying Ability and Nonability Groups: Incorporating Response Times Using Mixture Modeling	10
On the Benefits of Using Maximal Reliability in Educational and Behavioral Research	10
Assessing the Speed–Accuracy Tradeoff in Psychological Testing Using Experimental Manipulations	9
Reconceptualizing Scoring Reliability Through Linguistic Similarity	9
An Omega-Hierarchical Extension Index for Second-Order Constructs With Hierarchical Measuring Instruments	9
Item Parameter Recovery: Sensitivity to Prior Distribution	8
Rotation Local Solutions in Multidimensional Item Response Theory Models	8
What Affects the Quality of Score Transformations? Potential Issues in True-Score Equating Using the Partial Credit Model	8

How to Improve the Regression Factor Score Predictor When Individuals Have Different Factor Loadings	8
From Linear Geometry to Nonlinear and Information-Geometric Settings in Test Theory: Bregman Projections as a Unifying Framework	7
On Bank Assembly and Block Selection in Multidimensional Forced-Choice Adaptive Assessments	7
The Impact and Detection of Uniform Differential Item Functioning for Continuous Item Response Models	7
A New Stopping Criterion for Rasch Trees Based on the Mantel–Haenszel Effect Size Measure for Differential Item Functioning	7
Examining the Dynamic of Clustering Effects in Multilevel Designs: A Latent Variable Method Application	7
Measuring Unipolar Traits With Continuous Response Items: Some Methodological and Substantive Developments	7
On the Complex Sources of Differential Item Functioning: A Comparison of Three Methods	6
Are Speeded Tests Unfair? Modeling the Impact of Time Limits on the Gender Gap in Mathematics	6
A Bayesian General Model to Account for Individual Differences in Operation-Specific Learning Within a Test	6
Agreement Lambda for Weighted Disagreement With Ordinal Scales: Correction for Category Prevalence	6
Separation of Traits and Extreme Response Style in IRTree Models: The Role of Mimicry Effects for the Meaningful Interpretation of Estimates	6
Linear and Nonlinear Indices of Score Accuracy and Item Effectiveness for Measures That Contain Locally Dependent Items	5
Examining the Instructional Sensitivity of Constructed-Response Achievement Test Item Scores	5
Differential Item Functioning Effect Size Use for Validity Information	5
Impacts of DIF Item Balance and Effect Size Incorporation With the Rasch Tree	5
Obtaining a Bayesian Estimate of Coefficient Alpha Using a Posterior Normal Distribution	5
The One-Parameter Logistic Model Can Be True With Zero Probability for a Unidimensional Measuring Instrument: How One Could Go Wrong Removing Items Not Satisfying the Model	5
An Evaluation of Fit Indices Used in Model Selection of Dichotomous Mixture IRT Models	5
Using Multiple Imputation to Account for the Uncertainty Due to Missing Data in the Context of Factor Retention	5
Detecting Rating Scale Malfunctioning With the Partial Credit Model and Generalized Partial Credit Model	5
Evaluating Model Fit of Measurement Models in Confirmatory Factor Analysis	5
An Item Response Theory Model for Incorporating Response Times in Forced-Choice Measures	4
Coefficients of Factor Score Determinacy for Mean Plausible Values of Bayesian Factor Analysis	4
Evaluation of Polytomous Item Locations in Multicomponent Measuring Instruments: A Note on a Latent Variable Modeling Procedure	4
Reducing Calibration Bias for Person Fit Assessment by Mixture Model Expansion	4
Assessing Essential Unidimensionality of Scales and Structural Coefficient Bias	4
Investigating Heterogeneity in Response Strategies: A Mixture Multidimensional IRTree Approach	4
Evaluating the Performance of a Regularized Differential Item Functioning Method for Testlet-Based Polytomous Items	4
Historical Measurement Information Can Be Used to Improve Estimation of Structural Parameters in Structural Equation Models With Small Samples	4
Modeling Misspecification as a Parameter in Bayesian Structural Equation Models	4
Is Effort Moderated Scoring Robust to Multidimensional Rapid Guessing?	4
Awareness Is Bliss: How Acquiescence Affects Exploratory Factor Analysis	4
Overestimation of Internal Consistency by Coefficient Omega in Data Giving Rise to a Centroid-Like Factor Solution	4
A Small Sample Correction for Factor Score Regression	3
Discriminant Validity of Interval Response Formats: Investigating the Dimensional Structure of Interval Widths	3
The Dominant Trait Profile Method of Scoring Multidimensional Forced-Choice Questionnaires	3
Field-Testing Multiple-Choice Questions With AI Examinees: English Grammar Items	3
Is the Area Under Curve Appropriate for Evaluating the Fit of Psychometric Models?	3
Exploration of the Stacking Ensemble Machine Learning Algorithm for Cheating Detection in Large-Scale Assessment	3
A Note on Evaluation of Polytomous Item Locations With the Rating Scale Model and Testing Its Fit	3
Resolving Dimensionality in a Child Assessment Tool: An Application of the Multilevel Bifactor Model	3
Two-Method Measurement Planned Missing Data With Purposefully Selected Samples	3
Detecting Cheating in Large-Scale Assessment: The Transfer of Detectors to New Tests	3
The Impact of Sample Size and Various Other Factors on Estimation of Dichotomous Mixture IRT Models	3
Comparing Accuracy of Parallel Analysis and Fit Statistics for Estimating the Number of Factors With Ordered Categorical Data in Exploratory Factor Analysis	3
Model-Based Person Fit Statistics Applied to the Wechsler Adult Intelligence Scale IV	3
Croon’s Bias-Corrected Estimation for Multilevel Structural Equation Models with Non-Normal Indicators and Model Misspecifications	3
Equidistant Response Options on Likert-Type Instruments: Testing the Interval Scaling Assumption Using Mplus	3
Investigating the Ordering Structure of Clustered Items Using Nonparametric Item Response Theory	3

When Cluster-Robust Inferences Fail	3
The Effect of Modeling Missing Data With IRTree Approach on Parameter Estimates Under Different Simulation Conditions	2
On the Importance of Coefficient Alpha for Measurement Research: Loading Equality Is Not Necessary for Alpha’s Utility as a Scale Reliability Index	2
Interpretation of the Standardized Mean Difference Effect Size When Distributions Are Not Normal or Homoscedastic	2
A Regression Discontinuity Design Framework for Controlling Selection Bias in Evaluations of Differential Item Functioning	2
Added Value of Subscores for Tests With Polytomous Items	2
Assessing Ability Recovery of the Sequential IRT Model With Unstructured Multiple-Attempt Data	2
Developing Situated Measures of Science Instruction Through an Innovative Electronic Portfolio App for Mobile Devices: Reliability, Validity, and Feasibility	2
Treating Noneffortful Responses as Missing	2
Range Restriction Affects Factor Analysis: Normality, Estimation, Fit, Loadings, and Reliability	2
Why Forced-Choice and Likert Items Provide the Same Information on Personality, Including Social Desirability	2
Dimensionality Assessment in Forced-Choice Questionnaires: First Steps Toward an Exploratory Framework	2
Comparing RMSEA-Based Indices for Assessing Measurement Invariance in Confirmatory Factor Models	2
A Comparison of the Next Eigenvalue Sufficiency Test to Other Stopping Rules for the Number of Factors in Factor Analysis	2
Dominance Analysis for Latent Variable Models: A Comparison of Methods With Categorical Indicators and Misspecified Models	2
Evaluating Equating Methods for Varying Levels of Form Difference	2
Disentangling Qualitatively Different Faking Strategies in High-Stakes Personality Assessments: A Mixture Extension of the Multidimensional Nominal Response Model	2
Diagnostic Classification Model for Forced-Choice Items and Noncognitive Tests	2
Testing the Performance of Level-Specific Fit Evaluation in MCFA Models With Different Factor Structures Across Levels	2
Scoring Graphical Responses in TIMSS 2019 Using Artificial Neural Networks	2
Evaluating Imputation-Based Fit Statistics in Structural Equation Modeling With Ordinal Data: The MI2S Approach	2
On the Utility of Indirect Methods for Detecting Faking	2