A comparison of the polytomous Rasch analysis output of RUMM2030 and R (ltm/eRm/TAM/lordif)
BMC Medical Research Methodology
URL with Digital Object Identifier
© 2019 The Author(s). Background: Patient-reported outcome measures developed using Classical Test Theory are commonly comprised of ordinal level items on a Likert response scale are problematic as they do not permit the results to be compared between patients. Rasch analysis provides a solution to overcome this by evaluating the measurement characteristics of the rating scales using probability estimates. This is typically achieved using commercial software dedicated to Rasch analysis however, it is possible to conduct this analysis using non-specific open source software such a R. Methods: Rasch analysis was conducted using the most commonly used commercial software package, RUMM 2030, and R, using four open-source packages, with a common data set (6-month post-injury PRWE Questionnaire responses) to evaluate the statistical results for consistency. The analysis plan followed recommendations used in a similar study supported by the software package's instructions in order to obtain category thresholds, item and person fit statistics, measures of reliability and evaluate the data for construct validity, differential item functioning, local dependency and unidimensionality of the items. Results: There was substantial agreement between RUMM2030 and R with regards for most of the results, however there are some small discrepancies between the output of the two programs. Conclusions: While the differences in output between RUMM2030 and R can easily be explained by comparing the underlying statistical approaches taken by each program, there is disagreement on critical statistical decisions made by each program. This disagreement however should not be an issue as Rasch analysis requires users to apply their own subjective analysis. While researchers might expect that Rasch performed on a large sample would be a stable, two authors who complete Rasch analysis of the PRWE found somewhat dissimilar findings. So, while some variations in results may be due to samples, this paper adds that some variation in findings may be software dependent.