An analysis of item response theory using program R

Ali Ali, Edi Istiyono


The test is one of the instruments used to assess the extent of student understanding in learning. Multiple choice is a type of test commonly used in testing students. In addition to testing students' understanding, the quality of the tests used also needs to be tested. This study aims to determine the characteristics of the national mathematics test items in Baubau in the 2015/2016 academic year and the test information function with the item response theory approach. This research is an ex-post-facto study with a sample size of 574 students using a random sampling technique. Data was collected through documentation and analyzed using the LTM R package program. Findings indicated that there were four items (I1, I2, I4, and I8) for the 1-PL model, six items (I1, I2, I4, I7, I8, and I10) for the 2-PL model, and seven items (I1, I2, I3, I4, I7, I9, and I10) (3-PL) that fit the model (FM). The percentage of good (G) item parameters using R was 90% for (b) (1-PL), 90% (b) and 100% (a) (2-PL), and 90% (b), 10% (a), and 70% (c) (3-PL). The percentage of good quality items in each model for the 1-PL model was 40% or four items, the 2-PL model was 60% or six items, and the 3-PL model was 0%, or none was included in the good quality item category.


Item Analysis; Item Response Theory; R Package Program.

Full Text:



Aiken, L. R. (1994). Psychological testing and assessment (eight edit). Allyn and Bacon.

Anastasi, A. (1988). Psychological testing (6th Edition). Mcmillan.

Anastasi, A., & Urbina, S. (1997). Psychological testing (seventh ed). Prentice-Hall, Inc.

Anggoro, B. S., Agustina, S., Komala, R., Komarudin, K., Jermsittiparsert, K., & Widyastuti, W. (2019). An analysis of students’ learning style, mathematical disposition, and mathematical anxiety toward metacognitive reconstruction in mathematics learning process abstract. Al-Jabar: Jurnal Pendidikan Matematika, 10(2), 187–200.

Ary, D., Jacobs, L. C., Sorensen, C., & Razavieh, A. (2010). Introduction to research in education (8th editio). Nelson Education.

Ayala, R. J. D. (2018). Item Response Theory and Rasch Modeling. In The Reviewer’s Guide to Quantitative Methods in the Social Sciences (2nd ed.). Routledge.

Baker, F. B. (2001). The basics of item response theory. In Evaluation (Second Edi). ERIC.

Bolt, D. (2003). Essays on item response theory. A. Boomsma, MAJ van Duijn, and TAB Snijders (Eds.)[Book Review]. Psychometrika, 68(1), 155-58.

Chalmers, R. P. (2012). Mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1–29.

Champlain, A. F. (2010). A primer on classical test theory and item response theory for assessments in medical education. Medical Education, 44(1), 109–117.

Chan, B. K. C. (2018). Data analysis using R programming. In Biostatistics for Human Genetic Epidemiology (pp. 47–122). Springer.

Chen, C., Razak, T. R., & Garibaldi, J. M. (2020). FuzzyR: An extended fuzzy logic toolbox for the R programming language. 2020 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 1–8.

Cohen, C., & Swerdlik, S. (2009). Psychology testing and assessment: An introduction to test and measurement (seventh ed). McGraw Hill, Inc.

Dahlke, J. A., & Wiernik, B. M. (2019). Psychmeta: An R package for psychometric meta-analysis. Applied Psychological Measurement, 43(5), 415–416.

de Gruijter, D. N. M., & van der Kamp, L. J. T. (2008). Statistical test theory for the behavioral sciences. Taylor & Francis Group, LLC.

DeMars, C. (2010). Item response theory: Understandings statistics measurement. Oxford University Press.

DeMars, C. E. (2018). Classical test theory and item response theory. The Wiley Handbook of Psychometric Testing: A Multidisciplinary Reference on Survey, Scale and Test Development, 49–73.

Drasgow, F. (1989). An evaluation of marginal maximum likelihood estimation for the two-parameter logistic model. Applied Psychological Measurement, 13(1), 77–90.

Embretson, S. E., & Reise, S. P. (1998). Item response theory for psychologist. Lawrence Erlbaum Associates, Inc.

Essen, C. B., Idaka, I. E., & Metibemu, M. A. (2017). Item level diagnostics and model-data fit in item response theory (IRT) using BILOG-MG v3. 0 and IRTPRO v3. 0 programmes. Global Journal of Educational Research, 16(2), 87-94.

Ferraro, M. B., & Giordani, P. (2015). A toolbox for fuzzy clustering using the R programming language. Fuzzy Sets and Systems, 279(1), 1–16.

Finch, W. H., & French, B. F. (2015). Latent variable modeling with R. Routledge.

Fox, J. P. (2010). Bayesian item response modeling: Theory and applications. New York: Springer.

Hambleton, R. K., & Swaminathan, H. (1985). Item Response Theory: Principles and Applications. kluwer.

Hambleton, R. K., & Swaminathan, H. (2013). Item response theory: Principles and applications. Springer Science & Business Media.

Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory (Vol. 2). SAGE Publications, Inc.

Harwell, M. R., & Janosky, J. E. (1991). An empirical study of the effects of small item parameter estimation in BILOG. Applied Psychological Measurement, 15(3), 279–291

Hays, R. D., Brown, J., Brown, L. U., Spritzer, K. L., & Crall, J. J. (2006). Classical test theory and item response theory analyses of multi-item scales assessing parents’ perceptions of their children’s dental care. Medical Care, 44(11), S60–S68.

Holland, P. W., & Hoskens, M. (2003). Classical test theory as a first-order item response theory: Application to true-score prediction from a possibly nonparallel test. Psychometrika, 68(1), 123–149.

Jeon, M., & Rockwood, N. (2017). PLmixed: An R package for generalized linear mixed models with factor structures. Applied Psychological Measurement, 42(5), 401–402.

Kaplan, R. M., & Saccuzzo, D. P. (2009). Psychological testing: Principles, applications and issues (Seventh Ed). Nelson Education.

Kruschke, J. (2014). Doing Bayesian data analysis: A tutorial with R, JAGS, and stan. Elsevier

Kurniawan, D. D. (2015). Analisis kualitas soal ujian akhir semester matematika berdasarkan teori respon butir. Prosiding Seminar Nasional Matematika dan Pendidikan Matematika UMS 2015, 123–132.

Lemenkova, P. (2018). Factor analysis by R programming to assess variability among environmental determinants of the Mariana Trench. Turkish Journal of Maritime and Marine Sciences, 4(2), 146–155.

Lemenkova, P. (2019). Statistical analysis of the Mariana Trench geomorphology using R programming language. Geodesy and Cartography, 45(2), 57–84.

Linden, W. J., & Hambleton, R. K. (1997). Item response theory: Brief history, common models, and extensions. In Handbook of modern item response theory (pp. 1-28). Springer, New York, NY.

Mair, P. (2018). Item response theory. Modern Psychometrics with R, 95–159.

Meijer, R. R., Sijtsma, K., & Smid, N. G. (1990). Theoretical and empirical comparison of the mokken and the rasch approach to IRT. Applied Psychological Measurement, 14(3), 283–298.

Muchlisin, M., Mardapi, D., & Setiawati, F. A. (2019). An analysis of Javanese language test characteristic using the Rasch model in R program. REiD (Research and Evaluation in Education), 5(1), 61–74.

Murphy, K. R., & Davidshofer, C. O. (2005). Psychological testing principles and applications (sixth edit). Pearson Education.

Muthen, B., & Lehman, J. (1985). Multiple group IRT modeling: Applications to item bias analysis. Journal of Educational Statistics, 10(2), 133–142.

Nitko, A. J. (1996). Educational assessment of students. ERIC.

Ostrouchov, G., Chen, W. C., Schmidt, D., & Patel, P. (2012). Programming with big data in R. Oak Ridge National Laboratory and University of Tennessee.

Paek, I., & Cole, K. (2019). Using R for item response theory model applications. Routledge.

Penfield, R. D. (2003). IRT-Lab: Software for research and pedagogy in item response theory. Applied Psychological Measurement, 27(4), 301–302.

Primi, C., Morsanyi, K., Chiesi, F., Donati, M. A., & Hamilton, J. (2016). The development and testing of a new version of the cognitive reflection test applying item response theory (IRT). Journal of Behavioral Decision Making, 29(5), 453–469.

Reckase, M. D. (2009). Multidimensional item response theory (statistics for social and behavioral sciences). Springer Sicence.

Reise, S. P. (2014). Item response theory. The Encyclopedia of Clinical Psychology, 1–10.

Reise, S. P., & Waller, N. G. (1993). Traitedness and the assessment of response pattern scalability. Journal of Personality and Social Psychology, 65(1), 143–151.

Reise, S. P., & Waller, N. G. (2003). How many IRT parameters does it take to model psychopathology items?. Psychological Methods, 8(2), 164–184.

Retnawati, H. (2014). Teori respon butir dan penerapannya untuk peneliti, praktisi pengukuran dan pengujian, mahasiswa pascasarjana. Nuha Medika.

Rizbudiani, A. D., Jaedun, A., Rahim, A., & Nurrahman, A. (2021). Rasch model item response theory (IRT) to analyze the quality of mathematics final semester exam test on system of linear equations in two variables (SLETV). Al-Jabar: Jurnal Pendidikan Matematika, 12(2), 399–412.

Schmidt, K. M., & Embretson, S. E. (2003). Measuring abilities and item response theory. Comprehensive Handbook of Psychology: Research Methods in Psychology, 429–445.

Steinberg, L., & Thissen, D. (2013). Item response theory. In The Oxford handbook of research strategies for clinical psychology (pp. 336–373). Oxford University Press.

Stone, C. A. (1992). Recovery of marginal maximum likelihood estimates in the two-parameter logistic response model: An evaluation of MULTILOG. Applied Psychological Measurement, 16(1), 1-16.

Thissen, D., & Steinberg, L. (1988). Data analysis using item response theory. Psychological Bulletin, 104(3), 385–395.

Thissen, D., Steinberg, L., & Wainer, H. (1993). Detection of differential item functioning using the parameters of item response models. In Differential item functioning (pp. 67–113). Lawrence Erlbaum Associates, Inc.

Urbina, S. (2004). Essentials of psychological testing. John Wiley & Sons.Inc.

Van der Linden, W. J. (2017). Handbook of item response theory: Volume 2: Statistical tools. CRC Press.



  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


Creative Commons License
Al-Jabar : Jurnal Pendidikan Matematika is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.