FMCE-PHQ-9 Assessment with Rasch Model in Detecting Concept Understanding, Cheating, and Depression amid the Covid-19 Pandemic

_______________________ *Correspondence Address: fauzansulman@uinjambi.ac.id Abstract: This research aims to see the ability of the FMCE-PHQ-9 test instrument amid the Covid-19 pandemic to measure conceptual understanding, cheating, and depression in students. The research was conducted on 64 physics education students at Sulthan Thaha Saifuddin State Islamic University Jambi. The instrument consists of 47 force and motion material items to fit the Winsteps 3.65.0 program. The analysis results using the Rasch Model showed that the MNSQ Outfit was 1.00 in the person column and 0.1 in the item column. Judging from the ZSTD value of 0.57 for the person and 0.1 for the item, the Points Measure value correlated with 0.4 to 0.85 while the item reliability value was 0.73 and the Cronbach's Alpha value was 0.56. therefore, the test instrument using the Rasch proclamation model found 31 fit items. The analysis results show that the concept ability was poor since, on average, the students could only answer questions with a low index of difficulty category. The research results on the level of cheating obtained data that 100 percent of students were not indicated to have the same pattern. Lastly, for the level of depression, only 16 percent of students did not experience depression, while 84 percent of students experienced it.


INTRODUCTION
In facing the changes caused by covid -19, higher education (Chang et al., 2020;Mishra et al., 2020;Platto et al., 2020) must not only provide a learning system but also assess the complex and sensitive tools used in the learning process (Ozkazanc & Yuksel, 2015;Watts et al., 2003). Assessment must have a clear system and function (Liaupsin, 2002;Reed et al., 2020) and must see the psychological condition (Lavasani et al., 2011;Sremcev et al., 2018) so that students can support the government in the MBKM (freedom to learn campus) program (Sudaryanto et al., 2020) Assessment must have a transparent system and function and must also see the psychological condition so that students can support the government in the program (Manda Negara & Hidayati, 2021). MBKM in physics education is related to students' conceptual understanding. Concept understanding is a special concern in physics, especially in the force and motion learning materials. Various forms of efforts to improve mastery of physics concepts have been carried out (Malone, 2008;Scott & Schumayer, 2017) in terms of learning models, development of learning media, and development of assessments in learning (Sofiuddin et al., 2018;Sulman, 2019). The facts show unsatisfactory results (Sulman et al., 2021), so an assessment that can measure accurately is needed. One of the measuring tools used is multiple choice test questions because it provides space for more precise and more focused thinking (Fidan & Tuncel, 2019;Holbrook & Kasales, 2020). It cannot be denied that there are still many multiplechoice test instruments that cannot measure what should be measured (Kusairi, 2013;Ozkazanc & Yuksel, 2015).
The selected test instrument must meet the criteria as a good measuring tool and provide an overview of the competencies possessed by students (Sri Hastuti, 2021;Wijayanti & Mundilarto, 2015). Most existing research instruments cannot detect measurement errors, especially on the measuring instrument itself. For example, when the lecturer asks to analyze what happens when we dive into an intense place, students accustomed to doing so can answer the question with relative ease. Inversely, students who do not know the phenomena proposed by the lecturer may need additional data. Therefore, a good assessment is needed. In carrying out formative assessments, one must pay attention to many points of view. Formative assessment has an insignificant impact on the learning process (Kusairi, 2013; Xiao & Yang, 2019) and does not pay attention to the conditions experienced by students' mental states (Xiao & Yang, 2019). This factor is important and can affect student learning outcomes. Therefore, a test instrument must be developed to reflect students' actual abilities by detecting biases in the questions or cheating in the test. It should consider the condition and mental state of the test takers. Students' psychology is fluctuating, so it will be straightforward to experience changes that can cause obstacles in the process of understanding learning (Thees et al., 2020). In carrying out their activities, students can position themselves as students who must focus on their scientific fields and many other factors in their lives that will affect them. For example, the social environment, organization, family, and lectures.
The process has shifted from offline to online. These and other factors will influence the completion of the current course. In the current condition where tests are carried out online at their respective homes, we cannot see students' mental conditions through technology development. Students can understand concepts, cheat, and be depressed, so that teachers cannot measure their abilities. The multiple-choice assessment will give scores to the correct answers and no score to the incorrect answer. The more questions answered by students, the higher the ability of these students. This phenomenon must be reviewed deeply from the psychological aspect. The classical procedure usually pays less attention to the interaction between each student and the questions (Shute & Rahimi, 2021). Hence, it isn't easy to measure students' actual ability online that allows collaboration and technology to answer test questions.
Assessments must be measured in depth in making decisions. Therefore, Item Response Theory and questionnaires on the psychological or mental condition can be a solution in overcoming the weaknesses that exist in the usual classical test theory. An item approach is an alternative approach that can be used to analyze a test using probabilistic principles. The probability of the subject answering the item correctly is based on the subject's ability and item characteristics. In item response theory models, each participant responds to each item in the test. The response theory in determining the characteristics of the model that can be used is the Rasch model. Benjamin Wright later popularized this model (Ho, 2019;Ponkilainen et al., 2021). Its raw data is dichotomous (true and false). Rasch modeling aims to measure the results depending on who is measured (test-dependent scoring). The percentage or number of correct answers on a test based on the measured subject (sample dependent) is applied to all subjects (Andrich & Pedler, 2019; Rabbitt, 2018). Rasch's measurement model is considered to meet these five conditions. Objective measurements produce data independent of the type of subject, the characteristics of the assessor, and the characteristics of the measuring instrument (Zehirlioglu & Mert, 2020). This research uses the Rasch model to analyze the test instrument developed following the perspective assessment for learning. The instrument developed using the Rasch Model assisted by the Winstep program version 3.65.0 can detect queries' bias and detect answers during an online test correctly. Therefore, students will not feel disadvantaged by inaccurate and unfair measurements. (Gordon et al., 2021;Lu et al., 2021) The Rasch model can provide an accurate measurement. To make the measurement results more in-depth in measuring the mentality of the test takers, it will be combined with the Patient Health Questionnaire 9 (PHQ-9). PHQ-9 is a questionnaire often used in detecting primary health facilities. Several studies have revealed the advantages of this test in analyzing depression (Patrick & Connick, 2019) amid covid-19. We cannot deny that everyone is unique. Therefore, it creates obstacles if we cannot diagnose students' problems. An objective and sensitive questionnaire is needed to screen the measuring tools. The PHQ-9 can be used among several available psychological questionnaires (Levis et al., 2019;Patrick & Connick, 2019). The PHQ-9 is a psychometric instrument that was originally part of the mental disorder evaluation intended to screen for mental disorders in general. In 2001, a questionnaire compiled by Robert J. Spitzer et all and his colleagues from Columbia University, New York, developed this separately to determine depression. This is the reason for researchers to integrate the FMCE concept understanding assessment, which is widely used in determining students' conceptual understanding by using the Rasch model to obtain comprehensive results and explore the subjects. The Covid-19 pandemic requires lecturers to be done online.

METHOD
The research method used was the research and development method. The development model used was proposed by Borg and Gall (Tristiantari, 2019). The subjects of the research trial were 64 students who had taken basic physics courses at Sulthan Thaha Saifuddin State Islamic University Jambi. The quantitative data were analyzed with the help of the Winsteps Version 3.65.0 program. The Winsteps Version 3.65.0 program obtained item parameters that matched the Rasch model. The reliability was seen from the magnitude between item reliability and overall item reliability indicated by the Cronbach alpha value. Furthermore, according to the model, the item limit was stated if it has an MNSQ Outfit of 0.5 to 1.5, ZSTD outfits between -2.0 to 2.0, and the correlation value ranges from 0.4 to 0.85. After developing the instrument, the researchers analyzed and concluded the concept of understanding, cheating, and student depression on force and motion material during the covid-19 pandemic.

RESULT AND DISCUSSION
Based on data analysis using Winsteps software, 31 items fit the Rasch model and 16 other items that did not fit the Rasch model. The results are summarized in Table 1.  Table 1 shows that the logit value of the person or measure is 0.00, and the item measure value is 0.78. It means that the person measure value is much lower than the item measure. It can be stated that the students' ability at Sulthan Thaha Saifuddin State Islamic University Jambi is far below the level of difficulty. In other words, it is tough for students to answer all questions correctly. Students with low abilities only answer questions with a low discriminating index and cannot work on the most difficult questions.
On the other hand, the item reliability is feasible (0.73). The person reliability is 0.44, and the Cronbach Alpha value is 0.56. Therefore, the level of students' answers consistency is in the sufficient or consistent category.
Another thing that is presented in table 1 is the Outfit Mean Squared (Outfit MNSQ) value of 1 and for the person of 0.7. This value is included in the fit criteria between 0.5 < MNSQ < 1.5. It can be stated that the test instrument is appropriate to measure students' understanding. Furthermore, the Outfit Z Standardized value (Outfit ZSTD) is 0.57 for the person and 0.1 for the item. The values are between -2.0 <ZSTD< 2.0, so it can be ascertained that the data has a possible rational value. In other words, the questions or items follow the Rasch model (Shea et al. 2012;Tseng et al. 2019) and can be used as an instrument to test students' conceptual understanding of the force and motion material.
The results show the ability of the Rasch Model in developing assessment instruments. The distribution of items is considered appropriate or according to the Rasch. The item limit is declared fit with the model if it meets one or more of the following conditions: Outfit MNSQ value is between 0.5 and 1.5; Outfit ZSTD value is between -2.0 to 2.0; and the item correlation value with the total score between 0.4 to 0.85. The developed questions can be a measuring tool that is fit for the final test and fit for assessments that can help lecturers during lectures (Chang and Chang 2014; Veugen, Gulikers, and den Brok 2021).
The Rasch model has a special characteristic where raw data must be converted to form an odds ratio (Blanchin et al. 2020;Shea et al. 2012) and then transformed into logarithmic units to manifest the probability of respondents responding to an item. It returns data according to its natural condition, which is a continuum. Through the Rasch model, items can change the form of the ordinal response into a ratio with a better level of accuracy in terms of the probability principle (Mamat et al., 2014). The Rasch model looks at good items and from the perspective of the person working on the instrument so that the instrument can be consistent in measuring without being influenced by other factors to the extreme (San Martín & Rolin, 2013; Scoulas et al., 2021). The complete misfit order test data can be seen in Figure 1.

Students' Concepts Understanding on Force and Motion Materials
The analysis of conceptual understanding in terms of the FMCE-PHQ-9 instrument developed using the Rasch model for the Physic Education Study Program students at Sulthan Thaha Saifuddin State Islamic University Jambi can see the students' level compared to the level of difficulty. The data is displayed in Figure 2.  Figure 2 shows that the students' conceptual understanding of the FMCE-PHQ-9 test questions can be analyzed using the Winsteps output, namely the Wright map. Students' conceptual understanding of force and motion was poor. It could be determined since the FMCE had a much higher difficulty level than students' abilities. Out of 47 items, the students only did the low-level difficulty items, namely question no. 32, 1, and 19. On average, the students cannot answer correctly. It indicated that students did not understand the concept well (Ramlo, 2008;Smith & Wittmann, 2008;Wells et al., 2020).

Detecting Online Test Cheating
Concepts understanding are directly proportional to honest behavior. Education in the industrial era 4.0 is more directed at understanding a comprehensive study (Honey et al., 2014;Tatus et al., 2014). Honesty is one way to study a conscience phenomenon as a whole so that education is not merely a knowledge transfer process but also the knowledge that can be applied and developed in the future (Detel, 2015;Ryckman, 2015;Sulman et al., 2021). Assessment is not merely a final test tool. Teachers' companions can carry out the learning process (Kusairi, 2013). It indicates the importance of teachers knowing the original value of a given test. An online test performed at home presents an opportunity to cheat. This phenomenon can be seen through the Rasch analysis with the Winsteps application by looking at the pattern of student answers on the Guttman Scalogram of response scale shown in Figure 3. The graph shows the students' attitudes or characters in the given test. No student is the same or indicated to make mistakes in the test.

Detecting Students' Depression
When there is a change in learning patterns due to the Covid-19 pandemic, many students respond differently (Zb et al., 2020). It takes an appropriate learning model to maintain students' learning outcomes and maintain their enthusiasm and motivation in the learning process.
The pattern of lectures that has never happened and experienced before can be undesirable because it is very difficult to change the learning patterns that have become the characteristics of students. The change in habits is not just a change in the pattern of action but changes all components of life that can be the cause and effect in the learning process. Changing learning patterns can affect students' comfort and reduce their interest and motivation in learning, which results in the disruption of student psychological Student answer pattern Student disorders. (Mozaffari et al., 2020;Wang et al., 2021). From the results of observations using the FMCE-PHQ-9 instrument, a section can be used to detect student depression in the covid-19 pandemic on force and motion material. It shows the category or level of students' psychological state during the test process. The results of the study or test can be correlated with mental health. The study results show the levels or categories of student depression, as shown in Figure  4. This assessment is one of the modern assessments that can provide a more comprehensive and in-depth assessment by looking at the test instrument. The assessment results are only from the cognitive aspect (Putra et al., 2021;Sulman et al., 2021;Thornton & Sokoloff, 1998). It was found that from 64 respondents who filled out the questionnaire, about 16 percent or 10 students did not experience depression, and 84 percent of students experienced depression. There are 53% or 34 students who were depressed. This early warning must make teachers aware that students are ordinary people. Many other variables can affect mental and psychic conditions. The data in the diagram must be taken into consideration and be evaluated. This phenomenon is something that must be dealt with quickly. Teachers must always think about finding the right solution in learning (Sulman et al., 2020), especially during the covid-19 pandemic. One solution can be started by understanding student constraints and using the right learning model.

CONCLUSION
The test instrument used for the fundamental physics course on force and motion fits the Rasch model indicated by an item score (item reliability) of 0.73, person reliability of 0.00, and Cronbach's alpha value of 0.56. furthermore, the Outfit.Mean Square Statistic (Outfit MNSQ) value is 1 in the person column while the item is 0.1. The Outfit Z Standard (Outfit ZSTD) value is 0.57 in the person table and 0.1 for the item table.
There are 31 fit items and 16 unfit items. Judging from the concepts understanding, the students could only work on questions with a low difficulty index. No students had sufficient conceptual understanding skills to work on questions in the difficult category.
Furthermore, it can be taken from the indications that the students did not cheat. As for depression, many students were depressed during lectures during the Covid-19 pandemic. Only about 16% of students were not depressed, while 84% were depressed. It is recommended for further researchers to expand the number of population or research subjects so that the analysis using the Rasch model can be better and maximal.