The Development Higher Order Thinking Skill (Hots) As Questions In Chemistry Study (Solubility And Solubility Product Constant)

Article history Abstract Submission : 2021-01-04 Revised : 2021-02-16 Accepted : 2021-03-08 The development of higher order thinking skill (HOTS) questions on chemistry subject takes to drive students critical thinking skills. This research purpose to analyze the quality of HOTS chemistry questions and also to observe the teachers and students responses. The research method used research & development. The sample was determined by random sampling technique in total 104 students from several high school on Banda Aceh. Research procedures are collecting data and informations, product design, product development, validation by experts, initial product revisions, smallscale trials, product revisions, large-scale trials, and final product revisions. The data analysis technique was through calculating the percentage score of the assessment quality and analyzing the question items quantitatively using proanaltes program. The results showed that the quality of HOTS questions developed for Ksp (solubility product constant) lesson was 98.1%, means had very feasible criteria. In terms of quantitative analysis question items reached 95% valid, reliability test 0.740 high category, the level of difficulty includes 95% medium category and 65% difference good category. All calculation above plus the teachers and students positive responses attests the development of the WQC-based HOTS test instrument on Ksp lesson help improve students critical thinking skills significanly.

The development of higher order thinking skill (HOTS) questions on chemistry subject takes to drive students critical thinking skills. This research purpose to analyze the quality of HOTS chemistry questions and also to observe the teachers and students responses. The research method used research & development. The sample was determined by random sampling technique in total 104 students from several high school on Banda Aceh. Research procedures are collecting data and informations, product design, product development, validation by experts, initial product revisions, smallscale trials, product revisions, large-scale trials, and final product revisions. The data analysis technique was through calculating the percentage score of the assessment quality and analyzing the question items quantitatively using proanaltes program. The results showed that the quality of HOTS questions developed for Ksp (solubility product constant) lesson was 98.1%, means had very feasible criteria. In terms of quantitative analysis question items reached 95% valid, reliability test 0.740 high category, the level of difficulty includes 95% medium category and 65% difference good category. All calculation above plus the teachers and students positive responses attests the development of the WQC-based HOTS test instrument on Ksp lesson help improve students critical thinking skills significanly.

Keywords
First keyword Second keyword Third keyword Fourth keyword Fifth keyword

INTRODUCTION
Implementation of Curriculum 2013 recent years in Indonesia requires students emphasis learning on create, evaluate, and analyzing. To the educators, the curriculum focus on skills in developing HOTS assessment instruments, which is an evaluation tool that train students critical and creative thinking processes . Furthermore, education in the 21st century is education that integrates knowledge, skills, attitudes and mastering technology (Phito, et al. 2019). The reasons bring it is necessary to prepare students think more critic and creative, doing good in teamwork, communicate well and understand computer literacy. This is consistent with the statement of Janssen et al. (2019).
The results of question items review conducted by the Directorate of Senior High School Development (2019), USBN assistance for 26 subjects in 136 referral schools across 34 provinces during 2018 to 2019, showed that of the 1,779 items analyzed, most were at Level-1 and Level-2. From the 136 reference schools, there only 27 schools compiled high-order thinking skills questions as much as 20% of all the USBN questions compiled, 84 are under 20%, and the rest still struggling on HOTS issues. Based on this research, it can be seen that the ability of educators in Indonesia comprehend the HOTS questions are still low, while the curriculum demanded more.
Critical thinking is an ability to analyze ideas logically, reflectively, systematically and productively to understand and evaluate information with the aim of whether the information is accepted, rejected or its judgment is suspended (Indah, 2020). Critical thinking also a major and important topic in modern education (Damayanti, et al., 2017). To measure the achievement of critical and creative thinking, an assessment is necessary. This assessment has an important role in determining the HOTS of students. HOTS is the ability to use the mind broadly to get something new and want someone to apply new information or previous knowledge in solving problems in dynamic situations (Kusumastuti, et al., 2019). According to Jainal and Louise (2019), HOTS in chemistry learning is one of the competencies that students must have. To find out a person's HOTS, indicators that can measure this ability are needed (Kurniati, et al., 2016). This is where HOTS questions items most critical acting devices.
According to Ghani, et al. (2017), the use of HOTS questions in exams can spur on students to think deeply how to answer the questions in learning and help increase students learning motivation, because HOTS questions bring stimulus that come from surrounding environment. Lee and Choi (2017) also argue that the HOTS issue has an important role in the education success, in academics and also in the workplace. It because HOTS makes students better prepared for challenges and more creative in solving problems. In addition, HOTS questions can also improve the achievement of student learning outcomes so that they can be competitive both nationally and internationally (Fanani, 2018).
Based on observations and interviews with high school chemistry teachers in several region in Banda Aceh, it was found that some teachers had never compiled the HOTS category questions in daily tests or semester exams. In the implementation of learning evaluation in class, the teacher only provides evaluations in the form of questions containing the three lowest cognitive levels, it is C1, C2 and C3. The practice questions given to students still use paper and rarely use computer facilities. It also found that the solubility material and the product of the solubility were still difficult for students to understand properly. The average completeness value of students in this material is still less than optimal, proved by their score was under the specified minimum completeness criteria (KKM), 80 point. Based on the fact, it is essential to change the system in the assessment. The instrument developed by the teacher is expected more improve students higher order thinking.
One of the efforts that stakeholder can take of existing computer facilities in the evaluation process requires the right software, which is Wondershare Quiz Creator (WQC). WQC is software for creating questions, quizzes or online tests (web-based). The questions that have been created can be saved in flash format that can privately (stand alone) on the website. With WQC, users can arrange various forms and different levels of questions. Even with WQC, various images (images) and flash files (flash movie) can also be inserted to support students' understanding in working on questions (Hernawati, 2009). WQC is free software, it can be downloaded from the official website https://downloads.tomsguide.com/Wondershare-QuizCreator,0301-31201.html.
The results of the development of HOTS-based test instruments have been carried out by several researchers and have shown a positive effect. Research conducted by Kusuma, et al. (2017) show that the HOTS assessment instrument is an effective learning assessment to train HOTS and measure students thinking skills. More, Ramadhan, et al. (2019); Kurniawan and Lestari (2019), also reported in their research that the test instrument developed could be used to measure the HOTS of students. Also, the research by Kusumadani et al. (2016) showed that the HOTS test instrument was effective in measuring the HOTS of students and had good effectiveness.
Research by Sa'adah, et al. (2019) show that the development of HOTS questions with WQC as a display media on stoichiometric material is suitable for use. In addition, the responses of teachers and students to the development of HOTS questions with WQC on stoichiometric material were also high, it is 90.67% and 91.20% respectively with very good criteria. Maifajir, et al. (2019), stated that the questions developed were very feasible to be used as independent training for students to train HOTS, which had positive responses from teachers and students to HOTS questions on chemical bonding material using WQC. Another result, Hidayat et al., (2020) the development of HOTS assessments in Bahasa Indonesia subjects also showed positive responses from teachers and students. However, the use of HOTS-based test instruments is still minimal in developing chemical materials, it is important to develop it further. https://jurnal.unimus.ac.id/index.php/JPKIMIA/index

METHOD
The research method used is Research And Development (RnD). This research aims to produce HOTSbased test assessment instrument. The model used is the Borg & Gall development model. The sample in this study consisted of 104 students who were selected randomly from several schools in Banda Aceh, it is SMAN 4, SMAN 5, SMAN 8, and SMAN 11. The research procedure consisted of collecting data and information, product planning, product development, trials, revisions, field trials, and final product revisions. The research instrument consisted of validation sheets, question quality assessment sheets, media quality assessment sheets computer-based test, and student and teacher responses questionnaire sheets. The data analysis technique consisted of analyzing the quality of the instrument on the item, it is the validity, reliability, difficulty level, and differentiation tests that were tested using the Excel program version 6 proanaltes (Khaldun, 2017). Then the instrument feasibility test, result data analysis, teacher and student questionnaire analysis.

RESULTS AND DISCUSSION Development the quality of HOTS in chemistry questions for solubility and solubility products constant chapter
The validation process carried out by material and instrument experts. The process aims to determine the quality of HOTS in chemistry questions developed for the solubility and solubility product constant subject. The results of expert validation / material expert and WQC-based test instrument media can be seen in Table 1.  (2020) says that the results of the validator's assessment, obtained scores 98% by material experts, then 87% from the evaluation experts, and 92% by linguists, all of them are qualified in "very strong" criteria. Based on the results, the average value of expert validation was 92% with "very strong" criteria. A similar study was conducted by Khaldun, et al. (2019) that the feasibility results of the computer-based HOTS category in chemistry questions have been developed using WQC are 85% qualifies as "high valid" criteria.
The next progress was validating the WQC media. The validation process carried out by instrument experts to determine the validator's assessment of WQC quality. The results of the validation can be seen in Table 2 Table 3 results, it proves that the teacher assessment items are in the qualified category. The validity also supports that the WQC-based HOTS instrument will bring positive impact in chemistry learning.

Results of Items Test in a Small Scale
The trial was carried out on a small sample consisting of 27 students with 25 item questions. The results can be seen in following table. The result shows that the HOTS analysis items test in a small sample reach the high category validity test were 48% greater than the other categories. Then the reliability test results obtained 0.874 qualified in the high and reliable category. The level of difficulty test results the highest score in the medium category in 84 percentage. In accord with research of Khaldun et al. (2019) that items belonging the middle category are maintained and should be recorded in the question bank. The difficult and easy category questions must be corrected again and replaced with questions where some students are able to answer and the questions sentence should more complex so students use to takes more effort and think critically. Furthermore, the discrimination power reach highest percentage qualified as good and excellent category. It bring to the hypothesis that, the HOTS instrument test can also be used on a large sample. The difference power test carried out to determine the difference between students who have mastered the concept and who did not. Conform to the opinion expressed by Rahayu & Djazari (2016) that the calculation of distinguishing power is a measurement of the extent to which an item can differentiate between students who have mastered competence from students who have not. The distinguishing power can be seen by looking at the size of the discrimination index of the questions.
Based on the analysis of the items consisting of validity, reliability, difficulty level, and differential power tests in the early, 20 questions in the good category, 4 discarded questions and 1 revised question were obtained. It can be concluded that of the 25 questions that were tested , there were 20 questions that were worthy of being tested to the final stage.

Results of Items Test in a Large Sample
The large group trial was carried out in both groups, it is the experiment group and the control group. HOTS question instruments test in the experiment group using WQC media while the control group was done manually or without media. The test question instrument consisted of 20 multiple choice questions designed according to HOTS criteria, it is analyzing (C4), evaluating (C5), and creating (C6). The large scale sample consisted of 104 students in the experiment group and 53 students in the control group. The difference in the number of samples taken was caused by the Ramadhan national holiday and the end of students semester exams. The test results of large sample items includes: 1. Validity Test The chart indicates that the experiment group reach the highest score in the neutral category 50% and the control group also shows similar score. The calculation in this validity test called the empirical validity test. In accord with the research by Riyani et al. (2017) that empirical validity contains the word "empirical" which means "experience". An instrument said it is qualified to use when it empirical validity has been tested.

Reliability test
The reliability test of HOTS instrument on a large scale shows that the experiment group with a coefficient of r 1 of 0.704 is in the high and reliable category. While the reliability results in the control group with a coefficient r1 of 0.090 spot in the very poor and unreliable category. Found that questions in the experiment group are in the medium difficulty category while the control group in the difficult category. According to Solichin (2017) opinion, a question called qualified when the quality are in avarage, not too easy to answer and not difficult enough. Easy question do not stimulate students to enhance their efforts in solving. On the other hand, question that are too difficult will cause students to become discouraged and have no enthusiasm to try because they are out of reach. Rahayu & Djazari (2016) also stated that the level of difficulty calculation is proportion between students who can correctly answer an item with the total number of test participants. Rahmaini & Taufiq (2018) says that the number shows the question easy or difficult called the difficulty index, which is in the formula symbolized by the letter P. The greater the question difficulty index, the easier question can be or the opposite. Chart 3. The results of HOTS differential question power in large sample

Differential Power Tests
The differential question power in the experiment group are more good be compared to the control class score. The result proves that the questions that were tested in the experiment group have good differential power and the questions are more widely used. Stated by Susdelina, et al. (2018) the question differential power is the ability to distinguish between students who have a high level of ability and students who have low abilities. Supriadi, et al. (2018) the index of question differential power denoted by the letter D (stands for discriminatory power), and the index of question differential power ranges from -1.00 to +1.00. The higher the distinguishing power of a question, the better it get. Dewi, et al. (2018) said that the analysis of differential power is very important to do to determine the level of success of the questions in measuring students actual ability. When the question has bad distinguishing power, it not be able to measure the students ability, and when the question has good differential power, the better the question can measure.

Analysis of Students Higher Level Thinking Ability Test Results
Based on the results from the final trial in the experiment group and control group, students higher order thinking abilities can be measure. The results can be seen in Chart 4 and 5. Chart 4. The result of high-order thinking skills percentage in the experiment group The result of high-order thinking skills percentage analyzed on 20 HOTS instruments test in the experiment group gained 17 questions in good criteria, 2 items medium criteria and 1 item poor criteria. The good criteria range of values between 51-76, medium criteria 26-50, and poor criteria 1-25. The calculation proves that students thinking skills in the experiment class are qualified to the good criteria. Beside, the result of high-order thinking skills percentage in the control group can be seen in Image 5. Chart 5. The result of high-order thinking skills percentage in the control group The results of the percentage of students high-order thinking skills in the control class measured by 20 HOTS question instruments obtained 13 poor questions criteria 5 fair questions criteria and, 2 good questions criteria. The results prove that the high order thinking skills of students in the group are still lack of critic and creative. Based on the analysis, teacher responses to the development of the HOTS instrument question using the WQC are in the strongly agree and agree category. The analysis percentage value reached 16.7% for agree category and 83.3% for strongly agree category. The results of the student response analysis obtained percentage score of 27.2% strongly agree, 63.3% agree, 8.8% disagree and 0.7% strongly disagree. Regarding to the result, it proves that majority students support on the development of the WQC-based HOTS test instrument.

CONCLUSION
Based on the research on the development of HOTS instruments test based on wondershare quiz creator, it can be concluded that the quality of HOTS developed in chemistry questions for solubility material and the solubility product in terms of qualitative analysis of items from the validation test results obtained an average score of 98, 1%, then the HOTS test questions are very viable to use. In terms of the quantitative analysis of the items, the validity of the questions was 95% valid and 5% invalid, the reliability test was 0.740 in the high category. The difficulty level of the questions covers 95% medium questions and 5% difficult questions. The difference power of the questions obtained for each category is 65% good, 30% fair and 5% poor. The response of teachers and students to the development of the computer-based HOTS question instrument developed using WQC reach positive responses. The results of the teacher's response with the category agree 16.7% and 83.3% strongly agree. While the response of students in the category of strongly agree 27.2%, 63.3% agree, 0.7% disagree, and 8.8% strongly disagree.