Binary Logistic Regression Analysis of Variables that Influence Poverty in Central Java

The phenomenon of poverty is a serious problem faced by almost every country in the world. The Central Java regional government incorporated poverty issues into the Regional Medium-Term Development Plan (RPJMD) because Central Java has a high number of poor people. This was done as an effort by the Central Java government to reduce poverty. The main objective in this research is how to find factors influence poverty in Central Java with the dichotomous categorical response variable, binary logistic regression analysis was used. The results showed that based on the analysis conducted did not obtain a logistic regression equation model because there were no significant parameters because there were no variables that had a sig value <0.05, so there are no variables that affect the level of poverty in Central Java.


INTRODUCTION
The World Bank states that "Poverty is pronounced deprivation in well-being" (Poverty is a state of loss of well-being). The phenomenon of poverty is a serious problem faced by almost every country in the world. This is because poverty can affect various aspects of people's lives (Naranjo, 2012). According to (Group, 2016a), one of the causes of poverty is due to lack of income and assets to meet basic needs such as food, clothing, housing, health and education levels that are acceptable. In addition, poverty occurs because of the powerlessness of society to get out of the problems it faces. Therefore, community empowerment and improvement of community welfare in the regions are very important as the most important part of various policy strategies implemented by the regions (Aneta, 2010).
Indonesia is one of the countries that has a high level of poverty in the world (Group, 2016b (Poruschi & Ambrey, 2019). Thus, Java is the island with the highest poverty population. Central Java is one of the provinces in Java which has a high number of poor people and high social inequality. This was done as an effort by the Central Java government to reduce poverty. Therefore it is necessary to conduct research to find out the factors that most influence poverty in order to assist the government in developing the RPJMD.
Previous research on poverty has been carried out by (Nirwana & Istiawan, 2019) with the title "Mapping Regencies / Cities in Central Java Province Based on Poverty Levels Using the K-Harmonic Means Algorithm", using variables of population, female household heads, non-school children, disabled individuals, individuals who have chronic diseases, unemployment, unprotected drinking water sources, lighting sources other than electricity, kerosene and wood cooking fuels, defecation facilities (BAB) are not available (Nirwana & Istiawan, 2019). The results of research conducted there are 2 clusters of poverty, namely poor and not poor (Nirwana & Istiawan, 2019). The results of this cluster will be used as a category in the response variable in this study. Where poor as a success category and not poor as a failure category. Research using binary logistic regression was conducted by (Ramadhani, Faiz dan Zain, 2014) to Identify Factors Affecting the Status of Poor Family Rice Acceptance (Raskin) in Gunung Anyar District. The results showed that the variables that influence the status of rice reception for poor families (Raskin) are gender, family members, population status, recent education, frequency of eating, floor area of residential buildings, chapel facilities, expenditure for food, asset ownership and frequency buy new clothes.
To find out what variables affect poverty in Central Java with the dichotomous category of response variables, binary logistic regression analysis was used. The binary logistic regression model is one of the logistic regression models used to analyze the effect of one response variable and several explanatory variables, with the response variable being a dichotomous categorical data consisting of two categories, namely category 1 to declare a success event and category 0 to declare a failure event. , and the explanatory variables are categorical, numeric, or a combination of categorical and numeric data.
The researcher plans to conduct research with the aim of how to model poverty in Central Java with the binary logistic regression approach and find out what variables affect poverty in Central Java with the binary logistic regression approach.

Data Source
The data used in this study are secondary data obtained from the study (Nirwana & Istiawan, 2019). This data is in the form of quantitative data, which are indicators of the welfare status of the Integrated Database (BDT) of Central Java in 2015 consisting of ten variables. Fig. 1, shows the flow diagram of the proposed method.

Variable's Used
The response variable (Y) from this study is poverty, where the response variable category 1 = poor and 0 = not poor. In this study, the variables used consisted of response and explanatory variables. The explanatory variables used in this study are the number of residents, female household heads, out of school children, disabled individuals, individuals with chronic diseases, unemployment, unprotected drinking water sources, lighting sources other than electricity, kerosene and wood cooking fuel , defecation facility (BAB) is not available.

Binary Logistics Regression
Binary logistic regression is a statistical modeling technique where the probability of categorical dependent variables is related to the independent variables that are numeric or categorical on the scale. Suppose the dependent variable has an M category (Agresti, 2003). One value (failure event) of the bound variable is designated as a category reference. The probability of membership in a category is compared to the probability of membership in a category reference (Agresti, 2003). In general the probability of logistic regression is shown in the equation.
Where, π (x) is the probability of success category on the dependent variable, and is the estimating parameter for the independent variable.

Simultaneous Significance Test Results
Simultaneous test is performed to test whether the explanatory variables together affect the response variable. The test used to test the significance of the model simultaneously uses the Likelihood Ratio Test obtained by comparing the Log Likelihood function using all explanatory variables with the Log Likelihood function without explanatory variables. Log-likelihood value with explanatory variables and without explanatory variables is obtained through iteration of processed results. The values obtained are as follows: Based on (David & Stanley, 2000), the test statistics used for the likelihood ratio test are as follows: Based on the G test it was found that simultaneously the response variable had no effect on the explanatory variable, so partial testing could not be carried out either. So it is necessary to check the data again. Data checking is done by looking back at the relationship between explanatory variables with correlation analysis.  Table 2, it can be seen that the..... correlation analysis results is <0.05 which means there is a correlation between explanatory variables. Can also be seen that the resulting correlation value above 0.5, which means the correlation between explanatory variables is very strong. This allows the occurrence of multicoinierity, namely linear correlation between all independent variables. So that it allows modeling in logistic regression to be insignificant.

CONCLUSION
The binary logistic regression equation model cannot be used to describe poverty in Central Java, because simultaneous testing and partial tests are not significant. It is possible for multicollinearity between explanatory variables. So it is necessary to handle multicollinearity first, for example with the principal component analysis approach.