K-MEDOIDS ALGORITHM CLUSTERING WITH PRINCIPAL COMPONENT ANALYSIS (PCA) (CASE STUDY: DISTRICTS/CITIES ON THE BORNEO ISLAND BASED ON POVERTY INDICATORS IN 2021)

Muhammad Yafi(1*), Rito Goejantoro(2), Andrea Tri Rian Dani(3)


(1) Statistics Study Program, Department of Mathematics, Faculty of Mathematics and Natural Sciences, Mulawarman University, Samarinda
(2) Statistics Study Program, Department of Mathematics, Faculty of Mathematics and Natural Sciences, Mulawarman University, Samarinda
(3) Statistics Study Program, Department of Mathematics, Faculty of Mathematics and Natural Sciences, Mulawarman University, Samarinda
(*) Corresponding Author

Abstract


Cluster analysis is a technique in data mining that aims to group data (object) based on the information in the data. This research is used a non-hierarchical grouping named K-Medoids algorithm to group districts/cities in Borneo island based on poverty indicators and Principal Component Analysis (PCA) method to reduce research variable. This research is also do a cluster validity test to see how many cluster there are has the best grouping result using Silhouette Coefficient (SC) method. Based on the results of the analysis there is 3 optimal Principal Component (PC) were obtained with eigen value criteria of greater than or equal to 1. Furthermore, districts/cities on Borneo island were grouped based on the PC that formed and obtained 2 optimal clusters with an SC value of 0.61. The K-Medoids algorithm obtain 2 cluster, cluster 1 consisting of 49 districts/cities and cluster 2 consisting of 7 cities.


Keywords


K-Medoids; PCA; Poverty; Silhouette Coefficient

Full Text:

PDF

References


Santoso, B., (2007), Data Mining Techniques for Utilizing Data for Business Purposes, Yogyakarta: Graha Science.

Prasetyo, E., (2012), Data Mining: Concepts and Applications Using MATLAB, Yogyakarta: ANDI Publisher.

Singh, N dan Singh, D., (2012), Peformance evaluation of K-means and Heirarichal clustering in terms of accuracy and running time. International journal in computer science and information technology, Vol. 3, Pp. 4119-4121.

Umar, H. B., (2009), Principal Component Analysis (PCA) and its Application with SPSS. Journal of Health. Vol. 3, No. 2, Pp. 97-101.

Smith, L. I., (2002), A Tutorial on Principal Component Analysis. Computer Science Technical Report, Vol. 1, No. 2, Pp. 1-26.

Nasution, M. Z., (2019), Application of Principal Component Analysis (PCA) in Determining Dominant Factors Affecting Student Learning Achievement. Journal of Information Technology. Vol. 3, No. 1, Pp. 41–48.

Pramana, S., Yuniarto, B., Mariyah, S., Santoso, I., Nooraeni, R., (2018). Data Mining with R: Concepts and Implementation. Bogor: IN MEDIA.

Ghaisani, S. Y., Hikmah, N., Prasetyo, A. H., Widodo, E., (2018), Hierarchical Cluster Analysis for Grouping Provinces in Indonesia Based on Indonesian Democracy Indicators in 2016. National Conference on Mathematics Research and Learning IV, Surakarta, 2019.

Mohammed, N. N., dan Abdulazeez, A. M., (2007), Evaluation of partitioning around medoids algorithm with various distances on microarray data," in IEEE International Conference on Internet of Things (iThings), Exeter, UK, Pp. 1011-1016. doi: 10.1109/iThings- GreenCom-CPSCom-SmartData.2017.155

Kaufman, L. dan Rousseeuw, P. J., (1990), Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley and Sons, New York.

Pramesti, D. F., Furqon, M. T., dan Dewi, C., (2017), Implementation of K-Medoids Clustering Method for Grouping Potential Forest/Land Fire Data Based on Hotspot Distribution. Journal of Information Technology and Computer Science Development e-ISSN, 2548, 964X.Y. Vol. 1, No. 9, Pp. 723-732.

Rachmatin, D., (2014), Application of Agglomerative Methods in Cluster Analysis on Air Pollution Level Data. Scientific Journal of Mathematics Study Program STKIP Siliwangi Bandung, Vol. 3, No. 2, Pp. 133-149.

Soemartini, (2008), Principal Component Analysis (PCA) as a Method to Solve Multicollinearity Problem. Journal of Technology and Information. Vol. 6, No. 5, Pp. 1-9.

Afira, N., (2019), Poverty Cluster Analysis of Provinces in Indonesia in 2019 using Partitioning and Hierarchical Methods. Komputika: Journal of Computer Systems Vol. 10, No. 2, Pp. 101–109.

Han, J., dan Kamber, M., (2006), Data Mining: Concept and Techniques. San Fransisco: Morgan Kauffman Publisher.


Article Metrics

Abstract view : 111 times
PDF - 38 times

DOI: https://doi.org/10.26714/jsunimus.11.2.2023.31-43

Refbacks

  • There are currently no refbacks.


Copyright (c) 2023 Jurnal Statistika Universitas Muhammadiyah Semarang

Editorial Office:
Department of Statistics
Faculty Of Mathematics And Natural Sciences
 
Universitas Muhammadiyah Semarang

Jl. Kedungmundu No. 18 Semarang Indonesia



Published by: 
Department of Statistics Universitas Muhammadiyah Semarang

View My Stats

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License