K-Means Algorithm for Grouping Provinces in Indonesia Based on Macroeconomic and Criminality Indicators

Andrea Tri Rian Dani(1*), Fachrian Bimantoro Putra(2), Meirinda Fauziyah(3), Sifriyani Sifriyani(4), Suyitno Suyitno(5), M Fathurahman(6)


(1) Statistics Study Program, Department of Mathematics, Faculty of Mathematics and Natural Sciences (FMIPA), Mulawarman University, Samarinda
(2) Statistics Study Program, Department of Mathematics, Faculty of Mathematics and Natural Sciences (FMIPA), Mulawarman University, Samarinda
(3) Statistics Study Program, Department of Mathematics, Faculty of Mathematics and Natural Sciences (FMIPA), Mulawarman University, Samarinda
(4) Statistics Study Program, Department of Mathematics, Faculty of Mathematics and Natural Sciences (FMIPA), Mulawarman University, Samarinda
(5) Statistics Study Program, Department of Mathematics, Faculty of Mathematics and Natural Sciences (FMIPA), Mulawarman University, Samarinda
(6) Statistics Study Program, Department of Mathematics, Faculty of Mathematics and Natural Sciences (FMIPA), Mulawarman University, Samarinda
(*) Corresponding Author

Abstract


Cluster analysis is a method in multivariate analysis to group n observations into K groups (K ≤ n) based on their characteristics. One of the well-known algorithms in cluster analysis is K-Means. K-Means uses the non-hierarchical principle where at the initial initiation, it is necessary to determine the number of groups in advance. The K-Means algorithm can be applied to classify provinces in Indonesia based on macroeconomic indicators (percentage of poor people, open unemployment rate, and Gini ratio) and crime rate (Crime rate). The ultimate goal of this research is of course to get optimal grouping results. The similarity measure used is Euclidean Distance. The number of groups tested K=2,3,4,…,10 and the optimal number of groups with the highest Silhouette value was selected. Based on the results of the analysis, the optimal number of clusters is four. These four clusters have characteristics that distinguish one cluster from another.

Keywords


Euclidean Distance; K-Means; Criminality; Silhouette

Full Text:

PDF

References


M. F. Cordova and A. Celone, “SDGs and innovation in the business context literature review,” Sustainability (Switzerland), vol. 11, no. 24, Dec. 2019, doi: 10.3390/su11247043.

S. Panuluh Meila Riskia Fitri, “Perkembangan Pelaksanaan Sustainable Development Goals (SDGs) di Indonesia,” 2015. [Online]. Available: www.infid.org

F. Irhamsyah, “Sustainable Development Goals (SDGs) dan Dampaknya Bagi Ketahanan Nasional,” Jurnal Kajian Lemhannas RI, no. 38, pp. 45–54, 2019, [Online]. Available: www.unsplash.com

D. A. Sari et al., “Performance Auditing to Assess the Implementation of the Sustainable Development Goals (SDGs) in Indonesia,” Sustainability (Switzerland), vol. 14, no. 19, Oct. 2022, doi: 10.3390/su141912772.

N. Rulandari, “Study of Sustainable Development Goals (SDGS) Quality Education in Indonesia in the First Three Years,” Budapest International Research and Critics Institute (BIRCI-Journal): Humanities and Social Sciences, vol. 4, no. 2, pp. 2702–2708, May 2021, doi: 10.33258/birci.v4i2.1978.

S. Syamsuri, Y. Sa’adah, and I. A. Roslan, “Reducing Public Poverty Through Optimization of Zakat Funding as an Effort to Achieve Sustainable Development Goals (SDGs) in Indonesia,” Jurnal Ilmiah Ekonomi Islam, vol. 8, no. 1, p. 792, Mar. 2022, doi: 10.29040/jiei.v8i1.3872.

S. Leite, “Using the SDGs for global citizenship education: definitions, challenges, and opportunities,” Globalisation, Societies and Education, vol. 20, no. 3, pp. 401–413, 2022, doi: 10.1080/14767724.2021.1882957.

K. Daerah Istimewa Yogyakarta Pendekatan Ekonomi, “Analisis Faktor-Faktor yang Mempengaruhi Kriminalitas di.”

E. Yulia Purwanti, J. Ilmu Ekonomi Studi Pembangunan FEB Undip Eka Widyaningsih, and J. Ilmu Ekonomi Studi Pembangunan FEB Undip, “Analisis Faktor Ekonomi Yang Mempengaruhi Kriminalitas Di Jawa Timur,” vol. 9, no. 2, 2019, [Online]. Available: http://jurnal.untirta.ac.id/index.php/Ekonomi-Qu

R. Novidianto and A. T. R. Dani, “Analisis Klaster Kasus Aktif COVID-19 Menurut Provinsi di Indonesia Berdasarkan Data Deret Waktu,” Jurnal Aplikasi Statistika & Komputasi Statistik, vol. 5, pp. 15–24, 2020.

D. Widyadhana, R. B. Hastuti, I. Kharisudin, and F. Fauzi, “Perbandingan Analisis Klaster K-Means dan Average Linkage untuk Pengklasteran Kemiskinan di Provinsi Jawa Tengah,” PRISMA, Prosiding Seminar Nasional Matematika, vol. 4, pp. 584–594, 2021, [Online]. Available: https://journal.unnes.ac.id/sju/index.php/prisma/

G. D. Rembulan, T. Wijaya, D. Palullungan, K. N. Alfina, and M. Qurthuby, “Kebijakan Pemerintah Mengenai Coronavirus Disease (COVID-19) di Setiap Provinsi di Indonesia Berdasarkan Analisis Klaster,” JIEMS (Journal of Industrial Engineering and Management Systems), vol. 13, no. 2, Sep. 2020, doi: 10.30813/jiems.v13i2.2280.

Z. He, X. Xu, and S. Deng, “Clustering Mixed Numeric and Categorical Data: A Cluster Ensemble Approach,” no. October 2005, 2005, [Online]. Available: http://arxiv.org/abs/cs/0509011

L. F. Marini and C. D. Suhendra, “Penggunaan Algoritma K-Means Pada Aplikasi Pemetaaan Klaster Daerah Pariwisata,” Jurnal Media Informatika Budidarma, vol. 7, no. 2, pp. 707–713, 2023, doi: 10.30865/mib.v7i2.5558.

A. Wahyu and Rushendra, “Klasterisasi Dampak Bencana Gempa Bumi Menggunakan Algoritma K-Means di Pulau Jawa,” Jurnal Edukasi dan Penelitian Informatika, vol. 8, no. 1, pp. 175–179, 2022.

R. Madhuri, M. R. Murty, J. V. R. Murthy, P. V. G. D. P. Reddy, and S. C. Satapathy, “Cluster Analysis on Different Data Sets Using K-Modes and K-Prototype Algorithms,” Advances in Intelligent Systems and Computing, vol. 249, pp. 137–144, 2014, doi: 10.1007/978-3-319-03095-1_15.

H. Sofyan, M. Iqbal, M. Marzuki, and M. Muhammad, “The comparison of k-modes clustering and ROCK clustering to the poverty indicator in Samadua Subdistrict, South Aceh,” IOP Conf Ser Mater Sci Eng, vol. 1087, no. 1, p. 012085, 2021, doi: 10.1088/1757-899x/1087/1/012085.

C. Purnama, W. Witanti, and P. N. Sabrina, “Klasterisasi Penjualan Pakaian Untuk Meningkatkan Strategi Penjualan Barang Menggunakan K-Means,” Journal of Information Technology, vol. 04, no. 1, pp. 35–58, 2022.

A. Munawar et al., “Cluster Application with K-Means Algorithm on the Population of Trade and Accommodation Facilities in Indonesia,” J Phys Conf Ser, vol. 1933, no. 1, 2021, doi: 10.1088/1742-6596/1933/1/012027.

E. Banjarnahor, A. Bustamam, T. Siswantining, and P. Tampubolon, “Analyzing Kinship in Severe Acute Respiratory Syndrome Coronavirus 2 DNA Sequences Based on Hierarchical and K-Means Clustering Methods Using Multiple Encoding Vector,” Int J Adv Sci Eng Inf Technol, vol. 12, no. 6, pp. 2237–2247, 2022, doi: 10.18517/ijaseit.12.6.15582.

A. Ahmad and L. Dey, “A k-mean clustering algorithm for mixed numeric and categorical data,” Data Knowl Eng, vol. 63, no. 2, pp. 503–527, 2007, doi: 10.1016/j.datak.2007.03.016.

Z. He, X. Xu, and S. Deng, “A cluster ensemble method for clustering categorical data,” Information Fusion, vol. 6, no. 2, pp. 143–151, 2005, doi: 10.1016/j.inffus.2004.03.001.

J. C. Gower, “A Comparison of Some Methods of Cluster Analysis,” Biometrics, vol. 23, no. 4, p. 623, 1967, doi: 10.2307/2528417.

S. Sarumathi, P. Ranjetha, C. Saraswathy, M. Vaishnavi, and S. Geetha, “A Review and Comparative Analysis on Cluster Ensemble Methods,” International Journal of Computer and Information Engineering, vol. 15, no. 6, pp. 385–394, 2021.

M. Halkidi, Y. Batistakis, and M. Vazirgiannis, “On clustering validation techniques,” J Intell Inf Syst, vol. 17, no. 2–3, pp. 107–145, 2001, doi: 10.1023/A:1012801612483.

A. T. R. Dani, S. Wahyuningsih, and N. A. Rizki, “Penerapan Hierarchical Clustering Metode Agglomerative pada Data Runtun Waktu,” Jambura Journal of Mathematics, vol. 1, no. 2, pp. 64–78, 2019, doi: 10.34312/jjom.v1i2.2354.

E. Herman, K. E. Zsido, and V. Fenyves, “Cluster Analysis with K-Mean versus K-Medoid in Financial Performance Evaluation,” Applied Sciences (Switzerland), vol. 12, no. 16, 2022, doi: 10.3390/app12167985.

J. I. M. Araujo et al., “Non-hierarchical cluster analysis for determination of resistance to worm infection in meat sheep,” Trop Anim Health Prod, vol. 53, no. 1, 2021, doi: 10.1007/s11250-020-02484-3.

R. Hidayati, A. Zubair, A. Hidayat Pratama, L. Indana, P. Studi Sistem Informasi, and F. Teknologi Informasi, “Analisis Silhouette Coefficient pada 6 Perhitungan Jarak K-Means Clustering Silhouette Coefficient Analysis in 6 Measuring Distances of K-Means Clustering,” 2021.

S. Zhou, F. Liu, and W. Song, “Estimating the Optimal Number of Clusters Via Internal Validity Index,” Neural Process Lett, vol. 53, no. 2, pp. 1013–1034, 2021, doi: 10.1007/s11063-021-10427-8.


Article Metrics

Abstract view : 202 times
PDF - 73 times

DOI: https://doi.org/10.26714/jsunimus.11.2.2023.12-21

Refbacks

  • There are currently no refbacks.


Copyright (c) 2023 Jurnal Statistika Universitas Muhammadiyah Semarang

Editorial Office:
Department of Statistics
Faculty Of Mathematics And Natural Sciences
 
Universitas Muhammadiyah Semarang

Jl. Kedungmundu No. 18 Semarang Indonesia



Published by: 
Department of Statistics Universitas Muhammadiyah Semarang

View My Stats

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License