DECISION TREE-BASED GRADIENT BOOSTING: ALGORITHM TO APPROACH HOUSE PRICE PREDICTION IN JAKARTA, BOGOR, DEPOK, TANGERANG, BEKASI (JABODETABEK)

Intan Lisnawati(1*), Anjasmoro Adi Nugroho(2)


(1) National Central University
(2) Innopolis University
(*) Corresponding Author

Abstract


The house sale prices are a particular concern for some people, whether sellers or buyers, for personal use or investment. Commonly, the buyer comes from newly-married couples, parents, or investors. Compared to years ago, the recent price is more expensive due to some conditions over the time. Forecasting is a method to see at which price the house may fit the market price with certain features. Through this study, we complement the previous research about house prices and analyze the results. Besides, here we also break down the algorithm and sketch the steps so that it eases the reader to understand the method. Exploratory data analysis is also done to see and analyze the characteristics of the dataset. Applying decision tree-based gradient boosting, we run the algorithm into datasets in Jakarta, Bogor, Depok, Tangerang, and Bekasi (Jabodetabek) consisting of house price and its features. We see that the RMSE value is Rp277.369.397 and the MAPE is 17,3%. With that value of accuracy we could mention that gradient boosting is quite competitive compared with other methods and able to give its best prediction over house prices.


Keywords


Boosting, prediction, ensemble method, house price, root mean square error

Full Text:

PDF

References


Badan Pusat Statistik Provinsi DKI Jakarta. (2021, Mar 18). Jumlah Perusahaan, Tenaga Kerja, dan Pengeluaran Untuk Tenaga Kerja Menurut Klasifikasi Industri pada Industri Besar dan Sedang di Provinsi DKI Jakarta, 2018 [Online]. Available: https://jakarta.bps.go.id/id/statistics-table/2/MzU5IzI=/jumlah-perusahaan--tenaga-kerja-dan-pengeluaran-untuk-tenaga-kerja--menurut-klasifikasi-industri-pada-industri-besar-dan-sedang-di-provinsi-dki-jakarta.html

Satu Data Ketenagakerjaan. (2024, Jan 16). Upah Minimum Provinsi Tahun 2024 [Online]. Available: https://satudata.kemnaker.go.id/infografik/57

Jobstreet. (2024, Feb 19). 12 Daerah dengan UMR Tertinggi di Indonesia Tahun 2024 [Online]. Available: https://id.jobstreet.com/id/career-advice/article/umr-tertinggi-di-indonesia#

F.A. Sitanggang and P.A. Sitanggang, Buku Ajar Perilaku Konsumen. Jawa Tengah: PT. Nasya Expanding Management, 2021.

N. Hadi and J. Benedict, “Implementasi Machine Learning untuk Prediksi Harga Rumah Menggunakan Algoritma Random Forest,” Computatio: Journal of Computer Science and Information Systems, vol. 8, no. 1, pp. 50-61, April 2024.

E.F. Rahayuningtyas, F.N. Rahayu and Y. Azhar, “Prediksi Harga Rumah Menggunakan General Regression Neural Network,” Jurnal Informatika, vol. 8, no. 1, pp. 59-66, April 2021.

A. Vermaysha and Nurmalitasari, “Prediksi Harga Rumah di Kabupaten Karanganyar Menggunakan Metode Regresi Linear,” Prosiding Seminar Nasional Teknologi Informasi dan Bisnis (SENATIB), pp. 6-11, July 2023.

A. Saiful, S. Andryana and A. Gunaryati, “Prediksi Harga Rumah Menggunakan Web Scrapping dan Machine Learning dengan Algoritma Linear Regression,” Jurnal Teknik Informatika dan Sistem Informasi, vol. 8, no. 1, pp. 41-50, Jan 2021.

N. Nuris, “Analisis Prediksi Harga Rumah pada Machine Learning Menggunakan Metode Regresi Linear,” Jurnal Explore, vol. 14, no.2, pp. 108-112, July 2024.

M.L. Mu’tashim, S.A. Damayanti, H.N. Zaki, T. Muhayat and R. Wirawan, “Analisis Prediksi Harga Rumah Sesuai Spesifikasi Menggunakan Multiple Linear Regression,” Jurnal Informatik edisi ke-17, no. 3, pp. 238-245, Dec 2021.

H. Wu, J.M. Yamal, A. Yaseen, and V. Maroufy, Statistics and Machine Learning Methods for EHR Data: From Data Extraction to Data Analytics, Boca Raton, Florida: CRC Press, 2021.

G. Kyriakides and K.G. Margaritis, Hands-On Ensemble Learning with Python, Birmingham: Packt, 2019.

G. James, D. Witten, T. Hastie, and R. Tibshirani, An Introduction to Statistical Learning: with Applications in R, 2nd ed. New York: Springer, 2017.

Scikit Learn. 1.10 Decision Trees [Online]. Available: https://scikit-learn.org/stable/modules/tree.html

S.S. Shwartz and S.B. David, Understanding Machine Learning: From Theory to Algorithms. Cambridge: Cambridge University Press, 2014.

Analytics India Mag. (2022, Feb 28). Story of Gradient Boosting: How It Evolved Over Years [Online]. Available: https://analyticsindiamag.com/ai-origins-evolution/story-of-gradient-boosting-how-it-evolved-over-years/

J. Strang and E. Herman. (2024, July). Calculus Volume 3 [online]. Available: https://openstax.org/books/calculus-volume-3/pages/4-6-directional-derivatives-and-the-gradient

I. Lisnawati, “Tree-Based Ensemble Methods with An Application in House Sale Price Prediction,” M.Sc. theses, Dept. Mathematics, National Central Univ., Taoyuan, Taiwan, 2022.

A.T. Damaliana, A. Muhaimin and D.A. Prasetya, “Forecasting The Occupancy Rate of Star Hotels in Bali Using The XGBoost and SVR Methods,” Journal of Statistics, vol. 12, no. 1, pp. 24-33, June 2024.

Nafis Barizki. (2022). Daftar Harga Rumah Jabodetabek [Online]. Available: https://www.kaggle.com/datasets/nafisbarizki/daftar-harga-rumah-jabodetabek/data.


Article Metrics

Abstract view : 3 times
PDF - 4 times

DOI: https://doi.org/10.26714/jsunimus.12.2.2024.1-9

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Jurnal Statistika Universitas Muhammadiyah Semarang

Editorial Office:
Department of Statistics
Faculty Of Mathematics And Natural Sciences
 
Universitas Muhammadiyah Semarang

Jl. Kedungmundu No. 18 Semarang Indonesia



Published by: 
Department of Statistics Universitas Muhammadiyah Semarang

View My Stats

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License