Data Management Analysis for Predicting Stroke using RapidMiner

Hendy Tannady; Johanes Fernandes Andry; Kosasi Kosasi

doi:10.61487/jiste.v2i3.95

Authors

Hendy Tannady Universitas Esa Unggul
Johanes Fernandes Andry Universitas Bunda Mulia
Kosasi Kosasi Universitas Bunda Mulia

Keywords:

stroke, data mining, decision tree

Abstract

Stroke are known as the second most leading cause of death. Because of this, data mining techniques are already being used to predict patients that may have stroke. Therefore, we are doing a study to try using data mining techniques using RapidMiner to find information or patterns regarding stroke from a dataset obtained from Kaggle. Three data mining techniques are used in this study, that is classification using decision trees, association rule using FP-Growth algorithm, and clustering technique using k-Means algorithm. Using RapidMiner, we are able to process the dataset using the operators provided in the application. As the result, we found out that due to an unbalanced data, the decision tree model made were only able to predict 68,75% patients as having stroke. With the association rule technique, we found out that most attributes in the dataset does not really associated with each other. With the clustering technique, we were able to group up patients and found out that most patients that have stroke are averaged in the age of 58, with 31 bmi, and 201 average glucose level.

References

Dinata, C. A., Syafrita, Y., & Sastri, S. (2013). Gambaran Faktor Risiko dan Tipe Stroke pada Pasien Rawat Inap di Bagian Penyakit Dalam RSUD Kabupaten Solok Selatan Periode 1 Januari 2010 - 31 Juni 2012. Jurnal Kesehatan Andalas, 2(2), 57-61.

Boehme, A. K., Esenwa, C., & Elkind, M. S. V. (2017). Stroke Risk Factors, Genetics, and Prevention. Circulation Research, 120(3), 472-495.

Centers of Disease Control and Prevention, "Preventing Stroke: Healthy Living," Centers for Disease Control and Prevention, 31 January 2020. [Online]. Available: https://www.cdc.gov/stroke/healthy_living.htm. [Accessed 9 May 2021].

World Health Organization, "The top 10 causes of death," World Health Organization, 9 December 2020. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/the-top-10-causes-of-death. [Accessed 9 May 2021].

Ramageri, B. M. (2020). Data Mining Techniques and Applications. Indian Journal of Computer Science and Engineering, 1(4), 301-305.

Milovic, B., & Milovic, M. (2022). Prediction and Decision Making in Health Care using Data Mining. International Journal of Public Health Science (IJPHS), 1(2), 69-78.

Durairaj, M., & Ranjani, V. (2013). Data MIning Applications in Healthcare Sector: A Study. International Journal of Scientific & Technology Research, 2(10), 29-35.

Gorade, S. M., Deo, A., & Purohit, P. (2017). A Study of Some Data Mining Classification Techniques. International Research Journal of Engineering and Technology, 4(4), 3112-3115.

Song, Y. Y., & Lu, Y. (2015). Decision tree methods: applications for classification and prediction. Shanghai Archives of Psychiatry, 27(2), 130-135.

Zeng, Y., Yin, S., Liu, J., & Zhang, M. (2015). Research of Improved FP-Growth Algorithm in Association Rules Mining. London: Hindawi.

Chen, R., Ovbiagele, B., & Feng, W. (2016). Diabetes and Stroke: Epidemiology, Pathophysiology, Pharmaceuticals and Outcomes. The American Journal of the Medical Sciences, 351(4), 380-386.

Data Management Analysis for Predicting Stroke using RapidMiner

Authors

Keywords:

Abstract

References

Published

How to Cite

Issue

Section

Quick-Menu

SUBMIT ARTICLES

Information

Current Issue

Citation

STATISTIC