Implementation of Genetic Algorithm - Random Oversampling on the Random Forest Algorithm to Address Imbalanced Blood Donor Eligibility Data

Authors

  • Al Janatul Ulvivianti Universitas Muhammadiyah Kalimantan Timur Author
  • Taghfirul Azhima Yoga Siswa Universitas Muhammadiyah Kalimantan Timur Author https://orcid.org/0000-0003-2017-8538
  • Wawan Joko Pranoto Author

Keywords:

Blood Donation, Genetic Algorithm, Imbalanced Data, Random Forest, Random Oversampling

Abstract

Purpose: This study aims to improve the accuracy of blood donor classification by addressing data imbalance using machine learning techniques. Accurate classification of donor eligibility is crucial for maintaining a reliable blood supply. To achieve this, the research explores the integration of the Random Forest algorithm with the Genetic Algorithm (GA) for feature selection and optimization, alongside Random Oversampling (RO) for data balancing.
Methods: The research employs the Random Forest algorithm combined with GA for feature selection and optimization. Additionally, Random Oversampling is applied to handle the class imbalance in the dataset. The model's performance is evaluated using 10-Fold Cross Validation. The dataset used in this study consists of blood donor records from the Indonesian Red Cross (PMI) in Samarinda City for 2023–2024.
Results: The application of Random Oversampling significantly improved the model’s accuracy, achieving 99.94%. However, the use of GA Feature Selection and GA Optimization independently did not result in notable improvements. Furthermore, when both techniques were applied together, the accuracy decreased to 98.78%.
Conclusions: The study confirms that Random Oversampling is highly effective in improving classification accuracy for blood donor eligibility. However, the integration of GA for feature selection and optimization did not yield additional benefits and even reduced accuracy when applied together. Future research could explore alternative feature selection and optimization methods to further enhance classification performance.

References

Anjas Aprihartha, M., Zulhan, D., Nurfaizal, A. F., & Nur Alam, T. (2024). Penyelesaian Masalah Ketidakseimbangan Data Melalui Teknik Oversampling dan Undersampling pada Klasifikasi Siswa Tidak Naik Kelas. Jurnal Teknik Ibnu Sina, 9(01), 43–52.

Atmaja, K. J., Anandita, I. B. G., & Dewi, N. K. C. (2018). Penerapan Data Mining Untuk Memprediksi Potensi Pendonor Darah Menjadi Pendonor Tetap Menggunakan Metode Decision Tree C.45. S@Cies, 7(2), 101–108. https://doi.org/10.31598/sacies.v7i2.284

Ayuningtyas, P., Khomsah, S., Informatika, T., Informatika, F., Data, S., & Informatika, F. (2024). Pelabelan Sentimen Berbasis Semi-Supervised Learning menggunakan Algoritma LSTM dan GRU. 9(3), 217–229.

Basri, R. F., & Rahmita. (2021). PENYULUHAN PROSES DONOR DARAH DAN PENTINGNYA DONOR DARAH SEBAGAI EDUKASI PRA-DONASI PADA MASYARAKAT PATTITANGNGANG, KECAMATAN MAPPAKASUNGGU KABUPATEN TAKALAR. Abdimas Indonesia, 1(2), 26–32. https://dmi-journals.org/jai/article/view/226

Efendi, M. S., & Zyen, A. K. (2024). Penerapan Algoritma Random Forest Untuk Prediksi Penjualan Dan Sistem Persediaan Produk. Resolusi: Rekayasa Teknik Informatika Dan Informasi, 12–20.

Firdaus, M. R., Latif, A., & Gata, W. (2020). Klasifikasi Kelayakan Calon Pendonor Darah Menggunakan Neura L Network. Sistemasi, 9(2), 362. https://doi.org/10.32520/stmsi.v9i2.840

Halim, T. N., Martin, R., & ... (2023). Klasifikasi Kepuasan Pelanggan Terhadap Platform E-Commerce dengan Metode K–Nearest Neighbor (K-NN). Jurasik (Jurnal Riset …, 8, 512–523. http://ejurnal.tunasbangsa.ac.id/index.php/jurasik/article/view/636%0Ahttps://ejurnal.tunasbangsa.ac.id/index.php/jurasik/article/download/636/609

Handayani, K., Lisnawanty, L., Latif, A., Firdaus, M. R., & Hasan, F. N. (2021). Komparasi Algoritma C4.5 Dan Naïve Bayes Dalam Penentuan Status Kelayakan Donor Darah. Sistemasi, 10(3), 676. https://doi.org/10.32520/stmsi.v10i3.1440

Ifongki, I. (2020). Penerapan Data Mining Menggunakan Algoritma C4.5 Tehadap Pengaruh Penjualan Kopi Pada Pt. Jpw Indonesia. Jurnal Sistem Informasi Dan Informatika (Simika), 3(1), 40–54. https://doi.org/10.47080/simika.v3i1.836

Irfan Syahroni, M. (2022). Prosedur Penelitian Kuantitatif. EJurnal Al Musthafa, 2(3), 43–56. https://doi.org/10.62552/ejam.v2i3.50

Karomi, M. A. Al. (2020). Optimasi Parameter K Pada Algoritma Knn Untuk Klasifikasi Heregistrasi Mahasiswa. IC-Tech, 10(1), 28–33.

Rivaldo, V. J., Siswa, T. A. Y., & Pranoto, W. J. (2024). Perbaikan Akurasi Naïve Bayes dengan Chi-Square dan SMOTEDalam Mengatasi High Dimensional dan Imbalanced Data Banjir. JURNAL MEDIA INFORMATIKA BUDIDARMA, 8(3), 1656–1664.

Veronica, R., Agustina, Elwindra, Prihatini, F., Vestabilivy, E., & Herlina. (2024). Kegiatan Donor Darah sebagai Salah Satu Cara Membantu Meningkatkan Kesehatan Diri dan Selamatkan Nyawa Sesama diperoleh dari manusia , dalam keadaan mengalami kecelakaan atau menderita suatu pengganti . Donor darah sukarela merupakan seseorang yang menyum. 2(1), 116–125. https://doi.org/10.62354/healthcare.v2i1.21

Wahono, H., & Riana, D. (2020). Prediksi Calon Pendonor Darah Potensial Dengan Algoritma Naïve Bayes, K-Nearest Neighbors dan Decision Tree C4.5. JURIKOM (Jurnal Riset Komputer), 7(1), 7. https://doi.org/10.30865/jurikom.v7i1.1953

Downloads

Published

2025-05-26

How to Cite

Implementation of Genetic Algorithm - Random Oversampling on the Random Forest Algorithm to Address Imbalanced Blood Donor Eligibility Data. (2025). International Journal of Artificial Intelligence and Information Technology (IJAIIT), 1(1), 33-50. https://publish.umam.edu.my/index.php/ijaiit/article/view/55