Cover
Vol. 1 No. 1 (2025)

Published: June 30, 2025

Pages: 31-37

Original Article

Utilizing Machine Learning Algorithms and SMOTE for Analyzing and Predicting Homicides

Abstract

This study analyzes homicide data in the United States from 1980 to 2014 using machine learning techniques to predict crime resolution and classify victim gender. The dataset, obtained from the FBI Supplementary Homicide Report, contains 638,454 records. Data preprocessing involved cleaning, converting categorical features to numerical values, and addressing class imbalance using Synthetic Minority Oversampling Technique (SMOTE). Various classification algorithms were applied, including Decision Tree and Naïve Bayes. The results showed that the Decision Tree model achieved 95% accuracy in predicting crime resolution and 85% accuracy in classifying victim gender, while Naïve Bayes reached 92% accuracy in crime resolution prediction. The findings highlight the effectiveness of machine learning in crime pattern analysis and prediction, aiding law enforcement in making more informed investigative decisions.

References

  1. E. Lavanya, V. Mandalapu, and N. Roy. “Developing machine learning based predictive models for smart policing,” In 2019 IEEE International Conference on Smart Computing (SMARTCOMP), pp. 198-204, 2019, https://doi.org/10.1109/SMARTCOMP.2019.00053
  2. B. Gamze, and H. E. Colak. “Predicting and analyzing crime—Environmental design relationship via GIS‐based machine learning approach,” Transactions in GIS 28, no. 5, pp. 1377-1399, 2024, https://doi.org/10.1111/tgis.13195
  3. E.S. Mahimkar “Predicting crime locations using big data analytics and Map-Reduce techniques,” The International Journal of Engineering Research 8, no. 4, pp. 11-21, 2021
  4. D. Fatima, M. Abu Talib, O. Abu Waraga, A. Bou Nassif, S. Abbas, and Q. Nasir, “Artificial intelligence & crime prediction: A systematic literature review,” Social Sciences & Humanities Open 6, no. 1, 2022, https://doi.org/10.1016/j.ssaho.2022.100342
  5. B. Shruti, and R. K. Singh, “Exploration of Crime Detection Using Deep Learning,” In Innovations in Cyber Physical Systems: Select Proceedings of ICICPS 2020, pp. 297-304. Springer Singapore, 2021, https://doi.org/10.1007/978-981-16-4149-7_26
  6. I. M. Hayder, G. A. Al Ali, and H. A. Younis, “Predicting reaction based on customer's transaction using machine learning approaches,” International Journal of Electrical and Computer Engineering, vol.13, no. 1, 2023, https://doi.org/10.11591/ijece.v13i1.pp1086-1096
  7. N. A Sharma, A. S. Ali, and A. Kabir “A review of sentiment analysis: tasks, applications, and deep learning techniques,” International journal of data science and analytics, vol. 19, no. 3, pp. 1-38, 2024, doi: 10.1007/s41060-024-00594-x
  8. H. A. Younis, N. I. R. Ruhaiyem, A. A. Badr, A. K. Abdul-Hassan, I. M. Alfadli, W. M. Binjumah, M. Nasser, ”Multimodal age and gender estimation for adaptive human-robot interaction: A systematic literature review,” Processes, vol. 11, issue 5, 2023, https://doi.org/10.3390/pr11051488
  9. S. Redkar, S. Mondal, A. Joseph, and K. S. Hareesha, “A Machine Learning Approach for Drug-target Interaction Prediction using Wrapper Feature Selection and Class Balancing,” Mol. Inform., vol. 39, no. 5, 2020, doi: 10.1002/minf.201900062
  10. J. Ribeiro, L. Meneses, D. Costa, W. Miranda, and R. Alves, “Prediction of Homicide Urban Centers: A Machine Learning Approach,” under exclusive license to Springer Nature Switzerland AG 2022 K. Arai (Ed.): Intelligent Sys 2021, LNNS, vol. 296, pp. 344–361, 2022. https://doi.org/10.1007/978-3-030-82199-9_22
  11. A. Shermila, A. B. Bellarmine, and N. Santiago, “Crime data analysis and prediction of perpetrator identity using machine learning approach,” In 2018 2nd international conference on trends in electronics and informatics (ICOEI), pp. 107-114. IEEE, 2018, https://doi.org/10.1109/ICOEI.2018.8553904
  12. R. Geetha, S. Sivasubramanian, M. Kaliappan, S. Vimal, and S. Annamalai, “Cervical Cancer Identification with Synthetic Minority Oversampling Technique and PCA Analysis using Random Forest Classifier,” J. Med. Syst., vol. 43, no. 9, 2019, doi: 10.1007/s10916-019-1402-6
  13. J.H. Joloudari, A. Marefat, M.A. Nematollahi, S.S. Oyelere, & S. Hussain, “Effective class-imbalance learning based on SMOTE and convolutional neural networks,” Applied Sciences, vol.13, issue 6, 2023, https://doi.org/10.3390/app13064006
  14. A. Fernández, S. Garcia, F. Herrera, & N.V. Chawla,”SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary,” Journal of artificial intelligence research, vol. 61, pp.863-905, 2023, https://doi.org/10.1613/jair.1.11192
  15. A. Géron, “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems,” 2nd ed. Sebastopol, CA, USA: O'Reilly Media, 2019.
  16. I. M. Hayder, T. A. Al-Amiedy, W. G. Saeed, M. Nasser, G. A. Al-Ali, and H. A. Younis. “An intelligent early flood forecasting and prediction leveraging machine and deep learning algorithms with advanced alert system,” Processes, vol. 11, no. 2, 2023, https://doi.org/10.3390/pr11020481
  17. A. Bommert, X. Sun, B. Bischl, J. Rahnenführer, and M. Lang, “Benchmark for filter methods for feature selection in high-dimensional classification data,” Comput. Stat. Data Anal., vol. 143, p. 106839, 2020, doi: 10.1016/j.csda.2019.106839
  18. A. Gupta,”A novel approach for classification of mental tasks using multiview ensemble learning (MEL),” Neurocomputing, vol. 417, pp. 558–584, 2020, doi: 10.1016/j.neucom.2020.07.050.
  19. H. A. Younis, N. I. Ruhaiyem, A. A. Badr, T. A. Eisa, M. Nasser, T. Tan , N.H. Samsudin, S. Salisu, “Creating the Hu-Int dataset: A comprehensive Arabic speech dataset for gender detection and age estimation of Arab celebrities,” Biomedical Signal Processing and Control, vol.96, 2024, https://doi.org/10.1016/j.bspc.2024.106511
  20. J.T. Townsend, “Theoretical analysis of an alphabetic confusion matrix,” Perception & Psychophysics, vol. 9, pp.40-50, 1971, https://doi.org/10.3758/BF03213026