Classification of Breast Cancer Using Data Mining

  • Farah Sardouk MSc. Department Electrical Engineering, Altinbaş University, Istanbul, Turkey
  • Dr. Adil Deniz Duru Assist. Prof. Department of Physical Education and Sports, Marmara University, Istanbul, Turkey
  • Dr. Oğuz Bayat Assoc. Prof. Department of Electrical Engineering, Altinbaş University, Istanbul, Turkey
Keywords: ANN, Artificial Neural Network, BMI, KDD, k-fold cross validation, PPV, WHO.


 According to the publications of leading health organization in the world, the World Health Organization (WHO) reveals that breast cancer is the most propagated disease among women and it may end with mortality. The precautions and regular investigations are the options for preventing this cancer. Furthermore, the recognition of the sickness may begin at early stages for combating purpose.  From data science perspectives, data mining technology is used to uncover the disease according to some parameters like BMI, age and sugar routine database. The deployment of those technologies had resulted in considerable results that may help for breast cancer aid. In this research, Coimbra dataset are collected and studied according to 10 predictors. We used these predictors to estimate if the breast cancer is occurring or not. The 6 algorithms used are compared according to their performance in WEKA and in MATLAB. The comparison is useful to prove the possibility of using Data Mining algorithms to help Medicine decision engine with good precision.


. Umesh D R ; B Ramachandra ‘Association rule mining based predicting breast cancer recurrence on SEER breast cancer data’.

. Galal, G., Cook, D.J., Holder, L.B. “Improving Scalability in a Scientific Discovery System by exploiting Parallelism”, Proceedings KDD '97.

. Holsheimer, M., Kersten, M., Mannila, H., Toivonen, H. “A Perspective on Databases and Data Mining”, Proceedings KDD '95.

. William B. Schwartz, M.D., Ramesh S. Patil, Ph.D., and Peter Szolovits, Ph.D. “Artificial Intelligence in Medicine’’, Volume 34, Issue 2, June 2005, Pages 113-127.

. Ultrasound in Medicine & Biology, Volume 29, Issue 5, May 2003, Pages 679-686.

. Ian Witten Eibe Frank Mark Hall Christopher Pal, Data Mining: Practical Machine Learning Tools and Techniques, Fourth Edition, Pages 160, 161.

. Chen D, Chi HC, Jing D, Chun LD. Citation retrieval in digital libraries’ International Conference on Systems, Man, and Cybernetics, 1999; 105-109.

. Evaluation of Machine Learning Methods for Breast Cancer Prediction. Applied and Computational Mathematics. Vol. 7, No. 4, 2018, pp. 212-216.Yixuan Li, Zixuan Chen. Performance.

. Crisóstomo, J., Matafome, P., Santos-Silva, D. et al. Endocrine (2016) 53: 433.

. Miguel Patrício,José Pereira, Joana Crisóstomo, Paulo Matafome, Manuel Gomes, Raquel Seiça and Francisco Caramelo, “Using Resistin, glucose, age and BMI to predict the presence of breast cancer’’.