Detection of Rare Events: The Need to Know the Customer

Authors

  • Ayse Humeyra Bilge Kadir Has University, Istanbul 34083, Turkey
  • Tarkan Ozmen
  • Ayse Tosun

Keywords:

Time series analysis, Unbalanced data, clustering and classification, customer scoring

Abstract

The prediction of customer complaints based on a time series of invoices is a two-stage process consisting of determining anomalies in the sequence of invoices and assessing the response of the customers to these anomalies. In the telecommunication sector, the average complaint rate is approximately 10?? hence the prediction of customer complaints falls in the realm of rare event detection. Detecting rare events poses a significant challenge when working with unbalanced datasets. In machine learning applications, oversampling of the minority class and under sampling of the majority class in the training set are well-known preprocessing tools for creating a more balanced set. In previous work, [14] we proposed a cluster based under sampling approach as an alternative to random under sampling of the majority class, based on splitting heterogeneous data into homogeneous subsets, using Principal Component Analysis, to reduce variability within clusters. In the present work we propose a method for assessing the response of the customers to anomalies detected in the time series of invoices.

References

M. Galar, A. Fernandez, E. Barrenechea, H. Bustince, F. Herrera, “A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches”, IEEE Trans. Syst. Man Cybern. – Part C, 42 (4), 463–484, 2012

C. Beyan and R. Fisher, “Classifying imbalanced data sets using similarity-based hierarchical decomposition,” Pattern Recognition, vol. 48, no. 5, pp. 1653–1672, 2015.

M. Galar, A. Fernandez, E. Barrenechea, H. Bustince, F. Herrera, A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches, IEEE Trans. Syst. Man Cybern. – Part C 42 (4) (2012) 463–484.

N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “Smote: Synthetic minority over-sampling technique,” Journal of Artificial Intelligence Research, vol. 16, pp. 321–357, 2002.

Azhar, N.A., M.S.M. Pozi, A.M. Din, and A. Jatowt. “An Investigation of SMOTE Based Methods for Imbalanced Datasets with Data Complexity Analysis.” IEEE Transactions on Knowledge and Data Engineering, Knowledge and Data Engineering, IEEE Transactions on, IEEE Trans. Knowl. Data Eng 35, no. 7 (July 1, 2023): 6651–72. doi:10.1109/TKDE.2022.3179381.

Jason Van Hulse, Taghi M Khoshgoftaar, and Amri Napolitano. “An empirical comparison of repetitive undersampling techniques.” In Information Reuse & Integration, 2009. IRI’09. IEEE International Conference

on, pages 29–34. IEEE, 2009.

Hasanin, T., & Khoshgoftaar, T. (2018). “The Effects of Random Undersampling with Simulated Class Imbalance for Big Data.” 2018 IEEE International Conference on Information Reuse and Integration (IRI), Information Reuse and Integration (IRI), 2018 IEEE International Conference on, IRI, 70–79. https://icproxy.khas.edu.tr:2071/10.1109/IRI.2018.00018

C. Seiffert, T. Khoshgoftaar, J. Van Hulse, A. Napolitano, “RUSBoost: a hybrid approach to alleviating class imbalance”, IEEE Trans. Syst. Man Cybern. – Part A 40 (1), 185–197, 2010

R. Barandela, R.M. Valdovinos, J.S. Sanchez, “New applications of ensembles of classifiers”, Pattern Anal. , Appl. 6 ,245–256, 2003.

K. Randhawa, C. K. Loo, M. Seera, C. P. Lim, and A. K. Nandi, “Credit Card Fraud Detection Using AdaBoost and Majority Voting”, IEEE Access, Vol. 6, pp. 14277–14284, 2018.

M. Zareapoor and P. Shamsolmoali, “Application of Credit Card Fraud Detection: Based on Bagging Ensemble Classifier”, Procedia Computer Science, Vol. 48, pp. 679–685, 2015.

J. O. Awoyemi, A. O. Adetunmbi, and S. A. Oluwadare, “Credit card fraud detection using machine learning techniques: A comparative analysis”, In: Proc. of 2017 International Conference on Computing Networking and Informatics (ICCNI), pp. 1–9, 2017.

Bilge, A. H. ., Ogrenci, A. S. ., Carpanali, H. ., Aktunc, E. A. ., Atas, F., Ozmen, T. ., & Kaya, B. E. . (2022). Detection of Expenditure Trends in the Telecommunication Sector. American Scientific Research Journal for Engineering, Technology, and Sciences, 90(1), 340–350.

Bilge, A., Ogrenci, A. S. Carpanali, H., Ozmen, T, Tosun, A ., Cakar, K. (2023) Detection of Rare Events: Cluster Based Preprocessing of the Training Set: The Case on Complaints for Invoice Time Series

Downloads

Published

2025-02-12

How to Cite

Bilge, A. H., Ozmen, T., & Tosun, A. (2025). Detection of Rare Events: The Need to Know the Customer. American Scientific Research Journal for Engineering, Technology, and Sciences, 101(1), 138–160. Retrieved from https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/11239

Issue

Section

Articles