Naufal, Hafizh Iman and Efendi, Achmad and Sumarminingsih, Eni (2023) Bayesian Additive Regression Trees for Classification of Unbalanced Class of Credit Collectability Data. Asian Journal of Probability and Statistics, 23 (1). pp. 16-27. ISSN 2582-0230
Efendi2312023AJPAS100802.pdf - Published Version
Download (499kB)
Abstract
Aims: This study aims at determining the classification results using the Bayesian Additive Regression Trees (BART) method on bank credit collectability data, where there is a class imbalance in the data.
Study Design: Quantitative Design.
Place and Duration of Study: The used data are secondary data in the form of bank debtor’s credit collectability data with nine predictor variables and one response variable in the form of credit collectability. They are collected from Banks in East Java, Indonesia, from the date of 01 May 1986 to 31 May 2018.
Methodology: The Bayesian approach is one of the estimation methods in statistics that is currently being popularly used, this is because the rapid development of technology makes computational challenges no longer a problem. The Bayesian estimation continues to develop and can be used in various statistical methods, for instance both for regression and classification. The Classification and Regression Trees (CART) method is one of the most popular classification methods used. Debtors, in a bank, who have delinquent credit have a small proportion compared to debtors who have current credit. Standard classifier methods such as CART are not suitable for handling this case, as CART is sensitive to classes that have a high degree. Hence, additional methods such as ensemble BART (Bayesian Additive Regression Trees), are needed in order to increase the accuracy of classification in cases of class imbalance.
Results: The results of the cross-validation on the BART show a high consistency of classification accuracy, 83.49%. This indicates that the BART method can work consistently even though there is a class imbalance. The results of this study indicate that the classification accuracy of the training data is 84.53%, while the accuracy in the testing data is 85.48%. These results also show that the BART method has ability to overcome overfitting in the classification method, where overfitting often occurs in most of the classification methods that have very good classification abilities.
Conclusion: The testing data show that the accuracy is relatively similar to the one of the training data, this indicates that the BART method has been able to capture patterns in the data.
Item Type: | Article |
---|---|
Subjects: | OA STM Library > Mathematical Science |
Depositing User: | Unnamed user with email support@oastmlibrary.com |
Date Deposited: | 12 Jun 2023 04:34 |
Last Modified: | 16 Sep 2024 10:17 |
URI: | http://geographical.openscholararchive.com/id/eprint/1079 |