COMPARISON OF BAGGING, BOOSTING, AND STACKING ENSEMBLE MODELS FOR AIRLINE CUSTOMER SATISFACTION ANALYSIS

Authors

  • Melvin Lee UPH Medan - Universitas Pelita Harapan, Medan

DOI:

https://doi.org/10.19166/jstfast.v8i1.8166

Keywords:

airline satisfaction, bagging, boosting, ensemble learning, stacking

Abstract

By the end of COVID-19 pandemic and subsequent lockdowns last year, air travel has soared high, with an increase of 30.1% compared to last year according to one report. The rise of number of passengers means a good opportunity for the airline carriers to recoup losses due to lockdowns, and competition becomes heated as rival carriers try to lure new and old customers into their services. To remain competitive, more and more companies are turning towards machine learning to analyze large amounts of data to gain an edge towards their competitors, with ensemble learning being one of the many methods employed for the analysis work. In this study, Decision Tree, Random Forest, Boosting, and Stacking methods will be chosen for comparative study, which will be supplied with Airline Satisfaction dataset which is cleaned of null values and changing data types, for the study itself and then compared with each other using confusion matrix, precision-recall-f1-scoreaccuracy metrics, ROC curve, and feature importances. The results have shown that while the three chosen classifiers are almost similar in their overall success rate, with Bagging method reaching 96.117%, Boosting with a rate of 96.037%, and stacking with a rate of 96.264%, overall Stacking has the highest rate among all. These results show the almost negligible differences on all three main ensemble learning methods in terms of efficacy. Additional studies with larger datasets, and more varieties of ensemble learning methods can improve the overall judgement of the results.


Bahasa Indonesia Abstract:

Dengan berakhirnya pandemi COVID-19 dan lockdown yang terjadi tahun lalu, perjalanan udara melonjak tinggi, dengan peningkatan sebesar 30,1% dibandingkan tahun lalu menurut sebuah laporan. Peningkatan jumlah penumpang berarti peluang bagus bagi maskapai penerbangan untuk menutup kerugian akibat lockdown, dan persaingan menjadi memanas ketika maskapai pesaing mencoba memikat pelanggan baru dan lama untuk menggunakan layanan mereka. Agar tetap kompetitif, semakin banyak perusahaan yang beralih ke pembelajaran mesin untuk menganalisis data dalam jumlah besar guna mendapatkan keunggulan dibandingkan pesaing mereka, dengan pembelajaran ansambel menjadi salah satu dari banyak metode yang digunakan untuk pekerjaan analisis. Dalam studi ini, metode Decision Tree, Random Forest, Boosting, dan Stacking akan dipilih untuk studi komparatif, yang akan dilengkapi dengan dataset Kepuasan Maskapai yang dibersihkan dari nilai null dan tipe data yang berubah, untuk studi itu sendiri dan kemudian dibandingkan dengan masing-masing metode. lainnya menggunakan matriks konfusi, metrik akurasi skor recall-f1, kurva ROC, dan kepentingan fitur. Hasilnya menunjukkan bahwa meskipun ketiga pengklasifikasi yang dipilih memiliki tingkat keberhasilan keseluruhan yang hampir serupa, dengan metode Bagging mencapai 96,117%, Boosting dengan tingkat 96,037%, dan penumpukan dengan tingkat 96,264%, secara keseluruhan Penumpukan memiliki tingkat tertinggi di antara pengklasifikasi lainnya. semua. Hasil ini menunjukkan perbedaan yang hampir dapat diabaikan pada ketiga metode pembelajaran ansambel utama dalam hal kemanjuran. Studi tambahan dengan kumpulan data yang lebih besar, dan lebih banyak variasi metode pembelajaran ansambel dapat meningkatkan penilaian hasil secara keseluruhan.


Author Biography

Melvin Lee, UPH Medan - Universitas Pelita Harapan, Medan

A student of Masters in Computer Science in UPH Medan.

References

Airlines IATA. (2023). Passenger demand posts solid growth. Retrieved November 10, 2023 from https://airlines.iata.org/2023/11/10/passenger-demand-posts-solid-growth

Airports Council International. (2023). The trusted source for air travel demand updates. Retrieved September 27, 2023 from https://aci.aero/2023/09/27/global-passenger-traffic-expected-torecover-by-2024-and-reach-9-4-billion-passengers/

Akano, T. T., & James, C. C. (2022). An assessment of ensemble learning approaches and single-based machine learning algorithms for the characterization of undersaturated oil viscosity. Beni-Suef University Journal of Basic and Applied Sciences, 11, 149. https://doi.org/10.1186/s43088-022-00327-8

Bouwer, J., Saxon, S., & Wittkamp, N. (2021). Back to the future? Airline sector poised for change post COVID-19. Retrieved October 4, 2023 from https://www.mckinsey.com/industries/travel-logistics-and-infrastructure/ourinsights/back-to-the-future-airline-sector-poised-for-change-post-covid-19

Dong, Y., Liang, J., Zhao, Z., & Ding, D. (2021). Research on the relationship between customer satisfaction and compensation plan in U.S Airline industry. Advances in Economics, Business and Management Research, 506-510.

Dong, X., Yu, Z., Cao, W., Shi, Y., & Ma, Q. (2020). A survey on ensemble learning. Frontiers of Computer Science, 14, 241-258. https://doi.org/10.1007/s11704-019-8208-z

Janiesch, C., Zschech, P., & Heinrich, K. (2021). Machine learning and deep learning. Electron Markets, 31, 685-695. https://doi.org/10.1007/s12525-021-00475-2

Nair, A. (2023). Corona virus lockdown - A dramatic impact on the aviation industry. Retrieved October 4, 2023 from https://straitsresearch.com/article/corona-virus-lockdown-a-dramatic-impact-on-the-aviation-industry

Tasci, E., Uluturk, C., & Ugur, A. (2021). A voting-based ensemble deep learning method focusing on image augmentation and preprocessing variations for tuberculosis detection. Neural Computing and Applications, 33, 15541-15555. https://doi.org/10.1007/s00521-021-06177-2

Zhou, Z. H. (2009). Ensemble learning. In S. Z. Li & A. Jain (Eds.), Encyclopedia of biometrics (pp. 270-273). Springer. https://doi.org/10.1007/978-0-387-73003-5_293

Downloads

Published

2024-05-31

Issue

Section

Articles