Prediction Five Feature Importance for Intention to Enroll High School Student using Random Forest and Decision Tree

Auteurs

  • Hendra Achmadi Universitas Pelita Harapan

Trefwoorden:

Mapping the decision of the high school student, Data Mining, Random forest, Decision tree

Samenvatting

The number of prospective students enrolling in higher education, especially in private universities, is a serious problem. The decline in the number of prospective students that occurred during the COVID-19 pandemic from 2019 to 2022 has also become a serious problem for private universities in Indonesia. Therefore, this research focuses on finding the main characteristics of high school students in choosing private universities in Jakarta and its surroundings. The research method used is data mining, using primary data obtained from questionnaires distributed to high school students in grades 11 and 12 in the area, with a total of 438 respondents, which then went through a data cleaning process, producing 295 respondents. Using the Random Forest method in determining 5 important features and the Learning Supervisor maps out what important features should be taken into consideration in decision making for high school students. By using the random forest algorithm, an accuracy of 67 percent is obtained. Then by using the decision tree algorithm, machine learning will map the decisions of high school students. And the results illustrate that the first thing that is the main consideration is the father's education, and the second is which school he comes from, and the third is the mother's education and then how much transport money is given, and the last is what department he is from.

Referenties

Charbuty, B., & Abdulazeez, A. (2021). Classification Based on Decision Tree Algorithm for Machine Learning. Journal of Applied Science and Technology Trends, 2(01), 20-28. https://doi.org/10.38094/jastt20165

Duarte, V., Zuñiga-Jara, S., & Contreras, S. (2019). Machine Learning and Marketing: a literature review. https://ssrn.com/abstract=4006436

Jiawei. (2012). Data Mining Third Edition.

Mansoor, F. (2022). Increasing Generalizability: Naïve Bayes Vs K-Nearest Neighbors. https://doi.org/10.21203/rs.3.rs-1578985/v1

Müller, A. C., & Guido, S. (2017). Introduction to Machine Learning with Python A GUIDE FOR DATA SCIENTISTS Introduction to Machine Learning with Python.

Muzumdar, P., Prasad Basyal, G., & Vyas, P. (2022). An Empirical Comparison of Machine Learning Models for Student’s Mental Health Illness Assessment. In Asian Journal of Computer and Information Systems (Vol. 10, Issue 1). www.ajouronline.com

Rish, I. (2000). An empirical study of the naive Bayes classifier.

##submission.downloads##

Gepubliceerd

2023-12-31