WEB-BASED RESEARCH ARTICLE CLASSIFICATION USING THE RANDOM FOREST ALGORITHM
Abstract:
Purpose: This study aims to develop a web-based system that classifies research articles using the Random Forest algorithm to address mismatches between article content and journal scope.
Methodology/approach: The research employed the SDLC Waterfall model, with data sourced from 560 articles published by Goodwood Publishing (2019–2024) across four categories. Text preprocessing included case folding, stopword removal, stemming, and tokenization, with TF-IDF applied for feature extraction. Random Forest was trained with 80% training data and 20% testing data.
Results/findings: The model achieved 91% accuracy, with high precision and recall across all categories. The system was successfully implemented as a web-based application, providing instant classification and journal recommendations.
Limitations: The dataset was limited to one publisher and only Random Forest was applied, which may restrict the generalizability of findings.
Contribution: This study contributes to the application of machine learning in scholarly publishing, offering a practical solution for editors to streamline article selection and improve efficiency.
Downloads

Dieses Werk steht unter der Lizenz Creative Commons Namensnennung - Weitergabe unter gleichen Bedingungen 4.0 International.