A comparative study of machine learning algorithms and the prompting approach using GPT-3.5 turbo for questions categorization

Дата

Назва журналу

Номер ISSN

Назва тому

Видавець

Анотація

The study focuses on text categorization tasks, comparing the effectiveness of traditional Machine Learning (ML) models with Large Language Models (LLM), such as GPT-3.5 turbo. The literature review tracks the historical progress in text categorization from early ML algorithms to LLMs, which automatically determine contextual features, simplifying the process. The goal of the research is to evaluate whether LLMs with a prompt-based approach can outperform traditional ML models in text categorization. A dataset of 55,235 questions in nine categories is used. The effectiveness of categorization is determined by the F1 score. Various ML models such as Logistic Regression and Random Forest were used for categorization, while models like curie, davinci, and GPT-3.5 turbo were used for categorization with LLM. The study found that traditional ML models provided better categorization (F1 score – 88%), whereas LLMs, particularly GPT-3.5 turbo, offered competitive but inferior results without prior training (F1 score – 72%). The discussion highlights the advantages of LLMs, such as their suitability in scenarios without historical data for training and their ease of use. Disadvantages are also cited, such as higher costs for large data volumes and potential instability in API operation. In conclusion, the study recommends LLMs for certain applications, such as new applications or those with limited categorization needs. Traditional ML models remain more suitable for scenarios requiring high accuracy or processing sensitive data.

Опис

Тип публікації

Text

Тип текстової публікації

Стаття

ISSN

23674512

Ключові слова

Бібліографічний опис

1. A Comparative Study of Machine Learning Algorithms and the Prompting Approach Using GPT-3.5 Turbo for Text Categorization. Oleksandr Mitsa, Yurii Voloshchuk, Oleksandr Levchuk, Vasyl Petsko Lecture Notes on Data Engineering and Communications Technologies. 2025. Vol. 242. P. 156-167.

Endorsement

Review

Supplemented By

Referenced By