Пожалуйста, используйте этот идентификатор, чтобы цитировать или ссылаться на этот ресурс: https://dspace.uzhnu.edu.ua/jspui/handle/lib/66550
Название: A Hybrid Model of Cancer Diseases Diagnosis Based on Gene Expression Data with Joint Use of Data Mining Methods and Machine Learning Techniques
Авторы: Babichev, Sergii
Yasinska-Damri, Lyudmyla
Liakh, Igor
Ключевые слова: A Hybrid Model of Cancer Diseases Diagnosis Based on Gene Expression Data with Joint Use of Data Mining Methods and Machine Learning Techniques, gene expression profiles; spectral clustering algorithm; convolutional neural network; inductive clustering technique; random forest classifier; alternative voting method; hybrid model; cancer disease
Дата публикации: 14-мая-2023
Издательство: MDPI
Библиографическое описание: One of the current focuses of modern bioinformatics is the development of hybrid models to process gene expression data, in order to create diagnostic systems for various diseases. In this study, we propose a solution to this problem that combines an inductive spectral clustering algorithm, random forest classifier, convolutional neural network, and alternative voting method for making the final decision about patient condition. In the first stage, we apply the spectral clustering algorithm to gene expression profiles using inductive methods of objective clustering, with the calculation of internal, external, and balance clustering quality criteria. This results in clusters of mutually correlated and differently expressed gene expression profiles. In the second stage, we apply the random forest classifier and convolutional neural network to identify the examined objects, containing as attributes the gene expression values in the allocated clusters. The presented research solves both binaryand multi-classification tasks. The final decision about the patient’s condition is made using the alternative voting method, considering the classification results based on the gene expression data in various clusters. The simulation results showed that the proposed technique was highly effective, achieving a high accuracy in object identification when both classifiers were used. However, the convolutional neural network had a significantly higher data processing efficiency than the random forest algorithm, due to its substantially shorter processing time. Keywords: gene expression profiles; spectral clustering algorithm; convolutional neural network; inductive clustering technique; random forest classifier; alternative voting method; hybrid model; cancer disease
Серия/номер: Applied Sciences;13 (10)
Краткий осмотр (реферат): One of the current focuses of modern bioinformatics is the development of hybrid models to process gene expression data, in order to create diagnostic systems for various diseases. In this study, we propose a solution to this problem that combines an inductive spectral clustering algorithm, random forest classifier, convolutional neural network, and alternative voting method for making the final decision about patient condition. In the first stage, we apply the spectral clustering algorithm to gene expression profiles using inductive methods of objective clustering, with the calculation of internal, external, and balance clustering quality criteria. This results in clusters of mutually correlated and differently expressed gene expression profiles. In the second stage, we apply the random forest classifier and convolutional neural network to identify the examined objects, containing as attributes the gene expression values in the allocated clusters. The presented research solves both binaryand multi-classification tasks. The final decision about the patient’s condition is made using the alternative voting method, considering the classification results based on the gene expression data in various clusters. The simulation results showed that the proposed technique was highly effective, achieving a high accuracy in object identification when both classifiers were used. However, the convolutional neural network had a significantly higher data processing efficiency than the random forest algorithm, due to its substantially shorter processing time.
Описание: One of the current focuses of modern bioinformatics is the development of hybrid models to process gene expression data, in order to create diagnostic systems for various diseases. In this study, we propose a solution to this problem that combines an inductive spectral clustering algorithm, random forest classifier, convolutional neural network, and alternative voting method for making the final decision about patient condition. In the first stage, we apply the spectral clustering algorithm to gene expression profiles using inductive methods of objective clustering, with the calculation of internal, external, and balance clustering quality criteria. This results in clusters of mutually correlated and differently expressed gene expression profiles. In the second stage, we apply the random forest classifier and convolutional neural network to identify the examined objects, containing as attributes the gene expression values in the allocated clusters. The presented research solves both binaryand multi-classification tasks. The final decision about the patient’s condition is made using the alternative voting method, considering the classification results based on the gene expression data in various clusters. The simulation results showed that the proposed technique was highly effective, achieving a high accuracy in object identification when both classifiers were used. However, the convolutional neural network had a significantly higher data processing efficiency than the random forest algorithm, due to its substantially shorter processing time.
Тип: Text
Тип публикации: Стаття
URI (Унифицированный идентификатор ресурса): https://dspace.uzhnu.edu.ua/jspui/handle/lib/66550
ISSN: 2076-3417
Располагается в коллекциях:Наукові публікації кафедри інформатики та фізико-математичних дисциплін

Файлы этого ресурса:
Файл Описание РазмерФормат 
applsci-13-06022-v2.pdfStattja1.62 MBAdobe PDFПросмотреть/Открыть


Все ресурсы в архиве электронных ресурсов защищены авторским правом, все права сохранены.