Будь ласка, використовуйте цей ідентифікатор, щоб цитувати або посилатися на цей матеріал: https://dspace.uzhnu.edu.ua/jspui/handle/lib/70221
Назва: Integrating Data Mining, Deep Learning, and Gene Ontology Analysis for Gene Expression- Based Disease Diagnosis Systems
Автори: BABICHEV, SERGII
LIAKH, IGOR
ŠKVOR, JIŘí
Ключові слова: Integrating Data Mining, Deep Learning, and Gene Ontology Analysis for Gene Expression- Based Disease Diagnosis Systems, Gene expression data, gene ontology analysis, clustering, biclustering, convolutional neural network, Bayes optimization, classification, Alzheimer’s disease, cancer disease.
Дата публікації: 29-січ-2025
Видавництво: IEEE Access
Бібліографічний опис: ABSTRACT The manuscript details the outcomes of a comprehensive study on the application of cluster- bicluster analysis, gene ontology analysis, and convolutional neural network (CNN) for diagnosing cancer and Alzheimer’s disease using gene expression data derived from both DNA microarray experiments and mRNA sequencing. It outlines a conceptual framework and provides a block diagram of the stepwise procedure for analyzing gene expression data, aiming to enhance the accuracy and objectivity of disease diagnosis. The research methodology involves initial gene ontology analysis, followed by the application of the Self Organizing Tree Algorithm (SOTA) for clustering gene expression profiles, an ensemble algorithm for data biclustering, and CNN for sample classification. Bayesian optimization method was employed to determine the optimal hyperparameters for all models. The analysis of simulation results demonstrates the high efficacy of the proposed approach. Specifically, for Alzheimer’s data, the number of genes analyzed was reduced from 44,662 to 4,004. Subsequent cluster-bicluster analysis divided this data into two subsets containing 1,158 and 2,846 genes, respectively. Classification accuracy for samples within these subsets reached 89.8% and 91.8%. In cancer data analysis, the gene count was reduced from 60,660 to 10,422, with 3,955 and 6,467 genes in the first and second clusters, respectively. The classification accuracy for these subsets was 97.4% and 97.6%, respectively. To our mind, the implementation of this model promises to significantly improve the efficacy of early diagnosis systems for complex diseases. INDEX TERMS Gene expression data, gene ontology analysis, clustering, biclustering, convolutional neural network, Bayes optimization, classification, Alzheimer’s disease, cancer disease.
Серія/номер: technical sciences;VOLUME 13, 2025
Короткий огляд (реферат): The manuscript details the outcomes of a comprehensive study on the application of cluster- bicluster analysis, gene ontology analysis, and convolutional neural network (CNN) for diagnosing cancer and Alzheimer’s disease using gene expression data derived from both DNA microarray experiments and mRNA sequencing. It outlines a conceptual framework and provides a block diagram of the stepwise procedure for analyzing gene expression data, aiming to enhance the accuracy and objectivity of disease diagnosis. The research methodology involves initial gene ontology analysis, followed by the application of the Self Organizing Tree Algorithm (SOTA) for clustering gene expression profiles, an ensemble algorithm for data biclustering, and CNN for sample classification. Bayesian optimization method was employed to determine the optimal hyperparameters for all models. The analysis of simulation results demonstrates the high efficacy of the proposed approach. Specifically, for Alzheimer’s data, the number of genes analyzed was reduced from 44,662 to 4,004. Subsequent cluster-bicluster analysis divided this data into two subsets containing 1,158 and 2,846 genes, respectively. Classification accuracy for samples within these subsets reached 89.8% and 91.8%. In cancer data analysis, the gene count was reduced from 60,660 to 10,422, with 3,955 and 6,467 genes in the first and second clusters, respectively. The classification accuracy for these subsets was 97.4% and 97.6%, respectively. To our mind, the implementation of this model promises to significantly improve the efficacy of early diagnosis systems for complex diseases.
Опис: The manuscript details the outcomes of a comprehensive study on the application of cluster- bicluster analysis, gene ontology analysis, and convolutional neural network (CNN) for diagnosing cancer and Alzheimer’s disease using gene expression data derived from both DNA microarray experiments and mRNA sequencing. It outlines a conceptual framework and provides a block diagram of the stepwise procedure for analyzing gene expression data, aiming to enhance the accuracy and objectivity of disease diagnosis. The research methodology involves initial gene ontology analysis, followed by the application of the Self Organizing Tree Algorithm (SOTA) for clustering gene expression profiles, an ensemble algorithm for data biclustering, and CNN for sample classification. Bayesian optimization method was employed to determine the optimal hyperparameters for all models. The analysis of simulation results demonstrates the high efficacy of the proposed approach. Specifically, for Alzheimer’s data, the number of genes analyzed was reduced from 44,662 to 4,004. Subsequent cluster-bicluster analysis divided this data into two subsets containing 1,158 and 2,846 genes, respectively. Classification accuracy for samples within these subsets reached 89.8% and 91.8%. In cancer data analysis, the gene count was reduced from 60,660 to 10,422, with 3,955 and 6,467 genes in the first and second clusters, respectively. The classification accuracy for these subsets was 97.4% and 97.6%, respectively. To our mind, the implementation of this model promises to significantly improve the efficacy of early diagnosis systems for complex diseases.
Тип: Text
Тип публікації: Стаття
URI (Уніфікований ідентифікатор ресурсу): https://dspace.uzhnu.edu.ua/jspui/handle/lib/70221
Розташовується у зібраннях:Наукові публікації кафедри інформатики та фізико-математичних дисциплін

Файли цього матеріалу:
Файл Опис РозмірФормат 
Integrating_Data_Mining_Deep_Learning_and_Gene_Ontology_Analysis_for_Gene_Expression-Based_Disease_Diagnosis_Systems.pdfStattja2.9 MBAdobe PDFПереглянути/Відкрити


Усі матеріали в архіві електронних ресурсів захищені авторським правом, всі права збережені.