Identificación de biomarcadores asociados a los síndromes talasémicos mediante el uso de algoritmos de aprendizaje automático
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
La talasemia es un trastorno sanguíneo hereditario que afecta la producción de hemoglobina, lo que conlleva niveles anormalmente bajos de esta proteína. En 2015, esta enfermedad fue responsable de 16800 muertes y afecta aproximadamente al 1,5 % de la población mundial. El diagnóstico se realiza mediante análisis de sangre y pruebas genéticas. Sin embargo, la falta de registros y tamizajes adecuados impide la detección de numerosos casos graves, lo cual contribuye a una elevada tasa de mortalidad. Aunque su prevalencia es baja, la talasemia tiene un impacto significativo en la calidad de vida de los pacientes, quienes requieren tratamiento de por vida. Este estudio propone el uso de perfiles de expresión génica y métodos de aprendizaje automático para identificar biomarcadores asociados con la talasemia. Se emplearon modelos de clasificación y un algoritmo de detección de anomalías, observándose que los métodos de clasificación no fueron efectivos para abordar el problema. Sin embargo, el modelo basado en bosque de aislamiento permitió identificar 72 genes anómalos. La validación funcional de estos genes destacó términos biológicos relevantes, como traducción citoplasmática y apoptosis, sugiriendo posibles vías moleculares implicadas en la talasemia. Además, se identificaron genes relacionados con la homeostasis del hierro, estableciendo un vínculo entre el estrés oxidativo, la apoptosis y esta enfermedad. La comparación con estudios previos reveló procesos biológicos comunes, lo que resalta el potencial del aprendizaje automático para mejorar el diagnóstico y profundizar en la comprensión de las vías moleculares, con el objetivo de optimizar los tratamientos para los pacientes.
Thalassemia is a hereditary blood disorder that affects hemoglobin production, resulting in abnormally low levels of this protein. In 2015, this disease was responsible for 16,800 deaths and affects approximately 1.5 % of the global population. Diagnosis involves blood tests and genetic screening; however, the lack of adequate records and screening programs hinders the detection of many severe cases, contributing to high mortality rates. Although its prevalence is low, thalassemia significantly impacts the quality of life of patients, who require lifelong treatment. This study proposes the use of gene expression profiles and machine learning methods to identify biomarkers associated with thalassemia. Classification models and an anomaly detection algorithm were applied, revealing that classification methods were not effective for this problem. However, the isolation forest based model identified 72 anomalous genes. Functional validation of these genes highlighted relevant biological terms such as cytoplasmic translation and apoptosis, suggesting potential molecular pathways involved in thalassemia. Additionally, genes related to iron homeostasis were identified, establishing a link between oxidative stress, apoptosis, and the disease. Comparison with previous studies revealed common biological processes, underscoring the potential of machine learning to enhance diagnosis and deepen the understanding of molecular pathways, aiming to optimize patient treatments.
Thalassemia is a hereditary blood disorder that affects hemoglobin production, resulting in abnormally low levels of this protein. In 2015, this disease was responsible for 16,800 deaths and affects approximately 1.5 % of the global population. Diagnosis involves blood tests and genetic screening; however, the lack of adequate records and screening programs hinders the detection of many severe cases, contributing to high mortality rates. Although its prevalence is low, thalassemia significantly impacts the quality of life of patients, who require lifelong treatment. This study proposes the use of gene expression profiles and machine learning methods to identify biomarkers associated with thalassemia. Classification models and an anomaly detection algorithm were applied, revealing that classification methods were not effective for this problem. However, the isolation forest based model identified 72 anomalous genes. Functional validation of these genes highlighted relevant biological terms such as cytoplasmic translation and apoptosis, suggesting potential molecular pathways involved in thalassemia. Additionally, genes related to iron homeostasis were identified, establishing a link between oxidative stress, apoptosis, and the disease. Comparison with previous studies revealed common biological processes, underscoring the potential of machine learning to enhance diagnosis and deepen the understanding of molecular pathways, aiming to optimize patient treatments.
Description
Keywords
talasemia, biomarcadores, aprendizaje automático, expresión genética, clasificación, detección de anomalías, genes, trastorno sanguíneo, thalassemia, biomarkers, machine learning, gene expression, anomaly detection, blood disorder
Citation
Collections
Endorsement
Review
Supplemented By
Referenced By
Creative Commons license
Except where otherwised noted, this item's license is described as Attribution-NonCommercial-ShareAlike 3.0 United States