top of page

2nd Edition

13 October 2020

Doctor

Tommaso Lo Barco

52333468_2004806743157096_32687460338399
Schermata 2021-03-25 alle 17.02.38.png
DRAVET'S SYNDROME AS A MODEL FOR DIAGNOSING AND "PHENOTYPING" OTHER CONDITIONS THROUGH NATURAL LANGUAGE PROCESSING
Abstract

The global trend toward "digital health" in the US and in Europe has led to an unprecedented adoption of Electronic Health Records (EHRs): by the end of 2014, 83% of US physicians and 75% of hospitals used some form of EHRs ( Office of the National Coordinator for Health Information Technology Health Record Adoption: 2004-2014, 2015) (Adler-Milstein, et al., 2015).

An increasing number of hospitals are now equipped with digital data banks (Clinical Data Warehouse) consisting of data collected in the patient care process through EHRs, and which can be used for statistical or research purposes.
Clinical and epidemiological studies that exploit big data to detect and propose interdisciplinary treatments for patients with similar medical history, diagnosis and outcomes have enormously increased in recent years.

Among the most impacting applications deriving from the use of Big Data and EHRs are those that allow to intervene directly in the diagnostic process, avoiding delays or errors.
In the same way it is interesting how Big Data can be used to elaborate predictive models, able, for example, to calculate the risk of disease.

It is intuitive that the use of this type of support in the context of rare diseases is highly desirable, where the scarce knowledge of the disease due to the small number of patients often leads to a delay in diagnosis and the difficulty of predicting the outcome.
However, the need to compute large amounts of data in order to elaborate effective models has meant that today there are models that can be applied only to relatively common pathologies, mainly using numerical data (laboratory or computer-based data deriving from imaging techniques).

Natural Language Processing (NLP) is an innovative informatic technology able to extract information directly from digital written documents. Recently, the models that have been developed are able to exploit this technique by applying it to patient medical reports, in order to suggest diagnostic hypotheses for common pathologies, or describe the clinical phenotype of a rare pathology (Liang, et al., 2019) (Garcelon, et al., 2018).

The starting point for this work derives from the opportunity to adopt this technique in the context of a rare pathology: Dravet Syndrome.

This condition presents a significant problem of diagnostic delay. Although the clinical characteristics for diagnosing are all presented within the second year of life, from the data present in the literature it is clear that only in 2 countries in the world can manage a timely diagnosis, with an average delay

that, only in Europe, it is around 3-4 years from the onset of illness. The delayed diagnosis, besides representing a frustrating condition for the family, can determine the use of contraindicated drugs that can exacerbate seizures, increase the risk of status epilepticus, worsen cognitive outcome, as well as delaying the employ of helpful drugs.

The main differential diagnosis with Dravet Syndrome at onset is represented by a benign condition characterized solely by the predisposition to have seizures during fever.
This work shows how in the visit reports of subjects produced within the 2 years of life are already present some terms that, statistically more relevant in subjects with Dravet Syndrome than in subjects with "febrile convulsions", could be used to elaborate Alert systems designed to direct suspicion and / or advise targeted follow-up.

We have also demonstrated how the automatic extrapolation of concepts from the visit reports of a cohort of patients with rare pathology followed over the years, allows to reconstruct a "longitudinal model of disease" compatible with what is known from the literature. Having validated this new mode of extrapolating a longitudinal phenotype, it justifies its use in other rare conditions whose outcome is, contrary to Dravet Syndrome, poorly known.

In addition to the possibility of exploiting NLP for automatic extrapolation of concepts from reports, this technique can also be used to the targeted search of terms within visit reports. This opportunity allowed us to easily collect data on weight, height and head circumference of our cohort of patients with Dravet Syndrome, subsequently implemented with patients from a second center. The analysis of the results documents a slowing in rate of cranial growth in these patients, a data not previously reported in the literature.

Further uses that derive from the application of the NLP could be the retrospective research in a CDW of patients suffering from a specific condition not yet diagnosed, searching for keywords within the visit reports. Moreover, the comparison with a "control" population could reveal clinical data not yet present in the literature.

In conclusion, we demonstrated how Natural Language Processing applied to narrative medical reports, both through a data-driven analysis and targeted searches, allows to make diagnosis and to reconstruct the phenotype of a specific rare condition, like Dravet Syndrome, not only confirming what is present in the literature, but also contributing to enlarge the known phenotype.

bottom of page