Data mining occurrences of infectious diseases with SNOMED CT
Date
2013-05-01
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Synonyms within SNOMED CT’s structure give meaning to the clinical terminology.
The hypothesis in this thesis is that the number of synonyms of a disease within SNOMED
CT can be used to predict the number of occurrences of an infectious disease reported on by
the World Health Organization (WHO). Using simple Classification and Regression
(CART), Bayes theory, and Best Fit trees, prediction algorithms are created based on the
number of synonyms in infectious disease terms of SNOMED CT, the number of those
diseases world-wide, the region of occurrence of the disease, and the year of occurrence of
the disease. The results of experiments predict the number of occurrences of a disease
correctly 67% of the time by using Simple Cart method; Bayes and Best Fit Trees each
produce the correct number of occurrences 61% of the time.
Description
Keywords
SNOMED CT, Data mining, World Health Organization, Infectious diseases, Simple CART theory, Naive Bayes, Best Fit Trees, World health statistics