Details: | Abstract
Electronic health records (EHRs) provide opportunities for researchers to develop clinical diagnostic tools. Conventional machine learning methods have been used to develop those tools, usually one tool for one diagnosis and longitudinal patient records are usually not utilized efficiently. We developed a risk prediction framework, WeightP2V, which takes advantage of numeric representations of medical records of all patients, based on which a numeric vector for a patient can be calculated with a weighting mechanism, i.e., medical records that are more relevant to an outcome of interest are given higher weights. Patient vectors are then used to predict their risks of developing any health outcomes. With extensive simulation studies and clinical applications to predict 1,193 diagnoses in the MIMIC-III database and 1,559 diagnoses in the EHR database of Columbia University Irving Medical Center, we demonstrated an improved prediction performance of WeightP2V over competing methods.
|