Challenge: Combine Disparate Data Sets in PreProcessing for ML

Summary: Compelling results show that combining data sources generally allowed better diagnostic performance than with any data set alone (Figures 1&2)

Solution: Using Augusta™, we examined features extracted from several thousand MRIs, genome wide association screening, metabolic profiling, and family history for 334 patients in a study of progression of Alzheimer’s Disease (ADNI) – Figures 1 & 2

Our study identified that greater accuracy in diagnosing Alzheimer’s disease comes when combining features from genomics, imaging, and metabolomics.

Interestingly, different features were selected when combinatorial data sets were used (i.e. some genetic markers only have meaning in the context of a given anatomical feature)

We believe it is even more important to consider combining the use of various types of data depending on the problem at hand.

See Augusta™ in Action