The Industry’s Most Dynamic Machine Learning Platform

Augusta™ performs predictive analytics and biomedical machine learning for use cases in bioinformatics, R&D clinical informatics, target discovery, and computational biology.

Current Release: v1.3

  • Improved visualization incorporating Seaborn and Matplotlib packages
  • Increased Machine Learning model types
  • Expanded preprocessing for tabular data sets with Pandas integration

Future Release: v2.0 (Due October 2019)

  • A single syntax that allows you to process multiple, diverse data types, integrate them and run/compare multiple machine learning algorithms.
  • Seamless distributed computing
  • Easy implementation and installation using a dockerized framework

Connect Multiple Data Sources

  • Any Data Source: Through integration with Python, Augusta™ can connect to and read from any data source (g. local storage, databases, cloud storage).
  • Any Combination of Sources: Modular and customizable pipelines for processing raw phenotypic, imaging, drug, and genomic data sets using any combination of data types.
Connection to Data Sources

Available Machine Learning Data Pipelines:

  • MRI/fMRI and other imaging modalities
  • EEG
  • Genetics, Proteomics
  • Wearables data
  • EHR/EMR data
  • Integrative Analytics, e.g. combining genomics with clinical data
  • Precision medicine applications
  • Drug compound and small molecule activity prediction
  • Patient care quality and program analysis
  • Drug discovery and development analysis
  • Hospital telemetry and operational data

Easy Set-Up

  • Augusta™ can deploy where the data reside, saving valuable time and development effort.
  • Seamlessly integrates into existing business processes or embedded directly within user applications.
  • Capabilities available for IPython notebooks with Plotly-powered visualizations.

Deploy In Any Environment

  • Cloud deployment via AWS or Microsoft Azure
  • On-Premise deployment using local servers
  • Hybrid Cloud leveraging the best of both worlds

Designed for Speed

  • Augusta™ leverages the performance of the most advanced parallel processors – multi-core CPUs and GPUs – for increased performance during model rebuilds


  • Augusta™ is integrated with Apache Spark for distributed execution
  • Connection to Apache Kafka to retrieve and organize streaming data

Proven Predictive Models

pharma icon

Using data available in the DrugBank Database we generated binding prediction profiles for every known protein target having 5 or more described ligands (607 protein-drug binding models made, 7,149 potential ligands for each, over 4 million drug-ligand activity predictions total)