BioSymetrics’ Machine Learning Platform: AugustaTM

Augusta v1.3 Release


  • Improved visualization incorporating Seaborn and Matplotlib packages
  • Increased Machine Learning model types
  • Expanded preprocessing for tabular data sets with Pandas integration

Upgrade Today!  Contact Us

Biomedical Machine Learning Using Multiple Data Pipelines

  • AugustaTM performs predictive analytics and biomedical machine learning for use cases in bioinformatics, R&D clinical informatics, target discovery, and computational biology:
    • MRI/fMRI and other imaging modalities
    • EEG
    • EKG/ECG
    • Genetics, Proteomics
    • Wearables data
    • EHR/EMR data
    • Real Time Integrative Analytics, e.g. combining genomics with clinical data
    • Precision medicine applications in real time
    • Drug compound and small molecule activity prediction
    • Patient care quality and program analysis
    • Drug discovery and development analysis
    • Hospital telemetry and operational data

Connect to a Variety of Data Sources

  • Through integration with Python, AugustaTM can connect to and read from any data source (g. local storage, databases, cloud storage).
  • Modular and customizable pipelines for processing raw phenotypic, imaging, drug, and genomic data sets using any combination of data types.

Integrate with Existing Business Processes

  • Augusta can deploy where the data reside, saving valuable time and development effort.
  • Seamlessly integrates into existing business processes or embedded directly within user applications.
  • Capabilities available for IPython notebooks with Plotly-powered visualizations.

Parallel and Distributed Computing Enabled

  • AugustaTM leverages the performance of the most advanced parallel processors – multi-core CPUs and GPUs – for increased performance during model rebuilds.
  • Augusta is integrated with Apache Spark for distributed execution.
  • Deployable on standalone servers, through the Amazon Web Service and Microsoft Azure, with an Apache Spark distributed framework for all steps above.
  • Connection to Apache Kafka to retrieve and organize streaming data.

Using data available in the DrugBank Database we generated binding prediction profiles for every known protein target having 5 or more described ligands (607 protein-drug binding models made, 7,149 potential ligands for each, over 4 million drug-ligand activity predictions total)