PURPOSE: Quantify and correct bias from high-content screening (HCS) data

INPUT: Chemical structures, morphological properties (or original images)

OUTPUT: Dynamic workflow that integrates bias removal and mechanism prediction


  • Batch effects are a common issue when dealing with high througput assays, often resulting in patterns within the data unrelated to assay response.
  • Machine Learning (ML) models latch on to any source of regularity
  • Without Augusta™ pre-processing and Contingent-AI (patent pending), ML models will learn plate and plate group effects instead of true behavior

Correlation data for the same assay when presenting plates randomly (left picture) versus alphabetically by plate ID (right picture).

Clear patterns related to the naming scheme can be seen, indicating lab, freezer, surveyor, etc.

A variety of algorithms can be used to reduce this bias, but care must be taken not to remove assay information
  • High accuracy on prediction of plate, well, or plate group indicates systemic bias
  • High prediction of the assay response is the “desired” outcome
  • Here we see that normalization methods slightly increase the ability to predict the Assayed response
  • But also increase the ability to predict plate and plate group
Each Augusta™ workflow is built to accommodate several bias reduction methods and dynamically permute parameters to minimize accuracy of plate/well/group and maximize desired outcome


  • Accommodate confounding factors
  • Gain more insight into compound effects, increase confidence in HCS data
  • Generate more robust conclusions of mechanism/activity

PREVIOUS APPLICATIONS: Identification and reduction of bias from cell painting data


BioSymetrics leverages a proprietary machine learning platform (Augusta) which is used to generate structure-based activity predictions. This in combination with a vertebrate, in vivo phenotypic profiling framework has allowed us to make phenotype-mechanism association predictions across a range of potential clinical applications.