Bias Reduction in HCS Data
PURPOSE: Quantify and correct bias from high-content screening (HCS) data
INPUT: Chemical structures, morphological properties (or original images)
OUTPUT: Dynamic workflow that integrates bias removal and mechanism prediction
USE CASE:
Batch effects are a common issue when dealing with high througput assays, often resulting in patterns within the data unrelated to assay response.
Machine Learning (ML) models latch on to any source of regularity
Without Augusta™ pre-processing and Contingent-AI (patent pending), ML models will learn plate and plate group effects instead of true behavior
<NOTE: images were removed from this post>
Correlation data for the same assay when presenting plates randomly (left picture) versus alphabetically by plate ID (right picture).
Clear patterns related to the naming scheme can be seen, indicating lab, freezer, surveyor, etc.
A variety of algorithms can be used to reduce this bias, but care must be taken not to remove assay information.
High accuracy on prediction of plate, well, or plate group indicates systemic bias
High prediction of the assay response is the “desired” outcome
Here we see that normalization methods slightly increase the ability to predict the Assayed response
But also increase the ability to predict plate and plate group
Each Augusta™ workflow is built to accommodate several bias reduction methods and dynamically permute parameters to minimize accuracy of plate/well/group and maximize desired outcome
USER BENEFITS:
Accommodate confounding factors
Gain more insight into compound effects, increase confidence in HCS data
Generate more robust conclusions of mechanism/activity
PREVIOUS APPLICATIONS: Identification and reduction of bias from cell painting data
SUMMARY
BioSymetrics leverages a proprietary machine learning platform (Augusta) which is used to generate structure-based activity predictions. This in combination with a vertebrate, in vivo phenotypic profiling framework has allowed us to make phenotype-mechanism association predictions across a range of potential clinical applications.