Mitigating Batch Effects in Cell Painting Data

With the advent of high content screening methodologies (e.g. cellular imaging, transcriptomics, etc.), it becomes more challenging to tease apart and visualize batch effects. This is further compounded when building machine learning models which can easily use these confounding variables instead of real biological signal to generate predictions leading to poor real world relevance.

Read More
Blog Post Blog Post

Dishing Dirt About Clean Data

A daughter's desire to please her parents demonstrates how a data scientist with good intentions can cause far more harm and expense in the long run, through the selection and creation of the wrong features during data pre-processing.

Read More