As with any assay, L1000 data is noisy. Experimental replicates (the same compound tested on the same cell line under the same conditions) often result in different levels of expression being measured. The process of de-noising the L1000 data makes it easier to see true assay response, and pick a representative concentration for each compound.
When are two compounds the same? The effect of Simplified Molecular Input Line Entry System (SMILES) format on chemical database overlap including best practice for canonicalization and harmonization to understand the impact of these compound effects on a particular dataset and specific application.
CASE STUDY: Machine learning for activity prediction, as part of lead compound generation
The Challenge: The ability to quickly iterate multiple large feature sets with the flexibility to test models at scale is a challenge for any data scientist. Read More
USE CASE: Value Based Care
CLIENT: Major UK based Healthcare network in partnership with Intacare.
OVERVIEW: The annual cost of radiotherapy is escalating year-on-year with little visibility of root cause and control. Maintaining cost efficient healthcare for patients required an investigation of current code/claim and cost data.
GOAL: Identify and quantify potential cost savings of revising existing reimbursement mechanisms.
PROBLEM: Data was incongruous. Each healthcare provider used different systems, taxonomies, codes and cost basis for mapping radiotherapy procedures when submitting claims.
- 75,000+ claims
- 1725+ unique narratives
- 1000’s of individual codes and duplicate codes
- Data types: text, numeric
SOLUTION: Intacare used Augusta Pre-Processing workflows to quickly normalize procedure and cost data collected from multiple sources. The team also created an automatable workflow to streamline future analysis and report generation.
- Identified inefficiencies across 24,281 claims
- 39% of total claims
- Projected cost savings: $5Million
Pre-processing of data using Augusta workflows reduced the time to manage the data from multiple weeks to just hours.
Moreover, the workflows are now available within Augusta as standard packages, easily replicated for future projects or if additional data needs to be interrogated.
Download the Intacare case study in PDF format.
Complete the form below.
Challenge: Combine Disparate Data Sets in PreProcessing for ML
Summary: Compelling results show that combining data sources generally allowed better diagnostic performance than with any data set alone (Figures 1&2) Read More
Report Title: Distributed Processing Frameworks for Machine Learning of Combined Biomedical Data Types
Whitepaper discusses the computing requirements of combined data types for which the Augusta™ platform was constructed to operate
This is a must read for understanding the compute power complexities of pre-processing various data types and identifying ideal scenarios when using/pricing Augusta™
Please complete the form below to download our free white paper.