Fight Cancer with AI


Detect tumor in whole slide images
Segment tissue types


Predict patient recurrence or survival
Count mitoses


Infer molecular biomarkers and treatment response

Machine learning projects for pathology have unique challenges

Large & Diverse Images

Whole slide histopathology images are often more than 60,000 pixels across. They contain multiple tissue types and both tumor and non-tumor tissue. Tissue appearance is heterogeneous, both from patient to patient and sometimes within a single tumor.

Weak Labels

For some applications like mitosis detection and tissue segmentation, pathologists can provide detailed annotations. But for patient-level prediction tasks like molecular biomarkers, treatment response, or patient outcomes, the algorithm must learn itself which regions of the image are important, often employing multiple instance learning.

Limited Labeled Data

Some applications of computer vision make use of millions of labeled images, whereas data sets of 1000 or so patients are much more common for medical applications. Transfer learning and self-supervised methods are often critical to success in this low sample size regime.

Additional Modalities of Data

The most powerful models that can improve patient care and outcomes often use multiple modalities of data - histopathology, clinical, genomic, proteomic, etc. Specialized models can make use of these structured and unstructured sources of data.

Language Barrier Between Disciplines

Understanding the intricacies of a particular application and possible clinical use cases requires the expertise of pathologists and other domain experts. Project success depends on communicating both ways - about the disease and about the machine learning solution.

Build robust and generalizable models

  1. Decipher the distribution shifts in your images
  2. Properly clean your data
  3. Create models that accommodate unique data challenges
  4. Thoroughly validate your models
  5. Iterate and improve

Work With Me