Large & Diverse Images
Whole slide histopathology images are often more than 60,000 pixels across. They contain multiple tissue types and both tumor and non-tumor tissue. Tissue appearance is heterogeneous, both from patient to patient and sometimes within a single tumor.
For some applications like mitosis detection and tissue segmentation, pathologists can provide detailed annotations. But for patient-level prediction tasks like molecular biomarkers, treatment response, or patient outcomes, the algorithm must learn itself which regions of the image are important, often employing multiple instance learning.
Limited Labeled Data
Some applications of computer vision make use of millions of labeled images, whereas data sets of 1000 or so patients are much more common for medical applications. Transfer learning and self-supervised methods are often critical to success in this low sample size regime.
Additional Modalities of Data
The most powerful models that can improve patient care and outcomes often use multiple modalities of data - histopathology, clinical, genomic, proteomic, etc. Specialized models can make use of these structured and unstructured sources of data.
Language Barrier Between Disciplines
Understanding the intricacies of a particular application and possible clinical use cases requires the expertise of pathologists and other domain experts. Project success depends on communicating both ways - about the disease and about the machine learning solution.