Advances in AI applying deep learning to digital pathology images can stratify patients by risk.

Computational pathology applications using artificial intelligence are becoming increasingly complex. From detecting and classifying cells and tissue to predicting biomarkers and patient outcomes. Simpler tasks rely upon pathologists’ annotations of specific features in the tissue. But biomarkers and outcomes are more complex. Algorithms must decipher large whole slide images without any prior knowledge of which regions of tissue or characteristics of its appearance are important.

Risk stratification can already be done using cancer staging, molecular features, or clinical variables. However, improving prognostic insights is an active area of research. Prognostics refers to the likely outcome for a patient following the standard treatment.

Take ductal carcinoma in situ (DCIS) as an example - a pre-invasive form of breast cancer. Many do not become invasive - but which ones will? There is a great deal of inter-observer variability amongst pathologists assessing such lesions [Van Bockstal, Groen]. Researchers at Georgia State University have developed an algorithm to predict the risk of local recurrence of DCIS within ten years using digitized whole slide images [Klimov].

Risk stratification could also mean predicting whether a distant metastasis will occur or how long a patient is likely to live. Regardless of the target, the challenges in creating such an algorithm are similar.

H&E whole slide images are large, and tissue appearance is diverse. Unlike methods to find mitoses or segment tissue types, pathologists cannot annotate which regions of the tissue are associated with patient outcome - at least not with any high degree of certainty.

Limitations of Pathologist-Guided Prognostics

Traditional approaches to predicting patient outcomes from histopathology mimic the work of a pathologist. They take features that pathologists already use to stratify patients and create algorithms to automate extracting these characteristics. One major advantage of this approach is interpretability - pathologists already understand what these features look like and so it makes sense when an automated method also demonstrates that cellular diversity is associated with lower survival rates for non-small cell lung cancer, for example [Lu].

If pathologists already had a superior ability to predict patient outcomes from histopathology, this approach would be great. However, the visual complexity of these gigapixel whole slide images presents a challenge because of the size of the images, their intricate appearance, and the heterogeneity within each.

If pathologists cannot reliably stratify tumors by risk, automated versions of the properties that they assess are unlikely to accomplish this task. Hence, more powerful image features are needed.

Capturing Prognostic Patterns with Deep Learning

Enter deep learning. Over the last decade, deep learning has revolutionized speech recognition, language translation, and facial recognition, among many others. The keys to its success are a lot of data and the end-to-end method for learning models. Image features don’t need to be predefined - the model learns them. It can learn complex and abstract properties - even those beyond the capabilities of human visual processing - all based on a training set of labeled images.

Deep learning models learn to extract patterns that are predictive of some target - for instance, whether a tumor is low versus high grade. For outcome prediction, the target could be the time to a particular event - such as cancer recurrence or death.

Of course many different factors determine how long a patient lives. The morphology of their tumor is only one piece of the puzzle. Predicting the time to event directly is also challenging because only a small fraction of patients will have died or had their cancer recur. Also some patients may leave the study before an event occurs, or the study ends before they have an event.

Instead of predicting exactly how long a patient is likely to live, most survival models take a contrastive approach: is patient A likely to live longer than patient B?

If the model incorrectly predicts which patient lived longer, it gets penalized. From each incorrect prediction, the model adapts so that it performs slightly better for each new example that it sees.

From Hypothesis to Risk Prediction

Deep learning models consist of multiple layers where the higher-level concepts are built upon the lower level ones. Each layer of the network has a set of weights that are used to compute a representation for the next layer. The weights are like a hypothesis of what properties to look for in the image. Models often have more than 100 layers and, across all layers, upwards of 10 million weights to tune.

After passing an image into the network, each layer computes a new representation based on the output from the previous layer. At the end of the network, it predicts the target - in this case, a patient risk score. The goal in training the network is to minimize erroneous predictions.

Training a model involves adjusting each weight a little bit at a time to lower the total error. This way the model’s hypothesis for the important properties in the image improves. With each new patient image and its associated survival time, the model gets a little better. After seeing many images - and artificial variations of each to provide extra examples - the model learns to predict patient risk.

The Challenge of Gigapixel Images

But there is one more challenge in handling gigapixel histopathology images - they are very large, sometimes more than 100,000 pixels across. End-to-end training is not possible with images this large because they won’t fit on the special processor called a Graphics Processing Unit that trains deep learning models.

Most solutions break the whole slide images into small patches. In some studies a pathologist identifies tumor regions for training, while others train the deep learning model on all tissue patches [Shirazi]. Another approach is to first cluster the patches by visual appearance, then use a subset of patches from each cluster [Yao].

Risk predictions from the patches must then be aggregated to form a final risk score for the patient. Often this takes the form of a model that learns to select the most informative patches.

Further Advances and the Unique Power of Histopathology

Outcome prediction models have also been built to overcome additional challenges of histopathology data. Federated learning models can handle data sets located in different centers to preserve privacy [Andreux]. And models can learn to predict survival across multiple types of cancer simultaneously [Wulczyn, Vale-Silva].

Histopathology is of course not the only modality with a demonstrated ability to predict patient outcomes. Whole slide images can be combined with genomic and clinical features to improve outcome predictions, all within the same deep learning model [Hao, Chen, Vale-Silva].

Some pan-cancer studies have shown that clinical data and gene expression are most beneficial in predicting prognosis [Vale-Silva] and that histopathology features provided no additional predictive power [Zhong, Vale-Silva]. However, deep learning-based methods for whole slide images are still a new innovation. Their full potential has yet to be reached.

Histopathology provides a unique perspective that a single genomic profile cannot: a spatial view of the tumor. Researchers are just beginning to understand the role that intratumoral heterogeneity plays in tumor progression [Alizadeh, McGranahan, Natrajan]. These spatial variations can be captured from images far more efficiently than with genomic profiling.

While interpretability is still a challenge for deep learning models, they can also benefit from the spatial variations of whole slide images by indicating which regions of the tissue are most associated with a poor outcome. In some cases the highlighted regions are not even in the tumor itself but in the adjacent stroma [Courtiol]. This information can provide new insights for pathologists.

H&E histology is a routine part of the pathology pipeline. It is cheaper and faster than molecular analyses. As the transition to digital pathology accelerates, these whole slide images provide many new opportunities for artificial intelligence. Prognostic models with deep learning are just beginning to show their potential. It might just require a larger data set to find the most prognostic patterns in these gigapixel images.

Want to receive regular machine learning insights for pathology delivered straight to your inbox?

Sign up for Pathology ML Insights


[Alizadeh] Alizadeh, Ash A., et al. “Toward understanding and exploiting tumor heterogeneity.” Nature medicine 21.8 (2015): 846.

[Andreux] Andreux, Mathieu, et al. “Federated Survival Analysis with Discrete-Time Cox Models.” arXiv preprint arXiv:2006.08997 (2020).

[Chen] Chen, Richard J., et al. “Pathomic fusion: an integrated framework for fusing histopathology and genomic features for cancer diagnosis and prognosis.” IEEE Transactions on Medical Imaging (2020).

[Courtiol] Courtiol, Pierre, et al. “Deep learning-based classification of mesothelioma improves prediction of patient outcome.” Nature medicine 25.10 (2019): 1519-1525.

[Groen] Groen, Emma J., et al. “Prognostic value of histopathological DCIS features in a large-scale international interrater reliability study.” Breast cancer research and treatment 183.3 (2020): 759-770.

[Hao] Hao, Jie, et al. “PAGE-Net: Interpretable and integrative deep learning for survival analysis using histopathological images and genomic data.” Pacific Symposium on Biocomputing. Vol. 25. 2020.

[Klimov] Klimov, Sergey, et al. “A whole slide image-based machine learning approach to predict ductal carcinoma in situ (DCIS) recurrence risk.” Breast Cancer Research 21.1 (2019): 83.

[Lu] Lu, Cheng, et al. “A prognostic model for overall survival of patients with early-stage non-small cell lung cancer: a multicentre, retrospective study.” The Lancet Digital Health 2.11 (2020).

[McGranahan] McGranahan, Nicholas, and Charles Swanton. “Biological and therapeutic impact of intratumor heterogeneity in cancer evolution.” Cancer cell 27.1 (2015): 15-26.

[Natrajan] Natrajan, Rachael, et al. “Microenvironmental heterogeneity parallels breast cancer progression: a histology–genomic integration analysis.” PLoS medicine 13.2 (2016): e1001961.

[Shirazi] Shirazi, Amin Zadeh, et al. “DeepSurvNet: deep survival convolutional network for brain cancer survival rate classification based on histopathological images.” Medical & Biological Engineering & Computing (2020): 1-15.

[Wulczyn] Wulczyn, Ellery, et al. “Deep learning-based survival prediction for multiple cancer types using histopathology images.” PLoS One 15.6 (2020): e0233678.

[Vale-Silva] Vale-Silva, Luis Andre, and Karl Rohr. “MultiSurv: Long-term cancer survival prediction using multimodal deep learning.” medRxiv (2020).

[Van Bockstal] Van Bockstal, Mieke R., et al. “Interobserver Variability in Ductal Carcinoma In Situ of the Breast.” American journal of clinical pathology 154.5 (2020): 596-609.

[Yao] Yao, Jiawen, et al. “Whole slide images based cancer survival prediction using attention guided deep multiple instance learning networks.” Medical Image Analysis 65 (2020): 101789.

[Zhong] Zhong, Tingyan, Mengyun Wu, and Shuangge Ma. “Examination of independent prognostic power of gene expressions and histopathological imaging features in cancer.” Cancers 11.3 (2019): 361.