Channel Embedding for Informative Protein Identification from Highly Multiplexed Images

Code [GitHub]
bioRxiv 2020 [Paper]

Figure 1: Informative channel identification. (a) Given highly multiplexed imaging data, we train (b) a neural network to encode a channel embedding and classify a label (e.g., tumor grade). Then, we measure (c) the classification task channel importance by adopting an interpretation method to the channel embedding. (d) We evaluate our system by comparing the predicted informative channels to expert knowledge, and provide new insights for clinicians and pathologists.

Abstract

Interest is growing rapidly in using deep learning to classify biomedical images, and interpreting these deep-learned models is necessary for life-critical decisions and scientific discovery. Effective interpretation techniques accelerate biomarker discovery and provide new insights into the etiology, diagnosis, and treatment of disease. Most interpretation techniques aim to discover spatially-salient regions within images, but few techniques consider imagery with multiple channels of information. For instance, highly multiplexed tumor and tissue images have 30-100 channels and require interpretation methods that work across many channels to provide deep molecular insights. We propose a novel channel embedding method that extracts features from each channel. We then use these features to train a classifier for prediction. Using this channel embedding, we apply an interpretation method to rank the most discriminative channels. To validate our approach, we conduct an ablation study on a synthetic dataset. Moreover, we demonstrate that our method aligns with biological findings on highly multiplexed images of breast cancer cells while outperforming baseline pipelines.

Highlights

(1) We introduce a novel system to automatically identify informative channels in highly multiplexed tissue images and to provide interpretable and potentially actionable insight for research and clinical applications.

Teaser

(2) In our experimental results, we demonstrate that our system outperforms conventional algorithms combined with interpretation techniques on the informative channel identification task for assessing tumor grade.

Teaser

(3) The informative channels identified by our novel method align with findings from a single cell data analysis, even though our approach does not require single cell segmentation.

Teaser