Digital staining for the automated annotation of mass spectrometry imaging (MSI) data has previously been achieved using state-of-the-art classifiers such as random forests or support vector machines (SVMs). However, the training of such classifiers requires an expert to label exemplary data in advance. This process is time-consuming and hence costly, especially if the tissue is heterogeneous. In theory, it may be sufficient to only label a few highly representative pixels of an MS image, but it is not known a priori which pixels to select. This motivates active learning strategies in which the algorithm itself queries the expert by automatically suggesting promising candidate pixels of an MS image for labeling. Given a suitable querying strategy, the number of required training labels can be significantly reduced while maintaining classification accuracy. In this work, we propose active learning for convenient annotation of MSI data. We generalize a recently proposed active learning method to the multiclass case and combine it with the random forest classifier. Its superior performance over random sampling is demonstrated on secondary ion mass spectrometry data, making it an interesting approach for the classification of MS images.

Anal. Chem.

Hanselmann, M, Roeder, J, Köthe, U, Renard, B.Y, Heeren, R.M.A, & Hamprecht, F.A. (2013). Active learning for convenient annotation and classification of Mass Spectrometry images. Anal. Chem., 85(1), 147–155. doi:10.1021/ac3023313