IIIT-Delhi Institutional Repository

Deep learning assisted methods for microscopic blood Cancer imaging analysis

Show simple item record

dc.contributor.author Gehlot, Shiv
dc.contributor.author Gupta, Anubha (Advisor)
dc.date.accessioned 2022-05-21T09:04:23Z
dc.date.available 2022-05-21T09:04:23Z
dc.date.issued 2022-04-08
dc.identifier.uri http://repository.iiitd.edu.in/xmlui/handle/123456789/1029
dc.description.abstract This thesis, “Deep Learning Assisted Methods for Microscopic Blood Cancer Imaging Analysis” aims to develop deep learning-based CAD tools for Acute Lymphoblastic Leukemia (ALL) and Multiple Myeloma (MM). A CAD tool typically follows a hierarchical approach starting with stain normalization followed by cell segmentation and classification. Stain normalization counters the stain variation in the dataset while segmentation extracts the region of interest (ROI) from the histopathology images, which are finally analyzed for diagnosis (classification). Accordingly, the first objective of this thesis is to develop a robust method for stain normalization. During the data collection process, a slide is prepared using the smear. A whole slide or microscopic image is then captured using a scanner or a camera mounted on the microscope. The slide preparation also involves staining to highlight the underlying cellular/tissue structure. For example, the application of hematoxylin and eosin (H&E) results in blue staining of the nuclei (due to H) and pink staining of cytoplasm (due to E). The staining leads to stain variation in the images due to multiple factors such as variation in scanners at different collections centers, the stain quantities, or the staining time. A significant prerequisite for deep learning algorithms is: train and test conditions must match. However, due to the stain variation, data distribution of the different centers may vary, and the model trained on one center may have a sub-optimal performance at the other. Stain normalization counters the stain variation and matches the chromatic distribution of the two images/centers. The second objective is to develop a segmentation algorithm. Segmentation aims to identify the region of interest (ROI) by assigning a label to each pixel. The ROI could be nucleus, cells, or gland in a histopathology image. For example, in ALL and MM diagnoses, ROIs are the cell nuclei of the microscopic images. However, all the nucleus in an image are not corresponding to the cells of interest. Hence, an expert oncologist is required to identify the relevant ones. Semantic and instance segmentation are the two categories of segmentation. While the former assigns the same label to the ROI, the latter assigns a different label to each instance of the ROI. For instance, semantic segmentation will identify a cluster of the nuclei, while instance segmentation will further segregate the cluster. We target the instance segmentation for ALL and MM diagnosis because the analysis is performed at the nucleus level. The final objective is to develop deep learning-based classification methods for ALL and MM that discriminate the segmented structure (nuclei) into different categories, cancerous or healthy, based on the underlying pattern. The problem is challenging due to visual similarity between the malignant and normal cells. To meet the first objective, this thesis proposes Geometry-inspired Chemical invariant and Tissue Invariant Stain Normalization (GCTI-SN) method for microscopic medical images. The proposed GCTI-SN method corrects for illumination variation, stain chemical, and stain quantity variation in a unified framework by exploiting the underlying color vector space’s geometry. GCTI-SN method is benchmarked against the existing methods via quantitative and qualitative results, validating its robustness for stain chemical and cell/tissue type. Further, the utility and the efficacy of the proposed GCTI-SN stain normalization method are demonstrated diagnostically in the application of breast cancer detection via a CNN-based classifier. For the second objective, the thesis develops an Encoder-Decoder based Convolutional Neural Network (CNN) with Nested-Feature Concatenation (EDNFC-Net) for automatic nuclei segmentation. Accurate nuclei identification is an important and yet complex step in the diagnosis due to heterogeneity in structure, color, and texture among the different categories of cells. The problem is further complicated due to overlapped/clustered nuclei. The feature concatenation cell (FCC) of the EDNFC-Net comprises two stacks of convolutional filters combined with nonlinearity, followed by a concatenation of features. Apart from intra-FCC feature concatenation, a mechanism is also provided for inter-FCC feature concatenation. This arrangement leads to better feature flow and feature reusability. Similarly, direct feature flow is provided between the encoder and decoder module that preserves the context information. A new loss function with better-penalizing capability is also proposed that helps in the better background and foreground separation. For ALL diagnosis, this thesis proposes an architecture, namely, SDCT-AuxNet_ that is a 2-module framework that utilizes a compact CNN as the primary classifier in one module and a Kernel SVM as the auxiliary classifier in the other one. While the CNN classifier uses features through bilinear-pooling, spectral-averaged features are used by the auxiliary classifier. Further, this CNN is trained on the stain deconvolved quantity images in the optical density domain instead of the conventional RGB images. A novel test strategy is proposed that couples both the classifiers for decision-making using the confidence scores of their predicted class labels. For MM diagnosis, this work proposes a unified framework that addresses the key challenges of inter-class visual similarity of healthy versus cancer cells and that of the label noise of the dataset. To extract class distinctive features, we propose projection loss to maximize the projection of a sample’s activation on the respective class vector besides imposing orthogonality constraints on the class vectors. This projection loss is used along with the cross-entropy loss to design a dual branch architecture that helps achieve improved performance and provides scope for targeting the label noise problem. Based on this architecture, two methodologies have been proposed to correct the noisy labels. A coupling classifier has also been proposed to resolve the conflicts in the dualbranch architecture’s predictions. Next, this thesis presents an integrated framework for Stain Normalization, Classification, and Segmentation. The framework consists of a coupled network composed of two U-Net type architectures that utilize self-supervised learning. The first subnetwork (N1) learns an identity transformation, while the second (N2) learns a transformation to perform stain normalization. We also introduce classification heads in the subnetworks, trained along with the stain normalization task. To the best of our knowledge, the proposed coupling framework, where the information from the encoders of both the subnetworks is utilized by the decoders of both subnetworks as well as trained in a coupled fashion, is introduced in this domain for the first time. Interestingly, the coupling of N1 (for identity transformation) and N2 (for stain normalization) helps N2 learn the stain normalization task while being cognizant of the features essential to reconstruct images. Similarly, N1 learns to extract relevant features for reconstruction invariant to stain color variations due to its coupling with N2. Thus, the two subnetworks help each other, leading to improved performance on the subsequent classification task. Further, it is shown that the proposed architecture can also be used for segmentation, making it applicable for all three applications: stain normalization, classification, and segmentation. While deep learning-based methods are data-intensive, availability of a large amount of data is a challenge in the medical domain due to difficulties in data capture and data annotation. Consequently, sufficient size imaging datasets are not available in the public domain for ALL and MM diagnosis. Therefore, in-house datasets of more than 100 subjects each were collected by AIIMS, New Delhi, India for ALL and MM cancers. Of these, the curated, annotated, and segmented (cell nucleus) ALL dataset was used in the challenge, Classification of Normal vs. Malignant Cells in B-ALL White Blood Cancer Microscopic Image, organized as part of the IEEE International Symposium on Biomedical Imaging (ISBI) 2019. This dataset is available in the public domain. en_US
dc.language.iso en_US en_US
dc.publisher IIIT-Delhi en_US
dc.subject Deep learning en_US
dc.subject AI in cancer diagnosis en_US
dc.subject Affordable AI in healthcare en_US
dc.subject Microscopic images en_US
dc.subject Stain normalization en_US
dc.subject Nuclei segmentation en_US
dc.subject Cell classification en_US
dc.subject Self-supervised learning en_US
dc.subject ALL diagnosis en_US
dc.subject MM diagnosis en_US
dc.subject Projection loss en_US
dc.title Deep learning assisted methods for microscopic blood Cancer imaging analysis en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


Advanced Search

Browse

My Account