Handling Missing MRI Data in Brain Tumors Classification Tasks: Usage of Synthetic Images vs. Duplicate Images and Empty Images

J Magn Reson Imaging. 2023 Oct 21. doi: 10.1002/jmri.29072. Online ahead of print.

ABSTRACT

BACKGROUND: Deep-learning is widely used for lesion classification. However, in the clinic patient data often has missing images.

PURPOSE: To evaluate the use of generated, duplicate and empty(black) images for replacing missing MRI data in AI brain tumor classification tasks.

STUDY TYPE: Retrospective.

POPULATION: 224 patients (local-dataset; low-grade-glioma (LGG) = 37, high-grade-glioma (HGG) = 187) and 335 patients (public-dataset (BraTS); LGG = 76, HGG = 259). The local-dataset was divided into training (64), validation (16), and internal-test-data (20), while the public-dataset was an independent test-set.

FIELD STRENGTH/SEQUENCE: T1WI, T1WI+C, T2WI, and FLAIR images (1.5T/3.0T-MR), obtained from different suppliers.

ASSESSMENT: Three image-to-image translation generative-adversarial-network (Pix2Pix-GAN) models were trained on the local-dataset, to generate T1WI, T2WI, and FLAIR images. The rating-and-preference-judgment assessment was performed by three human-readers (radiologist (MD) and two MRI-technicians). Resnet152 was used for classification, and inference was performed on both datasets, with baseline input, and with missing data replaced by 1) generated images; 2) duplication of existing images; and 3) black images.

STATISTICAL TESTS: The similarity between the generated and the original images was evaluated using the peak-signal-to-noise-ratio (PSNR) and the structural-similarity-index-measure (SSIM). Classification results were evaluated using accuracy, F1-score and the Kolmogorov-Smirnov test and distance.

RESULTS: For baseline-state, the classification model reached to accuracy = 0.93,0.82 on the local and public-datasets. For the missing-data methods, high similarity was obtained between the generated and the original images with mean PSNR = 35.65,32.94 and SSIM = 0.87,0.91 on the local and public-datasets; 39% of the generated-images were labeled as real images by the human-readers. The classification model using generated-images to replace missing images produced the highest results with mean accuracy = 0.91,0.82 compared to 0.85,0.79 for duplicated and 0.77,0.68 for use of black images; DATA CONCLUSION: The feasibility for inference classification model on an MRI dataset with missing images using the Pix2pix-GAN generated images, was shown. The stability and generalization ability of the model was demonstrated by producing consistent results on two independent datasets.

LEVEL OF EVIDENCE: 3 TECHNICAL EFFICACY: Stage 5.

PMID:37864370 | DOI:10.1002/jmri.29072

By Nevin Manimala