Int J Comput Assist Radiol Surg. 2022 Jul 26. doi: 10.1007/s11548-022-02707-y. Online ahead of print.
PURPOSE: Ultrasound is the standard-of-care to guide the systematic biopsy of the prostate. During the biopsy procedure, up to 12 biopsy cores are randomly sampled from six zones within the prostate, where the histopathology of those cores is used to determine the presence and grade of the cancer. Histopathology reports only provide statistical information on the presence of cancer and do not normally contain fine-grain information of cancer distribution within each core. This limitation hinders the development of machine learning models to detect the presence of cancer in ultrasound so that biopsy can be more targeted to highly suspicious prostate regions.
METHODS: In this paper, we tackle this challenge in the form of training with noisy labels derived from histopathology. Noisy labels often result in the model overfitting to the training data, hence limiting its generalizability. To avoid overfitting, we focus on the generalization of the features of the model and present an iterative data label refinement algorithm to amend the labels gradually. We simultaneously train two classifiers, with the same structure, and automatically stop the training when we observe any sign of overfitting. Then, we use a confident learning approach to clean the data labels and continue with the training. This process is iteratively applied to the training data and labels until convergence.
RESULTS: We illustrate the performance of the proposed method by classifying prostate cancer using a dataset of ultrasound images from 353 biopsy cores obtained from 90 patients. We achieve area under the curve, sensitivity, specificity, and accuracy of 0.73, 0.80, 0.63, and 0.69, respectively.
CONCLUSION: Our approach is able to provide clinicians with a visualization of regions that likely contain cancerous tissue to obtain more accurate biopsy samples. The results demonstrate that our proposed method produces superior accuracy compared to the state-of-the-art methods.