Obtaining accurate segmentation of the prostate and nearby organs at risk (e.g., bladder and rectum) in CT images is critical for radiotherapy of prostate cancer. Currently, the leading automatic segmentation algorithms are based on Fully Convolutional Networks (FCNs), which achieve remarkable performance but usually need large-scale datasets with high-quality voxel-wise annotations for full supervision of the training. Unfortunately, such annotations are difficult to acquire, which becomes a bottleneck to build accurate segmentation models in real clinical applications. In this paper, we propose a novel weakly supervised segmentation approach that only needs 3D bounding box annotations covering the organs of interest to start the training. Obviously, the bounding box includes many non-organ voxels that carry noisy labels to mislead the segmentation model. To this end, we propose the label denoising module and embed it into the iterative training scheme of the label denoising network (LDnet) for segmentation. The labels of the training voxels are predicted by the tentative LDnet, while the label denoising module identifies the voxels with unreliable labels. As only the good training voxels are preserved, the iteratively re-trained LDnet can refine its segmentation capability gradually. Our results are remarkable, i.e., reaching ~94% (prostate), ~91% (bladder), and ~86% (rectum) of the Dice Similarity Coefficients (DSCs), compared to the case of fully supervised learning upon high-quality voxel-wise annotations and also superior to several state-of-the-art approaches. To our best knowledge, this is the first work to achieve voxel-wise segmentation in CT images from simple 3D bounding box annotations, which can greatly reduce many labeling efforts and meet the demands of the practical clinical applications.

Author