Preclinical WebsiteClinical Website

Our PIANO™ deep learning model for hippocampal segmentation significantly reduces quality control failure rates, outperforming existing methods and enabling smaller, more cost-effective clinical trials

Conceptual diagram of U-net architecture for segmentation of the hippocampus, with a T1-weighted MR image as input and a 3D segmentation as output.

Conceptual diagram of U-net architecture for segmentation of the hippocampus, with a T1-weighted MR image as input and a 3D segmentation as output. The left part of the U-shape represents the encoder, while the right part represents the decoder, and each horizontal level represents multiple convolutional layers of features of the underlying data, condensed into a latent space at the lower-most level. 

Training and Test Data Set Composition by Clinical Stage

The fraction represented by each clinical stage in the training, testing, and combined datasets derived from the ADNI dataset

The fraction represented by each clinical stage in the training, testing, and combined datasets derived from the ADNI dataset. The clinical stages represented are identified as follows; Cognitively Normal (CN), Subjective Memory Concern (SMC), Early Mild Cognitive Impairment (ECMI), Mild Cognitive Impairment (MCI), Late Mild Cognitive Impairment (LMCI), Alzheimer’s Disease (AD). 

Hippocampus (HC) segmentation failure rates comparing PIANO™ deep learning (blue) with Hippodeep (orange) and FreeSurfer (green).

Hippocampus (HC) segmentation failure rates comparing PIANO™ deep learning (blue) with Hippodeep (orange) and FreeSurfer (green). Left & Right HC represents the case when the segmentation of both sides simultaneously fails in the same dataset. 

--:--

The hippocampus is a notoriously challenging neuroanatomical region to segment reliably. Our analyses show that routinely used methods for hippocampus segmentation can fail up to 10% of the time without manual correction. Manual correction is not preferable as it introduces unwanted bias in clinical trials. Deep learning algorithms offer robust methods for generating accurate image segmentation. 

Deep learning models, specifically U-net networks, are very well-suited for 3-dimensional (3D) region-of-interest (ROI) segmentation. Convolutional neural networks (CNNs), inspired by the human visual cortex, are powerful tools for image feature identification. Although they require significant training data and processing power to develop, once trained, they are efficient tools that can be rapidly run on standard computational resources.

As illustrated in the conceptual diagram, we used a U-net network architecture with encoder-decoder layers for segmentation of the hippocampus, with a T1-weighted MR image as input and a 3D segmentation of the hippocampus as output.

Training deep learning models typically requires large datasets; however, U-net's architecture combined with data augmentation allows effective training with smaller neuroimaging datasets. We curated a training dataset of T1-weighted MRI scans from the ADNI database, consisting of a range of participants from cognitively normal to Alzheimer’s disease patients as illustrated in the graph of training set composition. The variety of this composition is critical for training to be able to show as many different shapes and variations of the hippocampus to the deep learning model. The trained model was tested on a separate dataset with no participant overlap between the training and testing sets.

We used FreeSurfer for comparison purposes because of its relative ubiquitous use by the academic community as one of the best performing, open source neuroimaging analysis platforms freely available. FreeSurfer is also an automated pipeline, but may necessitate troubleshooting and manual intervention at various steps of processing, as outlined on their website, in order to get optimal results. However, manual correction introduces unwanted bias and intra-/inter-observer variability in the data and limits reproducibility. In order to compare both platforms, no manual intervention was performed. 

We also used HippoDeep for comparison purposes, to highlight that the curation of the training dataset is critical in generating a strong performance from the deep learning model. The hippocampus segmentations from all methods underwent quality control (QC) via visual assessment on verification images. 

Our deep learning model outperformed existing methods, such as FreeSurfer and HippoDeep, achieving a quality failure rate under 2% for either hemisphere and <0.4% for both hemispheres without manual corrections. This substantial reduction in quality control failure rate directly impacts clinical trial costs, as the lower data processing failure rate allows for potentially smaller trial sample sizes. This deep learning hippocampus segmentation was applied to the FTD dataset.

We use necessary cookies to make our site work. We also use other cookies to help us make improvements by measuring how you use the site or for marketing purposes. You have the choice to accept or reject them all. For more detailed information about the cookies we use, see our Privacy Notice.