Case-based repeatability and operating point variability of AI: breast lesion classification based on deep transfer learning

Heather M. Whitney; Karen Drukker; Hiroyuki Abe; Maryellen L. Giger

doi:10.1117/12.2612405

4 April 2022 Case-based repeatability and operating point variability of AI: breast lesion classification based on deep transfer learning

Heather M. Whitney, Karen Drukker, Hiroyuki Abe, Maryellen L. Giger

Author Affiliations +

Proceedings Volume 12035, Medical Imaging 2022: Image Perception, Observer Performance, and Technology Assessment; 120350V (2022) https://doi.org/10.1117/12.2612405
Event: SPIE Medical Imaging, 2022, San Diego, California, United States

Conference Poster

Abstract

Characterization of case-based classification repeatability and variability of operating points can complement measures of classification performance in artificial intelligence/computer-aided diagnosis (AI/CADx). Building upon our previous work in this area using human-engineered radiomic features extracted from dynamic contrast-enhanced magnetic resonance (DCE-MR) images, we investigated the application of these methods to features extracted from pre-trained convolutional neural networks using deep transfer learning. The second post-contrast DCE-MR images for 601 unique breast lesions (194 benign, 407 malignant) were cropped and resized for input into a VGG-19 network, pretrained using ImageNet. Features were extracted and average pooled from the five max-pool layers, resulting in 1,472 features for each lesion. The assignment of cases to training and test sets was varied using a 1000-iteration 0.632 bootstrap. Overall classification performance for distinguishing between malignant and benign cases (using the area under the receiver operating characteristic curve (AUC) with 0.632+ bootstrap correction), case-based classification repeatability (using repeatability profiles which measure the 95% confidence interval (CI) of classifier output across its range), and attainment of a ‘preferred’ target (95%) or ‘optimal’ sensitivity and specificity were investigated using a random forest classifier. The AUC (median, [95% CI]) was 0.862 [0.806, 0.899]. The repeatability profile and attained sensitivity and specificity were similar to previous results for both the ‘preferred’ and ‘optimal’ targets when using human-engineered radiomic features. These results demonstrate the application of these methods to complement AI/CADx model assessment when using deep transfer learning features.

Citation Download Citation

Heather M. Whitney, Karen Drukker, Hiroyuki Abe, and Maryellen L. Giger "Case-based repeatability and operating point variability of AI: breast lesion classification based on deep transfer learning", Proc. SPIE 12035, Medical Imaging 2022: Image Perception, Observer Performance, and Technology Assessment, 120350V (4 April 2022); https://doi.org/10.1117/12.2612405

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available