Medical Biophysics Publications

Multisite Comparison of MRI Defacing Software Across Multiple Cohorts

Athena E. Theyers, Rotman Research Institute
Mojdeh Zamyadi, Rotman Research Institute
Mark O'Reilly, Ontario Brain Institute
Robert Bartha, Robarts Research InstituteFollow
Sean Symons, Sunnybrook Health Sciences Centre
Glenda M. MacQueen, Cumming School of Medicine
Stefanie Hassel, Cumming School of Medicine
Jason P. Lerch, Hospital for Sick Children University of Toronto
Evdokia Anagnostou, Holland Bloorview Kids Rehabilitation Hospital
Raymond W. Lam, University of British Columbia, Faculty of Medicine
Benicio N. Frey, McMaster University
Roumen Milev, Queen’s University
Daniel J. Müller, Centre for Addiction and Mental Health
Sidney H. Kennedy, University of Toronto
Christopher J.M. Scott, L.C. Campbell Cognitive Neurology Research Unit
Stephen C. Strother, Rotman Research Institute
Stephen R. Arnott, Rotman Research Institute

Document Type

Article

Publication Date

2-24-2021

Journal

Frontiers in Psychiatry

Volume

URL with Digital Object Identifier

10.3389/fpsyt.2021.617997

Abstract

With improvements to both scan quality and facial recognition software, there is an increased risk of participants being identified by a 3D render of their structural neuroimaging scans, even when all other personal information has been removed. To prevent this, facial features should be removed before data are shared or openly released, but while there are several publicly available software algorithms to do this, there has been no comprehensive review of their accuracy within the general population. To address this, we tested multiple algorithms on 300 scans from three neuroscience research projects, funded in part by the Ontario Brain Institute, to cover a wide range of ages (3–85 years) and multiple patient cohorts. While skull stripping is more thorough at removing identifiable features, we focused mainly on defacing software, as skull stripping also removes potentially useful information, which may be required for future analyses. We tested six publicly available algorithms (afni_refacer, deepdefacer, mri_deface, mridefacer, pydeface, quickshear), with one skull stripper (FreeSurfer) included for comparison. Accuracy was measured through a pass/fail system with two criteria; one, that all facial features had been removed and two, that no brain tissue was removed in the process. A subset of defaced scans were also run through several preprocessing pipelines to ensure that none of the algorithms would alter the resulting outputs. We found that the success rates varied strongly between defacers, with afni_refacer (89%) and pydeface (83%) having the highest rates, overall. In both cases, the primary source of failure came from a single dataset that the defacer appeared to struggle with - the youngest cohort (3–20 years) for afni_refacer and the oldest (44–85 years) for pydeface, demonstrating that defacer performance not only depends on the data provided, but that this effect varies between algorithms. While there were some very minor differences between the preprocessing results for defaced and original scans, none of these were significant and were within the range of variation between using different NIfTI converters, or using raw DICOM files.

Download

Included in

Medical Biophysics Commons

COinS

Medical Biophysics Publications

Multisite Comparison of MRI Defacing Software Across Multiple Cohorts

Document Type

Publication Date

Journal

Volume

URL with Digital Object Identifier

Abstract

Included in

Links

Browse

Author Corner

Medical Biophysics Publications

Multisite Comparison of MRI Defacing Software Across Multiple Cohorts

Authors

Document Type

Publication Date

Journal

Volume

URL with Digital Object Identifier

Abstract

Included in

Share

Links

Browse

Author Corner