Background:Multiparametric magnetic resonance imaging(mpMRI)has significantly advanced prostate cancer(PCa)detection,yet decisions on invasive biopsy with moderate prostate imaging reporting and data system(PI-RADS)sc...Background:Multiparametric magnetic resonance imaging(mpMRI)has significantly advanced prostate cancer(PCa)detection,yet decisions on invasive biopsy with moderate prostate imaging reporting and data system(PI-RADS)scores remain ambiguous.Methods:To explore the decision-making capacity of Generative Pretrained Transformer-4(GPT-4)for automated prostate biopsy recommendations,we included 2299 individuals who underwent prostate biopsy from 2018 to 2023 in 3 large medical centers,with available mpMRI before biopsy and documented clinical-histopathological records.GPT-4 generated structured reports with given prompts.The performance of GPT-4 was quantified using confusion matrices,and sensitivity,specificity,as well as area under the curve were calculated.Multiple artificial evaluation procedures were conducted.Wilcoxon’s rank sum test,Fisher’s exact test,and Kruskal-Wallis tests were used for comparisons.Results:Utilizing the largest sample size in the Chinese population,patients with moderate PI-RADS scores(scores 3 and 4)accounted for 39.7%(912/2299),defined as the subset-of-interest(SOI).The detection rates of clinically significant PCa corresponding to PI-RADS scores 2-5 were 9.4%,27.3%,49.2%,and 80.1%,respectively.Nearly 47.5%(433/912)of SOI patients were histopathologically proven to have undergone unnecessary prostate biopsies.With the assistance of GPT-4,20.8%(190/912)of the SOI population could avoid unnecessary biopsies,and it performed even better[28.8%(118/410)]in the most heterogeneous subgroup of PI-RADS score 3.More than 90.0%of GPT-4-generated reports were comprehensive and easy to understand,but less satisfied with the accuracy(82.8%).GPT-4 also demonstrated cognitive potential for handling complex problems.Additionally,the Chain of Thought method enabled us to better understand the decision-making logic behind GPT-4.Eventually,we developed a ProstAIGuide platform to facilitate accessibility for both doctors and patients.Conclusions:This multi-center study highlights the clinical utility of GPT-4 for prostate biopsy decision-making and advances our understanding of the latest artificial intelligence implementation in various medical scenarios.展开更多
基金supported by the Beijing Key Clinical Specialty Project(20240930)the National Natural Science Foundation of China(NSFC 82373436)+7 种基金the Beijing Hospitals Authority’Youth Program(BHAYP,QML20230114)the Beijing Natural Science Foundation(BNSF Z200027)the Beijing Chaoyang Hospital Multi-disciplinary Team Program(CYDXK202204),the NSFC(62331001)the BNSF(Z200027)the NSFC(82202097)the BHAYP(QML20230113)the Training Fund for Open Projects at Clinical Institutes and Departments of Capital Medical University(CCMU2022ZKYXY010)the Beijing Scholars Program(No.[2015]160).
文摘Background:Multiparametric magnetic resonance imaging(mpMRI)has significantly advanced prostate cancer(PCa)detection,yet decisions on invasive biopsy with moderate prostate imaging reporting and data system(PI-RADS)scores remain ambiguous.Methods:To explore the decision-making capacity of Generative Pretrained Transformer-4(GPT-4)for automated prostate biopsy recommendations,we included 2299 individuals who underwent prostate biopsy from 2018 to 2023 in 3 large medical centers,with available mpMRI before biopsy and documented clinical-histopathological records.GPT-4 generated structured reports with given prompts.The performance of GPT-4 was quantified using confusion matrices,and sensitivity,specificity,as well as area under the curve were calculated.Multiple artificial evaluation procedures were conducted.Wilcoxon’s rank sum test,Fisher’s exact test,and Kruskal-Wallis tests were used for comparisons.Results:Utilizing the largest sample size in the Chinese population,patients with moderate PI-RADS scores(scores 3 and 4)accounted for 39.7%(912/2299),defined as the subset-of-interest(SOI).The detection rates of clinically significant PCa corresponding to PI-RADS scores 2-5 were 9.4%,27.3%,49.2%,and 80.1%,respectively.Nearly 47.5%(433/912)of SOI patients were histopathologically proven to have undergone unnecessary prostate biopsies.With the assistance of GPT-4,20.8%(190/912)of the SOI population could avoid unnecessary biopsies,and it performed even better[28.8%(118/410)]in the most heterogeneous subgroup of PI-RADS score 3.More than 90.0%of GPT-4-generated reports were comprehensive and easy to understand,but less satisfied with the accuracy(82.8%).GPT-4 also demonstrated cognitive potential for handling complex problems.Additionally,the Chain of Thought method enabled us to better understand the decision-making logic behind GPT-4.Eventually,we developed a ProstAIGuide platform to facilitate accessibility for both doctors and patients.Conclusions:This multi-center study highlights the clinical utility of GPT-4 for prostate biopsy decision-making and advances our understanding of the latest artificial intelligence implementation in various medical scenarios.