ChatGPT artificial intelligence in clinical data analysis: an example comparing standard vs fusion prostate biopsy outcomes after robotic-assisted radical prostatectomy

Pier Paolo Prontera; Francesca Romana Prusciano; Marco Lattarulo; Arman Tsaturyan; Francesco Addabbo; Carmine Sciorio; Francesco Saverio Grossi

doi:10.4081/aiua.2025.13596

Authors

Pier Paolo Prontera

pierpaolo.prontera@asl.taranto.it

https://orcid.org/0000-0002-6321-946X

Department of Urology, “S.S. Annunziata” Hospital, Taranto, Italy.

Francesca Romana Prusciano

https://orcid.org/0009-0005-5965-9509

Department of Urology, “S.S. Annunziata” Hospital, Taranto; Division of Urology, Hospital “Valle D’Itria”, Martina Franca (TA), Italy.

Marco Lattarulo

Department of Urology, “S.S. Annunziata” Hospital, Taranto, Italy.

Arman Tsaturyan

Department of Urology, Yerevan State Medical University after Mkhitar, Heratsi, Yerevan; Department of Urology, Erebouni Medical Center, Yerevan, Armenia.

Francesco Addabbo

Unit of Statistics and Epidemiology, Local Health Authority of Taranto, Italy.

Carmine Sciorio

Department of Urology, “Alessandro Manzoni” Hospital, Lecco, Italy.

Francesco Saverio Grossi

https://orcid.org/0000-0003-2985-6526

Department of Urology, “S.S. Annunziata” Hospital, Taranto, Italy.

Objective: To compare statistical outputs from ChatGPT 4.0 and human experts in both comparative and correlation analyses in the evaluation of multiparametric MRI/ultrasound fusion-targeted biopsy plus random biopsy versus standard random biopsy alone, in terms of upstaging.
Methods: Authors performed a retrospective evaluation on 101 patients undergoing robot-assisted radical prostatectomy (RaRP) between 2021 and 2023. Patients were divided in two groups, according to the type of prostatic biopsy received: combined fusion (MRI/US) targeted and random biopsy versus standard random biopsy. Clinical and histological data were anonymized and analyzed using logistic regression models, ANOVA, and Chi-square tests. Analysis generated by ChatGPT and by an experienced human statistician were compared. The Q-EVAL and Q-EVA tools were used to assess the quality of user-formulated questions and AI-generated answers, respectively.
Results: Results revealed high concordance between statistical outputs generated by AI and expert human statistician with perfect concordance using Cohen’s kappa coefficient (κ = 1.0). Logistic regression analysis demonstrated that fusion biopsy was associated with a reduced likelihood of upstaging, a consistent finding across statistical evaluations. Additionally, user interaction assessments indicated high-quality in question formulation.
Conclusions: ChatGPT (version 4.0) proved reliable for statistical analysis, showing strong concordance with human statisticians (κ = 1.0) in performing logistic regression, chi-square, and ANOVA tests. The Q-EVAL tool could reduce query errors, though ChatGPT's lack of automatic citations remains a limitation. Fusion biopsy significantly lowered upstaging risk after RaRP. In conclusion, ChatGPT is a valuable assistive tool but further research is required to optimize human-AI collaboration in clinical research.

Downloads

Download data is not yet available.

Qin S, Chislett B, Ischia J, et al. ChatGPT and generative AI in urology and surgery - a narrative review. BJUI Compass. 2024;5:813-21.

Mu Y, He D. The Potential Applications and Challenges of ChatGPT in the Medical Field. Int J Gen Med. 2024; 17:817-826.

Lazaros T, Konstantinos K, Georgios F, et al. ChatGPT in clinical medicine, urology and academia: a review. Arch. Esp. Urol. 2024;77: 708-717.

Teperikidis L, Boulmpou A, Papadopoulos C, et al. Using ChatGPT to perform a systematic review: a tutorial. Minerva Cardiol Androl 2024; 72:547-67.

Huang Y, Wu R, He J, et al. Evaluating ChatGPT-4.0’s data analytic proficiency in epidemiological studies: a comparative analysis with SAS, SPSS, and R. J Glob Health 2024;14:04070.

Aykut D. A Comparison of ChatGPT and human questionnaire evaluations of the urological cancer videos most watched on YouTube. Clinical Genitourinary 2024; 22:102145.

Prontera PP, Prusciano FR, Lattarulo M, et al. Quality of bladder cancer treatment information on YouTube: may the user’s profile affect the quality of results? Arch Ital Urol Androl 2024; 96:12179.

Braga Martinelli AVN, Nunes NC, Santos EN, et al. Use of ChatGPT in urology and its relevance in clinical practice: is it useful? Int Braz J Urol 2024; 50:192-198.

Kasivisvanathan V, Rannikko AS, Borghi M, et al. MRI-targeted or standard biopsy for prostate-cancer diagnosis. N Engl J Med 2018;378:1767-77.

Baco E, Ukimura O, Rud E, et al. Magnetic resonance imaging transectal ultrasound image-fusion biopsies accurately characterize the index tumor: correlation with step-sectioned radical prostatectomy specimens in 135 patients. Eur Urol 2014; 67:787-794.

Porpiglia F, De Luca S, Passera R, et al. Multiparametric-magnetic resonance/ultrasound fusion targeted prostate biopsy improves agreement between biopsy and radical prostatectomy gleason score. Anticancer Res. 2016; 36:4833-9.

Borkowetz A, Platzek I, Toma M, et al. Direct comparison of multiparametric magnetic resonance imaging (MRI) results with final histopathology in patients with proven prostate cancer in MRI/ultrasonography-fusion biopsy. BJU Int. 2016; 118:213-20.

Lanz C, Cornud F, Beuvon F, et al. Gleason score determination with transrectal ultrasound-magnetic resonance imaging fusion guided prostate biopsies are we gaining in accuracy? J Urol. 2016;195:88-93.

D’Amico AV, Whittington R, Malkowicz SB, et al. Biochemical outcome after radical prostatectomy, external beam radiation therapy, or interstitial radiation therapy for clinically localized prostate cancer. AMA. 1998; 280:969-74.

Dave T, Athaluri SA, Singh S. ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations. Front. Artif. Intell. 2023; 6:1169595.

Flammia RS, Hoeh B, Hohenhorst L, et al. Adverse upgrading and/or upstaging in contemporary low-risk prostate cancer patients. Int Urol Nephrol. 2022; 54:2521-2528.

How to Cite

ChatGPT artificial intelligence in clinical data analysis: an example comparing standard vs fusion prostate biopsy outcomes after robotic-assisted radical prostatectomy. (2025). Archivio Italiano Di Urologia E Andrologia, 97(2). https://doi.org/10.4081/aiua.2025.13596

Download Citation

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

PAGEPress has chosen to apply the Creative Commons Attribution NonCommercial 4.0 International License (CC BY-NC 4.0) to all manuscripts to be published.

ChatGPT artificial intelligence in clinical data analysis: an example comparing standard vs fusion prostate biopsy outcomes after robotic-assisted radical prostatectomy

Authors

Downloads

How to Cite

Download Citation

authors

reviewers

indexing