Unsupported Browser
The American College of Surgeons website is not compatible with Internet Explorer 11, IE 11. For the best experience please update your browser.
Menu
Become a member and receive career-enhancing benefits

Our top priority is providing value to members. Your Member Services team is here to ensure you maximize your ACS member benefits, participate in College activities, and engage with your ACS colleagues. It's all here.

Become a Member
Become a member and receive career-enhancing benefits

Our top priority is providing value to members. Your Member Services team is here to ensure you maximize your ACS member benefits, participate in College activities, and engage with your ACS colleagues. It's all here.

Become a Member
ACS
Literature Selections

Artificial Intelligence Shows Promise in Extracting and Curating Longitudinal Data from Surveillance Imaging

July 15, 2025

acs-store-journalperiodical.jpg

Choubey AP, Eguia E, Hollingsworth A, et al. Data Extraction and Curation from Radiology Reports for Pancreatic Cyst Surveillance Using Large Language Models. J Am Coll of Surg. 2025; in press. 

Longitudinal evaluation of the radiographic features of pancreatic cysts would improve understanding of this condition, but this practice is too time-consuming for common use. However, using large language models (LLMs) designed to extract clinical variables from radiology reports may facilitate bringing longitudinal evaluation into widespread use.

Choubey and colleagues from the Memorial Sloan Kettering Cancer Center in New York City conducted a single-center retrospective study examining 3,198 longitudinal scans from 991 patients under surveillance for intraductal papillary mucinous neoplasms. By comparing the results of an LLM using a GPT-4 model with a manually annotated institutional database, they assessed feasibility and accuracy of LLM use.

With respect to categorical variables, LLM accuracy ranged from 97% for solid components to 99% for calcific lesions. In continuous variables, LLM results ranged from 92% accurate for cyst size to 97% accurate for main pancreatic duct size. Although the manually curated database was regarded as ground truth, LLM results permitted correction of several errors.

The authors called their approach “promising” but emphasized ongoing assessment of patients at risk for pancreatic ductal adenocarcinoma and other conditions.