The following is a summary of “Developing prompts from large language model for extracting clinical information from pathology and ultrasound reports in breast cancer,” published in the September 2023 issue of Oncology by Choi et al.
Researchers performed a retrospective study to assess how long and how much it costs to create prompts for a large language model (LLM) to extract clinical factors from breast cancer patient reports and how accurate those prompts are.
The study gathered data from 2,931 breast cancer patients who had radiotherapy between 2020 and 2022, using a technique called the “LLM” method, which involves the Generative Pre-trained Transformer (GPT) for sheets and docs extension plugin. They compared the time and cost of using the LLM method to collect information with traditional “full manual” and “LLM-assisted manual” methods. To evaluate the accuracy, they randomly chose 340 patients and compared the data extracted through the LLM method with that collected through the “full manual” method.
The study developed 12 prompts for Extract and 12 for Format functions, achieving an overall accuracy of 87.7%, with 98.2% accuracy for lymphovascular invasion. Developing and processing the prompts took 3.5 hours and 15 minutes, and utilizing the ChatGPT API cost $65.8, totaling $95.4, with estimated wages. In comparison, “LLM-assisted manual” and “LLM” methods were more time- and cost-efficient than the “full manual” method.
The analysis found that creating and using prompts with LLM efficiently extracted important clinical data from extensive medical records. This study highlighted the potential of natural language processing with LLM models for breast cancer patients. The prompts developed in this study can be applied in future research to gather clinical information.
Source: e-roj.org/journal/view.php?doi=10.3857/roj.2023.00633