Performing systematic reviews is a time-consuming and resource-intensive process.
We investigated whether a machine learning system could perform systematic reviews more efficiently.
All systematic reviews and meta-analyses of interventional randomized controlled trials cited in recent clinical guidelines from the American Diabetes Association, American College of Cardiology, American Heart Association (x2), and American Stroke Association were assessed. After reproducing the primary screening dataset according to the published search strategy, we extracted correct articles (actually reviewed) and incorrect articles (not reviewed) from the dataset. Next, these two sets of articles were used to train a neural network-based artificial intelligence engine (Concept Encoder, FRONTEO Inc.). The primary endpoint was work saved over sampling at 95% recall (WSS@95%).
Among 145 candidates, 8 reviews of randomized controlled trials fulfilled the inclusion criteria. Using these 8 reviews, the machine learning system significantly reduced the literature screening workload by at least 6-fold versus manual screening based on WSS@95%. When machine learning was initiated using two correct articles randomly selected by a researcher, it achieved a 10-fold reduction in workload versus manual screening based on the WSS@95% value with high sensitivity for eligible studies. The area under the receiver operating characteristics curve increased dramatically every time the algorithm learned a correct article.
The Concept Encoder achieved a 10-fold or better reduction of the screening workload for systematic review after learning two randomly selected studies on the target topic. However, few meta-analyses of randomized controlled trials were included. The Concept Encoder could facilitate the acquisition of evidence for clinical guidelines.
UMIN clinical trials registry (UMIN000032663).

Author