Data mining versus manual screening to select papers for inclusion in systematic reviews: a novel method to increase efficiency

Elena Ierardi, J. Chris Eilbeck, Frederike van Wijck*, Myzoon Ali, Fiona Coupar

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


Systematic reviews rely on identification of studies, initially through electronic searches yielding potentially thousands of studies, and then reviewer-led screening studies for inclusion. This standard method is time- and resource-intensive. We designed and applied an algorithm written in Python involving computer-aided identification of keywords within each paper for an exemplar systematic review of arm impairment after stroke. The standard method involved reading each abstract searching for these keywords. We compared the methods in terms of accuracy in identification of keywords, abstracts' eligibility, and time taken to make a decision about eligibility. For external validation, we adapted the algorithm for a different systematic review, and compared eligible studies using the algorithm with those included in that review. For the exemplar systematic review, the algorithm failed on 72 out of 2,789 documents retrieved (2.6%). Both methods identified the same 610 studies for inclusion. Based on a sample of 21 randomly selected abstracts, the standard screening took 1.58 ± 0.26 min per abstract. Computer output screening took 0.43 ± 0.14 min per abstract. The mean difference between the two methods was 1.15 min (P < 0.0001), saving 73% per abstract. For the other systematic review, use of the algorithm resulted in the same studies being identified. One study was excluded based on the interpretation of the comparison intervention. Our purpose-built software was an accurate and significantly time-saving method for identifying eligible abstracts for inclusion in systematic reviews. This novel method could be adapted for other systematic reviews in future for the benefit of authors, reviewers and editors.

Original languageEnglish
Pages (from-to)284-292
JournalInternational Journal of Rehabilitation Research
Issue number3
Publication statusPublished - 30 Sept 2023


  • Text data mining
  • Algorithm
  • Systematic Review
  • Abstract
  • Screening
  • Keywords
  • Selection
  • screening
  • keywords
  • text data mining
  • abstract
  • systematic review
  • selection
  • algorithm

ASJC Scopus subject areas

  • General Social Sciences
  • General Psychology
  • General Medicine
  • General Health Professions
  • General Decision Sciences
  • Rehabilitation
  • Physical Therapy, Sports Therapy and Rehabilitation


Dive into the research topics of 'Data mining versus manual screening to select papers for inclusion in systematic reviews: a novel method to increase efficiency'. Together they form a unique fingerprint.

Cite this