Early Lung Cancer Screening using NLP and AI

  • Research type

    Research Study

  • Full title

    Application of the Clinithink Natural Language Processing tool and Machine Learning methods to Clinical Notes for the Screening of Lung Cancer.

  • IRAS ID

    320934

  • Contact name

    John Conibear

  • Contact email

    john.conibear@nhs.net

  • Sponsor organisation

    Barts Health NHS Trust

  • Duration of Study in the UK

    0 years, 7 months, 16 days

  • Research summary

    Identifying lung cancer early may make treatment easier and more effective. Once symptoms begin to present, the cancer may have developed and begun to spread. At present, GPs refer patients for a chest x-ray they perceive to be of high risk using their own clinical acumen, then further referring on a 2 week-wait pathway based on the findings of the x-ray or due to a lack of reassurance from a negative x-ray finding. Therefore, the process for referral is heavily reliant on the referrer’s personal judgment and ability to recognise symptoms, potentially delaying a diagnosis or misdiagnosing the cancer as a less serious ailment.

    Medical notes are rich sources of information, containing more nuanced insights into a patient’s health that are captured in tabular data. However, due to their unstructured nature, extraction of relevant information from text-based inputs is a time and labour-intensive process. For this reason, the use of unstructured data for the screening of many diseases is largely unexplored. 

    Clinithink is a natural language processing tool that takes medical notes and extracts the requested information, formatting the output into a structured dataset which can be more easily analysed. In implementing a tool such as Clinithink, the limitations associated with unstructured data could be mitigated, enabling the use of the information-dense free-text notes. Therefore, the intention of this study is to investigate the viability of using the Clinithink tool to extract information from the medical notes of a targeted demographic of patients and test the viability of applying predictive models to the extracted information to identify which patients are risk of lung cancer.

  • REC name

    London - Harrow Research Ethics Committee

  • REC reference

    23/LO/0120

  • Date of REC Opinion

    23 Mar 2023

  • REC opinion

    Further Information Favourable Opinion