Abstract
Background Cystic fibrosis (CF) is a multisystem disease in which the assessment of disease severity based on lung function alone may not be appropriate. The aim of the study was to develop a comprehensive machine-learning algorithm to assess clinical status independent of lung function in children.
Methods A comprehensive prospectively collected clinical database (Toronto, Canada) was used to apply unsupervised cluster analysis. The defined clusters were then compared by current and future lung function, risk of future hospitalisation, and risk of future pulmonary exacerbation treated with oral antibiotics. A k-nearest-neighbours (KNN) algorithm was used to prospectively assign clusters. The methods were validated in a paediatric clinical CF dataset from Great Ormond Street Hospital (GOSH).
Results The optimal cluster model identified four (A–D) phenotypic clusters based on 12 200 encounters from 530 individuals. Two clusters (A and B) consistent with mild disease were identified with high forced expiratory volume in 1 s (FEV1), and low risk of both hospitalisation and pulmonary exacerbation treated with oral antibiotics. Two clusters (C and D) consistent with severe disease were also identified with low FEV1. Cluster D had the shortest time to both hospitalisation and pulmonary exacerbation treated with oral antibiotics. The outcomes were consistent in 3124 encounters from 171 children at GOSH. The KNN cluster allocation error rate was low, at 2.5% (Toronto) and 3.5% (GOSH).
Conclusion Machine learning derived phenotypic clusters can predict disease severity independent of lung function and could be used in conjunction with functional measures to predict future disease trajectories in CF patients.
Abstract
Machine learning-derived clusters can be used to define clinical status in children with cystic fibrosis https://bit.ly/3nudlPG
Footnotes
This article has supplementary material available from erj.ersjournals.com
Conflict of interest: N. Filipow has nothing to disclose.
Conflict of interest: G. Davies reports personal fees for lectures from Chiesi Limited, outside the submitted work.
Conflict of interest: E. Main has nothing to disclose.
Conflict of interest: N.J. Sebire has nothing to disclose.
Conflict of interest: C. Wallis has nothing to disclose.
Conflict of interest: F. Ratjen reports grants and personal fees for consultancy from Vertex, Calithera, Proteostasis, TranslateBio, Genentech, Bayer and Boehringer Ingelheim, outside the submitted work.
Conflict of interest: S. Stanojevic reports grants from SickKids Foundation and European Respiratory Society, during the conduct of the study.
Support statement: G. Davies was supported by a grant from the UCL's Wellcome Institutional Strategic Support Fund 3 (grant reference 204841/Z/16/Z). S. Stanojevic received funding from the Program for Individualized Cystic Fibrosis Therapy Synergy Grant and the European Respiratory Society. N. Filipow received funding from a UCL, GOSH and Toronto SickKids studentship. All research at Great Ormond Street Hospital NHS Foundation Trust and UCL Great Ormond Street Institute of Child Health is made possible by the NIHR Great Ormond Street Hospital Biomedical Research Centre. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health. Funding information for this article has been deposited with the Crossref Funder Registry.
- Received July 23, 2020.
- Accepted December 22, 2020.
- Copyright ©The authors 2021. For reproduction rights and permissions contact permissions{at}ersnet.org