Latest Research
Home > Research > Latest Research
Data mining-based model and risk prediction of colorectal cancer by using secondary health data: A systematic review

2020-05-02



Author: Hailun Liang , Lei Yang, Lei Tao, Leiyu Shi, Wuyang Yang, Jiawei Bai, Da Zheng, Ning Wang, Jiafu Ji


Abstract: 

Objective

Prevention and early detection of colorectal cancer (CRC) can increase the chances of successful treatment and reduce burden. Various data mining technologies have been utilized to strengthen the early detection of CRC in primary care. Evidence synthesis on the model’s effectiveness is scant. This systematic review synthesizes studies that examine the effect of data mining on improving risk prediction of CRC.


Methods

The PRISMA framework guided the conduct of this study. We obtained papers via PubMed, Cochrane Library, EMBASE and Google Scholar. Quality appraisal was performed using Downs and Black’s quality checklist. To evaluate the performance of included models, the values of specificity and sensitivity were comparted, the values of area under the curve (AUC) were plotted, and the median of overall AUC of included studies was computed.


Results

A total of 316 studies were reviewed for full text. Seven articles were included. Included studies implement techniques including artificial neural networks, Bayesian networks and decision trees. Six articles reported the overall model accuracy. Overall, the median AUC is 0.8243 [interquartile range (IQR): 0.8050−0.8886]. In the two articles that reported comparison results with traditional models, the data mining method performed better than the traditional models, with the best AUC improvement of 10.7%.


Conclusions

The adoption of data mining technologies for CRC detection is at an early stage. Limited numbers of included articles and heterogeneity of those studies implied that more rigorous research is expected to further investigate the techniques’ effects.


Link: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7219096/