Skip to main content

COVID-19 Resources

What people with cancer should know:

Guidance for cancer researchers:

Get the latest public health information from CDC:

Get the latest research information from NIH:

Grant Details

Grant Number: 1R01CA246418-01 Interpret this number
Primary Investigator: Bian, Jiang
Organization: University Of Florida
Project Title: The Benefits and Harms of Lung Cancer Screening in Florida
Fiscal Year: 2020


PROJECT SUMMARY/ABSTRACT Lung cancer is the leading cause of cancer related death in both men and women in the United States. Currently, approximately 70% of lung cancer patients are diagnosed at advanced stages, and the 5-year survival rate of advanced stage lung cancer is very low, at only 16%. Investigators have been searching for effective screening modalities for the early detection of lung cancer so that patients can receive curative treatments at an early stage. When the National Lung Screening Trial (NLST) demonstrated the effectiveness of using low-dose computed tomography (LDCT) scan for lung cancer screening (LCS), researchers and physicians hope to save lives from lung cancer by screening high-risk population who aged 55 to 77 years and have a 30 pack years making history or former smokes who have quitted within the past 15 years. Since the release of the landmark NLST results, many medical associations published guidelines to recommend LDCT-based screening for individuals at high risk for lung cancer and the Centers for Medicare and Medicaid Services (CMS) also decided to cover the LCS for Medicare beneficiaries who are at high risk for lung cancer. While many efforts have been made to accelerate the dissemination the beneficial LCS, the concerns over the high false positive rates (96.4% of the positive results), invasive diagnostic procedures, postprocedural complications and health care costs may hinder the utilization of lung cancer screening. This concern was magnified as researchers and policy makers started questioning whether the complication rate and false positives in real-world settings would be even higher than the rates reported in the NLST, which was conducted in a setting with well-established facilities and proficiency in cancer care. Therefore, we propose to understand the contemporary use of lung cancer screening and associated health care outcomes and costs using data from a real-world setting. Our study has three goals: 1) to develop an innovative computable phenotype algorithm to identify high-risk and low-risk individuals for LCS from both structured and unstructured (i.e., clinical notes) electronic health record (EHR) data and to develop advanced natural language processing (NLP) methods to extract LCS related clinical information from clinical notes such as radiology reports; 2) to determine the appropriate and inappropriate use of LDCT among high-risk and low-risk individuals in Florida and to examine the test results of LDCT, the rates of invasive diagnostic procedures, postprocedural complications, and incidental findings in real-world settings; and 3) to develop and validate a microsimulation model of the clinical courses of LCS incorporating the real-world data in LCS to estimate the long-term benefits and the cost-effectiveness of LCS. Our proposed study has the potential to reduce lung cancer incidence and mortality by informing policymakers and practitioners on the appropriateness of contemporary use of LCS. This knowledge will help both patients and physicians better understand the harm- benefit tradeoff of lung cancer screening and transform such knowledge into practice to prevent avoidable postprocedural complications.


An efficient and accurate distributed learning algorithm for modeling multi-site zero-inflated count outcomes.
Authors: Edmondson M.J. , Luo C. , Duan R. , Maltenfort M. , Chen Z. , Locke K. , Shults J. , Bian J. , Ryan P.B. , Forrest C.B. , et al. .
Source: Scientific reports, 2021-10-04; 11(1), p. 19647.
EPub date: 2021-10-04.
PMID: 34608222
Related Citations

Comparing the downstream costs and healthcare utilization associated with the use of low-dose computed tomography (LDCT) in lung cancer screening in patients with and without alzheimer's disease and related dementias (ADRD).
Authors: Zhang Y. , Bian J. , Huo J. , Yang S. , Guo Y. , Shao H. .
Source: Current medical research and opinion, 2021 10; 37(10), p. 1731-1737.
EPub date: 2021-07-26.
PMID: 34252317
Related Citations

Optimizing Identification of People Living with HIV from Electronic Medical Records: Computable Phenotype Development and Validation.
Authors: Liu Y. , Siddiqi K.A. , Cook R.L. , Bian J. , Squires P.J. , Shenkman E.A. , Prosperi M. , Jayaweera D.T. .
Source: Methods of information in medicine, 2021 09; 60(3-04), p. 84-94.
EPub date: 2021-09-30.
PMID: 34592777
Related Citations

Challenges in replicating secondary analysis of electronic health records data with multiple computable phenotypes: A case study on methicillin-resistant Staphylococcus aureus bacteremia infections.
Authors: Jun I. , Rich S.N. , Chen Z. , Bian J. , Prosperi M. .
Source: International journal of medical informatics, 2021 09; 153, p. 104531.
EPub date: 2021-07-16.
PMID: 34332468
Related Citations

The role of sex and rurality in cancer fatalistic beliefs and cancer screening utilization in Florida.
Authors: Guo Y. , Szurek S.M. , Bian J. , Braithwaite D. , Licht J.D. , Shenkman E.A. .
Source: Cancer medicine, 2021 Sep; 10(17), p. 6048-6057.
EPub date: 2021-07-13.
PMID: 34254469
Related Citations

The application of artificial intelligence and data integration in COVID-19 studies: a scoping review.
Authors: Guo Y. , Zhang Y. , Lyu T. , Prosperi M. , Wang F. , Xu H. , Bian J. .
Source: Journal of the American Medical Informatics Association : JAMIA, 2021-08-13; 28(9), p. 2050-2067.
PMID: 34151987
Related Citations

Applications of artificial intelligence in drug development using real-world data.
Authors: Chen Z. , Liu X. , Hogan W. , Shenkman E. , Bian J. .
Source: Drug discovery today, 2021 05; 26(5), p. 1256-1264.
EPub date: 2020-12-24.
PMID: 33358699
Related Citations

Bagged random causal networks for interventional queries on observational biomedical datasets.
Authors: Prosperi M. , Guo Y. , Bian J. .
Source: Journal of biomedical informatics, 2021 03; 115, p. 103689.
EPub date: 2021-02-04.
PMID: 33548542
Related Citations

Geographic Variation in Knowledge of Palliative Care Among US Adults: Findings From 2018 Health Information National Trends Survey.
Authors: Chen G. , Hong Y.R. , Wilkie D.J. , Kittleson S. , Huo J. , Bian J. .
Source: The American journal of hospice & palliative care, 2021 Mar; 38(3), p. 291-299.
EPub date: 2020-08-06.
PMID: 32757758
Related Citations

When text simplification is not enough: could a graph-based visualization facilitate consumers' comprehension of dietary supplement information?
Authors: He X. , Zhang R. , Alpert J. , Zhou S. , Adam T.J. , Raisa A. , Peng Y. , Zhang H. , Guo Y. , Bian J. .
Source: JAMIA open, 2021 Jan; 4(1), p. ooab026.
EPub date: 2021-04-04.
PMID: 33855274
Related Citations

International Classification of Diseases, Tenth Revision, Clinical Modification social determinants of health codes are poorly used in electronic health records.
Authors: Guo Y. , Chen Z. , Xu K. , George T.J. , Wu Y. , Hogan W. , Shenkman E.A. , Bian J. .
Source: Medicine, 2020-12-24; 99(52), p. e23818.
PMID: 33350768
Related Citations

Extracting Family History of Patients From Clinical Narratives: Exploring an End-to-End Solution With Deep Learning Models.
Authors: Yang X. , Zhang H. , He X. , Bian J. , Wu Y. .
Source: JMIR medical informatics, 2020-12-15; 8(12), p. e22982.
EPub date: 2020-12-15.
PMID: 33320104
Related Citations

Statin Use for Atherosclerotic Cardiovascular Disease Prevention Among Sexual Minority Adults.
Authors: Guo Y. , Wheldon C.W. , Shao H. , Pepine C.J. , Handberg E.M. , Shenkman E.A. , Bian J. .
Source: Journal of the American Heart Association, 2020-12-15; 9(24), p. e018233.
EPub date: 2020-12-02.
PMID: 33317368
Related Citations

An ontology-based documentation of data discovery and integration process in cancer outcomes research.
Authors: Zhang H. , Guo Y. , Prosperi M. , Bian J. .
Source: BMC medical informatics and decision making, 2020-12-14; 20(Suppl 4), p. 292.
EPub date: 2020-12-14.
PMID: 33317497
Related Citations

Assessing the practice of data quality evaluation in a national clinical data research network through a systematic scoping review in the era of real-world data.
Authors: Bian J. , Lyu T. , Loiacono A. , Viramontes T.M. , Lipori G. , Guo Y. , Wu Y. , Prosperi M. , George T.J. , Harle C.A. , et al. .
Source: Journal of the American Medical Informatics Association : JAMIA, 2020-12-09; 27(12), p. 1999-2010.
PMID: 33166397
Related Citations

Clinical concept extraction using transformers.
Authors: Yang X. , Bian J. , Hogan W.R. , Wu Y. .
Source: Journal of the American Medical Informatics Association : JAMIA, 2020-12-09; 27(12), p. 1935-1942.
PMID: 33120431
Related Citations

A Natural Language Processing Tool to Extract Quantitative Smoking Status from Clinical Narratives.
Authors: Yang X. , Yang H. , Lyu T. , Yang S. , Guo Y. , Bian J. , Xu H. , Wu Y. .
Source: medRxiv : the preprint server for health sciences, 2020-11-05; , .
EPub date: 2020-11-05.
PMID: 33173920
Related Citations

A Natural Language Processing Tool to Extract Quantitative Smoking Status from Clinical Narratives.
Authors: Yang X. , Yang H. , Lyu T. , Yang S. , Guo Y. , Bian J. , Xu H. , Wu Y. .
Source: IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics, 2020 Nov-Dec; 2020, .
EPub date: 2021-03-12.
PMID: 33786419
Related Citations

User-centered design of a web-based crowdsourcing-integrated semantic text annotation tool for building a mental health knowledge base.
Authors: He X. , Zhang H. , Bian J. .
Source: Journal of biomedical informatics, 2020 10; 110, p. 103571.
EPub date: 2020-09-19.
PMID: 32961307
Related Citations

Learning from local to global: An efficient distributed algorithm for modeling time-to-event data.
Authors: Duan R. , Luo C. , Schuemie M.J. , Tong J. , Liang C.J. , Chang H.H. , Boland M.R. , Bian J. , Xu H. , Holmes J.H. , et al. .
Source: Journal of the American Medical Informatics Association : JAMIA, 2020-07-01; 27(7), p. 1028-1036.
PMID: 32626900
Related Citations

Explainable artificial intelligence models using real-world electronic health record data: a systematic scoping review.
Authors: Payrovnaziri S.N. , Chen Z. , Rengifo-Moreno P. , Miller T. , Bian J. , Chen J.H. , Liu X. , He Z. .
Source: Journal of the American Medical Informatics Association : JAMIA, 2020-07-01; 27(7), p. 1173-1185.
PMID: 32417928
Related Citations

Are Preregistration and Registered Reports Vulnerable to Hacking?
Authors: Bian J. , Min J.S. , Prosperi M. , Wang M. .
Source: Epidemiology (Cambridge, Mass.), 2020 05; 31(3), p. e32.
PMID: 31985501
Related Citations

Identifying Clinical Risk Factors for Opioid Use Disorder using a Distributed Algorithm to Combine Real-World Data from a Large Clinical Data Research Network.
Authors: Tong J. , Chen Z. , Duan R. , Lo-Ciganic W.H. , Lyu T. , Tao C. , Merkel P.A. , Kranzler H.R. , Bian J. , Chen Y. .
Source: AMIA ... Annual Symposium proceedings. AMIA Symposium, 2020; 2020, p. 1220-1229.
EPub date: 2021-01-25.
PMID: 33936498
Related Citations

Developing and Validating a Computable Phenotype for the Identification of Transgender and Gender Nonconforming Individuals and Subgroups.
Authors: Guo Y. , He X. , Lyu T. , Zhang H. , Wu Y. , Yang X. , Chen Z. , Markham M.J. , Modave F. , Xie M. , et al. .
Source: AMIA ... Annual Symposium proceedings. AMIA Symposium, 2020; 2020, p. 514-523.
EPub date: 2021-01-25.
PMID: 33936425
Related Citations

Leverage Real-world Longitudinal Data in Large Clinical Research Networks for Alzheimer's Disease and Related Dementia (ADRD).
Authors: Duan R. , Chen Z. , Tong J. , Luo C. , Lyu T. , Tao C. , Maraganore D. , Bian J. , Chen Y. .
Source: AMIA ... Annual Symposium proceedings. AMIA Symposium, 2020; 2020, p. 393-401.
EPub date: 2021-01-25.
PMID: 33936412
Related Citations

An ontology-guided semantic data integration framework to support integrative data analysis of cancer survival.
Authors: Zhang H. , Guo Y. , Li Q. , George T.J. , Shenkman E. , Modave F. , Bian J. .
Source: BMC medical informatics and decision making, 2018-07-23; 18(Suppl 2), p. 41.
EPub date: 2018-07-23.
PMID: 30066664
Related Citations

Back to Top