Publications
Selected publications from the Gong Lab. A full bibliography is available on Dr. Gong’s Yale profile.
In press & forthcoming
Learning from Literature: Integrating LLMs and Bayesian Hierarchical Modeling for Oncology Trial Design
Gong G, Roychoudhury S, Meisner A, Pusztai L, Goldberg SB, Wei W.
JCO Clinical Cancer Informatics. In press.
arXiv
Integrates large language models with Bayesian hierarchical modeling to support evidence-driven oncology clinical trial design from published literature.
The Fundamentals of Digital Oncology: Key Technologies in Digital Oncology
Liu J, Lustberg M, Gong G.
In: Digital Oncology (series editors: Matti Aapro, Pietro Presti). Elsevier/Academic Press, 2026. Under review.
Book chapter on foundational technologies in digital oncology.
Peer-reviewed original research
2026
Evaluating Underrepresentation in Cancer Clinical Trial Enrollment across Three NCI-Designated Cancer Centers: A Retrospective Demographic Study
Gong G, Liu J, Syed M, Worley K, Pandya S, Taborda C, Mendez L, Battaglia T, Osterman TJ, Reid SA, Mazo Canola M, Creighton SL, Nahm Zozus M, Park BH, Padalecki SS, Kunz PL, Lustberg M.
JCO Oncology Advances 2026; 3:e2500159.
DOI: 10.1200/OA-25-00159
Retrospective analysis of demographic underrepresentation in cancer clinical trial enrollment across three NCI-designated cancer centers.
CTPM: A Real-Time, Common Data Model and Artificial Intelligence–Driven System for Automated Patient Pre-Screening in Cancer Clinical Trials
Gong G, Liu J, Pandya S, Taborda C, Wiesendanger N, Price N, Byran W, Coppi A, Young P, Wiess C, Barganier C, Brodeur R, Fischbach N, LoRusso P, Pusztai L, Kim SY, Rozenblit M, Cecchini M, Mongiu A, Mendez L, Kaftan E, Torre C Jr, Krumholz H, Krop I, Schulz W, Lustberg M, Kunz PL.
JCO Clinical Cancer Informatics 2026; 10:e2500262.
PubMed · DOI: 10.1200/CCI-25-00262
Developed and validated a hybrid rules- and NLP-based CTPM pipeline using OMOP-standardized EHR data across 29 oncology trials.
Assessment of the Integrity of Real-Time Electronic Health Record Data Used in Clinical Research
Liu J, Pandya S, Coppi A, Young HP, Krumholz H, Schulz W, Gong G.
PLOS ONE 2026; 21:e0340287.
PubMed · DOI: 10.1371/journal.pone.0340287
Benchmarked the accuracy and reliability of real-time EHR data for secondary use in clinical research.
A Natural Language Processing Framework for Structuring and Visualizing Clinical Trial Eligibility Criteria at Scale: Protocol for a Quantitative Study
Xie J, Parikh J, Liu J, Pandya S, Gong G.
JMIR Research Protocols 2026; 15:e86425.
Full text · DOI: 10.2196/86425
Protocol for an LLM-enabled pipeline to cluster, summarize, and visualize eligibility criteria from ClinicalTrials.gov oncology trials.
2021
Clinical Characteristics and Outcomes for 7,995 Patients with SARS-CoV-2 Infection
McPadden J, Warner F, Young HP, Hurley NC, Pulk RA, Singh A, Durant TJS, Gong G, Desai N, Haimovich A, Taylor RA, Gunel M, Dela Cruz CS, Farhadian SF, Siner J, Villanueva M, Churchwell K, Hsiao A, Torre CJ Jr, Velazquez EJ, Herbst RS, Iwasaki A, Ko AI, Mortazavi BJ, Krumholz HM, Schulz WL.
PLoS ONE 2021; 16:e0243291.
PubMed · DOI: 10.1371/journal.pone.0243291
Large cohort study of COVID-19 patients at Yale New Haven Health.
Next Generation Phenotyping in Clinical Trial Based on Real-World Data
Gong G.
PhD dissertation, Yale Graduate School of Arts and Sciences, 2021.
Yale EliScholar
Dissertation on computational phenotyping for clinical trials using real-world data. Advisor: Harlan Krumholz, MD, SM.
2020
Patient Factors Associated with SARS-CoV-2 in an Admitted Emergency Department Population
Haimovich A, Warner F, Young HP, Ravindra NG, Sehanobish A, Gong G, Wilson FP, van Dijk D, Schulz W, Taylor RA.
J Am Coll Emerg Physicians Open 2020; 1:569–577.
PubMed · DOI: 10.1002/emp2.12145
Bridging the Collaboration Gap: Real-Time Identification of Clinical Specimens for Biomedical Research
Durant TJS, Gong G, Price N, Schulz WL.
J Pathol Inform 2020; 11:14.
PubMed · DOI: 10.4103/jpi.jpi_15_20
Preprints & under review
LEAD-ONC: An AI-Assisted Framework for Automated Extraction and Harmonization of Clinical Trial Data from Oncology Literature
Song M, Wei W, Gong G.
Under review. arXiv:2602.08172
Framework for automated extraction and harmonization of oncology clinical trial data from published literature.
TrialChain: A Blockchain-Based Platform to Validate Data Integrity in Large Biomedical Research Studies
Dai H, Young HP, Durant TJS, Gong G, Kang M, Krumholz HM, Schulz WL, Jiang L.
arXiv 2018. arXiv:1807.03662
Selected conference abstracts & presentations
Improving Identification and Enrollment in HR+/HER2− Breast Cancer Trials Using AI Clinical Trial Patient Matching Tool
Gong G, Liu J, Pandya S, Xie J, Parikh J, Fischbach N, Kunz P, Pusztai L, Lustberg M.
San Antonio Breast Cancer Symposium, San Antonio, TX, 2025.
Evaluating Underrepresentation in Breast Cancer Clinical Trial Enrollment at Yale Cancer Center: A Retrospective Demographic Study
Gong G, Liu J, Taylor M, Pandya S, Taborda C, Xie J, Parikh J, Wei W, Stefanou M, Kunz P, Fischbach N, Battaglia T, Mendez L, Gaddy J, Krop I, LoRusso P, Lustberg M, Silber A.
San Antonio Breast Cancer Symposium, San Antonio, TX, 2025.
Early Outcomes from the IMPACCT Project (Improving Participation in Cancer Clinical Trials)
Taylor M, Stefanou M, Gong G, Lustberg M, Silber A.
San Antonio Breast Cancer Symposium, San Antonio, TX, 2025.
Prospective Evaluation of an AI-Powered Clinical Trial Patient Matching (CTPM) System in Myelodysplastic Syndromes and Multiple Myeloma
Taborda C, Gong G, Douglas G, Liu J, Incoom A, Stahl M, Podoltsev N, Bewersdorf JP, Getz T, Stempel J, Kewan T, Lanino L, Bidikian A, Parker T, Bar N, Browning S, Halene S, Neparidze N, Zeidan A, Mendez L.
67th ASH Annual Meeting. Blood 2025; 146(Suppl 1):1086.
Real-World Persistence with CDK4/6 Inhibitors in Early and Metastatic HR-Positive, HER2-Negative Breast Cancer
Gong G, Brown BR, Pandya S, Caetano M, Legare R, Taghzout S, Ramos M, Liu J, Hood A, Zummo M, Lustberg M.
NCCN Annual Conference, 2026.
Automated Eligibility Screening for Adjuvant CDK4/6 Inhibitors in High-Risk HR+/HER2− Early Breast Cancer Using Natural Language Processing
Liu J, Pandya S, Brown BR, Caetano ML, Taghzout S, Ramos M, Hood A, Zummo M, Wei W, Legare R, Lustberg M, Gong G.
ASCO Annual Meeting, 2026.
Learning from the Literature: An AI-Assisted Bayesian Framework for Evidence-Driven Oncology Trial Design
Gong G, Roychoudhury S, Meisner A, Pusztai L, Goldberg SB, Wei W.
ASCO Annual Meeting, 2026.