Two faculty in the Department of Medical Informatics & Clinical Epidemiology (DMICE), Professor Melissa Haendel, PhD and new Instructor Anita Walden, MS, are among the leaders of the National COVID Cohort Collaborative (N3C), an open-science community focused on analyzing patient-level data from many clinical centers to reveal patterns in COVID-19 patients. The grand opening of the N3C Data Enclave took place in the first week of September 2020. To create N3C, all parties involved had to overcome technical, regulatory, policy, and governance barriers to sharing patient-level clinical data from many institutions. In months, they developed solutions to acquire and harmonize data across organizations and created a secure data environment to enable transparent and reproducible collaborative research.
The N3C Data Enclave supports collaborative analytics across a broad range of clinical and translational domains related to COVID-19 infection, such as acute kidney injury, diabetes, pregnancy, cancer, immunosuppression, social determinants of health, and many other conditions to target mechanism, drug discovery, and best care practices for COVID-19. Currently, the Enclave contains 304,000 persons and 25,905 COVID-19 cases documented from 11,000 visits, with these numbers growing rapidly given the 59 clinical centers that have now signed regulatory agreements to submit their data. This effort has approximately 900 members across 260 organizations in 47 states and 14 countries, according to Ms. Walden. She also noted, “people with various expertise and experience coming together to collaborate on such an effort is one of the most significant examples of team science and is remarkable to witness.”
Dr. Haendel remarked, “The regulatory work (and data harmonization) was such a lift that we often were not sure if we would be able to make N3C happen. So it is very exciting to finally be able to provide critical data access to everyone.”
OHSU Chief Research Information Officer and DMICE Professor David Dorr, MD, MS described the advantages of N3C, “One of the most appealing aspects of N3C is the ability to get synthetic data extracts to train models that then can be used to improve our understanding of the disease; these models can then be tested and applied both within the Enclave and even at local health systems to predict outcomes and reduce the harm from this terrible pandemic.”
The N3C Data Enclave is anticipated to be one of the largest collections of data on COVID-19 patients in the United States. Data analysis within the Enclave is supported by a myriad of tools such as R, Python, the most widely used open-source platforms for statistical analysis and data science (Watch a demonstration of the platform). Researchers requesting access to, or working within, the enclave are encouraged to assemble collaborative teams with diverse expertise in such areas as clinical research, statistical analysis, and informatics to make the best use of the N3C Data Enclave. “One of the most exciting things about the N3C Enclave is its ability to track the provenance and contributions, and thereby provide robust attribution to all,” said Prof. Haendel. This is especially nice for early-career and non-traditional contributors.
Researchers interested in accessing the data will need to register with N3C and submit a Data Use Request for review by the N3C Data Access Committee. Learn more about the process and requirements, including data security training, for data access.