Speaker
Description
The utilization of Real-world data (RWD) in order to generate evidence has gained increasing importance after the COVID-19 pandemic. This crisis gave rise to data management challenges, particularly in data standardization. PIONEER, which is a European research project under the auspices of the IMI, aimed towards patient-centered outcomes research in prostate cancer has built a network of databases using the OMOP Common Data Model (CDM) structure, a standardized format for organizing and analyzing health data from disparate sources.
However, there is a major difference between data coming from clinical trials and RWD. Participants in clinical trials are usually selected based on strict inclusion and exclusion criteria, and interventions are administered in a standardized manner. Data is collected in a pre-determined, standardized format at proscribed intervals, capturing all the relevant variables needed for analysis. In contrast, RWD, which is gathered from real-world patient experiences, is exceedingly heterogenous, reflecting the diversity of patient experiences as they work their way through the healthcare system.
Our aim is to create homogeneous sub-groups of prostate cancer patients based on RWD which will enable better assessment of patient of patient outcomes from competing treatment strategies. We utilize a three-step approach: we start by using the hierarchical structure of the SNOMED vocabulary to pre-process the data, we use entropy as a measure of randomness across the vocabulary levels. Next, we reduce the dimension of the processed list of conditions, which gives a more manageable dataset. Finally, we apply clustering to get the medically relevant patient profiles.
Classification | Both methodology and application |
---|---|
Keywords | Real world data, prostate cancer, clustering |