Dutch primary care data for research
In the Netherlands, accessibility of healthcare is at a high level. More than 99% of the population has a health insurance and almost all citizens are registered with a general practitioner (GP) practice. People are free to choose their practice, enrolment can be rejected because of capacity limitations or a too large distance between the practice and the patient’s address. The GP forms the point of care and acts as a gatekeeper in a two-way exchange of information with secondary care. Over a year, around 78% of the population has at least one contact with their GP. Patient data in the GP records include demographic information, patient complaints and symptoms, diagnoses, lab results, lifestyle factors, referral notes from consultants, and hospital admissions. The medical records of a patient contain all medical information from primary care and part of that from secondary care such as referral and discharge letter.
The Integrated Primary Care Information database
The Integrated Primary Care Information (IPCI) database is a longitudinal observational database containing routinely collected data from computer-based patient records of a selected group of GPs throughout the Netherlands. IPCI was started in 1992 by the department of Medical Informatics of the Erasmus University Medical Center in Rotterdam with the objective to enable better post marketing surveillance of drugs. The current database includes patient records from 2006 on, when the size of the database started to increase significantly. In 2016, IPCI was certified as Regional Data Center. Since 2019 the data is also standardized to the Observational Medical Outcomes Partnership common data model (OMOP CDM), enabling collaborative research in a large network of databases within the Observational Health Data Sciences and Informatics (OHDSI) community.
The primary goal of IPCI is to enable medical research. In addition, reports are generated to inform GPs and their organizations about the provided care. Contributing GPs are encouraged to use this information for their internal quality evaluation.
The IPCI database is registered on the European Medicines Agency (EMA) ENCePP resources database (http://www.encepp.eu).
Data resource area and population coverage
GP practices included in IPCI are mainly located in the middle part of the country, including the most densely populated area (the ‘Randstad’) but also some non-urban areas. IPCI is a dynamic database in which patients are included from the date they are registered at a GP practice and remain in the database until death or leaving the practice. In total, the database currently (January 1, 2021) contains 2.5M patients records with a median follow-up duration of 4.7 year. The number of active patients is 1.4M, which comprises 8.1% of the Dutch population of 17M. The age distribution of the patients in IPCI is a good representation of the National age distribution as shown in the image on the right.
Measures to ensure data quality
Prior to each data release, extensive quality control steps are performed, e.g., comparison of patient characteristics between practices, and checks to identify abnormal temporal data patterns in practices. For each practice, around 200 quality indicators are obtained. Of these indicators, a quarter refer to population characteristics, e.g. number of birth and mortalities relative to practice size, temporal consistency. The other indicators are based on medical data, e.g. distribution of measurement values, frequencies of diagnoses and procedures relative to age, completeness of data. The indicators are combined in a couple of quality scores for each practice. For these scores, cut-off values for acceptable quality have been defined. Practices with a score below a cut-off are excluded for research. This approach has shown to be very important, for example to check if data from practices that just joined the database are at an acceptable level of quality. The details of the approach, like the cut-off values for acceptance, are based on years of experience.
Standardization to the OMOP CDM
In addition to the native format, the IPCI data is converted to the OMOP Common Data Model (OMOP CDM). This involves harmonisation of the data structure, and mapping the source terminologies to standardized concepts. The original codes are also retained in the data model. The OMOP CDM is developed and maintained by the Observational Health Data Sciences and Informatics (OHDSI) initiative and is described in detail on https://ohdsi.github.io/CommonDataModel and in The Book of OHDSI: http://book.ohdsi.org. Vocabularies of the OMOP CDM are available on the OHDSI vocabularies repository Athena (https://athena.ohdsi.org/). For IPCI, an Extraction, Transform, and Load (ETL) process has been developed following all the established best practices in the OHDSI community, including the use of the Data Quality Dashboard and CDM inspection R packages.