Use Case PheP

Methodical Use Case PheP

Phenotyping Pipeline supporting Clinical Evaluation Projects

Through the Medical Informatics Initiative (MII) and the development of the Data Integration Centers (DIC), clinical health-care data from various sources of the Hospital Information System (HIS) are made available for medical research. This creates a unique and rich repository of clinical data that is precisely defined across all participating sites. With the methodical Use Case Phenotyping Pipeline, PheP for short, the SMITH Consortium supports the construction, qualitative enrichment and evaluation of the data stock. The University of Leipzig is in charge of the project.

Use Case PheP

Development of innovative methods for processing, extracting and structuring clinical care data for research and care provision

Provision of a platform for the execution of distributed analyses

Development, qualitative enrichment and evaluation of the clinical data pool

Participation of all university hospitals represented in SMITH under the leadership of the University of Leipzig.

The PheP idea: Enriching health data and supplying it to science in the best way

PheP is a platform that enables clinical researchers to work together with statisticians and computer scientists in interdisciplinary collaboration to pursue scientific issues that previously seemed economically and technologically unthinkable. For this purpose, it is necessary to build data sets that can be used for clinical-epidemiological and health-economic issues.

From phenotypes, i.e. determinable characteristics of patients, further characteristics can be derived and provided via phenotyping. PheP also supports the record linkage procedure, which is used to combine data on a patient from different information sources, for example from health insurance companies or death data from civil registers.

One of the challenges in this context is that too little clinical information is available as machine-readable data sets. Admission letters, findings and operating room reports in particular contain valuable information such as diagnoses, medications, side effects and laboratory data that can only be extracted using methods of natural language processing and semantic text analysis methods. Natural Language Processing (NLP) is used to process documents from the Hospital Information System (HIS). The process is academically led by the Jena University Language & Information Engineering Lab (JULIE Lab) in collaboration with leading companies in the field of language processing.

Building a treasure of data for the supply of tomorrow

PheP focuses on supporting the development and standardized introduction of new Data Use Projects (DUPs). DUPs serve a variety of tasks – quality assurance in the context of health care, networking with external data, dynamic enrichment of the data pool, scientific hypothesis generation or statistical analysis of medical issues. We call the bundling of these processes PheP Factory.

The technical basis is provided by a platform built at all sites – the PheP Engine. The secure technology enables the execution of distributed analyses on the semantically and technically standardized data at all sites. Sensitive patient data remains in the clinic – the algorithms come to the data. This technology allows a flexible and data protection compliant approach for different clinical issues.

The clinical Use Cases ASIC and HELP of the SMITH Consortium exemplify the new possibilities. The methodological Use Case PheP is now establishing processes and an infrastructure to enable cross-site collaboration to answer future research questions.

The PheP Concept also forms the basis of the cross-MII Use Case POLAR (Polypharmacy, Drug Interactions and Risks), which was launched in early 2020 and involves all four consortia of the MII.

“SMITH focuses on the current challenges of digitization. Through the sustainable use of care data in medical research, decisive steps are taken to improve diagnosis, prevention and therapy.
With these steps health care can be taken to a new level”.

Prof. Dr. Markus Löffler

Head of the SMITH Consortium
Head of the PheP Use Case
Director of the Institute for Medical Informatics, Statistics and Epidemiology (IMISE) | Leipzig University