First foundations for automatic processing of patient care texts are being laid | 2nd GeMTeX plenary meeting

Work on the GeMTeX method platform, a project of the Medical Informatics Initiative (MII) for processing clinical patient care texts, has been underway for several months. Now, at the end of September, about 40 GeMTeX contributors met in a web meeting to discuss the first progress since the kick-off in June. One of the main topics on the agenda was the current technical status of the Data Integration Centres at the six participating university hospitals in Berlin, Dresden, Erlangen, Essen, Leipzig and Munich.

The technical leadership of the GeMTeX project is in the hands of Dr Frank Meineke from the Institute for Medical Informatics, Statistics and Epidemiology (IMISE) at the Leipzig University. He presented the results of a status survey of the Data Integration Centre, which was carried out in the form of interviews. For example, the technical basis for processing doctors’ letters, which is necessary for the construction of the text corpus, was examined.

Patient consent is essential for building a text corpus

“The technical requirements vary from place to place, but the Data Integration Centres are already well positioned,” Dr Meineke concludes. In particular, it is important to monitor the development of the numbers of the patients’ Broad Consent. The Broad Consent, coordinated by the Medical Informatics Initiative (MII), forms the basis for the construction of the text corpus. GeMTeX only uses clinical data that patients have consented to be used for research purposes.

In addition to the technical aspects, strategies for anonymisation and pseudonymisation of the doctors’ letters in the text corpus and work on the study protocols were an essential part of the discussions at the plenary meeting. These steps are necessary in order to be able to process text documents from healthcare.

Annotations make clinical documents readable for artificial intelligence

Annotation work in the GeMTeX project is due to start next spring. Medical students will read the letters and mark them according to structure and content. These markings are called annotations and form the basis for artificial intelligence applications.

In GeMTeX, these annotations are performed by the INCEpTION programme. In the plenary session, Serwar Basch, a research associate at the Technical University of Darmstadt, presented an extension to the INCEpTION annotation tool that can be used to track progress in the annotation process in more detail.

GeMTeX project leader Professor Martin Boeker from the Technical University of Munich then gave an outlook on the organisational steps to be taken in the coming months. The focus was on the further expansion of the technology and annotation working groups by the project partners as well as the presentation of milestones.

At the end of the conference, another hot topic was on the agenda: the use of generative language models or chatbots. The participants discussed how the developments around ChatGPT can be used for the construction of the GeMTeX text corpus.

The next GeMTeX closed meeting will take place on 20 and 21 November 2023 at the TU Munich.