{"id":18212,"date":"2023-05-15T12:07:45","date_gmt":"2023-05-15T10:07:45","guid":{"rendered":"https:\/\/www.smith.care\/?page_id=18212"},"modified":"2025-08-19T14:29:21","modified_gmt":"2025-08-19T12:29:21","slug":"about-gemtex","status":"publish","type":"page","link":"https:\/\/www.smith.care\/en\/gemtex_mii\/about-gemtex\/","title":{"rendered":"About GeMTeX"},"content":{"rendered":"<div class=\"wpb-content-wrapper\"><p>[vc_row css_animation=&#8221;&#8221; row_type=&#8221;row&#8221; use_row_as_full_screen_section=&#8221;no&#8221; type=&#8221;grid&#8221; angled_section=&#8221;no&#8221; text_align=&#8221;left&#8221; background_image_as_pattern=&#8221;without_pattern&#8221; z_index=&#8221;&#8221; css=&#8221;.vc_custom_1620639780558{background-color: #f5f6f6 !important;}&#8221;][vc_column][qode_elements_holder number_of_columns=&#8221;one_column&#8221;][qode_elements_holder_item advanced_animations=&#8221;no&#8221; item_padding=&#8221;5% 0&#8243; item_padding_480_600=&#8221;20px 0px 20px 0px&#8221; item_padding_480=&#8221;20px 0px 20px 0px&#8221;][vc_column_text]<\/p>\n<h2>GeMTeX \u2013 German Medical Text Corpus<\/h2>\n<p>[\/vc_column_text][vc_empty_space height=&#8221;15px&#8221;][vc_column_text css=&#8221;.vc_custom_1688458780272{padding-bottom: 5% !important;}&#8221;]<\/p>\n<h3>Automatic indexing of medical texts for research<\/h3>\n<p>[\/vc_column_text][vc_column_text]In everyday clinical practice, large amounts of text are produced, such as doctors&#8217; letters or medical reports, which contain valuable information on the background, course and treatment of diseases for healthcare and research. Natural language processing (NLP) programmes could support the work of doctors and researchers on the basis of these texts. However, due to the lack of standardisation of free medical text, the potential of this treasure trove of data cannot be fully exploited. The structure and language of clinical notes are highly dependent on the individuals who write them. In addition, medical language is very different from everyday and scientific language. Clinical texts are characterised by jargon, brevity and economy of language, are written under time pressure, have incomplete coding and little structure.[\/vc_column_text][vc_empty_space height=&#8221;15px&#8221;][vc_column_text]This is where the GeMTeX methodological use case comes in: Six Data Integration Centres from the four medical informatics consortia <a href=\"https:\/\/difuture.de\/\" target=\"_blank\" rel=\"noopener\">DIFUTURE<\/a>, <a href=\"https:\/\/www.highmed.org\/de\/home\" target=\"_blank\" rel=\"noopener\">HiGHmed<\/a>, <a href=\"https:\/\/www.miracum.org\/\" target=\"_blank\" rel=\"noopener\">MIRACUM<\/a> and <a href=\"https:\/\/www.smith.care\/de\/\">SMITH<\/a> are contributing data and methods to make medical texts from patient care available for research projects. The aim is to create the largest medical text corpus in the German language. The office of the SMITH Consortium coordinates the project. The work in GeMTeX is based on the methodological Use Case <a href=\"https:\/\/www.smith.care\/de\/forschung\/use-case-phep\/\">PheP<\/a>\/NLP, which was implemented by the SMITH Consortium from 01.01.2018 &#8211; 31.05.2023.[\/vc_column_text][\/qode_elements_holder_item][\/qode_elements_holder][\/vc_column][\/vc_row][vc_row css_animation=&#8221;&#8221; row_type=&#8221;row&#8221; use_row_as_full_screen_section=&#8221;no&#8221; type=&#8221;grid&#8221; angled_section=&#8221;no&#8221; text_align=&#8221;left&#8221; background_image_as_pattern=&#8221;without_pattern&#8221; z_index=&#8221;&#8221; css=&#8221;.vc_custom_1620639790782{background-color: #f5f6f6 !important;background-position: center !important;background-repeat: no-repeat !important;background-size: cover !important;}&#8221;][vc_column css=&#8221;.vc_custom_1620729543205{padding-top: 0px !important;padding-right: 0px !important;padding-bottom: 0px !important;padding-left: 0px !important;background-color: #ffffff !important;}&#8221;][qode_elements_holder number_of_columns=&#8221;two_columns&#8221;][qode_elements_holder_item vertical_alignment=&#8221;middle&#8221; advanced_animations=&#8221;no&#8221; item_padding=&#8221;5%&#8221;][vc_single_image image=&#8221;18633&#8243; img_size=&#8221;1000&#215;1000&#8243; css=&#8221;.vc_custom_1721376418009{background-color: #efefef !important;}&#8221; qode_css_animation=&#8221;&#8221;][\/qode_elements_holder_item][qode_elements_holder_item vertical_alignment=&#8221;middle&#8221; horizontal_alignment=&#8221;left&#8221; advanced_animations=&#8221;no&#8221; item_padding=&#8221;2% 5% 0 5%&#8221;][vc_column_text css=&#8221;.vc_custom_1688458852730{padding-bottom: 10% !important;}&#8221;]<\/p>\n<h3>GeMTeX Use Case<\/h3>\n<p>[\/vc_column_text]<div class='q_icon_with_title tiny circle left_from_title '><div class=\"icon_text_holder\" style=\"\"><div class=\"icon_text_inner\" style=\"\"><div class=\"icon_title_holder\"><div class=\"icon_holder \" style=\" \"><span data-icon-type=\"circle\"   class=\"qode_iwt_icon_holder fa-stack fa-lg  \" style=\"border-color: #66C4D8;background-color: #66C4D8;\"><i class=\"qodef-icon-dripicons dripicon dripicons-search qode_iwt_icon_element\" style=\"color: #ffffff;\" ><\/i><\/span><\/div><h5 class=\"icon_title\" style=\"\">Creation of a large database for medical research projects as well as for AI models with the aim of clinical application.<\/h5><\/div><p style=''><\/p><\/div><\/div><\/div>[vc_empty_space height=&#8221;15px&#8221;]<div class='q_icon_with_title tiny circle left_from_title '><div class=\"icon_text_holder\" style=\"\"><div class=\"icon_text_inner\" style=\"\"><div class=\"icon_title_holder\"><div class=\"icon_holder \" style=\" \"><span data-icon-type=\"circle\"   class=\"qode_iwt_icon_holder fa-stack fa-lg  \" style=\"border-color: #66C4D8;background-color: #66C4D8;\"><i class=\"qodef-icon-dripicons dripicon dripicons-folder-open qode_iwt_icon_element\" style=\"color: #ffffff;\" ><\/i><\/span><\/div><h5 class=\"icon_title\" style=\"\">Extensive annotation of this corpus - in addition to basic annotation (e.g. diagnoses, medications), also deep domain-specific annotation (e.g. pathology, oncology, neurology, cardiology).<\/h5><\/div><p style=''><\/p><\/div><\/div><\/div>[vc_empty_space height=&#8221;15px&#8221;]<div class='q_icon_with_title tiny circle left_from_title '><div class=\"icon_text_holder\" style=\"\"><div class=\"icon_text_inner\" style=\"\"><div class=\"icon_title_holder\"><div class=\"icon_holder \" style=\" \"><span data-icon-type=\"circle\"   class=\"qode_iwt_icon_holder fa-stack fa-lg  \" style=\"border-color: #66C4D8;background-color: #66C4D8;\"><i class=\"qodef-icon-dripicons dripicon dripicons-toggles qode_iwt_icon_element\" style=\"color: #ffffff;\" ><\/i><\/span><\/div><h5 class=\"icon_title\" style=\"\">Establishment of technical and organisational standards for the mapping of text and annotations with the expansion of the MII core dataset.<\/h5><\/div><p style='color: #000000'><\/p><\/div><\/div><\/div>[vc_empty_space height=&#8221;15px&#8221;]<div class='q_icon_with_title tiny circle left_from_title '><div class=\"icon_text_holder\" style=\"\"><div class=\"icon_text_inner\" style=\"\"><div class=\"icon_title_holder\"><div class=\"icon_holder \" style=\" \"><span data-icon-type=\"circle\"   class=\"qode_iwt_icon_holder fa-stack fa-lg  \" style=\"border-color: #66C4D8;background-color: #66C4D8;\"><i class=\"qodef-icon-dripicons dripicon dripicons-home qode_iwt_icon_element\" style=\"color: #ffffff;\" ><\/i><\/span><\/div><h5 class=\"icon_title\" style=\"\">Cross-consortium project of the Medical Informatics Initiative with 17 partners from science, IT and healthcare.<\/h5><\/div><p style='color: #000000'><\/p><\/div><\/div><\/div>[\/qode_elements_holder_item][\/qode_elements_holder][\/vc_column][\/vc_row][vc_row css_animation=&#8221;&#8221; row_type=&#8221;row&#8221; use_row_as_full_screen_section=&#8221;no&#8221; type=&#8221;grid&#8221; angled_section=&#8221;no&#8221; text_align=&#8221;left&#8221; background_image_as_pattern=&#8221;without_pattern&#8221; z_index=&#8221;&#8221; css=&#8221;.vc_custom_1620639850406{background-color: #f5f6f6 !important;}&#8221;][vc_column][qode_elements_holder number_of_columns=&#8221;one_column&#8221;][qode_elements_holder_item advanced_animations=&#8221;no&#8221; item_padding=&#8221;4% 0 0 0&#8243;][vc_column_text]<\/p>\n<h4>Creation of a large collection of German-language medical texts used in everyday patient care<\/h4>\n<p>[\/vc_column_text][vc_empty_space height=&#8221;25px&#8221;][vc_column_text]Computer-based natural language processing can be used to build models through machine learning that automatically make information visible in clinical texts.[\/vc_column_text][vc_empty_space height=&#8221;15px&#8221;][vc_column_text]The use of natural language processing (NLP) thus provides the necessary basis for making text documents usable for medical research. Progress in clinical NLP will depend crucially on specially trained language models that require realistic clinical documents. To realise the full potential of NLP, it is therefore necessary to have access to large amounts of annotated texts from everyday patient care.<\/p>\n<p>Annotated texts are documents that contain additional information through systematic annotations, such as information on diagnoses or medications. The annotations are manually reviewed by physician trainees and serve as a reference for further improvement of the automatic annotation. Information structured in this way can be used with existing data for analysis and statistical modelling.[\/vc_column_text][vc_empty_space height=&#8221;15px&#8221;][vc_column_text]The IT infrastructure that will be built during the development and networking phase of the Medical Informatics Initiative (MII) between 2018 and 2022 offers the possibility of making clinical documents accessible on a large scale and enriching them with systematic annotations. The MII method platform GeMTeX aims to solve the two major bottlenecks of current language models: data accessibility and data annotation.[\/vc_column_text][vc_empty_space height=&#8221;15px&#8221;][vc_column_text]With the consent of the patients, the GeMTeX project collects documents from the electronic patient files (ePA) of the six university medical centres in Munich, Leipzig, Essen, Berlin, Dresden and Erlangen. Using natural language processing, the documents are edited and made available in anonymised form for shared use. This creates a valuable text repertoire for research and development.[\/vc_column_text][\/qode_elements_holder_item][\/qode_elements_holder][\/vc_column][\/vc_row][vc_row css_animation=&#8221;&#8221; row_type=&#8221;row&#8221; use_row_as_full_screen_section=&#8221;no&#8221; type=&#8221;grid&#8221; angled_section=&#8221;no&#8221; text_align=&#8221;left&#8221; background_image_as_pattern=&#8221;without_pattern&#8221; z_index=&#8221;&#8221; css=&#8221;.vc_custom_1688115184553{margin-bottom: 2% !important;margin-left: % !important;background-color: #f5f6f6 !important;}&#8221;][vc_column][qode_elements_holder number_of_columns=&#8221;one_column&#8221;][qode_elements_holder_item advanced_animations=&#8221;no&#8221; item_padding=&#8221;3% 0 0 0&#8243; item_padding_480_600=&#8221;3% 0 10% 0&#8243; item_padding_480=&#8221;3% 0 10% 0&#8243;][vc_column_text]<strong>Central structures enable broad enrichment and use of clinical text documents<\/strong>[\/vc_column_text][vc_empty_space height=&#8221;25px&#8221;][vc_column_text]In its implementation, GeMTeX will create a central technical and organisational structure to collect anonymised texts and process them according to guidelines. GeMTeX thus covers a wide range of annotation tasks. These will be tested, verified and applied on a large scale to create a unique database. It can be used to train AI models and then test their usefulness in clinical practice. The enriched text documents and models will be made publicly available via the <a href=\"https:\/\/www.zbmed.de\/en\/\" target=\"_blank\" rel=\"noopener\">German National Library of Medicine (ZBMED)<\/a> and the DFG-funded<a href=\"https:\/\/www.nfdi4health.de\/en\/\" target=\"_blank\" rel=\"noopener\"> NFDI4Health<\/a> project.[\/vc_column_text][vc_empty_space height=&#8221;15px&#8221;][vc_column_text css=&#8221;&#8221;]The GeMTeX Use Case started on 1 June 2023 and is funded by the German Federal Ministry of Research, Technology and Space (BMFTR) with around seven million euros until 31 August 2026.[\/vc_column_text][\/qode_elements_holder_item][\/qode_elements_holder][\/vc_column][\/vc_row][vc_row css_animation=&#8221;&#8221; row_type=&#8221;row&#8221; use_row_as_full_screen_section=&#8221;no&#8221; type=&#8221;grid&#8221; angled_section=&#8221;no&#8221; text_align=&#8221;left&#8221; background_image_as_pattern=&#8221;without_pattern&#8221; z_index=&#8221;&#8221; css=&#8221;.vc_custom_1674469702467{padding-bottom: 2% !important;background-color: #f5f6f6 !important;}&#8221;][vc_column css=&#8221;.vc_custom_1644578899976{background-color: #ffffff !important;}&#8221;][qode_elements_holder number_of_columns=&#8221;three_columns&#8221;][qode_elements_holder_item vertical_alignment=&#8221;top&#8221; advanced_animations=&#8221;no&#8221; item_padding=&#8221;5%&#8221; item_padding_768_1024=&#8221;0&#8243;][vc_column_text]<\/p>\n<h4>GeMTeX Fact Sheet<\/h4>\n<p>[\/vc_column_text][vc_empty_space height=&#8221;15px&#8221;][vc_column_text]Status: 02\/2024[\/vc_column_text][\/qode_elements_holder_item][qode_elements_holder_item horizontal_alignment=&#8221;center&#8221; advanced_animations=&#8221;no&#8221; item_padding=&#8221;5%&#8221; item_padding_768_1024=&#8221;0&#8243;][vc_single_image image=&#8221;21422&#8243; img_size=&#8221;200&#215;150&#8243; alignment=&#8221;center&#8221; onclick=&#8221;custom_link&#8221; img_link_target=&#8221;_blank&#8221; qode_css_animation=&#8221;&#8221; link=&#8221;https:\/\/www.smith.care\/wp-content\/uploads\/2024\/03\/GeMTeX_Faktenblatt_DE_RGB.pdf&#8221;][vc_empty_space height=&#8221;15px&#8221;][vc_column_text]<\/p>\n<p><a href=\"https:\/\/www.smith.care\/wp-content\/uploads\/2024\/03\/GeMTeX_Faktenblatt_DE_RGB.pdf\" target=\"_blank\" rel=\"noopener\">GeMTeX Fact Sheet (German)<\/a><\/p>\n<p>[\/vc_column_text][\/qode_elements_holder_item][qode_elements_holder_item horizontal_alignment=&#8221;center&#8221; advanced_animations=&#8221;no&#8221; item_padding=&#8221;5%&#8221; item_padding_768_1024=&#8221;0&#8243;][vc_single_image image=&#8221;21422&#8243; img_size=&#8221;200&#215;150&#8243; alignment=&#8221;center&#8221; onclick=&#8221;custom_link&#8221; img_link_target=&#8221;_blank&#8221; qode_css_animation=&#8221;&#8221; link=&#8221;https:\/\/www.smith.care\/wp-content\/uploads\/2024\/03\/GeMTeX_Faktenblatt_EN_RGB.pdf&#8221;][vc_empty_space height=&#8221;15px&#8221;][vc_column_text]<\/p>\n<p><a href=\"https:\/\/www.smith.care\/wp-content\/uploads\/2024\/03\/GeMTeX_Faktenblatt_EN_RGB.pdf\" target=\"_blank\" rel=\"noopener\">GeMTeX Fact Sheet (English)<\/a><\/p>\n<p>[\/vc_column_text][\/qode_elements_holder_item][\/qode_elements_holder][\/vc_column][\/vc_row][vc_row css_animation=&#8221;&#8221; row_type=&#8221;row&#8221; use_row_as_full_screen_section=&#8221;no&#8221; type=&#8221;full_width&#8221; angled_section=&#8221;no&#8221; text_align=&#8221;left&#8221; background_image_as_pattern=&#8221;without_pattern&#8221;][vc_column][vc_column_text]<\/p>\n<p>[\/vc_column_text][\/vc_column][\/vc_row][vc_row css_animation=&#8221;&#8221; row_type=&#8221;row&#8221; use_row_as_full_screen_section=&#8221;no&#8221; type=&#8221;grid&#8221; angled_section=&#8221;no&#8221; text_align=&#8221;left&#8221; background_image_as_pattern=&#8221;without_pattern&#8221; z_index=&#8221;&#8221; css=&#8221;.vc_custom_1677759775702{margin-top: 0px !important;margin-right: 0px !important;margin-bottom: 50px !important;margin-left: 0px !important;border-top-width: 0px !important;border-right-width: 0px !important;border-bottom-width: 0px !important;border-left-width: 0px !important;padding-top: 0px !important;padding-right: 0px !important;padding-bottom: 0px !important;padding-left: 0px !important;}&#8221;][vc_column css=&#8221;.vc_custom_1697624367875{margin-top: 0px !important;margin-right: 0px !important;margin-bottom: 0px !important;margin-left: 0px !important;border-top-width: 0px !important;border-right-width: 0px !important;border-bottom-width: 0px !important;border-left-width: 0px !important;padding-top: 0px !important;padding-right: 0px !important;padding-bottom: 5% !important;padding-left: 0px !important;background-color: #ffffff !important;}&#8221;][qode_elements_holder number_of_columns=&#8221;one_column&#8221;][qode_elements_holder_item horizontal_alignment=&#8221;left&#8221; advanced_animations=&#8221;no&#8221; item_padding=&#8221;5%&#8221;][vc_column_text css=&#8221;.vc_custom_1697624832372{margin-top: 0px !important;margin-right: 0px !important;margin-bottom: 0px !important;margin-left: 0px !important;border-top-width: 0px !important;border-right-width: 0px !important;border-bottom-width: 0px !important;border-left-width: 0px !important;padding-top: 0px !important;padding-right: 0px !important;padding-bottom: 0px !important;padding-left: 0px !important;}&#8221;]<\/p>\n<h3>Participating consortia of the Medical Informatics Initiative<\/h3>\n<p>[\/vc_column_text][\/qode_elements_holder_item][\/qode_elements_holder][qode_elements_holder number_of_columns=&#8221;one_column&#8221;][qode_elements_holder_item vertical_alignment=&#8221;middle&#8221; horizontal_alignment=&#8221;center&#8221; advanced_animations=&#8221;no&#8221; item_padding=&#8221;0 5%&#8221;][vc_single_image image=&#8221;20362&#8243; img_size=&#8221;&#8221; alignment=&#8221;center&#8221; onclick=&#8221;custom_link&#8221; img_link_target=&#8221;_blank&#8221; qode_css_animation=&#8221;&#8221; link=&#8221;https:\/\/www.smith.care\/de\/&#8221;][\/qode_elements_holder_item][qode_elements_holder_item vertical_alignment=&#8221;middle&#8221; horizontal_alignment=&#8221;center&#8221; advanced_animations=&#8221;no&#8221; item_padding=&#8221;0 5%&#8221;][vc_single_image image=&#8221;20368&#8243; img_size=&#8221;&#8221; alignment=&#8221;center&#8221; onclick=&#8221;custom_link&#8221; img_link_target=&#8221;_blank&#8221; css=&#8221;&#8221; qode_css_animation=&#8221;&#8221; link=&#8221;https:\/\/www.difuture.de\/en\/home-2\/&#8221;][\/qode_elements_holder_item][qode_elements_holder_item vertical_alignment=&#8221;middle&#8221; horizontal_alignment=&#8221;center&#8221; advanced_animations=&#8221;no&#8221; item_padding=&#8221;0 5%&#8221;][vc_single_image image=&#8221;20366&#8243; img_size=&#8221;&#8221; alignment=&#8221;center&#8221; onclick=&#8221;custom_link&#8221; img_link_target=&#8221;_blank&#8221; qode_css_animation=&#8221;&#8221; link=&#8221;https:\/\/www.highmed.org\/&#8221;][\/qode_elements_holder_item][qode_elements_holder_item vertical_alignment=&#8221;top&#8221; horizontal_alignment=&#8221;center&#8221; advanced_animations=&#8221;no&#8221; item_padding=&#8221;0 5%&#8221;][vc_single_image image=&#8221;20364&#8243; img_size=&#8221;&#8221; alignment=&#8221;center&#8221; onclick=&#8221;custom_link&#8221; img_link_target=&#8221;_blank&#8221; qode_css_animation=&#8221;&#8221; link=&#8221;https:\/\/www.miracum.org\/&#8221;][\/qode_elements_holder_item][\/qode_elements_holder][\/vc_column][\/vc_row]<\/p>\n<\/div>","protected":false},"excerpt":{"rendered":"<p>[vc_row css_animation=&#8221;&#8221; row_type=&#8221;row&#8221; use_row_as_full_screen_section=&#8221;no&#8221; type=&#8221;grid&#8221; angled_section=&#8221;no&#8221; text_align=&#8221;left&#8221; background_image_as_pattern=&#8221;without_pattern&#8221; z_index=&#8221;&#8221; css=&#8221;.vc_custom_1620639780558{background-color: #f5f6f6 !important;}&#8221;][vc_column][qode_elements_holder number_of_columns=&#8221;one_column&#8221;][qode_elements_holder_item advanced_animations=&#8221;no&#8221; item_padding=&#8221;5% 0&#8243; item_padding_480_600=&#8221;20px 0px 20px 0px&#8221; item_padding_480=&#8221;20px 0px 20px 0px&#8221;][vc_column_text] GeMTeX \u2013 German Medical Text Corpus [\/vc_column_text][vc_empty_space height=&#8221;15px&#8221;][vc_column_text css=&#8221;.vc_custom_1688458780272{padding-bottom: 5% !important;}&#8221;] Automatic indexing of medical texts for research [\/vc_column_text][vc_column_text]In everyday clinical&#8230;<\/p>\n","protected":false},"author":13,"featured_media":0,"parent":20504,"menu_order":1,"comment_status":"closed","ping_status":"closed","template":"full_width.php","meta":{"footnotes":""},"class_list":["post-18212","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/www.smith.care\/en\/wp-json\/wp\/v2\/pages\/18212","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.smith.care\/en\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.smith.care\/en\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.smith.care\/en\/wp-json\/wp\/v2\/users\/13"}],"replies":[{"embeddable":true,"href":"https:\/\/www.smith.care\/en\/wp-json\/wp\/v2\/comments?post=18212"}],"version-history":[{"count":14,"href":"https:\/\/www.smith.care\/en\/wp-json\/wp\/v2\/pages\/18212\/revisions"}],"predecessor-version":[{"id":24961,"href":"https:\/\/www.smith.care\/en\/wp-json\/wp\/v2\/pages\/18212\/revisions\/24961"}],"up":[{"embeddable":true,"href":"https:\/\/www.smith.care\/en\/wp-json\/wp\/v2\/pages\/20504"}],"wp:attachment":[{"href":"https:\/\/www.smith.care\/en\/wp-json\/wp\/v2\/media?parent=18212"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}