{"id":18452,"date":"2023-06-01T16:01:21","date_gmt":"2023-06-01T14:01:21","guid":{"rendered":"https:\/\/www.smith.care\/?p=18452"},"modified":"2023-06-02T10:33:30","modified_gmt":"2023-06-02T08:33:30","slug":"annotation-guidelines","status":"publish","type":"post","link":"https:\/\/www.smith.care\/en\/2023\/06\/01\/annotation-guidelines\/","title":{"rendered":"SMITH project helps to make medical texts usable for automated analysis"},"content":{"rendered":"\n<p>Routine care generates large volumes of medical text that contain valuable information about patients. However, the wording, content and structure of medical documentation can vary greatly between different institutions, making it unusable for digital programmes or analysis across different locations. In the SMITH consortium, the NLP project has addressed this problem by publishing guidelines for preparing medical texts so that they can be used for medical research and care, using methods of natural language processing (NLP).<\/p>\n\n\n\n<p>In all German hospitals, doctors&#8217; letters are written at the time of transfer and discharge to provide information about the patient to the doctors who continue to treat him or her. These letters are an essential part of the medical record and contain information about the reason for the treatment, details of the patient&#8217;s medical history, allergies, previous illnesses, family diagnoses, therapies already carried out, medications prescribed to date and also information about further treatment. This information can be of great value not only to the doctors treating the patient, but also to medical research and to the patients themselves.<\/p>\n\n\n\n<p>In order for these medical documentation texts to be used across sites, they must be readable by digital programmes. However, the wording of medical texts is highly dependent on the institution, the medical speciality and also the person who writes them. In addition, they are not standardised either structurally or in terms of content. Automated capture of details in texts such as doctors&#8217; letters and discharge summaries, such as descriptions of medicines, their frequency of use (daily, three times a day) or form of administration (as tablets, drops) is therefore hardly possible without prior preparation.<\/p>\n\n\n\n<p>Automated text analysis methods can make the content of such texts usable for technical information systems as well as for treating doctors and patients. However, this requires that such NLP systems have access to sufficient text material to enable automatic analysis.<\/p>\n\n\n\n<p>With the help of special computer programmes, so-called annotation tools such as <a href=\"https:\/\/github.com\/nlplab\/brat\">Brat<\/a> or <a href=\"https:\/\/github.com\/inception-project\/inception\">INCEpTION<\/a>, trained personnel mark text passages in manual steps according to certain content specifications. These markings, also known as annotations, contain clues about the structure and content of the texts and form the basis for statistical models on which modern NLP systems base their analyses. The creation of such annotations is governed by annotation guidelines. During the first funding phase of the Medical Informatics Initiative (MII), several such guidelines were developed for the annotation of German discharge summaries. They focus on the following tasks:<\/p>\n\n\n\n<p>1) Structures of text passages that indicate whether a text passage describes, for example, a salutation, an anamnesis or the administration of medication, or the course of a hospital stay.<a href=\"#_ftn1\" id=\"_ftnref1\"><sup>[1]<\/sup><\/a><\/p>\n\n\n\n<p>2) Person-identifying characteristics or all descriptions that allow conclusions to be drawn about an individual patient and must therefore be subsequently anonymised for data protection reasons (e.g. personal names, address data or dates).<a href=\"#_ftn2\" id=\"_ftnref2\"><sup>[2]<\/sup><\/a><\/p>\n\n\n\n<p>3) Descriptions of key medical categories such as diagnoses, symptoms and findings;<a id=\"_ftnref3\" href=\"#_ftn3\"><sup>[3]<\/sup><\/a><\/p>\n\n\n\n<p>4) Descriptions of medications, including dosage (e.g., 50 mg, 1\/2 tablet), frequency (e.g., three times a day), route (e.g., oral or by mouth), duration, and reason;<a id=\"_ftnref4\" href=\"#_ftn4\"><sup>[4]<\/sup><\/a><\/p>\n\n\n\n<p>5) Additional medical categories in terms of content (e.g. descriptions of anatomical locations, medical tests and procedures, treatment methods) and their relations, which relate these categories to each other;<\/p>\n\n\n\n<p>6) Temporal references between categories and their relationships &#8211; all references to points in time and the sequence of clinical events &#8211; with the aim of being able to automatically map the information contained in the doctor&#8217;s letter onto a timeline;<\/p>\n\n\n\n<p>7) Descriptions of the certainty or uncertainty and the exclusion (negation) of statements, for example whether a diagnosis is formulated on a tentative basis or excluded altogether.<\/p>\n\n\n\n<p>The first four of these seven annotation guidelines have now been published at the end of the first phase of the MII. These can be used as a starting point for the cross-consortium project German Medical Textcorpus (GeMTeX). GeMTeX starts in June 2023 and will build a German clinical reference corpus at six university hospitals (Leipzig, TU Munich, Essen, Berlin, Dresden and Erlangen).<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.smith.care\/wp-content\/uploads\/2023\/05\/Screenshot-Annotationsguideline-4.png\" alt=\"\" class=\"wp-image-18398\" width=\"796\" height=\"364\" srcset=\"https:\/\/www.smith.care\/wp-content\/uploads\/2023\/05\/Screenshot-Annotationsguideline-4.png 796w, https:\/\/www.smith.care\/wp-content\/uploads\/2023\/05\/Screenshot-Annotationsguideline-4-300x137.png 300w, https:\/\/www.smith.care\/wp-content\/uploads\/2023\/05\/Screenshot-Annotationsguideline-4-768x351.png 768w, https:\/\/www.smith.care\/wp-content\/uploads\/2023\/05\/Screenshot-Annotationsguideline-4-700x320.png 700w\" sizes=\"auto, (max-width: 796px) 100vw, 796px\" \/><\/figure><\/div>\n\n\n\n<pre class=\"wp-block-verse has-text-align-center\"><figure class=\"wp-block-table has-small-font-size\"><em>Source: <a rel=\"noreferrer noopener\" href=\"https:\/\/zenodo.org\/record\/7707947\" target=\"_blank\">Matthies et al. Annotationsleitlinien f\u00fcr deutschsprachige Medizintexte. Teil 4: Annotation von Medikationsgaben<\/a>, S. 15<\/em><\/figure><\/pre>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td class=\"has-text-align-right\" data-align=\"right\">_______________<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<figure class=\"wp-block-table has-small-font-size\"><table><tbody><tr><td class=\"has-text-align-left\" data-align=\"left\"><a id=\"_ftn3\" href=\"#_ftnref3\"><sup>[1]<\/sup><\/a> Annotation guideline: <a rel=\"noreferrer noopener\" href=\"https:\/\/doi.org\/10.5281\/zenodo.7707756\" target=\"_blank\">https:\/\/doi.org\/10.5281\/zenodo.7707756<\/a> \/ Publication: Christina Lohr, Stephanie Luther, Franz Matthies, Luise Modersohn, Danny Ammon, Kutaiba Saleh, Andreas G. Henkel, Michael Kiehntopf, and Udo Hahn:<a href=\"https:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6371337\/\"> <\/a><a rel=\"noreferrer noopener\" href=\"https:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6371337\/\" target=\"_blank\">CDA-Compliant Section Annotation of German-Language Discharge Summaries: Guideline Development, Annotation Campaign, Section Classification.<\/a> In: AMIA Annual Symposium Proceedings 2018, San Francisco, USA, Nov 3-7.<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\"><a id=\"_ftn4\" href=\"#_ftnref4\"><sup>[2]<\/sup><\/a> Annotation guideline: <a rel=\"noreferrer noopener\" href=\"https:\/\/doi.org\/10.5281\/zenodo.7707882\" target=\"_blank\">https:\/\/doi.org\/10.5281\/zenodo.7707882<\/a> \/ Publication: Tobias Kolditz, Christina Lohr, Johannes Hellrich, Luise Modersohn, Boris Betz, Michael Kiehntopf, Udo Hahn:<a rel=\"noreferrer noopener\" href=\"http:\/\/ebooks.iospress.nl\/volumearticle\/51977\" target=\"_blank\"> Annotating German Clinical Documents for De-Identification<\/a> (<a rel=\"noreferrer noopener\" href=\"https:\/\/www.iospress.nl\/book\/medinfo-2019-health-and-wellbeing-e-networks-for-all\/\" target=\"_blank\">MedInfo 2019 Aug 25-30 Lyon France<\/a>) [<a rel=\"noreferrer noopener\" href=\"https:\/\/julielab.de\/downloads\/publications\/slides\/lohr-2019-medinfo-de-id-slides.pdf\" target=\"_blank\">Slides<\/a>]<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\"><a id=\"_ftn5\" href=\"#_ftnref5\"><sup>[3]<\/sup><\/a> Annotation guideline: <a rel=\"noreferrer noopener\" href=\"https:\/\/doi.org\/10.5281\/zenodo.7707917\" target=\"_blank\">https:\/\/doi.org\/10.5281\/zenodo.7707917<\/a> \/ Publication: Christina Lohr, Luise Modersohn, Johannes Hellrich, Tobias Kolditz, Udo Hahn:<a rel=\"noreferrer noopener\" href=\"http:\/\/ebooks.iospress.nl\/publication\/54118\" target=\"_blank\"> An Evolutionary Approach to the of Discharge Summaries<\/a>. In: Studies in Health Technology and Informatics, Vol. 270: Digital Personalized Health and Medicine &#8211; Proceedings of<a href=\"https:\/\/efmi.org\/2020\/06\/11\/mie2020-conference-proceedings-and-materials\/\"> <\/a><a rel=\"noreferrer noopener\" href=\"https:\/\/efmi.org\/2020\/06\/11\/mie2020-conference-proceedings-and-materials\/\" target=\"_blank\">MIE 2020<\/a><\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\"><a id=\"_ftn6\" href=\"#_ftnref6\"><sup>[4]<\/sup><\/a> Annotation guideline: <a rel=\"noreferrer noopener\" href=\"https:\/\/doi.org\/10.5281\/zenodo.7707947\" target=\"_blank\">https:\/\/doi.org\/10.5281\/zenodo.7707947<\/a> \/ Publication: Udo Hahn, Franz Matthies, Christina Lohr, Markus L\u00f6ffler.<a rel=\"noreferrer noopener\" href=\"http:\/\/ebooks.iospress.nl\/volumearticle\/48747\" target=\"_blank\"> 3000PA-Towards a National Reference Corpus of German Clinical Language<\/a>. In: Studies in Health Technology and Informatics, Vol. 247: Building Continents of Knowledge in Oceans of Data: The Future of Co-Created eHealth &#8211; Proceedings of MIE 2018, Gothenburg, Sweden, April 24-26 2018. [<a rel=\"noreferrer noopener\" href=\"https:\/\/julielab.de\/downloads\/publications\/slides\/lohr2018-mie-3000PA-slides.pdf\" target=\"_blank\">Slides<\/a>]<\/td><\/tr><\/tbody><\/table><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>Routine care generates large volumes of medical text that contain valuable information about patients. However, the wording, content and structure of medical documentation can vary greatly between different institutions, making it unusable for digital programmes or analysis across different locations. In the SMITH consortium, the&#8230;<\/p>\n","protected":false},"author":14,"featured_media":18402,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-18452","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-news"],"_links":{"self":[{"href":"https:\/\/www.smith.care\/en\/wp-json\/wp\/v2\/posts\/18452","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.smith.care\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.smith.care\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.smith.care\/en\/wp-json\/wp\/v2\/users\/14"}],"replies":[{"embeddable":true,"href":"https:\/\/www.smith.care\/en\/wp-json\/wp\/v2\/comments?post=18452"}],"version-history":[{"count":7,"href":"https:\/\/www.smith.care\/en\/wp-json\/wp\/v2\/posts\/18452\/revisions"}],"predecessor-version":[{"id":18467,"href":"https:\/\/www.smith.care\/en\/wp-json\/wp\/v2\/posts\/18452\/revisions\/18467"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.smith.care\/en\/wp-json\/wp\/v2\/media\/18402"}],"wp:attachment":[{"href":"https:\/\/www.smith.care\/en\/wp-json\/wp\/v2\/media?parent=18452"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.smith.care\/en\/wp-json\/wp\/v2\/categories?post=18452"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.smith.care\/en\/wp-json\/wp\/v2\/tags?post=18452"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}