gnumed-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnumed-devel] data pack creator wanted


From: Busser, Jim
Subject: Re: [Gnumed-devel] data pack creator wanted
Date: Thu, 10 Nov 2011 17:08:14 +0000

On 2011-11-10, at 12:54 AM, Eric MAEKER wrote:

>> Could you explain a little more ? Do you imply that this is not a good 
>> source 
>> ?
> 
> Well I've tried last year without any success... But may be the service has 
> been updated since this time.

Sounds like the service may not have performed well, technically.

I have completed a licence, on behalf of (identifying) 'GNUmed project' as the 
organization but with my own address and contact information. I can maybe 
attach it to the wiki or it can maybe live with the data packs.

The terms of the licence seem acceptable. I would bring attention to the 
appended #2 and #3:

re # 2 - reporting: what I was thinking that (if WHO would ask each January) I 
could post an email to the -announce list asking anyone who is using ICD-10 to 
please visit a simple Google form on which I could collect how respondents are 
using ICD10 and any suggested improvements.

re # 3 - copyright: this could be included (commented) in the to-be-created 
GNUmed data pack. Should it also be added to the GNUmed server or client 
license informations?

However before we can do anything with the downloads (list and data structure 
appended) an important question arises about ICD10... how to implement it? I 
have never used it. Judging from the files, is it more complex than selecting a 
code from a 'flat' list? Must any selection be informed by the interaction of 
hierarchical or contextual data? Can anyone please further inform about usage?

=====================================================

2.                  Reporting.  In lieu of charges, usage fees or royalties 
Licensee undertakes to fulfil the following Reporting obligations to WHO: 
within  30 days from the end of any calendar year in which Licensee makes use 
of the Classifications, Licensee agrees to provide WHO with a brief report on 
the use of the Classifications in the research project, any difficulties 
encountered in using the Classifications and suggestions that would make the 
Classifications more useful to Licensee and its user groups.
 
 
3.                  Copyright. Licensee agrees that in order to protect WHO’s 
copyright and to enable the user to order or refer to the published volumes, 
WHO shall be acknowledged with an appropriate reference to the Classifications, 
as follows:
 
Abbreviated form: ICD-10 codes, terms and text © World Health Organization, 
Third Edition. 2007.
                             ICD-O-3 codes, terms and text © World Health 
Organization, Third  Edition. 2007.
                             ICF codes, terms and text © World Health 
Organization, First Edition, 2001.
 

Available Classifications and Formats for Download

Classification  Language        Format
ICD-10 2010 version     English ClaML (more info)
ICD-10 2010 version     English Plain text tabular (more info)
ICD-10 2nd Edition / 2008       English ClaML (more info)
ICD-10 2nd Edition / 2008       English Plain text tabular (more info)
ICD-10 Training Package for offline use (148MB zip file)        English -
ICD-O-2 English Comma separated text
ICD-O-3


=====================================================
More info: ClaML format and its use in the dissemination of WHO Classifications
 

The "Classification Mark-up Language (ClaML)" is an XML based format designed 
specifically for classifications. It was accepted in 2007 as European norm 
(CEN/TS 14463). Additional details on the specification and use can be found in 
the respective CEN document (www.cen.eu). WHO decided to use this format to 
share its classifications such as the ICD.
 
This format allows us to capture
·         information on the classification hierarchy (i.e. parent child 
relations)
·         A level of granularity that allows identifying different rubrics 
within the classification categories. (Title, includes, excludes, definitions, 
coder instructions, etc. are separated from each other)
·         Cross references
 
 
The main XML elements used in this format are the following:
·         The main element (root element) is "ClaML" for the definition of the 
classification as such.

·         The element "Class" is used for the definition and structuring of the 
chapters, groups and categories. The classes define their parent and children 
classes by using "SuperClass" and "SubClass"elements so that the hierarchical 
representation is captured.

·         Each "Class" may have one or more "Rubric"s which are used to define 
different aspects of that class. For example, title, inclusions, exclusions are 
separate Rubrics under a "Class" element.

·         A "Reference" tag can be used to identify the cross references within 
the classification.

·         The element "ModifierClass" and "Modifier" are used for the 
definition and integration of subclassifications (Modifiers), in the ICD-10 as 
list of codes for the fourth and possible fifth character of the codes.
=====================================================
More info: ICD-10 Meta Data Format
 
ICD-10 meta data is a special excerpt of the Four Character Version of ICD-10.
The files contain all three and four character codes with their titles and the
blocks and chapters. This information is linked to the Special Tabulation Lists
from the appendix of Volume 1.
 
Meta data can be used e.g. to
 
    * translate codes into texts,
    * tabulate codes based on ICD blocks, chapters and Special Tabulation Lists 
,
    * check codes for formal correctness,
 
Meta data should not be used as coding software because important parts of 
ICD-10
are missing: without the inclusion and exclusion note coding may be incorrect.
 
Data Structure
 
All files are available in extended ASCII code and can be used for import into
relational data bases. Meta data are split into several files:
 
    * CHAPTERS.TXT
          * Field 1: chapter number, 2 characters
          * Field 2: chapter title, up to 110 characters
    * BLOCKS.TXT
          * Field 1: first three character code of a block, 3 characters
          * Field 2: block title, up to 210 characters
    * MORBL.TXT: special tabulation list for morbidity
          * Field 1: code
          * Field 2: title
    * MORTL1_1.TXT: blocks of the special tabulation list for mortality 1
          * Field 1: block code
          * Field 2: block title
    * MORTL1_2.TXT: special tabulation list for mortality 1
          * Field 1: code
          * Field 2: block code
          * Field 3: title
    * MORTL2.TXT: special tabulation list for mortality 2
          * Field 1: code
          * Field 2: title
    * MORTL3_1.TXT: blocks of the special tabulation list for mortality 3
          * Field 1: block code
          * Field 2: block title
    * MORTL3_2.TXT: special tabulation list for mortality 3
          * Field 1: code
          * Field 2: block code
          * Field 3: title
    * MORTL4.TXT: special tabulation list for mortality 4
          * Field 1: code
          * Field 2: title
    * CODES.TXT
          * Field 1: level in the hierarchy of the classification, 1 character
                * 3 = three character code
                * 4 = four character code
                * 5 = five character code
          * Field 2: place in the classification tree, 1 character
                * T = terminal node (leaf node, valid for coding)
                * N = non-terminal node (not valid for coding)
          * Field 3: type of code in WHO edition, 1 character
                * P = valid as primary code
                * O = only valid as secondary, optional code
                * V = not valid for coding
          * Field 4: type of terminal node, 1 character
                * X = explicitly listed in the classification (pre-combined)
                * S = derived from a subclassification (post-combined)
          * Field 5: chapter number, 2 characters
          * Field 6: first three character code of a block, 3 characters
          * Field 7: code without possible dagger, up to 6 characters
          * Field 8: like field 7, without possible asterisk
          * Field 9: like field 8, without dot
          * Field 10: title, up to 255 characters (four character subdivisions 
of Y83 had to be abbreviated)
          * Field 11: reference to special tabulation list for mortality 1
          * Field 12: reference to special tabulation list for mortality 2
          * Field 13: reference to special tabulation list for mortality 3
          * Field 14: reference to special tabulation list for mortality 4
          * Field 15: reference to special tabulation list for morbidity
 

-- Jim


reply via email to

[Prev in Thread] Current Thread [Next in Thread]