Healthbase blog: musings on ehealth...

NEHTA publishes flawed pathology terminology

On its  shiny new website  under About NEHTA , NEHTA now starts its strategy statement with: “With the foundations built, the infrastructure in place…“.

Nothing can be further from the truth. No amount of shouting from the rooftops, even by NEHTA’s Chairman, can change the facts. Most of us involved at the coalface of e-health know only too well that the foundations are not built and that the infrastructure simply is not in place!

Over the past decade, NEHTA has really struggled to develop and introduce useful terminology products for the Australian e-health community. The few it has developed, such as the Australian Medicines Terminology have been largely ignored by vendors, jurisdictions, systems integrators and others. When it comes to supporting diagnostic tests, NEHTA walked away from leading the terminology foundation development for years and years.

In an attempt to keep some momentum going  for the pathology sector, the Royal College for Pathologists of Australasia (RCPA) championed, with some DoH funding, a project to develop Australian standards for pathology requesting and reporting terminologies, including units. The project spanned several years from 2011 through 2014 and has some published documentation and value sets at https://www.rcpa.edu.au/Library/Practising-Pathology/PTIS. It represents a considerable corpus of work undertaken collaboratively by many individuals, often voluntarily. Michael Legg and The College are to be congratulated for bringing so many people together to produce that corpus.

Yet never in that period, do I recall seeing a single statement from NEHTA about it’s involvement in the project. It seemed, from the outside, that pathology was a no go area. If that is the “lead organisation supporting a national vision for eHealth for Australia”, then it has certainly been wearing a very thick blindfold.

So it came as an exciting surprise, yesterday evening after I downloaded my monthly release of the NEHTA AMT, to find at the bottom of their new Terminology Access page for Implementers, a collection of 6 value sets under the heading RCPA derived value sets. I went looking for the documentation, but could find none. Because I had previewed some of the products on the RCPA site over summer, I thought I’d look at what NEHTA had done. Had they fixed any of the errors in the original material? Had they turned the incomplete  products into quality products that the pathology sector could safely and reliably adopt?  As it turns out, I think they have made things worse.

The single biggest problem is that there is simply no documentation to explain to customers what the purpose of each product is, what the audience is, what the provenance is, how the product is to be used, what is the relationship to the corresponding artefact on the RCPA web-site, etc, etc. With that caveat, I can flag the following issues  concerning just a single product, the Organism value set, containing RCPA preferred names with mappings to SNOMED CT-AU. The naming of organisms, particularly pathogens reported for infection control, communicable disease monitoring and surveillance and for epidemiological studies is both complex and extremely important. Using automated techniques to speed up communication relies on consistent and published names and codes.

Issues with publication forms

NEHTA have changed the distribution format to a Tab Separated (TSV) file, altering the order of columns, stripping off leading and trailing space characters from the RCPA preferred names. This may well be an improved distribution form but in doing so, NEHTA have created subtly different RCPA-preferred terms for some 66 organisms. Sure – the RCPA spreadsheet was faulty in this respect and should never have been published with those flaws, but NEHTA’s “improvements” are undocumented. There now seem to be two separate sources of truth.

The RCPA XSLX spreadsheet has, for each organism ( bar Neisseria meningitidis group D ), the SNOMED code + preferred name pair. NEHTA report the SNOMED concept code and Fully Specified Name in separate columns, instead. Some implementers may take the SNOMED preferred terms from the RCPA spreadsheet, whilst others may take the “official” SNOMED FSN from the NEHTA distribution. The two sets do not match. The RCPA SNOMED preferred terms seem to have spelling errors.

Issues with SNOMED mapping

I found 3 pairs of duplicate SNOMED conceptID entries. That may well be the intention of the PTIS project, but is confusing for implementers. Are these errors? Does it simply mean that in each case, the RCPA would like to nominate 2 preferred terms for the same SNOMED concept? If implementers are unaware, some may simply load the organisms into a dict or map variable, {conceptID,RCPA_name}  keyed by conceptID, thus overwriting/deleting 3 RCPA preferred names. There is no documentation, either by RCPA nor NEHTA on what the integrity constraints should be.

I found that 251 of the 2960 RCPA preferred organism names have been mapped to SNOMED concepts that are inactive in the current Australian SNOMED CT-AU release!! I didn’t bother checking to see if  these mappings were incorrect at the time of initial mapping by the PTIS project, but NEHTA should have known they were before publishing their version.

Issues with RCPA preferred names

These aren’t issues caused by NEHTA, but do appear to indicate that NEHTA provided no additional quality control steps or review and simply took the RCPA names at face value.

RCPA provides no consistency in the way it designates species, subspecies, etc. This is despite the PTIS own published guidelines “G5.15 For species, sp (italicised) should be used to identify a single species whose identity is not known. For groups of species spp (italicised) should be used to identify the set of species within a genus. …“  It is unacceptable to place the burden onto implementers to deal with a mishmash of spellings and abbreviations for common qualifiers. A quick scan of the literature suggests to me that abbreviations should be used at least for

  • sp. species (singular)
  • spp. species (plural)
  • subsp.  subspecies (singular)
  • subspp. subspecies (plural)

Instead, I have seen in the RCPA organism list:   species, sp, sp., spp, Species, spp., Sp., ssp,  subsp, subsp., subspecies, susbsp

I’m no bacteriologist, but I did find quite a list of other suspect RCPA preferred organism names, including:-

  • Alcaligenes johnsonii
  • Alcaligenes baumanii
  • Salmonella Badagry
  • Peptostreptococcus anaerobiu
  • Citrobacter intermedia

as well as suspect mappings:-

  • 113982002  | Streptococcus dysgalactiae subspecies dysgalactiae (organism)  |  Streptococcus equisimilis
  • 91288006    | Acinetobacter baumannii (organism) |       Alcaligenes baumanii

Summary

There are many in the pathology sector crying out for quality value sets. They have been for more than a decade. There was some progress with the PTIS projects. Unfortunately, the quality is not sufficient for these to be usable in practice, and there are no defined processes for improving and managing these artefacts. They need to be improved. They need to be curated and managed to support the ever changing science, technology, needs and expectations of users. We seem to be at the mercy of national governance bodies who simply do not have the expertise, nor the willingness, nor the vision to help.

Acknowledgement: I was only able to make as much progress as I did today in analysing the value sets because of the beautiful and efficient SHRIMP SNOMED CT Browser developed  by Dr Michael Lawley and colleagues and  provided free by the CSIRO.

No comments

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>